Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.
Sign upIntroduce a recursive-descent parser for ranges and comparators #34
Conversation
udoprog
force-pushed the
udoprog:comparators
branch
from
ec74fc3
to
cd1f9e3
Nov 26, 2017
This comment has been minimized.
This comment has been minimized.
KodrAus
commented
Nov 26, 2017
|
This is an awesome effort @udoprog! Personally, I like the idea of having a solid parser to build off so am super on-board with this. I think we should make an effort to document the design of the tokenizer and parser using non-doc comment blocks internally, since it's a significant piece of infrastructure. What do you think? |
udoprog
force-pushed the
udoprog:comparators
branch
from
cd1f9e3
to
4874e5b
Nov 26, 2017
This comment has been minimized.
This comment has been minimized.
|
@KodrAus doc comments seem like a natural place to put it. I'll be expanding the module level docs a bit more. But I'm always for more documentation! What else do you have in mind? |
This comment has been minimized.
This comment has been minimized.
KodrAus
commented
Nov 27, 2017
|
Do you think this should be public at all? I kind of expected it wouldn't be. But that doesn't mean you can't have doc comments. Other than that I think this excellent. |
This comment has been minimized.
This comment has been minimized.
|
@KodrAus I just wanted documentation to generate for them right now. We can figure out what to do when/if this is ready? |
udoprog
force-pushed the
udoprog:comparators
branch
from
78f4ecd
to
5e796c3
Nov 27, 2017
This comment has been minimized.
This comment has been minimized.
KodrAus
commented
Nov 29, 2017
|
@udoprog That sounds fair enough |
This comment has been minimized.
This comment has been minimized.
KodrAus
commented
Dec 7, 2017
|
@steveklabnik have you had a chance to look through this? It's a lot of code, but I think it makes tweaking edge cases and supporting new features more natural. Overall I think it's a solid foundation to build on. |
udoprog
force-pushed the
udoprog:comparators
branch
from
eb5f042
to
db57060
Dec 8, 2017
This comment has been minimized.
This comment has been minimized.
|
I've done a rough porting and removal of dead code that is no longer needed so that you can get a better idea of the damage this would do. Most noteworthy is that I had to backpedal on strict parsing numeric components (here: e249514) because we need to support complex metadata that contains things with numeric prefixes like commits. |
udoprog
force-pushed the
udoprog:comparators
branch
from
2814da5
to
1392f82
Dec 8, 2017
steveklabnik
approved these changes
Dec 8, 2017
|
Okay! I think overall this looks excellent. Sorry it took me a while; parsers are something that I find intimidating, which is dumb and is one of the reasons I maintain this package. I really like this. Let's merge it in. I have a few tiny things, other than that, does this need anything in particular? |
| /// Check if the current token is a whitespace token. | ||
| pub fn is_whitespace(&self) -> bool { | ||
| match *self { | ||
| Whitespace(_, _) => true, |
This comment has been minimized.
This comment has been minimized.
steveklabnik
Dec 8, 2017
Owner
this isn't really an issue but I thought I'd mention it; you can also do
Whitespace(..) => true,
here, which is ever so slightly more clear, personally.
i don't think this change needs to be made but if you agree and happen to do it, no big deal :)
This comment has been minimized.
This comment has been minimized.
| pub struct Lexer<'input> { | ||
| input: &'input str, | ||
| chars: str::CharIndices<'input>, | ||
| // lookeahead |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| #[test] | ||
| pub fn all_tokens() { | ||
| assert_eq!( | ||
| lex("=><.^~*0.1234<=>=||"), |
This comment has been minimized.
This comment has been minimized.
steveklabnik
Dec 8, 2017
Owner
this only seems to have 13 out of the 16 tokens; whitespace is missing
also i wonder if we shouldn't check x/X in this test too
This comment has been minimized.
This comment has been minimized.
udoprog
Dec 8, 2017
Author
Contributor
Added coverage for all tokens, and split out tests for components (alphanumeric, numeric).
Added a separate test for testing just the is_wildcard method.
| //! use semver_parser::parser::Parser; | ||
| //! use semver_parser::range::Op; | ||
| //! | ||
| //! let mut p = Parser::new("^1").expect("a working parser"); |
This comment has been minimized.
This comment has been minimized.
steveklabnik
Dec 8, 2017
Owner
the expect message here is backwards; if this doesn't work out, it would say we had one, but we don't
This comment has been minimized.
This comment has been minimized.
| //! use semver_parser::parser::Parser; | ||
| //! use semver_parser::range::{Op, Predicate}; | ||
| //! | ||
| //! let mut p = Parser::new("^1.0").expect("a working parser"); |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| //! pre: vec![], | ||
| //! })), p.predicate()); | ||
| //! | ||
| //! let mut p = Parser::new("^*").expect("a working parser"); |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| let (patch, patch_wildcard) = self.dot_component()?; | ||
| let pre = self.pre()?; | ||
|
|
||
| // TODO: avoid illegal combinations, like `1.*.0`. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
udoprog
Dec 8, 2017
Author
Contributor
This is something we currently accept, and I added a test for it in #35 to make sure we stay consistent. Not sure if we should break it or not.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
steveklabnik
Dec 8, 2017
Owner
so, using https://semver.npmjs.com/, it seems that this is accepted. guess we should keep it
This comment has been minimized.
This comment has been minimized.
udoprog
Dec 8, 2017
Author
Contributor
Yeah, it's just a bit unsound. The following requirements are currently equivalent as per our and their implementation:
1.*1.*.*1.*.0,1.*.1, ... any number. Note how they are ignored.
Maybe this should generate a warning? A user might expect that 1.*.10 matches any
patch-release 10 across different minor releases.
udoprog
added some commits
Nov 26, 2017
udoprog
force-pushed the
udoprog:comparators
branch
from
1392f82
to
d4a573c
Dec 8, 2017
This comment has been minimized.
This comment has been minimized.
|
@steveklabnik rebased and fixed comments. Didn't realize the rebase would mess up comment history, so sorry about that. At least the changes are additional commits. |
This comment has been minimized.
This comment has been minimized.
|
Naw, it's all good, don't worry about it :) |
This comment has been minimized.
This comment has been minimized.
|
So, yeah, looks great. Is this ready to merge? |
This comment has been minimized.
This comment has been minimized.
|
I'd say yes. There are two outstanding improvements I'd like to see before this lands in a release, and I'll create separate issues for them if you decide merge this: Error messages are currently a bit debuggy (https://github.com/steveklabnik/semver-parser/pull/34/files#diff-2c09afcdc3c420ab0678ba9b5e83959cR87) and not suitable to end-users. We could do neater things like quoting the things that broke the parsing or provide suggestions for what was expected. There's this blanket conversion to string that I would eventually like to get rid off: |
This comment has been minimized.
This comment has been minimized.
|
There's also always the risk for regressions. But given that it currently is passing the existing suite of tests, this would always be the case on introducing changes. |
This comment has been minimized.
This comment has been minimized.
|
Both of those seem relatively minor, so let's |
steveklabnik
merged commit 8787d92
into
steveklabnik:master
Dec 8, 2017
1 check passed
udoprog
deleted the
udoprog:comparators
branch
Dec 8, 2017
This comment has been minimized.
This comment has been minimized.
|
Re: risk of regressions, I am opening a new issue I'll cc you on; not sure if the bug is here or not. |
This comment has been minimized.
This comment has been minimized.
KodrAus
commented
Dec 8, 2017
|
|
udoprog commentedNov 26, 2017
•
edited
This only introduces a recursive descent parser to garner feedback.
If merged, it's currently not 'hooked up' to anything.
||as a separator #22.andis already covered with the supported comma-based syntax.solved by matching identifiers.Xandxbecomes an issue with using a context-free lexer: since it might occur as a prefix in an identifier it needs special handling.