Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a recursive-descent parser for ranges and comparators #34

Merged
merged 16 commits into from Dec 8, 2017

Conversation

Projects
None yet
3 participants
@udoprog
Copy link
Contributor

udoprog commented Nov 26, 2017

This only introduces a recursive descent parser to garner feedback.

If merged, it's currently not 'hooked up' to anything.

  • Parses everything in a single pass, with one token of lookahead.
  • All tests for the old parser have been copied and are working with this parser.
  • Benchmarks indicate that this is at least as fast as the old method.
  • Parses comparators using || as a separator #22. and is already covered with the supported comma-based syntax.
  • Has the necessary infrastructure to solve #33.
  • No dependencies.
  • X and x becomes an issue with using a context-free lexer: since it might occur as a prefix in an identifier it needs special handling. solved by matching identifiers.

@udoprog udoprog force-pushed the udoprog:comparators branch from ec74fc3 to cd1f9e3 Nov 26, 2017

@KodrAus

This comment has been minimized.

Copy link

KodrAus commented Nov 26, 2017

This is an awesome effort @udoprog! Personally, I like the idea of having a solid parser to build off so am super on-board with this.

I think we should make an effort to document the design of the tokenizer and parser using non-doc comment blocks internally, since it's a significant piece of infrastructure.

What do you think?

@udoprog udoprog force-pushed the udoprog:comparators branch from cd1f9e3 to 4874e5b Nov 26, 2017

@udoprog

This comment has been minimized.

Copy link
Contributor Author

udoprog commented Nov 26, 2017

@KodrAus doc comments seem like a natural place to put it. I'll be expanding the module level docs a bit more. But I'm always for more documentation! What else do you have in mind?

@KodrAus

This comment has been minimized.

Copy link

KodrAus commented Nov 27, 2017

Do you think this should be public at all? I kind of expected it wouldn't be. But that doesn't mean you can't have doc comments.

Other than that I think this excellent.

@udoprog

This comment has been minimized.

Copy link
Contributor Author

udoprog commented Nov 27, 2017

@KodrAus I just wanted documentation to generate for them right now. We can figure out what to do when/if this is ready?

@udoprog udoprog force-pushed the udoprog:comparators branch from 78f4ecd to 5e796c3 Nov 27, 2017

@KodrAus

This comment has been minimized.

Copy link

KodrAus commented Nov 29, 2017

@udoprog That sounds fair enough 👍

@KodrAus

This comment has been minimized.

Copy link

KodrAus commented Dec 7, 2017

@steveklabnik have you had a chance to look through this? It's a lot of code, but I think it makes tweaking edge cases and supporting new features more natural. Overall I think it's a solid foundation to build on.

@udoprog udoprog force-pushed the udoprog:comparators branch from eb5f042 to db57060 Dec 8, 2017

@udoprog

This comment has been minimized.

Copy link
Contributor Author

udoprog commented Dec 8, 2017

I've done a rough porting and removal of dead code that is no longer needed so that you can get a better idea of the damage this would do.

Most noteworthy is that I had to backpedal on strict parsing numeric components (here: e249514) because we need to support complex metadata that contains things with numeric prefixes like commits.

@udoprog udoprog force-pushed the udoprog:comparators branch from 2814da5 to 1392f82 Dec 8, 2017

@steveklabnik
Copy link
Owner

steveklabnik left a comment

Okay! I think overall this looks excellent. Sorry it took me a while; parsers are something that I find intimidating, which is dumb and is one of the reasons I maintain this package.

I really like this. Let's merge it in. I have a few tiny things, other than that, does this need anything in particular?

/// Check if the current token is a whitespace token.
pub fn is_whitespace(&self) -> bool {
match *self {
Whitespace(_, _) => true,

This comment has been minimized.

@steveklabnik

steveklabnik Dec 8, 2017

Owner

this isn't really an issue but I thought I'd mention it; you can also do

Whitespace(..) => true,

here, which is ever so slightly more clear, personally.

i don't think this change needs to be made but if you agree and happen to do it, no big deal :)

This comment has been minimized.

@udoprog

udoprog Dec 8, 2017

Author Contributor

Fixed.

pub struct Lexer<'input> {
input: &'input str,
chars: str::CharIndices<'input>,
// lookeahead

This comment has been minimized.

@steveklabnik

steveklabnik Dec 8, 2017

Owner

extra e here

This comment has been minimized.

@udoprog

udoprog Dec 8, 2017

Author Contributor

Fixed, thanks :).

#[test]
pub fn all_tokens() {
assert_eq!(
lex("=><.^~*0.1234<=>=||"),

This comment has been minimized.

@steveklabnik

steveklabnik Dec 8, 2017

Owner

this only seems to have 13 out of the 16 tokens; whitespace is missing

also i wonder if we shouldn't check x/X in this test too

This comment has been minimized.

@udoprog

udoprog Dec 8, 2017

Author Contributor

Added coverage for all tokens, and split out tests for components (alphanumeric, numeric).
Added a separate test for testing just the is_wildcard method.

//! use semver_parser::parser::Parser;
//! use semver_parser::range::Op;
//!
//! let mut p = Parser::new("^1").expect("a working parser");

This comment has been minimized.

@steveklabnik

steveklabnik Dec 8, 2017

Owner

the expect message here is backwards; if this doesn't work out, it would say we had one, but we don't

This comment has been minimized.

@udoprog

udoprog Dec 8, 2017

Author Contributor

Fixed, thanks.

//! use semver_parser::parser::Parser;
//! use semver_parser::range::{Op, Predicate};
//!
//! let mut p = Parser::new("^1.0").expect("a working parser");

This comment has been minimized.

@steveklabnik

steveklabnik Dec 8, 2017

Owner

same here

This comment has been minimized.

@udoprog

udoprog Dec 8, 2017

Author Contributor

Fixed, thanks.

//! pre: vec![],
//! })), p.predicate());
//!
//! let mut p = Parser::new("^*").expect("a working parser");

This comment has been minimized.

@steveklabnik

This comment has been minimized.

@udoprog

udoprog Dec 8, 2017

Author Contributor

Fixed, thanks.

let (patch, patch_wildcard) = self.dot_component()?;
let pre = self.pre()?;

// TODO: avoid illegal combinations, like `1.*.0`.

This comment has been minimized.

@steveklabnik

steveklabnik Dec 8, 2017

Owner

when we merge this, we should make issues for each of these TODOs

This comment has been minimized.

@udoprog

udoprog Dec 8, 2017

Author Contributor

This is something we currently accept, and I added a test for it in #35 to make sure we stay consistent. Not sure if we should break it or not.

This comment has been minimized.

@steveklabnik

steveklabnik Dec 8, 2017

Owner

i think #36 should be the answer to this question, basically

This comment has been minimized.

@steveklabnik

steveklabnik Dec 8, 2017

Owner

so, using https://semver.npmjs.com/, it seems that this is accepted. guess we should keep it

This comment has been minimized.

@udoprog

udoprog Dec 8, 2017

Author Contributor

Yeah, it's just a bit unsound. The following requirements are currently equivalent as per our and their implementation:

  • 1.*
  • 1.*.*
  • 1.*.0, 1.*.1, ... any number. Note how they are ignored.

Maybe this should generate a warning? A user might expect that 1.*.10 matches any
patch-release 10 across different minor releases.

@udoprog udoprog force-pushed the udoprog:comparators branch from 1392f82 to d4a573c Dec 8, 2017

@udoprog

This comment has been minimized.

Copy link
Contributor Author

udoprog commented Dec 8, 2017

@steveklabnik rebased and fixed comments. Didn't realize the rebase would mess up comment history, so sorry about that. At least the changes are additional commits.

@steveklabnik

This comment has been minimized.

Copy link
Owner

steveklabnik commented Dec 8, 2017

Naw, it's all good, don't worry about it :)

@steveklabnik

This comment has been minimized.

Copy link
Owner

steveklabnik commented Dec 8, 2017

So, yeah, looks great. Is this ready to merge?

@udoprog

This comment has been minimized.

Copy link
Contributor Author

udoprog commented Dec 8, 2017

I'd say yes.

There are two outstanding improvements I'd like to see before this lands in a release, and I'll create separate issues for them if you decide merge this:

Error messages are currently a bit debuggy (https://github.com/steveklabnik/semver-parser/pull/34/files#diff-2c09afcdc3c420ab0678ba9b5e83959cR87) and not suitable to end-users. We could do neater things like quoting the things that broke the parsing or provide suggestions for what was expected.

There's this blanket conversion to string that I would eventually like to get rid off:
https://github.com/steveklabnik/semver-parser/pull/34/files#diff-2c09afcdc3c420ab0678ba9b5e83959cR102

@udoprog

This comment has been minimized.

Copy link
Contributor Author

udoprog commented Dec 8, 2017

There's also always the risk for regressions. But given that it currently is passing the existing suite of tests, this would always be the case on introducing changes.

@steveklabnik

This comment has been minimized.

Copy link
Owner

steveklabnik commented Dec 8, 2017

Both of those seem relatively minor, so let's :shipit: . I won't cut a release for now, but I'd like to try to get a 1.0 prerelease shipped by Christmas.

@steveklabnik steveklabnik merged commit 8787d92 into steveklabnik:master Dec 8, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@udoprog udoprog deleted the udoprog:comparators branch Dec 8, 2017

@steveklabnik

This comment has been minimized.

Copy link
Owner

steveklabnik commented Dec 8, 2017

Re: risk of regressions, I am opening a new issue I'll cc you on; not sure if the bug is here or not.

@KodrAus

This comment has been minimized.

Copy link

KodrAus commented Dec 8, 2017

🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.