Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/nom parser #9

Merged
merged 3 commits into from Nov 3, 2015
Merged

Feature/nom parser #9

merged 3 commits into from Nov 3, 2015

Conversation

@hoodie
Copy link
Contributor

@hoodie hoodie commented Oct 30, 2015

Hi there,
I'm working on replacing the parsing implementation with nom.

this blog article has layed the groundwork.
I compared the performance of datetime, chrono and this. After switching from regex to nom the results of datetime are significantly better.
Bonus: with this datetime seems to also have a more complete implementation of the ISO8601 standard.

This is not mergeworthy yet, milliseconds are not supported yet, I just wanted to let you know.

@ogham
Copy link
Collaborator

@ogham ogham commented Nov 1, 2015

Oh hey again,

I actually saw your parsing benchmarks the other day, and thought, aw man, is using a regex really that bad? And apparently it is! I’ve also heard of nom but never played around with it, and while I thought it would be faster I didn’t think it would be that much faster! So this is great.

A while back, I changed the parse module (83b464d) so that instead of parsing into LocalDateTime instances, it parses into field values, which makes the parsing implementation completely separate from the date implementation, allowing you to use the parser without using the dates. However, you’ve now got your own nom iso8601 crate, which also allows you to use the parser without using the dates. So it may be possible to replace the parse module entirely with this nom crate, as I don’t think that module does anything your crate doesn’t do too. What do you think?

@ogham
Copy link
Collaborator

@ogham ogham commented Nov 1, 2015

By the way, I was kind of hoping you didn’t notice that I removed some of your parsing tests when I made some recent changes. These tests have been living in a text file on my desktop since then, and I definitely don’t want to release a new version of datetime until all your code is back where it belongs. I hope you didn’t get the wrong idea where I accepted your PR and then removed its code… It’ll be back, I promise! 😥

@hoodie
Copy link
Contributor Author

@hoodie hoodie commented Nov 1, 2015

hey, that nom iso8601 came out shortly after you merged my regex version - but I didn't think anything of it at the time.
What I didn't notice when I started trying to adjust it to fit rust-datetime was how incomplete it is.
Right now I'm both contributing to the nom-parser and here trying to mix and match both implementations. My goal is to make iso8601 (the crate) a self sufficient crate that does nothing but what it is supposed to and meanwhile remove all parsing logic from datetime. Since parsing a date and validating it are two different jobs entirely and datetime is suited much better for telling whether a year as a Feb29 or not than a set of parsers :D Right now I'm still struggling a bit with noms macro-DSL and giving it a nicer api than it currently has. Afterwards I'll carve out the remains of my regex implementation and plugin the nom parser as an external dependency.

The thing about my Regex implementation is not that it is that much slower. I never took any time to optimize it, but as you can tell from the PR I received, what slows it down is not the application of the regex, but the initialization. After that it is only about 2x-3x slower. The nom version also gets slower as I implement more of the standard. Still my hope is, at the end of this integration rust-datetime will have a more complete implementation of the iso8601 standard than chrono and still parse strings faster.

@ogham
Copy link
Collaborator

@ogham ogham commented Nov 1, 2015

My goal is to make iso8601 (the crate) a self sufficient crate that does nothing but what it is supposed to and meanwhile remove all parsing logic from datetime.

Sounds good—that was what the parse module was supposed to be, though your crate will be more complete than my module is. Especially with regards to weird parsing edge cases...

Since parsing a date and validating it are two different jobs entirely and datetime is suited much better for telling whether a year as a Feb29 or not than a set of parsers :D

I agree! datetime’s current parser will quite happily accept “29th February 2015” as a date, because there’s no way I’m putting that logic in a parsing routine.

The thing about my Regex implementation is not that it is that much slower. I never took any time to optimize it, but as you can tell from the PR I received, what slows it down is not the application of the regex, but the initialization.

That did surprise me, actually. I hoped that as more and more work was done on the Regex crate, it would get faster, and eventually the regex would move from being created at runtime to being compiled into actual code at compile time. But if nom can do the job just as well today, then that’s great.

Anyway, the only thing I can think of that’ll change in datetime in the near future is the module where fixed-offset timezones are, and that’s quite a small change. I remember changing the API for you a little to make parsing easier last time, so with luck that won’t be necessary anymore.

@hoodie
Copy link
Contributor Author

@hoodie hoodie commented Nov 2, 2015

Alright, I got it to work with nom and it's fast.
However I left in the parse.rs because it contains all of the glue I needed.
Unfortunately since we both tried to solve the same issues (you with lazy_static) this turned out to be a rather painful merge. I stopped after about 10minutes in and gave up, partly also because I don't know what else you changed.

@ogham
Copy link
Collaborator

@ogham ogham commented Nov 2, 2015

Ah, sorry about that. Want me to do it?

@hoodie
Copy link
Contributor Author

@hoodie hoodie commented Nov 2, 2015

Would be nice if you did

@ogham ogham self-assigned this Nov 3, 2015
@ogham
Copy link
Collaborator

@ogham ogham commented Nov 3, 2015

This merge sucks :p

Could you please upload your iso8601 crate to crates.io? I can't find it there at the moment, and it's referred to locally in the Cargo.toml:

[dependencies]
iso8601 = { path = "../iso8601" }
@hoodie
Copy link
Contributor Author

@hoodie hoodie commented Nov 3, 2015

ups, sorry, just point it to my repo http://github.com/hoodie/iso8601
its not ready for crates.io yet, I'm still talking to the Original Maintainer who also got some PRs from me
This is all a bit hot-needley :D

@ogham
Copy link
Collaborator

@ogham ogham commented Nov 3, 2015

OK!

@hoodie
Copy link
Contributor Author

@hoodie hoodie commented Nov 3, 2015

don't get me wrong, I consider my current version of the iso8601 to be very stable and well tested, also the API ought to remain virtually identical, I just didn't want to go ahead and release it without consorting with @badboy who originally wrote it

@ogham
Copy link
Collaborator

@ogham ogham commented Nov 3, 2015

Don't worry, that makes sense. Given that crates.io has all the crates in a single namespace, it's good to be polite and not take the "iso8601" name until it's agreed upon.

@ogham
Copy link
Collaborator

@ogham ogham commented Nov 3, 2015

(PS, I have this merge mostly done, I'll finish it after lunch)

@hoodie
Copy link
Contributor Author

@hoodie hoodie commented Nov 3, 2015

Bon appetit

@ogham ogham merged commit a544d79 into rust-datetime:master Nov 3, 2015
@hoodie
Copy link
Contributor Author

@hoodie hoodie commented Nov 3, 2015

\O/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.