-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
faster pop_tz_offset_from_string, 2x faster dateparser.parse if date has no timezone info #569
Conversation
when most strings don't have tz offset, this is massively faster, as we avoid a loop over all timezones (around 800 of them). But it's possitble to improve this.
word_is_tz is supposed to be case sensetive, so don't modify it's behaviour
Tests run quite a bit faster after this change (e.g.
And here are results (on branch first):
so parsing without TZ is more than 2x faster, while with TZ it's a tiny bit smaller. It should also be possible to bring the time of TZ parsing down. |
Codecov Report
@@ Coverage Diff @@
## master #569 +/- ##
==========================================
+ Coverage 95.11% 95.12% +<.01%
==========================================
Files 302 302
Lines 2498 2500 +2
==========================================
+ Hits 2376 2378 +2
Misses 122 122
Continue to review full report at Codecov.
|
@asadurski do you think it's fine to merge this PR as it is now (big speedup for dates without TZ, small slowdown for dates with TZ), or it's better to improve the case with TZ as well? It should be possible to eliminate the loop from |
I also checked import time, and it's affected a little as well:
|
It should be possible to reverse this though if we lazy-init stuff used by timezone parser. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you! Huge gain with negligible trade-off. Let's merge this and focus on negative case in another PR.
And I'll look some more into performance benchmark options with Travis.
Thank you @asadurski 👍
I think even a benchmark suite which can be run locally to compare with master would be very useful, and there are plenty of nice cases in the tests which could be used as benchmark input. |
When most strings don't have tz offset, this is massively faster, as we avoid a loop over all timezones (around 800 of them).
But it's possible to improve this.
TODO: