Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow parsing of an invalid date #426

Closed
lopuhin opened this issue Jun 26, 2018 · 1 comment · Fixed by #428
Closed

Very slow parsing of an invalid date #426

lopuhin opened this issue Jun 26, 2018 · 1 comment · Fixed by #428

Comments

@lopuhin
Copy link
Member

lopuhin commented Jun 26, 2018

First several calls are above 1 second for me (python 3.6 on OS X):

In [1]: import dateparser

In [2]: %time dateparser.parse('not a date')
CPU times: user 1.94 s, sys: 99.2 ms, total: 2.04 s
Wall time: 2.13 s

In [3]: %time dateparser.parse('not a date')
CPU times: user 1.32 s, sys: 7.37 ms, total: 1.32 s
Wall time: 1.33 s

After some time it goes down to 150 -- 200 ms, which is still quite slow:

In [6]: %time dateparser.parse('not a date')
CPU times: user 145 ms, sys: 2.08 ms, total: 147 ms
Wall time: 152 ms

In [7]: %time dateparser.parse('not a date')
CPU times: user 147 ms, sys: 2 ms, total: 149 ms
Wall time: 153 ms
@lopuhin
Copy link
Member Author

lopuhin commented Jun 26, 2018

looks like it's possible to make it about 10x-20x faster, and have just the first run slow, by fixing how caching is performed in the dictionary module, and being smarter about stripping the tz (pop_tz_offset_from_string), which is done for every language (it's quite slow), and a locale is checked even if result is the same.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant