Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempted to implement lazy loading of languages data #294

Merged
merged 20 commits into from
Jan 15, 2018
Merged

Attempted to implement lazy loading of languages data #294

merged 20 commits into from
Jan 15, 2018

Conversation

sarthakmadaan
Copy link
Contributor

Attempted to implement lazy loading to resolve issue #253 .

@codecov-io
Copy link

codecov-io commented Mar 22, 2017

Codecov Report

Merging #294 into master will increase coverage by 0.38%.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff            @@
##           master   #294      +/-   ##
========================================
+ Coverage   97.61%    98%   +0.38%     
========================================
  Files          20     20              
  Lines        1674   1650      -24     
========================================
- Hits         1634   1617      -17     
+ Misses         40     33       -7
Impacted Files Coverage Δ
dateparser/utils/__init__.py 98.09% <ø> (-0.14%) ⬇️
dateparser/languages/detection.py 100% <100%> (ø) ⬆️
dateparser/conf.py 97.91% <100%> (ø) ⬆️
dateparser/date.py 97.61% <100%> (+1.54%) ⬆️
dateparser/languages/validation.py 100% <100%> (ø) ⬆️
dateparser/languages/loader.py 96.61% <100%> (+10.24%) ⬆️
dateparser/languages/__init__.py 100% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5308434...3f65cd7. Read the comment docs.

@sarthakmadaan
Copy link
Contributor Author

I checked the times of importing of dateparser in current system and after making changes.

Current :

>>> import timeit
>>> timeit.timeit('import dateparser',number=1)
1.249682903289795

After making changes :

>>> import timeit
>>> timeit.timeit('import dateparser',number=1)
0.6022160053253174

@sarthakmadaan
Copy link
Contributor Author

@waqasshabbir could you please have a look at this PR. I have implemented on-demand loading of languages data and avoided loading at import time to reduce import time and have better performance in long run.

self.languages=languages
self.are_languages_passed = bool(languages)

available_language_shortnames=['en', 'ar', 'be', 'bg', 'bn', 'cs', 'da', 'de', 'es',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We must not mix data loading logic with this class here especially when we can completely avoid it by making necessary changes in LanguageDataLoader class.

@sarthakmadaan
Copy link
Contributor Author

Thanks for the review @waqasshabbir . I have made required changes to implement loading logic entirely in LanguageDataLoader.

@dposada
Copy link

dposada commented Oct 6, 2017

What's the status of this? I would love to take advantage of the speedup

@asadurski asadurski merged commit 3f65cd7 into scrapinghub:master Jan 15, 2018
@sarthakmadaan sarthakmadaan deleted the reduce_import_time branch February 5, 2018 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants