Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading all SpaCy models slows down extraction #30

Open
loctimize opened this issue Aug 17, 2022 · 0 comments
Open

Loading all SpaCy models slows down extraction #30

loctimize opened this issue Aug 17, 2022 · 0 comments

Comments

@loctimize
Copy link

loctimize commented Aug 17, 2022

When loading the SpaCy models, all models are loaded even if they are not used. See file spacy_models.py line 19 and following.
When adding more languages or using lg models, this might become a bottleneck and slow down the extraction process significantly.

Suggestion:
Check which language is requested and only load the required model, e.g. by changing from line 58 to (removing spacy_model = spacy_models[lang]):

        if lang == 'de':
            spacy_model = de_core_news_md.load()
        elif lang == 'en':
            spacy_model = en_core_web_md.load()
        ...

Then you would also be able to remove line 19-26.
What do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant