Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize startup time using local & lazy imports (take 2) #544

Merged
merged 5 commits into from
Dec 20, 2021

Conversation

osma
Copy link
Member

@osma osma commented Dec 20, 2021

Simplified version of PR #543
Fixes #514

The goal of this PR is to reduce CLI startup time by avoiding useless work, especially imports that are not necessary for the requested operation.

It makes the following changes to the import statements within the Annif codebase:

  • complete rewrite of annif/backend/__init__.py; the end result is that backends (and the libraries they require, e.g. fasttext, omikuji and tensorflow) are only imported when they are actually used
  • avoid importing NLTK and sklearn unless actually required, by moving import statements inside functions and methods

I tried to craft the changes to have minimal impact on the code so I only chose to make imports local in cases where there were very few uses within the same module.

Startup time for simple commands such as annif --help and annif --version has been reduced by two thirds.

Before:

$ time annif --version
0.56.0.dev0

real	0m4,052s
user	0m4,001s
sys	0m0,568s

After:

$ time annif --version
0.56.0.dev0

real	0m1,385s
user	0m1,470s
sys	0m0,183s

As explained in #514, I also used tuna to visualize where the remaining import time is spent after this PR:

image

The main culprits are now connexion (with most of the time spent initializing openapi_spec_validator!) and flask. Those are core libraries and I don't think we can avoid importing them even for the simplest CLI commands.

TODO:

  • add tests for for the ImportError/ValueError clauses in annif/backend/__init__.py

@osma osma added this to the 0.56 milestone Dec 20, 2021
@codecov
Copy link

codecov bot commented Dec 20, 2021

Codecov Report

Merging #544 (40c1884) into master (fbd1f92) will increase coverage by 0.00%.
The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #544   +/-   ##
=======================================
  Coverage   99.48%   99.49%           
=======================================
  Files          80       80           
  Lines        5282     5313   +31     
=======================================
+ Hits         5255     5286   +31     
  Misses         27       27           
Impacted Files Coverage Δ
annif/analyzer/analyzer.py 100.00% <100.00%> (ø)
annif/analyzer/snowball.py 100.00% <100.00%> (ø)
annif/backend/__init__.py 100.00% <100.00%> (ø)
annif/cli.py 99.62% <100.00%> (+<0.01%) ⬆️
tests/test_backend.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fbd1f92...40c1884. Read the comment docs.

@osma osma changed the title Optimize startup time using lazy imports (take 2) Optimize startup time using local & lazy imports (take 2) Dec 20, 2021
@sonarcloud
Copy link

sonarcloud bot commented Dec 20, 2021

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@osma osma marked this pull request as ready for review December 20, 2021 11:15
@osma osma merged commit 5c6af91 into master Dec 20, 2021
@osma osma deleted the issue514-optimize-lazy-imports-take2 branch December 20, 2021 12:05
@osma osma mentioned this pull request Dec 20, 2021
@osma osma mentioned this pull request Mar 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Optimize startup time with lazy imports
2 participants