Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize startup time with lazy imports #514

Closed
osma opened this issue Aug 17, 2021 · 1 comment · Fixed by #544
Closed

Optimize startup time with lazy imports #514

osma opened this issue Aug 17, 2021 · 1 comment · Fixed by #544
Milestone

Comments

@osma
Copy link
Member

osma commented Aug 17, 2021

Annif takes several seconds to start even when it's doing nothing but printing the version number or help text:

$ time annif --version
0.54.0.dev0

real	0m4,398s
user	0m4,322s
sys	0m0,470s

I investigated this a little bit using the -X importtime feature in Python 3.7+ and the tuna tool for visualizing profiling information. It seems that the time is mostly spent importing large libraries such as tensorflow, scikit-learn, optuna, connexion and nltk:

kuva

These libraries are all unnecessary in simple operations such as annif --help and --version so it would be better to avoid importing them altogether. There are some tutorials on lazy importing (e.g. this one) and the importlib library contains (since Python 3.5) a LazyLoader utility class that could be used here.

I experimented a bit with this lazy_import function but couldn't get it to work for nltk submodules:

# Adapted from: https://stackoverflow.com/questions/42703908/
def lazy_import(fullname):
    """lazily import a module the first time it is used"""
    try:
        return sys.modules[fullname]
    except KeyError:
        spec = importlib.util.find_spec(fullname)
        module = importlib.util.module_from_spec(spec)
        loader = importlib.util.LazyLoader(spec.loader)
        # Make module with proper locking and get it inserted into sys.modules.
        loader.exec_module(module)
        return module

This needs more experimentation but for now I'm just opening the issue...

@osma osma added this to the Short term milestone Aug 17, 2021
@osma
Copy link
Member Author

osma commented Aug 26, 2021

Implementing lazy import of backends could partially solve the problem of using AVX instructions within VirtualBox, which quite frequently causes problems for participants of Annif tutorial (see Troubleshooting in the VirtualBox install instructions). With lazy import, TensorFlow (which requires AVX) would only be imported if the NN ensemble backend is used - so the AVX problem would not affect other backends.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
1 participant