Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimization: load a vocabulary only once even if used in different languages #736

Merged
merged 1 commit into from
Oct 3, 2023

Conversation

osma
Copy link
Member

@osma osma commented Sep 22, 2023

While looking at ways to implement #735, I discovered an opportunity for optimization in the registry code that handles loading of vocabularies. For some reason (probably my mistake) the registry loads vocabularies multiple times, once per language. This amounts to useless work and use of memory.

This PR adjusts the code slightly so that vocabularies are always loaded just once. This was always the intention since the introduction of multilingual vocabularies (#559, PR #600 etc.) and especially PR #610 which implemented vocabularies that are shared between projects.

I benchmarked this with an installation where I have three Finto AI MLLM projects (languages fi, sv, en) that all use the YSO vocabulary, but in different languages. I ran the command

ANNIF_CONFIG=annif.default_config.ProductionConfig /usr/bin/time -v annif list-projects

The idea here is to use ProductionConfig which causes all projects to be loaded on startup, instead of on demand. This means that also the vocabulary is loaded.

Before

(showing selected stats)

        User time (seconds): 13.04
	System time (seconds): 6.13
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:12.27
	Maximum resident set size (kbytes): 539600

After

	User time (seconds): 12.82
	System time (seconds): 7.26
	Elapsed (wall clock) time (h:mm:ss or m:ss): 0:11.66
	Maximum resident set size (kbytes): 428940

So there's a slight speedup, and the memory usage drops by 110MB. Not bad for a patch that also reduces the amount of code by 3 lines.

@osma osma added this to the 1.1 milestone Sep 22, 2023
@osma osma self-assigned this Sep 22, 2023
@sonarcloud
Copy link

sonarcloud bot commented Sep 22, 2023

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

Copy link
Member

@juhoinkinen juhoinkinen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find

@osma osma merged commit ff1d32c into main Oct 3, 2023
12 of 13 checks passed
@osma osma deleted the optimize-registry-vocab-language branch October 3, 2023 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants