Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Simplemma to version 0.7 #594

Merged
merged 1 commit into from Jun 23, 2022
Merged

Upgrade Simplemma to version 0.7 #594

merged 1 commit into from Jun 23, 2022

Conversation

osma
Copy link
Member

@osma osma commented Jun 23, 2022

This PR upgrades the Simplemma dependency (which is used in the simplemma analyzer, added in #591) from version 0.6 to 0.7.

Simplemma now does the loading of language data internally, so the calling code could be simplified even more from what it was. There is no need for the __getstate__ trick so that was deleted too.

Related to this comment

Note that I haven't tested this for real (yet), just verified that the tests pass, so keeping it as a draft PR for now.

@osma osma added this to the 0.58 milestone Jun 23, 2022
@osma osma self-assigned this Jun 23, 2022
@sonarcloud
Copy link

sonarcloud bot commented Jun 23, 2022

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@codecov
Copy link

codecov bot commented Jun 23, 2022

Codecov Report

Merging #594 (add4b19) into master (8702efb) will decrease coverage by 0.00%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #594      +/-   ##
==========================================
- Coverage   99.48%   99.48%   -0.01%     
==========================================
  Files          86       86              
  Lines        5645     5636       -9     
==========================================
- Hits         5616     5607       -9     
  Misses         29       29              
Impacted Files Coverage Δ
tests/test_analyzer_simplemma.py 100.00% <ø> (ø)
annif/analyzer/simplemma.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8702efb...add4b19. Read the comment docs.

@osma
Copy link
Member Author

osma commented Jun 23, 2022

Test results added to the table taken from #591:

backend train set lang analyzer train jobs train time train RSS eval set eval jobs eval time eval RSS F1@5 NDCG
mllm kirjastonhoitaja/fulltext-train fi voikko 4 357.78 1123776 kirjastonhoitaja/test 4 192.90 340248 0.3259 0.4368
mllm kirjastonhoitaja/fulltext-train fi simplemma-initial 4 362.77 3456424 kirjastonhoitaja/test 4 221.09 2691512 0.3269 0.4303
mllm kirjastonhoitaja/fulltext-train fi simplemma+lru_cache 4 357.15 3465288 kirjastonhoitaja/test 4 220.12 2691228 0.3217 0.4351
mllm kirjastonhoitaja/fulltext-train fi simplemma+lru_cache+load_data 4 330.41 1900016 kirjastonhoitaja/test 4 213.71 1291668 0.3246 0.4363
mllm kirjastonhoitaja/fulltext-train fi simplemma 0.7 4 328.74 1904932 kirjastonhoitaja/test 4 215.02 1290344 0.3231 0.4380
omikuji-parabel yso-finna (100k) fi voikko 4 243.78 748772 kirjastonhoitaja/test 4 21.84 557580 0.2033 0.2903
omikuji-parabel yso-finna (100k) fi simplemma-initial 4 305.54 2748572 kirjastonhoitaja/test 4 49.17 2738524 0.2043 0.2890
omikuji-parabel yso-finna (100k) fi simplemma+lru_cache 4 290.90 2775552 kirjastonhoitaja/test 4 41.80 2738080 0.2123 0.2826
omikuji-parabel yso-finna (100k) fi simplemma+lru_cache+load_data 4 233.96 1648580 kirjastonhoitaja/test 4 42.36 1531068 0.2071 0.2970
omikuji-parabel yso-finna (100k) fi simplemma 0.7 4 228.22 1696524 kirjastonhoitaja/test 4 40.85 1530076 0.2103 0.2877

The results are essentially unchanged (runs completed slightly faster) and no problems were encountered. I will merge this.

@osma osma marked this pull request as ready for review June 23, 2022 10:17
@osma osma merged commit aa50441 into master Jun 23, 2022
@osma osma deleted the upgrade-simplemma-0.7 branch June 23, 2022 10:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant