Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve import times of aiida.orm and aiida.storage #6382

Merged
merged 1 commit into from May 8, 2024

Conversation

danielhollas
Copy link
Collaborator

@danielhollas danielhollas commented May 7, 2024

  1. aiida.orm went from 600ms back to 300ms by deferring import of aiida.storage.
  2. Also some small improvements to both aiida.orm and aiida.storage, mainly from deferring build of pydantic models.

Main branch

Benchmark 1: python -c "import aiida.orm"
  Time (mean ± σ):     650.2 ms ±   9.2 ms    [User: 622.8 ms, System: 852.6 ms]
  Range (min … max):   641.0 ms … 663.2 ms    10 runs
 
Benchmark 1: python -c "import aiida.storage"
  Time (mean ± σ):     644.3 ms ±  12.0 ms    [User: 608.7 ms, System: 859.6 ms]
  Range (min … max):   632.4 ms … 671.6 ms    10 runs

Benchmark 1: verdi code list
  Time (mean ± σ):     926.9 ms ±  30.6 ms    [User: 871.6 ms, System: 852.1 ms]
  Range (min … max):   889.9 ms … 997.4 ms    10 runs

This branch

Benchmark 1: python -c "import aiida.orm"
  Time (mean ± σ):     282.4 ms ±   6.3 ms    [User: 296.7 ms, System: 794.5 ms]
  Range (min … max):   273.1 ms … 290.3 ms    10 runs
 
Benchmark 1: python -c "import aiida.storage"
  Time (mean ± σ):     588.4 ms ±  16.3 ms    [User: 564.5 ms, System: 857.0 ms]
  Range (min … max):   560.4 ms … 610.8 ms    10 runs

Benchmark 1: verdi code list
  Time (mean ± σ):     910.0 ms ±  29.4 ms    [User: 862.8 ms, System: 872.7 ms]
  Range (min … max):   836.6 ms … 939.5 ms    10 runs

@danielhollas danielhollas changed the title WIP: Fix regression in import time of aiida.orm module Fix regression in import time of aiida.orm May 8, 2024
1. Don't import aiida.storage from aiida.orm
2. use defer_build=True for all pydantic models
3. Improve import of storage.sqlite_zip
@danielhollas danielhollas marked this pull request as ready for review May 8, 2024 02:04
@danielhollas danielhollas requested a review from sphuber May 8, 2024 02:04
@danielhollas danielhollas changed the title Fix regression in import time of aiida.orm Improve import times of aiida.orm and aiida.storage May 8, 2024
Copy link
Contributor

@sphuber sphuber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @danielhollas . I guess the CI test we added a long time ago to catch these regressions is not doing its job. Guess it can be difficult to tune where we catch real significant regressions in performance but don't get hit by false positives all the time. Appreciate your effort to optimize this manually 👍

@sphuber sphuber merged commit fb9b6cc into aiidateam:main May 8, 2024
19 checks passed
@danielhollas danielhollas deleted the import-time-orm branch May 8, 2024 05:09
@danielhollas
Copy link
Collaborator Author

danielhollas commented May 13, 2024

Appreciate your effort to optimize this manually 👍

Cheers. :-)

I guess the CI test we added a long time ago to catch these regressions is not doing its job.

The tests works fine, but they only preventregressions in verdi. I've opened #6387 with some suggestions for further improvements.

I am afraid that even though I fixed the aiida.orm regression, it's practical impact will be small, since for any DB interaction one needs to load aiida.storage anyway (and one can see that the timings for verdi code list did not change much here). I am quite sad that even a very simple DB interaction takes a whole second (and it can pose real issues on AiiDAlab side). I'd like to understand more where the time is being spent. By some initial investigation, a lot of time is spent in sqlalchemy initialization. Seems like building the orm models is expensive, and we seem to be paying an upfront cost for all the DB backends unfortunately. I wonder if SQL supports something like defer_build in pydantic?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants