Significant performance improvements to AddResult / GetParameter #4446

fblanchetNaN · 2022-08-01T14:11:33Z

Saving and loading data from the SQL database rely on _adapt_xxx and _convert_xxx methods. However, currently they are using many numpy functions (np.isnan, np.isinf, np.save, np.load) that are slow when dealing with arrays containing single values. This PR makes the following changes that significantly improve data loading and saving performance:

For float: Replace numpy functions by math functions, use math.isinfinite instead of combining isnan and isinf;
For complex: Do not save complex single values as numpy arrays of one value. It saves both time and space if we avoid reading and writing the relatively large numpy header. This changes the representation of the data in the database, but it's fully backwards compatible (and tested);
For array: Use directly functions from np.lib.format, it skips some work of np.save and np.load;

b0b6528 rewrites adapters and converters avoiding numpy functions and significantly improves performance (around x3 when loading regular numeric data and up to x70 for complex data).

20c58d7 is a minor improvement (~10% when loading complex or numeric data), it avoids unpacking sqlite3.Row to list when it's unnecessary (those functions can reorder the columns however, it's most of the time unnecessary because the SQL queries are properly sorted).

3d7b0fd follows the previous one: as the SQL queries are properly sorted, the named tuple feature of sqlite3.Row (row[column_name]) becomes superfluous.

459b342 is up-to-date benchmarks. (Note that the benchmarks are not run in CI.)

Commit b0b6528 is the most important one, the two other modifications give pretty minor improvements compared to the number of changes to the codebase. If you feel that it's not worth changing so much for such a small gain, they can be left out.

(we worked on these preformance improvements together with @mgunyho)

closes #1238 (?)

codecov · 2022-08-02T09:21:38Z

Codecov Report

Merging #4446 (3362c91) into master (34f942b) will decrease coverage by 0.02%.
The diff coverage is 93.93%.

❗ Current head 3362c91 differs from pull request most recent head b476b79. Consider uploading reports for the commit b476b79 to get more accurate results

@@            Coverage Diff             @@
##           master    #4446      +/-   ##
==========================================
- Coverage   68.23%   68.21%   -0.03%     
==========================================
  Files         339      339              
  Lines       31801    31778      -23     
==========================================
- Hits        21700    21676      -24     
- Misses      10101    10102       +1

jenshnielsen · 2022-08-02T09:54:06Z

Thanks @fblanchetNaN and @mgunyho I will need a bit of time to look at this so please bear with me

jenshnielsen · 2022-09-13T12:25:40Z

@fblanchetNaN and @mgunyho thanks for the contribution. @astafan8 And I finally had a chance to look at it today. We agreed that this looks good and we would like to merge most of it. We are not completely confident in the logic for scalar complex numbers so we would prefer if we can merge the other optimizations but leave that one for a subsequent pr.

fblanchetNaN · 2022-09-14T08:00:30Z

Thanks for the response!

We agree that the complex scalar logic looks a bit convoluted, even though it is just following the numpy format standard. We can take it out from this PR and discuss it further separately, although for us this is actually the more significant change because it really affects the loading times in some of our experiments.

I can remove the changes related to complex stuff and force-push if that sounds good. I guess we anyway need to rebase to fix the conflicts.

fblanchetNaN · 2022-09-14T09:29:37Z

The branch is now updated excluding the changes related to complex numbers

astafan8

Could you also add a newsfragment (read about them here https://qcodes.github.io/Qcodes/community/contributing.html?highlight=newsfragment#pull-requests )?

qcodes/dataset/sqlite/queries.py

qcodes/dataset/sqlite/query_helpers.py

qcodes/tests/dataset/test_sqlite_connection.py

qcodes/utils/types.py

fblanchetNaN · 2022-09-14T10:35:01Z

We just realized that some stuff from the complex adapter/converter was left over (like pointed out here, and also here). Is it ok to rebase and force-push? Or should I add new commits to preserve your review? Sorry about the mess.

astafan8 · 2022-09-14T11:13:22Z

Is it ok to rebase and force-push? Or should I add new commits to preserve your review? Sorry about the mess.

no worries :) both are fine, feel free to do what's most convenient for you, the comments will be preserved anyways.

fblanchetNaN · 2022-09-20T11:24:32Z

The branch is now updated including the required changes, all checks have passed.

I have marked the conversations as resolved after replying/accepting them.

astafan8

bors merge

bors · 2022-09-20T12:29:50Z

Build succeeded:

fblanchetNaN force-pushed the master branch 5 times, most recently from 9d76b73 to 8ddeda1 Compare August 2, 2022 09:10

fblanchetNaN force-pushed the master branch 2 times, most recently from 8af5eef to 3d7b0fd Compare August 2, 2022 10:54

Add DataSet.get_parameter_data asv benchmark

005aa53

fblanchetNaN force-pushed the master branch from 3d7b0fd to 7270bca Compare September 14, 2022 09:28

astafan8 reviewed Sep 14, 2022

View reviewed changes

fblanchetNaN force-pushed the master branch 5 times, most recently from 008b641 to ec9bd8c Compare September 20, 2022 10:17

Florian Blanchet added 3 commits September 20, 2022 13:50

Improve sqlite3 converters and adapters

5fc0032

Do not unpack sqlite3.Row to list

4b72385

Get rid of sqlite3.Row and adjust SQL queries

b476b79

fblanchetNaN force-pushed the master branch from ec9bd8c to b476b79 Compare September 20, 2022 10:50

astafan8 approved these changes Sep 20, 2022

View reviewed changes

bors bot merged commit b48686e into microsoft:master Sep 20, 2022

fblanchetNaN mentioned this pull request Sep 21, 2022

Significant performance improvements for complex scalars #4642

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Significant performance improvements to AddResult / GetParameter #4446

Significant performance improvements to AddResult / GetParameter #4446

fblanchetNaN commented Aug 1, 2022 •

edited

codecov bot commented Aug 2, 2022 •

edited

jenshnielsen commented Aug 2, 2022

jenshnielsen commented Sep 13, 2022

fblanchetNaN commented Sep 14, 2022

fblanchetNaN commented Sep 14, 2022

astafan8 left a comment

fblanchetNaN commented Sep 14, 2022

astafan8 commented Sep 14, 2022

fblanchetNaN commented Sep 20, 2022

astafan8 left a comment

bors bot commented Sep 20, 2022

Significant performance improvements to AddResult / GetParameter #4446

Significant performance improvements to AddResult / GetParameter #4446

Conversation

fblanchetNaN commented Aug 1, 2022 • edited

codecov bot commented Aug 2, 2022 • edited

Codecov Report

jenshnielsen commented Aug 2, 2022

jenshnielsen commented Sep 13, 2022

fblanchetNaN commented Sep 14, 2022

fblanchetNaN commented Sep 14, 2022

astafan8 left a comment

Choose a reason for hiding this comment

fblanchetNaN commented Sep 14, 2022

astafan8 commented Sep 14, 2022

fblanchetNaN commented Sep 20, 2022

astafan8 left a comment

Choose a reason for hiding this comment

bors bot commented Sep 20, 2022

fblanchetNaN commented Aug 1, 2022 •

edited

codecov bot commented Aug 2, 2022 •

edited