Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up database querying #103

Closed
angus-g opened this issue Oct 12, 2018 · 0 comments · Fixed by #106
Closed

Speed up database querying #103

angus-g opened this issue Oct 12, 2018 · 0 comments · Fixed by #106
Assignees

Comments

@angus-g
Copy link
Collaborator

angus-g commented Oct 12, 2018

It is possible that the database will be overhauled as this project progresses, but in the meantime we can gain significant improvements by indexing on key columns in the database. From my own testing, an easy fix is: CREATE INDEX experiment_id ON ncfiles (experiment);.

Before index creation, a sample query:

sqlite> select distinct ncfile, dimensions, chunking from ncfiles
  WHERE experiment = '1deg_jra55v13_iaf_spinup1_A'
    AND (basename_pattern = 'ocean.nc' OR basename GLOB 'ocean.nc')
    AND variable in ('u')
  ORDER BY ncfile;
...
CPU Time: user 9.605138 sys 18.436621

Index creation on the full database, post-creation:

sqlite> create index experiment_id on ncfiles (experiment);
CPU Time: user 84.730643 sys 139.632335

Same query, after index creation:

sqlite> select distinct ncfile, dimensions, chunking from ncfiles
  WHERE experiment = '1deg_jra55v13_iaf_spinup1_A'
    AND (basename_pattern = 'ocean.nc' OR basename GLOB 'ocean.nc')
    AND variable in ('u')
  ORDER BY ncfile;
...
CPU Time: user 0.168388 sys 0.589691

This is a few orders of magnitude faster. Obviously this isn't the most rigorous of tests (filesystem caching etc.) One thing I'm not sure about is the effect on build_index

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant