Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] make nested queries faster #239

Merged
merged 42 commits into from
Mar 22, 2022
Merged

[ENH] make nested queries faster #239

merged 42 commits into from
Mar 22, 2022

Conversation

jdkent
Copy link
Member

@jdkent jdkent commented Mar 15, 2022

sqlalchemy uses lazy loading by default meaning accessing an attribute of a requested object will eject another SQL query to access that attribute, this is costly when displaying nested attributes, using an eager loading scheme is better.

changes:

  • eager loading from sqlalchemy when nesting
  • skip marshmallow when serializing Data/Studysets into python objects
  • use orjson instead of builtin json library to serialize python objects to JSON.
  • eager load annotations to reduce sql queries.

@jdkent
Copy link
Member Author

jdkent commented Mar 16, 2022

slowest query:

SELECT points.id AS points_id, points.created_at AS points_created_at, points.updated_at AS points_updated_at, points.x AS points_x, points.y AS points_y, points.z AS points_z, points.space AS points_space, points.kind AS points_kind, points.image AS points_image, points.label_id AS points_label_id, points.analysis_id AS points_analysis_id, points.user_id AS points_user_id, analyses_1.id AS analyses_1_id \nFROM (SELECT datasets.id AS datasets_id \nFROM datasets \nWHERE datasets.public = true OR %(param_1)s = datasets.user_id ORDER BY datasets.created_at DESC \n LIMIT %(param_2)s OFFSET %(param_3)s) AS anon_1 JOIN dataset_studies AS dataset_studies_1 ON anon_1.datasets_id = dataset_studies_1.dataset_id JOIN studies AS studies_1 ON studies_1.id = dataset_studies_1.study_id JOIN analyses AS analyses_1 ON studies_1.id = analyses_1.study_id JOIN points ON analyses_1.id = points.analysis_id

@jdkent
Copy link
Member Author

jdkent commented Mar 17, 2022

where time is going:

neurostore     | PATH: '/api/datasets/wXQ9Fxw3mPz3'
neurostore     |          34998995 function calls (24141844 primitive calls) in 26.160 seconds
neurostore     | 
neurostore     |    Ordered by: cumulative time
neurostore     |    List reduced from 1220 to 30 due to restriction <30>
neurostore     | 
neurostore     |    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
neurostore     |         1    0.000    0.000   26.170   26.170 /usr/local/lib/python3.10/site-packages/werkzeug/middleware/profiler.py:102(runapp)
neurostore     |         1    0.000    0.000   26.170   26.170 /usr/local/lib/python3.10/site-packages/flask/app.py:2086(__call__)
neurostore     |         1    0.000    0.000   26.170   26.170 /usr/local/lib/python3.10/site-packages/flask/app.py:2043(wsgi_app)
neurostore     |         1    0.000    0.000   26.138   26.138 /usr/local/lib/python3.10/site-packages/flask/app.py:1504(full_dispatch_request)
neurostore     |         1    0.017    0.017   26.138   26.138 /usr/local/lib/python3.10/site-packages/flask/app.py:1480(dispatch_request)
neurostore     |         1    0.000    0.000   26.121   26.121 /usr/local/lib/python3.10/site-packages/connexion/decorators/decorator.py:65(wrapper)
neurostore     |         1    0.000    0.000   18.975   18.975 /usr/local/lib/python3.10/site-packages/connexion/decorators/uri_parsing.py:132(wrapper)
neurostore     |         1    0.000    0.000   18.975   18.975 /usr/local/lib/python3.10/site-packages/connexion/decorators/validation.py:360(wrapper)
neurostore     |         1    0.000    0.000   18.973   18.973 /usr/local/lib/python3.10/site-packages/connexion/decorators/parameter.py:78(wrapper)
neurostore     |         1    0.000    0.000   18.973   18.973 /neurostore/neurostore/resources/data.py:231(get)
neurostore     |   44004/1    0.096    0.000   15.799   15.799 /usr/local/lib/python3.10/site-packages/marshmallow/schema.py:527(dump)
neurostore     |   84268/1    0.694    0.000   15.799   15.799 /usr/local/lib/python3.10/site-packages/marshmallow/schema.py:501(_serialize)
neurostore     |  407649/9    0.695    0.000   15.799    1.755 /usr/local/lib/python3.10/site-packages/marshmallow/fields.py:313(serialize)
neurostore     |   45872/1    0.076    0.000   15.799   15.799 /neurostore/neurostore/schemas/data.py:23(_serialize)
neurostore     |   44003/1    0.053    0.000   15.798   15.798 /usr/local/lib/python3.10/site-packages/marshmallow/schema.py:514(<listcomp>)
neurostore     |    407648    0.270    0.000    9.607    0.000 /usr/local/lib/python3.10/site-packages/marshmallow/fields.py:250(get_value)
neurostore     |    407648    0.180    0.000    9.337    0.000 /usr/local/lib/python3.10/site-packages/marshmallow/schema.py:469(get_attribute)
neurostore     |    407648    0.339    0.000    9.157    0.000 /usr/local/lib/python3.10/site-packages/marshmallow/utils.py:225(get_value)
neurostore     |    407648    0.242    0.000    8.724    0.000 /usr/local/lib/python3.10/site-packages/marshmallow/utils.py:251(_get_value_for_key)
neurostore     | 1166570/1022588    0.461    0.000    8.616    0.000 {built-in method builtins.getattr}
neurostore     |    523641    0.541    0.000    8.139    0.000 /usr/local/lib/python3.10/site-packages/sqlalchemy/orm/attributes.py:286(__get__)
neurostore     | 82266/41133    0.161    0.000    7.598    0.000 /usr/local/lib/python3.10/site-packages/sqlalchemy/orm/attributes.py:706(get)
neurostore     |     41133    0.248    0.000    7.237    0.000 /usr/local/lib/python3.10/site-packages/sqlalchemy/orm/strategies.py:675(_load_for_state)
neurostore     |         1    0.000    0.000    7.143    7.143 /usr/local/lib/python3.10/site-packages/connexion/apis/flask_api.py:137(get_response)
neurostore     |         1    0.000    0.000    7.143    7.143 /usr/local/lib/python3.10/site-packages/connexion/apis/abstract.py:266(_get_response)
neurostore     |         1    0.001    0.001    7.143    7.143 /usr/local/lib/python3.10/site-packages/connexion/apis/abstract.py:300(_response_from_handler)
neurostore     |         1    0.000    0.000    7.143    7.143 /usr/local/lib/python3.10/site-packages/connexion/apis/flask_api.py:183(_build_response)
neurostore     |         2    0.000    0.000    7.134    3.567 /usr/local/lib/python3.10/site-packages/flask/json/__init__.py:116(dumps)
neurostore     |         1    0.000    0.000    7.133    7.133 /usr/local/lib/python3.10/site-packages/connexion/apis/abstract.py:401(_prepare_body_and_status_code)
neurostore     |         1    0.000    0.000    7.133    7.133 /usr/local/lib/python3.10/site-packages/connexion/apis/flask_api.py:200(_serialize_data)

@jdkent jdkent changed the title [ENH] eager load nested queries [ENH] make nested queries faster Mar 17, 2022
@jdkent jdkent merged commit 0afca6c into master Mar 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant