[enhancement] Result database performance improvements #3546
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR adds two significant performance improvements for DB queries involving session information.
--describe-stored-sessionsoption performance is considerably improved as the unnecessary decoding and encoding of the results is removed. Since the result is already going to be returned as JSON, there is no need to pay the cost of decoding and re-encoding it, so it is returned raw from the database.Another minor improvement was also done to further optimize the JSON decoding inside the backend.
More specifically in a 15GB database and the query spanning roughly 3K sessions was ~30x faster using the index:
reframe --list-stored-testcases='now-7d:now?ci_pipeline_id=="123456789"'/mean:/+presultwhere
ci_pipeline_idis custom session property stored in the DB (see--session-extras).The
--describe-stored-sessionwith a similar span of records was 3x faster using the improvements from this PR.Finally, the index addition is seamless. The first time a query to the DB will be made with the new version the necessary DB indexes will be created if they do not exist. Note this very first query make take some time to execute depending on the size of the DB.