Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix lookup table tsv indexes #3841

Merged
merged 1 commit into from
Jan 29, 2024
Merged

Conversation

codyebberson
Copy link
Member

@codyebberson codyebberson commented Jan 29, 2024

Fixing issue in previous migration that failed to create indexes on tsvectors.


ChatGPT analysis of resulting queries:

The EXPLAIN output shows the execution plan for your SQL query. This query is selecting records from the "Patient" table where certain conditions are met, and there exists at least one matching record in the "Address" table. Let's break down the key components of this execution plan:

  1. Limit: The query starts with a LIMIT 21, which means it will only retrieve the first 21 records after sorting.

  2. Sort: Before limiting the results, it sorts the data based on "Patient"."lastUpdated" in descending order. Sorting operations can be computationally expensive, especially if the dataset is large.

  3. Nested Loop: The query uses a nested loop join to combine records from "Patient" and "Address" tables. This method is generally efficient for queries where one of the tables returns a small set of rows.

  4. HashAggregate: This step is part of processing the EXISTS subquery. It aggregates the results from the "Address" table, grouping by "resourceId". This ensures that duplicate "resourceId" values from the "Address" table are consolidated.

  5. Bitmap Heap Scan on "Address": The plan scans the "Address" table using a bitmap heap scan, which is efficient for queries that retrieve a moderate number of rows. It's filtering based on the text search condition (to_tsvector('simple',"Address"."state") @@ to_tsquery('simple','CA:*')).

  6. Bitmap Index Scan on "Address_state_idx_tsv": This indicates that the query is using an index specifically designed for full-text search on the "state" column of the "Address" table. This helps in efficiently finding rows that match the text search criteria.

  7. Index Scan using "Patient_pkey": After identifying the relevant "resourceId" values from the "Address" table, the query then performs an index scan on the "Patient" table using its primary key. The conditions in the WHERE clause are applied to filter the results.

Performance Insights

  • Index Utilization: The query efficiently utilizes indexes, specifically a full-text search index on "Address"."state" and a primary key index on "Patient".

  • Sorting Overhead: Sorting the results by "Patient"."lastUpdated" could be expensive, especially if this column is not indexed. Consider adding an index on "Patient"."lastUpdated" if the performance is not optimal.

  • Nested Loop Efficiency: The nested loop join is efficient in this context because the EXISTS subquery likely limits the number of rows that need to be joined.

  • Cost Estimates: The cost estimates (2411.21..2411.26 for the limit and 2411.21..2411.81 for the sort) indicate that the database expects the operation to be reasonably efficient, given the conditions and data size.

  • Full-Text Search: The use of full-text search (to_tsvector and to_tsquery) suggests that the query might be sensitive to how the text data is indexed and stored.

Overall, the query seems well-optimized, especially with the use of appropriate indexes. However, actual performance should be tested with real data, as execution plans can vary based on data size, distribution, and database configuration.

@codyebberson codyebberson requested a review from a team as a code owner January 29, 2024 21:09
Copy link

vercel bot commented Jan 29, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

3 Ignored Deployments
Name Status Preview Updated (UTC)
medplum-app ⬜️ Ignored (Inspect) Jan 29, 2024 9:09pm
medplum-storybook ⬜️ Ignored (Inspect) Jan 29, 2024 9:09pm
medplum-www ⬜️ Ignored (Inspect) Jan 29, 2024 9:09pm

Copy link

Messages
📖 @medplum/core: 153.9 kB
📖 @medplum/react: 338.4 kB

Generated by 🚫 dangerJS against a578e1c

Copy link

sonarcloud bot commented Jan 29, 2024

Quality Gate Passed Quality Gate passed

Kudos, no new issues were introduced!

0 New issues
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

@codyebberson codyebberson added this pull request to the merge queue Jan 29, 2024
Merged via the queue into main with commit 01e86eb Jan 29, 2024
18 checks passed
@codyebberson codyebberson deleted the cody-fix-lookup-table-tsv-indexes branch January 29, 2024 21:47
medplumbot added a commit that referenced this pull request Jan 31, 2024
Fixes #3794 - MeasureReport.period search (#3850)
Extra check for vmcontext bots (#3863)
Add and use vite-plugin-turbosnap (#3849)
Downgrade chromatic (#3848)
Repo sql fixes for cockroachdb (#3844)
Remove Health Gorilla from medplum-demo-bots (#3845)
fix-3815 cache presigned s3 binary urls (#3834)
Use tsvector index for token text search (#3791)
rate limit should return `OperationOutcome` (#3843)
Add global var "module" to vm context bots (#3842)
Fix lookup table tsv indexes (#3841)
Always use estimate count first (#3840)
Disambiguate getClient (#3839)
Fix invalid mermaid graph in diagnostic catalog docs (#3836)
fix-3809 race condition in Subscription extension fhir-path-criteria-expression %previous value lookup (#3810)
Fix Sonar code smells: mark React props readonly (#3832)
RDS proxy (#3827)
Fixed lookup tables in migration generator (#3830)
Fixed deprecated jest matchers (#3831)
Update README.md (#3828)
Update fhir-basics.md (#3829)
Case study content and images (#3820)
Added rdsReaderInstanceType and RDS upgrade docs (#3826)
Dependency upgrades (#3825)
Separate search popup menus for 'text' and 'token' (#3824)
Improve performance of token sort (#3823)
Additional logging (#3790)
Fix calendar input button style (#3817)
Don't add _total default in SearchControl (#3818)
Dark mode (#3814)
Fixes #3812 - FHIR profile cache bug (#3813)
Document using medplum client to integrate with external FHIR servers (#3811)
Use specific advisory locks (#3805)
Nested transactions (#3788)
Fix signin page on graphiql (#3802)
fix(heartbeat): start heartbeat on first bind to sub (#3793)
Fix async job tests (#3795)
Document using vm context bots (#3784)
Refactored access policy docs based on customer feedback (#3785)
Support Redis TLS config from Env (#3787)
feat(subscriptions): add `heartbeat` for WS subs (#3740)
Update Bot metrics (#3763)
github-merge-queue bot pushed a commit that referenced this pull request Jan 31, 2024
Fixes #3794 - MeasureReport.period search (#3850)
Extra check for vmcontext bots (#3863)
Add and use vite-plugin-turbosnap (#3849)
Downgrade chromatic (#3848)
Repo sql fixes for cockroachdb (#3844)
Remove Health Gorilla from medplum-demo-bots (#3845)
fix-3815 cache presigned s3 binary urls (#3834)
Use tsvector index for token text search (#3791)
rate limit should return `OperationOutcome` (#3843)
Add global var "module" to vm context bots (#3842)
Fix lookup table tsv indexes (#3841)
Always use estimate count first (#3840)
Disambiguate getClient (#3839)
Fix invalid mermaid graph in diagnostic catalog docs (#3836)
fix-3809 race condition in Subscription extension fhir-path-criteria-expression %previous value lookup (#3810)
Fix Sonar code smells: mark React props readonly (#3832)
RDS proxy (#3827)
Fixed lookup tables in migration generator (#3830)
Fixed deprecated jest matchers (#3831)
Update README.md (#3828)
Update fhir-basics.md (#3829)
Case study content and images (#3820)
Added rdsReaderInstanceType and RDS upgrade docs (#3826)
Dependency upgrades (#3825)
Separate search popup menus for 'text' and 'token' (#3824)
Improve performance of token sort (#3823)
Additional logging (#3790)
Fix calendar input button style (#3817)
Don't add _total default in SearchControl (#3818)
Dark mode (#3814)
Fixes #3812 - FHIR profile cache bug (#3813)
Document using medplum client to integrate with external FHIR servers (#3811)
Use specific advisory locks (#3805)
Nested transactions (#3788)
Fix signin page on graphiql (#3802)
fix(heartbeat): start heartbeat on first bind to sub (#3793)
Fix async job tests (#3795)
Document using vm context bots (#3784)
Refactored access policy docs based on customer feedback (#3785)
Support Redis TLS config from Env (#3787)
feat(subscriptions): add `heartbeat` for WS subs (#3740)
Update Bot metrics (#3763)
@reshmakh reshmakh added this to the February 29th, 2024 milestone Feb 2, 2024
@reshmakh reshmakh added the search Features and fixes related to search label Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
search Features and fixes related to search
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

None yet

2 participants