Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve search speed #2553

Merged
merged 3 commits into from
Jun 6, 2024
Merged

Improve search speed #2553

merged 3 commits into from
Jun 6, 2024

Conversation

MJ1998
Copy link
Collaborator

@MJ1998 MJ1998 commented May 27, 2024

IMPORTANT: All PRs must be linked to an issue (except for extremely trivial and straightforward changes).

Fixes #2547
As commented in the above issue, this change reduces time from ~25 seconds to few milliseconds (7k patients).

Description
Remove extra JOIN clause (a.resourceType = b.resourceType). This clause is not needed as we already filter the resourceType in WHERE clause.

Sample DSL search:-

fhirEngine.search<Patient> {
        sort(Patient.NAME, Order.ASCENDING)
        count = PaginationConstant.DEFAULT_PAGE_SIZE
        from = currentPage * PaginationConstant.DEFAULT_PAGE_SIZE
      }

Query before this change:-

SELECT a.resourceUuid, a.serializedResource
FROM ResourceEntity a
         LEFT JOIN StringIndexEntity b
                   ON a.resourceType = b.resourceType AND a.resourceUuid = b.resourceUuid AND b.index_name = 'name'
WHERE a.resourceType = 'Patient'
ORDER BY b.index_value ASC
LIMIT 20 OFFSET 0

Query after this change:-

SELECT a.resourceUuid, a.serializedResource
FROM ResourceEntity a
         LEFT JOIN StringIndexEntity b
                   ON a.resourceUuid = b.resourceUuid AND b.index_name = 'name'
WHERE a.resourceType = 'Patient'
ORDER BY b.index_value ASC
LIMIT 20 OFFSET 0

Notice that a.resourceType = b.resourceType clause is not present after ON.

Alternative(s) considered
None

Type
Choose one: (Bug fix | Feature | Documentation | Testing | Code health | Builds | Releases | Other)

Screenshots (if applicable)

Checklist

  • I have read and acknowledged the Code of conduct.
  • I have read the Contributing page.
  • I have signed the Google Individual CLA, or I am covered by my company's Corporate CLA.
  • I have discussed my proposed solution with code owners in the linked issue(s) and we have agreed upon the general approach.
  • I have run ./gradlew spotlessApply and ./gradlew spotlessCheck to check my code follows the style guide of this project.
  • I have run ./gradlew check and ./gradlew connectedCheck to test my changes locally.
  • I have built and run the demo app(s) to verify my change fixes the issue and/or does not break the demo app(s).

@jingtang10
Copy link
Collaborator

great pr thanks @MJ1998!

one thing i want to clarify, do you know if this column in the string index table (and similarly in other index tables) https://github.com/google/android-fhir/blob/5efb955619cc1f0ca54e0ce4bee9b7c69360d67b/engine/src/main/java/com/google/android/fhir/db/impl/entities/StringIndexEntity.kt#L49C26-L50C34 can be removed?

@jingtang10 jingtang10 enabled auto-merge (squash) June 6, 2024 11:20
@jingtang10 jingtang10 merged commit 6d3a5d3 into google:master Jun 6, 2024
4 checks passed
@chandrashekar-s
Copy link

Sorry for commenting late here, based on the solution I see that we are going with the assumption that Resource Id is unique across all resource types. But, as per the FHIR specification The logical id is unique within the space of all resources of the same type on the same server. Just raising it here so that we are aware of this.

@jingtang10
Copy link
Collaborator

Sorry for commenting late here, based on the solution I see that we are going with the assumption that Resource Id is unique across all resource types. But, as per the FHIR specification The logical id is unique within the space of all resources of the same type on the same server. Just raising it here so that we are aware of this.

In our query we already use a uuid to join the two tables so the resource type join is superfluous.

The uuid is not the resource logical id. It's an id we create for each resource.

@chandrashekar-s
Copy link

Sorry for commenting late here, based on the solution I see that we are going with the assumption that Resource Id is unique across all resource types. But, as per the FHIR specification The logical id is unique within the space of all resources of the same type on the same server. Just raising it here so that we are aware of this.

In our query we already use a uuid to join the two tables so the resource type join is superfluous.

The uuid is not the resource logical id. It's an id we create for each resource.

Oh I see. I had a look at the ResourceEntity table and the uuid field is unique and there is also a ResourceId column which I asume is the logical id. Thanks for clarifying.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Complete
Development

Successfully merging this pull request may close these issues.

Improve FhirEngine Search performance for sorting
5 participants