Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide the ability to retrieve a set of COMMON_TOKEN_VALUE_IDs in one request #2184

Closed
michaelwschroeder opened this issue Mar 31, 2021 · 3 comments
Assignees
Labels
enhancement New feature or request P3 Priority 3 - Nice To Have performance performance search

Comments

@michaelwschroeder
Copy link
Contributor

michaelwschroeder commented Mar 31, 2021

Is your feature request related to a problem? Please describe.
We have cases when building queries (processing inclusion criteria, processing :in, :not-in, :above, :below modifiers) where we want to attempt to retrieve COMMON_TOKEN_VALUE_IDs for a list of system+code pairs. It could be prohibitive performance-wise to go to the DB one-at-a-time for each cache miss.

Describe the solution you'd like
Provide a way to retrieve the COMMON_TOKEN_VALUE_IDs for a list of system+code pairs in a single request.

Describe alternatives you've considered

  1. Using existing interface, risking going out to the DB for every cache miss
  2. Don't use the COMMON_TOKEN_VALUE_IDs

Acceptance Criteria
1.
GIVEN [a precondition]
AND [another precondition]
WHEN [test step]
AND [test step]
THEN [verification step]
AND [verification step]

Additional context
Add any other context or screenshots about the feature request here.

@michaelwschroeder michaelwschroeder added enhancement New feature or request search performance performance labels Mar 31, 2021
@lmsurpre lmsurpre changed the title Provide the ability to retrieve a set of COMMON_TOKEN_VALUES in one request Provide the ability to retrieve a set of COMMON_TOKEN_VALUE_IDs in one request Mar 31, 2021
@lmsurpre
Copy link
Member

lmsurpre commented Jun 7, 2021

the new QueryBuilder has support for handling an "in list" of common token values, so it will already handle conversion between = 'a' and in (a,b,c), so just a matter of ensuring we get the values before that efficiently (even in face of cache misses)

@lmsurpre lmsurpre added the P3 Priority 3 - Nice To Have label Jun 7, 2021
@lmsurpre lmsurpre added this to the Sprint 2021-11 milestone Aug 10, 2021
@lmsurpre lmsurpre self-assigned this Aug 10, 2021
lmsurpre added a commit that referenced this issue Aug 13, 2021
Added
* getCommonTokenValueIds to JDBCIdentityCache
* resolveCommonTokenValueIds to ICommonTokenValuesCache
* readCommonTokenValueIds to ResourceReferenceDAO

And updated SearchQueryRenderer to use this mechanism in two cases:


Needs testing and perhaps customizations in ResourceReferenceDAO
subclasses for specific Dbs.

Also includes some code for variations I was playing with...that all
needs to get cleaned up now.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 13, 2021
* the derby-specific readCommonTokenValueId implementation replaces the
`(VALUES (?,value1), (?,value2), ...` construct with a temporary
table--following the same pattern used for upsertCommonTokenValues.
* removed dead code from getReferenceFilter; we had code to handle query
parameters of type reference which had modifiers that we don't actually
support for reference parameters
* removed commented-out code in from populateCodesSubSegment; I
convinced myself that we can use the common_token_values table for both
the IN and NOT-IN cases.
* removed stale getCommonTokenValueIds code / alternate implementations
* reordered field names and select statements to follow the convention
of existing methods (value, system, id)--personally I prefer
(system,code) to (code,system) but I opted for consistency with the
current codebase
* added defensive code to return an empty set of CommonTokenValueResult
if passed an empty Collection of CommonTokenValue

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 13, 2021
* the derby-specific readCommonTokenValueId implementation replaces the
`(VALUES (?,value1), (?,value2), ...` construct with a temporary
table--following the same pattern used for upsertCommonTokenValues.
* removed dead code from getReferenceFilter; we had code to handle query
parameters of type reference which had modifiers that we don't actually
support for reference parameters
* removed commented-out code in from populateCodesSubSegment; I
convinced myself that we can use the common_token_values table for both
the IN and NOT-IN cases.
* removed stale getCommonTokenValueIds code / alternate implementations
* reordered field names and select statements to follow the convention
of existing methods (value, system, id)--personally I prefer
(system,code) to (code,system) but I opted for consistency with the
current codebase
* added defensive code to return an empty set of CommonTokenValueResult
if passed an empty Collection of CommonTokenValue

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
readCommonTokenValueId

Because the table is not cleared until the entire transaction is
committed/rolled back, we can't be sure that the only values in this
table are the ones that we added specifically for the current search.
For example, consider transaction bundle cases.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
Added
* getCommonTokenValueIds to JDBCIdentityCache
* resolveCommonTokenValueIds to ICommonTokenValuesCache
* readCommonTokenValueIds to ResourceReferenceDAO

And updated SearchQueryRenderer to use this mechanism in two cases:


Needs testing and perhaps customizations in ResourceReferenceDAO
subclasses for specific Dbs.

Also includes some code for variations I was playing with...that all
needs to get cleaned up now.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
* the derby-specific readCommonTokenValueId implementation replaces the
`(VALUES (?,value1), (?,value2), ...` construct with a temporary
table--following the same pattern used for upsertCommonTokenValues.
* removed dead code from getReferenceFilter; we had code to handle query
parameters of type reference which had modifiers that we don't actually
support for reference parameters
* removed commented-out code in from populateCodesSubSegment; I
convinced myself that we can use the common_token_values table for both
the IN and NOT-IN cases.
* removed stale getCommonTokenValueIds code / alternate implementations
* reordered field names and select statements to follow the convention
of existing methods (value, system, id)--personally I prefer
(system,code) to (code,system) but I opted for consistency with the
current codebase
* added defensive code to return an empty set of CommonTokenValueResult
if passed an empty Collection of CommonTokenValue

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
readCommonTokenValueId

Because the table is not cleared until the entire transaction is
committed/rolled back, we can't be sure that the only values in this
table are the ones that we added specifically for the current search.
For example, consider transaction bundle cases.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
* the derby-specific readCommonTokenValueId implementation replaces the
`(VALUES (?,value1), (?,value2), ...` construct with a temporary
table--following the same pattern used for upsertCommonTokenValues.
* removed dead code from getReferenceFilter; we had code to handle query
parameters of type reference which had modifiers that we don't actually
support for reference parameters
* removed commented-out code in from populateCodesSubSegment; I
convinced myself that we can use the common_token_values table for both
the IN and NOT-IN cases.
* removed stale getCommonTokenValueIds code / alternate implementations
* reordered field names and select statements to follow the convention
of existing methods (value, system, id)--personally I prefer
(system,code) to (code,system) but I opted for consistency with the
current codebase
* added defensive code to return an empty set of CommonTokenValueResult
if passed an empty Collection of CommonTokenValue

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
readCommonTokenValueId

Because the table is not cleared until the entire transaction is
committed/rolled back, we can't be sure that the only values in this
table are the ones that we added specifically for the current search.
For example, consider transaction bundle cases.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
* Favor StringBuilder over String with lots of concatenation
* Try using NULL instead of -1 for readCommonTokenValueIds
* Add javadoc to CommonTokenValue constructor

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
* Favor StringBuilder over String with lots of concatenation
* Try using NULL instead of -1 for readCommonTokenValueIds
* Add javadoc to CommonTokenValue constructor

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
* Favor StringBuilder over String with lots of concatenation
* Use literals instead of bind variables for common token value ids
* Add javadoc to CommonTokenValue constructor

I also tried to use NULL instead of -1 for cases where there are no
common token values, but our database utils (BaseWhereAdapter) doesn't
seem to support that at the moment--it only supports this for bind
variables.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
* Favor StringBuilder over String with lots of concatenation
* Use literals instead of bind variables for common token value ids
* Add javadoc to CommonTokenValue constructor

I also tried to use NULL instead of -1 for cases where there are no
common token values, but our database utils (BaseWhereAdapter) doesn't
seem to support that at the moment--it only supports this for bind
variables.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
* Favor StringBuilder over String with lots of concatenation
* Use literals instead of bind variables for common token value ids
* Add javadoc to CommonTokenValue constructor

I also tried to use NULL instead of -1 for cases where there are no
common token values, but our database utils (BaseWhereAdapter) doesn't
seem to support that at the moment--it only supports this for bind
variables.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
lmsurpre added a commit that referenced this issue Aug 16, 2021
#2684)

* issue #2184 - initial attempt at resolveCommonTokenValueIds

Added
* getCommonTokenValueIds to JDBCIdentityCache
* resolveCommonTokenValueIds to ICommonTokenValuesCache
* readCommonTokenValueIds to ResourceReferenceDAO

And updated SearchQueryRenderer to use this mechanism in two cases:


Needs testing and perhaps customizations in ResourceReferenceDAO
subclasses for specific Dbs.

Also includes some code for variations I was playing with...that all
needs to get cleaned up now.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* issue #2184 - add derby-specific readCommonTokenValueId and do cleanup

* the derby-specific readCommonTokenValueId implementation replaces the
`(VALUES (?,value1), (?,value2), ...` construct with a temporary
table--following the same pattern used for upsertCommonTokenValues.
* removed dead code from getReferenceFilter; we had code to handle query
parameters of type reference which had modifiers that we don't actually
support for reference parameters
* removed commented-out code in from populateCodesSubSegment; I
convinced myself that we can use the common_token_values table for both
the IN and NOT-IN cases.
* removed stale getCommonTokenValueIds code / alternate implementations
* reordered field names and select statements to follow the convention
of existing methods (value, system, id)--personally I prefer
(system,code) to (code,system) but I opted for consistency with the
current codebase
* added defensive code to return an empty set of CommonTokenValueResult
if passed an empty Collection of CommonTokenValue

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* issue #2184 - avoid common_token_values_tmp in derby-specific
readCommonTokenValueId

Because the table is not cleared until the entire transaction is
committed/rolled back, we can't be sure that the only values in this
table are the ones that we added specifically for the current search.
For example, consider transaction bundle cases.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>

* issue #2184 - address PR feedback

* Favor StringBuilder over String with lots of concatenation
* Use literals instead of bind variables for common token value ids
* Add javadoc to CommonTokenValue constructor

I also tried to use NULL instead of -1 for cases where there are no
common token values, but our database utils (BaseWhereAdapter) doesn't
seem to support that at the moment--it only supports this for bind
variables.

Signed-off-by: Lee Surprenant <lmsurpre@us.ibm.com>
@michaelwschroeder
Copy link
Contributor Author

This issue has been verified.

I started with a clean schema and performed a search using an :in modifier:

  • verified via log that, as expected, for the values in the specified value set, all were cache misses
  • verified that a DB query was executed to retrieve the common token value IDs for the cache misses
  • verified that since none were found in the DB, a count query for the search was generated with only a -1 for the COMMON_TOKEN_VALUE_ID value

I then ingested resources which caused some of the values in the value set to be indexed and added to the DB and cache. I reran the search:

  • verified via log that, as expected, for the values in the specified value set, all were cache misses except the ones that were indexed for the ingested resources
  • verified a DB query was executed to retrieve the common token value IDs for the cache misses
  • verified that since none of the misses were in the DB, a count query for the search was generated with only the cache hits specified for the COMMON_TOKEN_VALUE_ID values

I then stopped and restarted the server to clear the cache. I reran the search:

  • verified via log that, as expected, for the values in the specified value set, all were cache misses
  • verified a DB query was executed to retrieve the common token value IDs for the cache misses, and the ones that were indexed for the ingested resources were returned and added to the cache
  • verified that a count query for the search was generated with only the DB hits specified for the COMMON_TOKEN_VALUE_ID values

@lmsurpre
Copy link
Member

lmsurpre commented Aug 26, 2021

additionally I ran a big Patient export and confirmed it completed as expected:

INFO  ---- FHIR resources exported in 19778.829seconds ----
INFO ResourceType 	| Exported
INFO Provenance[5660,5591,5593,5498,5496,4620]	|32458
INFO AllergyIntolerance[14930]	|14930
INFO CarePlan[115147,15470]	|130617
INFO Procedure[200126,200469,201637,158950]	|761182
INFO Immunization[200254,200208,48477]	|448939
INFO DocumentReference[84370,83722,83449,83819,83875,82646,83497,83280,83852,83317,84182,81328]	|1001337
INFO ExplanationOfBenefit[35133,36952,36031,36179,36062,37199,33985,37204,34438,34701,37459,36602,36543,36499,34250,37406,34850,34243,35068,36951,34755,36169,36537,34982,36596,34787,35394,34366	|1001341
INFO MedicationAdministration[9069]	|9069
INFO MedicationRequest[193899,193111,193595,114813]	|695418
INFO Group[1]	|1
INFO CareTeam[123285,7348]	|130633
INFO ImagingStudy[10397]	|10397
INFO Condition[198044,162333]	|360377
INFO Patient[32790]	|32790
INFO Coverage[17]	|17
INFO Encounter[131321,132825,131197,130990,132328,133484,131076	|1001337
INFO SupplyDelivery[173683]	|173683
INFO Claim[127444,128416,124530,127380,126698,124717,125015,127280,128045,124841,126991,127147,127163,51088]	|1696755
INFO Observation[205719,207265,209549,211104,216089,213304,208107,206351,214532,201366,213804,214574,205584,215498,202202,203337,218912,217765,215941,201220,203507,216579,209253,217251,207894,206875,208682,210512,204011,209014,64017]	|6359818
INFO  ---- Total: 13861099 ExportRate: 700.80 ----

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request P3 Priority 3 - Nice To Have performance performance search
Projects
None yet
Development

No branches or pull requests

2 participants