-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of chained queries into contained resources #3312
Improve performance of chained queries into contained resources #3312
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3312 +/- ##
==========================================
Coverage 82.82% 82.83%
- Complexity 20377 20399 +22
==========================================
Files 1366 1367 +1
Lines 73172 73337 +165
Branches 11010 11033 +23
==========================================
+ Hits 60608 60748 +140
- Misses 8343 8367 +24
- Partials 4221 4222 +1
Continue to review full report at Codecov.
|
…chain, so later we do not need to guess what it was
hapi-fhir-jpaserver-base/src/main/java/ca/uhn/fhir/jpa/search/builder/QueryStack.java
Show resolved
Hide resolved
hapi-fhir-jpaserver-base/src/main/java/ca/uhn/fhir/jpa/search/builder/QueryStack.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me, just curious why this way improves the performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved with some questions. Happy to walk through this with you in the morning.
return createPredicateComposite(theSourceJoinColumn, theResourceName, theSpnamePrefix, theParamDef, theNextAnd, theRequestPartitionId, mySqlBuilder); | ||
} | ||
|
||
private Condition createPredicateComposite(@Nullable DbColumn theSourceJoinColumn, String theResourceName, String theSpnamePrefix, RuntimeSearchParam theParamDef, List<? extends IQueryParameterType> theNextAnd, RequestPartitionId theRequestPartitionId, SearchQueryBuilder theSqlBuilder) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh. It would be nice to move these out to a new class, but I understand why you didn't. Maybe some Javadoc explaining they are only used for chained recursion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, I made this change to make them easier to use for subselects, and they could also be used to construct subselects outside of the chained recursion algorithm. But I agree that pulling these createPredicateX methods out should be one step in decomposing this monster class.
|
||
String paramName = getParamNameWithPrefix(theSpnamePrefix, theSearchParam.getName()); | ||
|
||
if (theList.get(0).getMissing() != null) { | ||
QuantityBasePredicateBuilder join = createOrReusePredicateBuilder(PredicateBuilderTypeEnum.QUANTITY, theSourceJoinColumn, theSearchParam.getName(), () -> mySqlBuilder.addQuantityPredicateBuilder(theSourceJoinColumn)).getResult(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How confident are you that you got every type (quantity, date, etc) of clause? The tests look like they are more focused on the chaining than on exercising the different types as leaf nodes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are other tests that assert the correctness of the leaf clauses. They didn't change as part of this MR, but they did catch a few bugs for me.
import java.util.function.Supplier; | ||
import java.util.stream.Collectors; | ||
|
||
import static org.apache.commons.lang3.ObjectUtils.defaultIfNull; | ||
import static org.apache.commons.lang3.StringUtils.isBlank; | ||
import static org.apache.commons.lang3.StringUtils.isNotBlank; | ||
import static org.apache.commons.lang3.StringUtils.split; | ||
|
||
public class QueryStack { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wowza. This is a big class. I'm impressed by your wrangling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should certainly decompose and refactor this monster before attempting any further enhancements.
} | ||
|
||
// restore the state of this collection to turn caching back on before we exit | ||
myReusePredicateBuilderTypes.addAll(cachedReusePredicateBuilderTypes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be cleared first like above? Or is the re-use barrier only one direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
myReusePredicateBuilderTypes is only populated on class initialization. Its contents don't change during processing. Other than the lines I added in this method, it's only accessed for reads elsewhere in the class. When I cleared it at the top of the method, it should still be empty at the bottom, where I restore its previous contents. That being said, in hindsight, I might have preferred to disable the caching with an explicit boolean switch rather than messing with the contents of the collection at all.
@@ -368,10 +383,15 @@ public Condition createPredicateCoords(@Nullable DbColumn theSourceJoinColumn, | |||
public Condition createPredicateDate(@Nullable DbColumn theSourceJoinColumn, String theResourceName, | |||
String theSpnamePrefix, RuntimeSearchParam theSearchParam, List<? extends IQueryParameterType> theList, | |||
SearchFilterParser.CompareOperation theOperation, RequestPartitionId theRequestPartitionId) { | |||
return createPredicateDate(theSourceJoinColumn, theResourceName, theSpnamePrefix, theSearchParam, theList, theOperation, theRequestPartitionId, mySqlBuilder); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am uncomfortable having both of these in the same class. The risk of calling the wrong one seems very high.
I don't see a double for createPredicateCoords()
, or createPredicateHas()
. Should I?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not every type of predicate is supported for use in a chain. I only converted the ones I needed to use. One way or another, when we pull these methods out to their own class during decomposition, the doubles will probably go away anyhow.
String theResourceName, String theParamName, List<String> theQualifiers, RuntimeSearchParam theSearchParam, | ||
List<? extends IQueryParameterType> theList, SearchFilterParser.CompareOperation theOperation, | ||
RequestDetails theRequest, RequestPartitionId theRequestPartitionId) { | ||
private class ChainElement { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do these all need to be in QueryStack?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. This class badly needs to be decomposed, and promoting all these inner classes to top level will be an important part of that.
updateMapOfReferenceLinks(referenceLinks, Lists.newArrayList(), | ||
leafNodes | ||
.stream() | ||
.map(t -> t.withPathPrefix(nextChain.get(0).getResourceType(), nextChain.get(0).getSearchParam().getName() + "." + nextChain.get(1).getSearchParam().getName())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is the only literal "."
here. Is there a reason?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, literal "."
also occurs on lines 1042 and 1048. They're correct, but the code would probably be more readable if I pulled the name merge out into a helper method like I did for the path merge.
.map(t -> t.withPathPrefix(nextChain.get(2).getResourceType(), nextChain.get(2).getSearchParam().getName())) | ||
.collect(Collectors.toSet())); | ||
// discrete -> discrete -> contained -> discrete | ||
updateMapOfReferenceLinks(referenceLinks, Lists.newArrayList(nextChain.get(0).getSearchParam().getPath(), mergePaths(nextChain.get(1).getSearchParam().getPath(), nextChain.get(2).getSearchParam().getPath())), leafNodes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple of helpers on ChainElement could make this more readable:
static String mergePaths(ChainElement ...)
Also: every call toLeafNodeDefinition.withPathPrefix(String, String)
seems to be with a disassembled ChainElement. Maybe simplify to.withPathPrefix(ChainElement)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it will have to take a List of ChainElements, but I like this idea. Also, since we always take the elements in sequence, I could probably also replace all the Lists.newArrayList(...) with nextChain.subList().
return createPredicateUri(theSourceJoinColumn, theResourceName, theSpnamePrefix, theSearchParam, theList, theOperation, theRequestDetails, theRequestPartitionId, mySqlBuilder); | ||
} | ||
|
||
public Condition createPredicateUri(@Nullable DbColumn theSourceJoinColumn, String theResourceName, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see chained versions of createIndexPredicate(
), or createPredicateResourceId()
either. Should I?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resource IDs aren't supported in chains, so it isn't needed. createIndexPredicate()
is a method that I extracted from createPredicateReferenceForContainedResource()
to improve readability. It isn't used anywhere else, so it only needs one signature.
* base restructuring of query * fix unit tests * suppress unnecessary resource type parameter * pass the resource type used to fetch the search param as part of the chain, so later we do not need to guess what it was * add query structure tests * changelog * fix test failures * got one of the branches wrong in the 3-reference case
* Add versionenum, bump maven version * failing tests * more logs * Add more test cases, not passing * fixed the double chain with contained resource second case * clean up a bit * add tests for qualifiers * force the resource table to be the root of the query if we might be traversing a contained reference * tidy up * tests for longer chains * changelog * code review feedback * backport changelog * Add version.yaml * [2935] Escape "%" in like expression * Add backport to changelog * Bumping version number. Update VersionEnum to support 5.5.3 * Jr 20211013 chained references (#3079) * Create index entries for outbound references of contained resources * build query for chained reference * fix case where the contained reference is an explicit id rather than a continued chain * fix contained index to use path names not search param names * make qualified search work * cleanup and changelog * code review * fix broken tests * add imports from bad merge * Add forgotten import from merge * Jr 20211018 chained references 2 (#3099) * Create index entries for outbound references of contained resources * build query for chained reference * fix case where the contained reference is an explicit id rather than a continued chain * fix contained index to use path names not search param names * make qualified search work * cleanup and changelog * recurse while creating indexes on contained resources * double link both contained * longer contained subchains * adding some failing test cases to illustrate the limitations of qualified searches * clean up merge cruft * changelog * Jr 20211021 chained references 3 (#3107) * Create index entries for outbound references of contained resources * build query for chained reference * fix case where the contained reference is an explicit id rather than a continued chain * fix contained index to use path names not search param names * make qualified search work * cleanup and changelog * recurse while creating indexes on contained resources * double link both contained * longer contained subchains * adding some failing test cases to illustrate the limitations of qualified searches * clean up merge cruft * changelog * create recursive resource links * add test coverage for a more complicated case * changelog * remove unnecessary check for _contained flag * fix broken tests * Add backport information * Bump version, add version enum * begin with failing test * fix bug * change log * change log * Add backport information * Improve performance of chained queries into contained resources (#3312) * base restructuring of query * fix unit tests * suppress unnecessary resource type parameter * pass the resource type used to fetch the search param as part of the chain, so later we do not need to guess what it was * add query structure tests * changelog * fix test failures * got one of the branches wrong in the 3-reference case * Add backport info * Bump version * Fix errors due to merge * Bump license year * Add folder for release version * Version bump * Jr 20220210 handle common search params in contained searches (#3377) * fix handling of common search parameters * add support for reference search parameters with multiple paths * Add backport info * add new folder for version * Add missing message * New version enum, and folder * Version bump * Bump spring lib version * Tidy versions * Remove test Co-authored-by: Ken Stevens <khstevens@gmail.com> Co-authored-by: Jason Roberts <jason.roberts@smilecdr.com> Co-authored-by: katie_smilecdr <katie@smilecdr.com> Co-authored-by: JasonRoberts-smile <85363818+JasonRoberts-smile@users.noreply.github.com>
* Add versionenum, bump maven version * failing tests * more logs * Add more test cases, not passing * fixed the double chain with contained resource second case * clean up a bit * add tests for qualifiers * force the resource table to be the root of the query if we might be traversing a contained reference * tidy up * tests for longer chains * changelog * code review feedback * backport changelog * Add version.yaml * [2935] Escape "%" in like expression * Add backport to changelog * Bumping version number. Update VersionEnum to support 5.5.3 * Jr 20211013 chained references (hapifhir#3079) * Create index entries for outbound references of contained resources * build query for chained reference * fix case where the contained reference is an explicit id rather than a continued chain * fix contained index to use path names not search param names * make qualified search work * cleanup and changelog * code review * fix broken tests * add imports from bad merge * Add forgotten import from merge * Jr 20211018 chained references 2 (hapifhir#3099) * Create index entries for outbound references of contained resources * build query for chained reference * fix case where the contained reference is an explicit id rather than a continued chain * fix contained index to use path names not search param names * make qualified search work * cleanup and changelog * recurse while creating indexes on contained resources * double link both contained * longer contained subchains * adding some failing test cases to illustrate the limitations of qualified searches * clean up merge cruft * changelog * Jr 20211021 chained references 3 (hapifhir#3107) * Create index entries for outbound references of contained resources * build query for chained reference * fix case where the contained reference is an explicit id rather than a continued chain * fix contained index to use path names not search param names * make qualified search work * cleanup and changelog * recurse while creating indexes on contained resources * double link both contained * longer contained subchains * adding some failing test cases to illustrate the limitations of qualified searches * clean up merge cruft * changelog * create recursive resource links * add test coverage for a more complicated case * changelog * remove unnecessary check for _contained flag * fix broken tests * Add backport information * Bump version, add version enum * begin with failing test * fix bug * change log * change log * Add backport information * Improve performance of chained queries into contained resources (hapifhir#3312) * base restructuring of query * fix unit tests * suppress unnecessary resource type parameter * pass the resource type used to fetch the search param as part of the chain, so later we do not need to guess what it was * add query structure tests * changelog * fix test failures * got one of the branches wrong in the 3-reference case * Add backport info * Bump version * Fix errors due to merge * Bump license year * Add folder for release version * Version bump * Jr 20220210 handle common search params in contained searches (hapifhir#3377) * fix handling of common search parameters * add support for reference search parameters with multiple paths * Add backport info * add new folder for version * Add missing message * New version enum, and folder * Version bump * Bump spring lib version * Tidy versions * Remove test Co-authored-by: Ken Stevens <khstevens@gmail.com> Co-authored-by: Jason Roberts <jason.roberts@smilecdr.com> Co-authored-by: katie_smilecdr <katie@smilecdr.com> Co-authored-by: JasonRoberts-smile <85363818+JasonRoberts-smile@users.noreply.github.com>
No description provided.