New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of getAllAfter queries into DTOs [8] #9320
Comments
Hi there, I am not sure if this is related but since recently we are having issues to get all tasks created within the last 24 hours via /tasks/all/{since} No matter the size of the response e.g. even if only 1 task will be returned matching the time criteria. No response even for 5 minutes. The highest possible timespan that we got working on some instances is last 4 hours. Even that one takes a couple of minutes to respond. Let us know if you need more info around the issue and keep us posted about the outcome here. |
Identifying the changed entities is what we want to improve. That includes also tasks. |
Perfect! Are there already some concrete plans when can this one be expected to make its way to production? |
Discussed current status and next steps with @HolgerReiseVSys and @syntakker. Results:
|
- Removed superfluous includeExtendedChangeDateFilters parameter
- Removed redundant getAll(Predicate, Integer) method
- Sorting matches BaseAdoService.getList queries
- Removed redundant getAll(Predicate, Integer) method
- Sorting matches BaseAdoService.getList queries
- Unified methods in CoreAdoServices for Facade: - getInJurisdictionIds: List<Long> - inJurisdictionOrOwned: boolean - inJurisdictionOrOwned: Predicate - Made jurisdiction check and convertToDto resilient against null entites
- new: BaseAdoService.getIdList with batching
Applied toPseudonymizedDtos and createPseudonymizer pattern in corresponding Facades
- Adjusted to match patterns for CoreAdo, added VisitQueryContext
- Case dependent entities: SurveillanceReport, ClinicalVisit, Prescription, Treatment - Sample dependent entities: AdditionalTest, PathogenTest - Immunization dependent entities: Vaccination
- Removed duplicate ContactJurisdictionFlagsDto - Simplified jurisdiction check where only Contact itself is relevant
- Removed obsolete method BaseAdoService.getBatchedQueryResults - Removed duplicate method CaseService.createActiveCasesFilter - Fixed naming of a Case unit test
…getAllAfter_methods #9320 improve getAllAfter methods and batch pseudonymization
Changes on performance tooling still open |
Problem Description
As shown by the following analysis, many
getAllAfter
methods show an inperformant pattern:"Entity"Service.getAllAfter
method takes some seconds. As shown in Improve performance of PersonService.getAllAfter method [3] #8946 (comment), this can be improved by initially fetching only the ids (reduced distinct effort) and using a dedicated index with appropriate sorting."Entity"Service.inJurisdictionOrOwned
per each entity used for pseudonymization seems to be inperformant. For Cases it took ~330ms per entity, for Persons ~0,3ms per entity with in clause (PersonService.getInJurisdictionIDs
).Analysis
Dataset:
The following measurements were taken from backend logs (EJB methods) and the Postgres logs.
Observations:
inJurisdictionOrOwned
methods, here mainly the number of callsRequests:
http://localhost:6080/sormas-rest/persons/all/1637090372005/10000/NO_LAST_SYNCED_UUID
: 3 min 5 secserver: 2 min 58 sec
SQL queries:
duration: 948.716 ms
duration: 2297.275 ms
duration: 4012.220 ms
duration: 4393.538 ms
duration: 696.277 ms
~ 12.5 sec
Note: SQL query times are inaccurate, either due to an error in the analysis or changes introduced later, see later analysis of this method.
http://localhost:6080/sormas-rest/cases/all/1637090372005/10000/NO_LAST_SYNCED_UUID
: 8 min 21 secserver: 8 min 18 sec
SQL queries:
duration: 787.011 ms
duration: 449.616 ms
duration: 146.224 ms
duration: 178.515 ms
duration: 154.552 ms
duration: 136.836 ms
duration: 142.597 ms
duration: 143.688 ms
duration: 150.584 ms
duration: 152.298 ms
duration: 133.814 ms
duration: 147.687 ms
~ 3 sec
http://localhost:6080/sormas-rest/contacts/all/1637090372005/10000/NO_LAST_SYNCED_UUID
: 35 secserver: 35 sec
SQL queries:
duration: 151.413 ms
duration: 143.237 ms
duration: 152.661 ms
duration: 144.054 ms
duration: 145.374 ms
duration: 155.861 ms
duration: 177.212 ms
duration: 153.647 ms
duration: 201.516 ms
duration: 187.382 ms
~ 1.6 sec
http://localhost:6080/sormas-rest/tasks/all/1637090372005/1000/NO_LAST_SYNCED_UUID
: 1 min 27 secserver: 1 min 24 sec
SQL queries:
duration: 466.097 ms
duration: 154.093 ms
duration: 132.150 ms
duration: 137.751 ms
duration: 140.757 ms
duration: 135.526 ms
duration: 136.612 ms
duration: 126.399 ms
duration: 124.695 ms
duration: 127.888 ms
duration: 133.940 ms
~1.9 sec
http://localhost:6080/sormas-rest/samples/all/1637090372005/10000/NO_LAST_SYNCED_UUID
: 27 secserver: 27 sec
SQL queries:
duration: 129.587 ms
http://localhost:6080/sormas-rest/immunizations/all/1637090372005/1000/NO_LAST_SYNCED_UUID
(1min 50 sec)server: 1 min 48 sec
SQL queries:
duration: 6985.128 ms
duration: 106.608 ms
~ 7 sec
Proposed Change
AdoServiceWithUserFilter.getAllAfter
to first fetch the needed ids (see pattern inPersonService.getAllAfter
, then fetch the entities by id with IN-clause (useBaseAdoService.getByIds
).PersonService.getInJurisdictionIDs
to query by ids with IN clause also for other entities where"Entity"Service.inJurisdictionOrOwned
is currently running one query per Entity.Acceptance Criteria
Implementation Details
All
getAllAfter
andgetInJurisdictionIDs
methods avoid parameter limit exception withIterableHelper.executeBatched
batching.Remove or adapt not with superclass aligned
getAllAfter
implementations in:EventUserFilterCriteria
)Remove or adapt not with superclass aligned implementations parallel to
getAllAfter
:boolean includeExtendedChangeDateFilters
that is never used outside of tests (introduced with Also consider indirectly dependent objects when determining when a case was last updated [0.5] #2059, partly removed by [SORMAS2SurvNet] Cleanup: Remove CaseResource::getAllCasesWithExtendedChangeDateFilters #2674).Additional Information
CREATE INDEX CONCURRENTLY IF NOT EXISTS ...
before on running instance.getAllAfter
methods) for most to all entities.getIndexList
) for AdditionalTest, PathogenTestgetByUuids
,getBy"Reference"/"criteria"
) for Case, Sample, Task, AdditionalTest, PathogenTest, Prescription, Treatment, Vaccination, Visit.Performance Analysis
Preliminary analysis of performance before and after the changes introduced in PR #10282
(tested on dafd47d)
The analysis was conducted by measuring the runtime of REST calls to the respective endpoints with batch size 1000.
A substantial improvement can be observed for all entities with long response times:
An exception is pathogen test with no improvement / slightly worse performance:
The results based on the dataset of the initial analysis (see issue description). Further measurements will be performed as needed (e.g., retesting of individual endpoints, larger batch sizes etc.).
Analysis data:
performance_9320.zip
Detailed analysis
Full list of REST calls
Performance of REST calls (
.../<entities>/all/
) before the refacoring:... and after:
Execution details for selected entities
Cases
before:
after:
Here the calls to
CaseService.inJurisdictionOrOwned
per selected case have been dropped, roughly accounting for the measurde difference.Contacts
before:
after:
As for cases, a heavy reduction of calls to
ContactService.inJurisdictionOrOwned
.Immunizations
before:
after:
Here the main culprits were calls to
ImmunizationService.inJurisdictionOrOwned
andFeatureConfigurationFacadeEjb.isPropertyValueTrue
Tasks
before:
after:
Here, calls to
TaskService.inJurisdictionOrOwned
on a per object basis accounted for most of the time spent.Analysis for different batch sizes
A analysis with different batch sizes after completion of this task shows:
pathogentests
is down to 15 sec now)Response times for selected entities:
Analysis data:
differentBatchSizes.zip
The text was updated successfully, but these errors were encountered: