[PP-3524] Remove complex sort script for "last_update" order in order…#3018
Conversation
5ed9084 to
4415704
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3018 +/- ##
==========================================
- Coverage 93.02% 93.01% -0.02%
==========================================
Files 482 481 -1
Lines 43650 43589 -61
Branches 6066 6054 -12
==========================================
- Hits 40607 40544 -63
- Misses 1968 1972 +4
+ Partials 1075 1073 -2 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
… to reduce search load.
b3186f8 to
76c1073
Compare
76c1073 to
22237cf
Compare
jonathangreen
left a comment
There was a problem hiding this comment.
Looks good, just a couple comments to take a look at.
| if has_script_fields: | ||
| # Wrap the Work objects in WorkSearchResult so the | ||
| # data from script fields isn't lost. | ||
| work = WorkSearchResult(work, hit) |
There was a problem hiding this comment.
After this change I think WorkSearchResult is dead code. I think this is the only place it was ever created. It should either get cleaned up or at least get a TODO comment to be cleaned up in the future.
| ), # Made up keyword type, because we don't want text fuzzyness on this | ||
| "licensepools.available": dict(path="licensepools"), | ||
| "licensepools.availability_time": dict(path="licensepools"), | ||
| "licensepools.last_updated": dict(path="licensepools"), |
There was a problem hiding this comment.
This field still exists right? Do we actually want to remove this?
There was a problem hiding this comment.
Good point - we might as well leave it in there.
… to reduce search load.
Description
Based on my analysis of the value of the work_last_update custom sorting script in the v5.py I determined that the script was not useful enough to justify their high computational cost given how work and license pool timestamps are updated in the live operation of the system.
This update simply removes the script invocation and uses a simpler sorting method that will hopefully tax the opensearch cluster less. The last_update now simply sorts according to the last_update_time on the work. This timestamp is updated when the metadata or circulation data changes for one of the license pools.
Additionally, the calculated "last_update" script has been removed as well. So the value of the "updated" element in the feed will always map to Work.last_update_time.
There is one final change in this PR: I have removed the sort by licenesepool.last_updated.
Since there is no appreciable different between the licensepools.last_update and the works.last_update_time, the ability to order by the licensepool.last_updated is unnecessary because we are now correctly updating the work.last_update_time when any associated license pool changes. Therefore it is not worth maintaining a separate sort option in the sort facet.
Motivation and Context
There is a fair amount of context and analysis here that supports the conclusion to remove the expensive sort script.
https://ebce-lyrasis.atlassian.net/browse/PP-3524
How Has This Been Tested?
Unit tests updated. Needs to be load tested on minotaur.
Checklist