New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-33164: Reimplement order_by
method for query results
#630
Conversation
order_by
method for query results (DM-33164)order_by
method for query results
8d7bf43
to
5cde0aa
Compare
Codecov Report
@@ Coverage Diff @@
## main #630 +/- ##
==========================================
- Coverage 84.12% 84.11% -0.01%
==========================================
Files 237 237
Lines 30075 30138 +63
Branches 4996 5014 +18
==========================================
+ Hits 25301 25351 +50
- Misses 3635 3643 +8
- Partials 1139 1144 +5
Continue to review full report at Codecov.
|
2754147
to
90acda6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good; code looks cleaner overall, I think, in addition to fixing the bug.
But we should probably file a follow-up ticket to actually move the dimension record sorting into the database, and in the meantime caution people like @mfisherlevine that if they want an efficient ORDER BY ... LIMIT 1
on a dimension records query, they should really use queryDataIds
, not queryDimensionRecords
, at least for now. I will also take a look at moving dimension record sorting into the DB on DM-31725, whenever I can actually get back to it, and that may be the best time to address it if deeper structural changes are needed.
I think even now (and in unpatched code) LIMIT 1 should work fine, as it is applied in |
`DataCoordinateQueryResults` is now using window function `row_number()` ordering rank for DataIds query, this simplifies handling of the ordering columns and generation of the materialized query. `queryDimensionRecords()` is doing ordering in-memory by using an ordering function that compares attributes of `DimensionRecord` instances.
90acda6
to
4df5dc9
Compare
Oh, that makes me realize that I didn't check whether the in-memory sorting of dimension records happens before or after the LIMIT 1 would be replied. If it happens after, doesn't that mean that we'd get a completely arbitrary single record, rather than the first one in the sort order? |
Both LIMIT and ORDER BY apply to a sub-query generated by
|
Ah, that makes sense, thanks. And I'm happy with that; I could imagine it giving the query optimizer some trouble sometimes, but it's hard to do better without much bigger changes. |
DataCoordinateQueryResults
is now using window functionrow_number()
ordering rank for DataIds query, this simplifies handling of the ordering columns and generation of the materialized query.queryDimensionRecords()
is doing ordering in-memory by using an ordering function that compares attributes ofDimensionRecord
instances.Checklist
doc/changes