-
Notifications
You must be signed in to change notification settings - Fork 258
[query] Add hl.query_table function for reading tables with indices…
#12376
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Important infrastructure for Seqr. |
2832ee7 to
f2629ce
Compare
hail/python/hail/expr/functions.py
Outdated
| f"query_table: key mismatch: cannot query a table with key " | ||
| f"({', '.join(builtins.str(x) for x in key_typ.values())}) with query point type {point.dtype}") | ||
|
|
||
| if isinstance(point_or_interval.dtype, hl.tinterval): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't include the case where the table is keyed by intervals. I think the right logic is something like
if point_or_interval.dtype is a prefix of key_type:
...
elif point_or_interval.dtype == tinterval(key_type[0]):
...
else:
error
hail/python/hail/ir/ir.py
Outdated
| self.row_typ_with_uid = tstruct(**self.row_typ, __uid=ttuple(tint64, tint64)) | ||
| self.drop_uid = False | ||
| self.reader.set_uid_field('__uid') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the other _handle_randomness methods return new nodes, they don't modify the node. I think that's necessary, since an IR can be reused in different contexts.
|
|
||
|
|
||
| def test_query_table(): | ||
| f = new_temp_file(extension='ht') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe include a few tests with a compound key?
| ((cb, elt) => elt), | ||
| bound = "lower", | ||
| compareEnd(_, _, _, _ < 1) | ||
| ).search(cb, partitionerRuntime, EmitCode.present(cb.emb, endAndSignTuple))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick look, I think you've reimplemented the registered function partitionerFindIntervalRange. If so, you should just factor out the implementation to a staged function (as most of the interval registered functions are) and call that. I think we should try to keep all the staged interval ordering stuff in one place, since it's relatively complicated.
|
good comments; addressed these and did a fair bit of redesigning of the Python stuff. Refactored the scala to use that method you mentioned, also got mad that we didn't already have a SStackInterval and added that. |
… from Python