Skip to content

Conversation

@tpoterba
Copy link
Contributor

… from Python

@tpoterba
Copy link
Contributor Author

Important infrastructure for Seqr.

f"query_table: key mismatch: cannot query a table with key "
f"({', '.join(builtins.str(x) for x in key_typ.values())}) with query point type {point.dtype}")

if isinstance(point_or_interval.dtype, hl.tinterval):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't include the case where the table is keyed by intervals. I think the right logic is something like

if point_or_interval.dtype is a prefix of key_type:
  ...
elif point_or_interval.dtype == tinterval(key_type[0]):
  ...
else:
  error

Comment on lines 2911 to 2913
self.row_typ_with_uid = tstruct(**self.row_typ, __uid=ttuple(tint64, tint64))
self.drop_uid = False
self.reader.set_uid_field('__uid')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the other _handle_randomness methods return new nodes, they don't modify the node. I think that's necessary, since an IR can be reused in different contexts.



def test_query_table():
f = new_temp_file(extension='ht')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe include a few tests with a compound key?

((cb, elt) => elt),
bound = "lower",
compareEnd(_, _, _, _ < 1)
).search(cb, partitionerRuntime, EmitCode.present(cb.emb, endAndSignTuple)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick look, I think you've reimplemented the registered function partitionerFindIntervalRange. If so, you should just factor out the implementation to a staged function (as most of the interval registered functions are) and call that. I think we should try to keep all the staged interval ordering stuff in one place, since it's relatively complicated.

@tpoterba
Copy link
Contributor Author

good comments; addressed these and did a fair bit of redesigning of the Python stuff. Refactored the scala to use that method you mentioned, also got mad that we didn't already have a SStackInterval and added that.

@danking danking merged commit 04328f6 into hail-is:main Nov 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants