New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[hail] Promote localize_entries to public & tested #5247
Changes from all commits
887b9ec
6e38e6a
af99b15
ba8a800
0c6b63c
3867a07
045fb3e
00b833d
d31aa61
ad7c0ae
83691c1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2533,6 +2533,92 @@ def _localize_entries(self, entries_field_name, cols_field_name) -> 'Table': | |
return Table(CastMatrixToTable( | ||
self._mir, entries_field_name, cols_field_name)) | ||
|
||
@typecheck_method(entries_array_field_name=nullable(str), | ||
columns_array_field_name=nullable(str)) | ||
def localize_entries(self, | ||
entries_array_field_name=None, | ||
columns_array_field_name=None) -> 'Table': | ||
"""Convert the matrix table to a table with entries localized as an array of structs. | ||
|
||
Examples | ||
-------- | ||
Build a numpy ndarray from a small :class:`.MatrixTable`: | ||
|
||
>>> mt = hl.utils.range_matrix_table(3,3) | ||
>>> mt = mt.select_entries(x = mt.row_idx * mt.col_idx) | ||
>>> mt.x.show() | ||
+---------+---------+-------+ | ||
| row_idx | col_idx | x | | ||
+---------+---------+-------+ | ||
| int32 | int32 | int32 | | ||
+---------+---------+-------+ | ||
| 0 | 0 | 0 | | ||
| 0 | 1 | 0 | | ||
| 0 | 2 | 0 | | ||
| 1 | 0 | 0 | | ||
| 1 | 1 | 1 | | ||
| 1 | 2 | 2 | | ||
| 2 | 0 | 0 | | ||
| 2 | 1 | 2 | | ||
| 2 | 2 | 4 | | ||
+---------+---------+-------+ | ||
|
||
>>> t = mt.localize_entries('entry_structs', 'columns') | ||
>>> t.describe() | ||
---------------------------------------- | ||
Global fields: | ||
'columns': array<struct { | ||
col_idx: int32 | ||
}> | ||
---------------------------------------- | ||
Row fields: | ||
'row_idx': int32 | ||
'entry_structs': array<struct { | ||
x: int32 | ||
}> | ||
---------------------------------------- | ||
Key: ['row_idx'] | ||
---------------------------------------- | ||
|
||
>>> t = t.select(entries = t.entry_structs.map(lambda entry: entry.x)) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a |
||
>>> import numpy as np | ||
>>> np.array(t.entries.collect()) | ||
array([[0, 0, 0], | ||
[0, 1, 2], | ||
[0, 2, 4]]) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. add a note that filtered entries are represented as a missing struct? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I ditched the struct on the second to last line so this would blow up without latest numpy, I assume, b/c you can't represent a missing value in a numpy ndarray. I think this is doctested, no? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I mean add a "Notes" section to the docs, and describe that the array of entries always contains Ncols elements (structs), with filtered entries appearing as missing structs. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is the second comment. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I still want a notes section with a description of the type! |
||
Notes | ||
----- | ||
Both of the added fields are arrays of length equal to | ||
``mt.count_cols()``. Missing entries are represented as missing structs | ||
in the entries array. | ||
|
||
Parameters | ||
---------- | ||
entries_array_field_name : :obj:`str` | ||
The name of the table field containing the array of entry structs | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oh, you added it in the parameter description. Can we move the last sentence of both of these up? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think I'm generally a bit allergic to the proliferation of Notes that we have in the docs. Seems like the parameters or returns section is the right place to put this information, right? |
||
for the given row. | ||
columns_array_field_name : :obj:`str` | ||
The name of the global field containing the array of column | ||
structs. | ||
|
||
Returns | ||
------- | ||
:class:`.Table` | ||
A table whose fields are the row fields of this matrix table plus | ||
one field named ``entries_array_field_name``. The global fields of | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. these parameter references should be single backticks to italicize (numpy style) |
||
this table are the global fields of this matrix table plus one field | ||
named ``columns_array_field_name``. | ||
""" | ||
entries = entries_array_field_name or Env.get_uid() | ||
cols = columns_array_field_name or Env.get_uid() | ||
t = self._localize_entries(entries, cols) | ||
if entries_array_field_name is None: | ||
t = t.drop(entries) | ||
if columns_array_field_name is None: | ||
t = t.drop(cols) | ||
return t | ||
|
||
def _unfilter_entries(self): | ||
entry_ir = hl.cond(hl.is_defined(self.entry), self.entry, hl.struct(**self.entry))._ir | ||
return MatrixTable(MatrixMapEntries(self._mir, entry_ir)) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this describe should be above the select -- to best demonstrate the localized schema