Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hail] add delimiter argument to import_matrix_table #7379

Merged
merged 4 commits into from Oct 25, 2019
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
27 changes: 22 additions & 5 deletions hail/python/hail/methods/impex.py
Expand Up @@ -1494,7 +1494,8 @@ def import_table(paths,
min_partitions=nullable(int),
no_header=bool,
force_bgz=bool,
sep=str)
sep=nullable(str),
delimiter=nullable(str))
def import_matrix_table(paths,
row_fields={},
row_key=[],
Expand All @@ -1503,7 +1504,8 @@ def import_matrix_table(paths,
min_partitions=None,
no_header=False,
force_bgz=False,
sep='\t') -> MatrixTable:
sep=None,
delimiter=None) -> MatrixTable:
"""Import tab-delimited file(s) as a :class:`.MatrixTable`.

Examples
Expand Down Expand Up @@ -1643,12 +1645,29 @@ def import_matrix_table(paths,
force_bgz : :obj:`bool`
If ``True``, load **.gz** files as blocked gzip files, assuming
that they were actually compressed using the BGZ codec.
sep : :obj:`str`
This parameter is a deprecated name for `delimiter`, please use that
instead.
delimiter : :obj:`str`
A single character string which separates values in the file.

Returns
-------
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was expecting to see something like this:

assert one of sep and delimiter is defined, but not both
use value from whichever is defined

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow good point, I must've been falling asleep when I wrote this. Fixed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need tests for this?

:class:`.MatrixTable`
MatrixTable constructed from imported data
"""
if sep is not None:
if delimiter is not None:
raise ValueError(
f'expecting either sep or delimiter but received both: '
f'{sep}, {delimiter}')
delimiter = sep
del sep

if delimiter is None:
delimiter = '\t'
if len(delimiter) != 1:
raise FatalError('delimiter or sep must be a single character')

add_row_id = False
if isinstance(row_key, list) and len(row_key) == 0:
Expand All @@ -1668,16 +1687,14 @@ def import_matrix_table(paths,
if entry_type not in {tint32, tint64, tfloat32, tfloat64, tstr}:
raise FatalError("""import_matrix_table expects entry types to be one of:
'int32', 'int64', 'float32', 'float64', 'str': found '{}'""".format(entry_type))
if len(sep) != 1:
raise FatalError('sep must be a single character')

reader = TextMatrixReader(paths,
min_partitions,
row_fields,
entry_type,
missing,
not no_header,
sep,
delimiter,
force_bgz,
add_row_id)

Expand Down