Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[hail] table tail #7386

Merged
merged 11 commits into from Oct 29, 2019

Conversation

@iitalics
Copy link
Contributor

iitalics commented Oct 25, 2019

closes #7357

  • implements .tail function on tables:
def tail(self, n) -> 'Table':
    """Subset table to last `n` rows.

    Examples
    --------
    Subset to the last three rows:

    >>> table_result = table1.tail(3)
    >>> table_result.count()
    3

    Notes
    -----

    The number of partitions in the new table is equal to the number of
    partitions containing the last `n` rows.

    Parameters
    ----------
    n : int
        Number of rows to include.

    Returns
    -------
    :class:`.Table`
        Table including the last `n` rows.
    """

    return Table(TableTail(self._tir, n))
  • refactored some of the logic in TableHead, because a lot of the behavior is the same

  • specifically, moved some partition-counts calculations to is.hail.utils.PartitionCounts

iitalics added 7 commits Oct 23, 2019
fix python documentation for Table.tail
implement getTailPartitionCounts

tentative: computeSubsetRange returns Option

refactor head PCs/tail PCs into new module

fix other uses of getHeadPartitionCounts

fix: remove 'private'
fixup: PrunDeadFields entry for TableTail
testIncrementalPCSubset
@catoverdrive

This comment has been minimized.

Copy link
Collaborator

catoverdrive commented Oct 25, 2019

can you add a python test that does something to the effect of:

t = hl.utils.range_table(100, 5)
t = t.annotate(idx2 = hl.scan.count())
result = t.tail(10).collect()
expected = [hl.Struct(idx=i, idx2=i) for i in range(90, 100)]
self.assertEquals(result, expected)
@iitalics iitalics requested a review from danking Oct 28, 2019
@danking danking self-assigned this Oct 28, 2019
@danking

This comment has been minimized.

Copy link
Collaborator

danking commented Oct 28, 2019

FYI, the PR shows up in my scorecard.hail.is queue if I'm an assignee.

Copy link
Collaborator

danking left a comment

This looks awesome, a couple small comments.

I think there's an obvious extension here to TableInterval(l, r) which handles the head case when l=0 and the tail case when r=0 but also permits things like skip rows! I'm excited to design that after this PR lands!

hail/python/hail/table.py Outdated Show resolved Hide resolved
@danking danking merged commit 0ed10c8 into hail-is:master Oct 29, 2019
1 check passed
1 check passed
ci-test success
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants
You can’t perform that action at this time.