Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python IR binding and environment infrastructure #7000

Merged
merged 5 commits into from Sep 9, 2019

Conversation

patrick-schultz
Copy link
Collaborator

This is the stuff I had to add to the python IR hierarchy for the CSE pass. Some of it feels a bit hacky; I would welcome suggestions on ways to clean it up.

commit 07e37a3
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Sep 5 15:10:30 2019 -0400

    fixes

commit f6b74c5
Merge: 6320d30 baa5251
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Sep 4 15:28:04 2019 -0400

    Merge branch 'tim-agglet-fix' into python-cse

commit 6320d30
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Sep 4 15:24:51 2019 -0400

    put back old renderer

commit 367819d
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Sep 4 11:00:55 2019 -0400

    enable AggLets, other refactoring

commit baa5251
Author: Tim Poterba <tpoterba@gmail.com>
Date:   Fri Aug 30 07:20:42 2019 -0400

    bump!

commit e3be663
Author: Tim Poterba <tpoterba@gmail.com>
Date:   Fri Aug 30 06:34:45 2019 -0400

    fix let lifting bug

commit 0e54c39
Author: Tim Poterba <tpoterba@gmail.com>
Date:   Thu Aug 29 10:24:15 2019 -0400

    [hail] Fix performance problems in aggregators by binding arguments

commit 2786aec
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Aug 28 14:10:39 2019 -0400

    use namedtuple and __slots__

commit 92c1f6b
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Aug 28 13:34:20 2019 -0400

    refactoring

commit 3842f38
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Aug 28 11:53:02 2019 -0400

    refactoring

commit f64ac3e
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Mon Aug 26 14:08:21 2019 -0400

    refactoring

commit 636aa5e
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Mon Aug 26 10:51:52 2019 -0400

    refactoring, documenting

commit 3d3671f
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 22 15:35:42 2019 -0400

    refactoring

commit 6aa0ba1
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 22 10:10:54 2019 -0400

    refactoring

commit 427d8c7
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Aug 14 12:50:30 2019 -0400

    wip

commit fb6e986
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Aug 14 10:43:19 2019 -0400

    bugfixes

commit 1cccb05
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Tue Aug 13 10:05:40 2019 -0400

    refactoring, bugfixes

commit d2df592
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Mon Aug 12 11:46:21 2019 -0400

    refactoring

commit 6ca6b23
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Mon Aug 12 10:48:31 2019 -0400

    linkedlist -> stack

commit 50ff007
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Fri Aug 9 16:18:55 2019 -0400

    recursion -> loop

commit 55c32ba
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Fri Aug 9 15:34:17 2019 -0400

    defunctionalize

commit 3db71db
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 8 15:05:37 2019 -0400

    wip

commit d196b2e
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 8 14:34:15 2019 -0400

    inline print

commit 16eb524
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 8 11:36:28 2019 -0400

    inline print_renderable

commit 8460050
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 8 11:23:40 2019 -0400

    factor out args into state object

commit f10306e
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 8 10:50:38 2019 -0400

    CPS

commit 654f219
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Aug 7 13:04:47 2019 -0400

    second pass: loop -> recursion

commit ed6ce81
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Aug 7 09:56:43 2019 -0400

    more refactoring

commit e12b7e5
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Aug 7 08:50:49 2019 -0400

    some refactoring

commit e0c76ad
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Fri Aug 2 15:11:21 2019 -0400

    cleaning up

commit 144a1bd
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 1 16:40:53 2019 -0400

    recursion -> loop

commit 28f23e3
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 1 16:28:47 2019 -0400

    defunctionalization

commit 6619b17
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 1 15:12:36 2019 -0400

    inline recur

commit c2e9f92
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 1 15:02:48 2019 -0400

    CPS transform

commit 9e788d3
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 1 14:29:51 2019 -0400

    loop -> recursion

commit 3c392e6
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Thu Aug 1 14:04:06 2019 -0400

    finally fully working?

commit ed133da
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Jul 31 14:40:18 2019 -0400

    some refactoring

commit 689ca40
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Mon Jul 29 14:57:16 2019 -0400

    works with agg bindings

commit 62e5680
Author: patrick-schultz <pschultz@broadinstitute.org>
Date:   Wed Jul 24 11:51:06 2019 -0400

    New CSE IR renrerer in Python

    Doesn’t handle agg environments correctly.
Copy link
Contributor

@tpoterba tpoterba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clarify/fix binds, otherwise good to go

def is_effectful() -> bool:
return False

def binds(self, i: int) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure what this method is supposed to do. In certain places it returns a boolean, in others a set?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should always return a boolean; (/agg_/scan_)bindings returns a set. I originally had this as a way to quickly check if a node binds anything, without actually constructing the dict of bindings. But I think it may not be needed anymore.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

places like here were throwing me off:
https://github.com/hail-is/hail/pull/7000/files#diff-de84e7c683cb75eeb5eaddc118d82078R75

I originally had this as a way to quickly check if a node binds anything,

I had guessed this, but wasn't sure! Thanks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Currently it's only used in BaseIR.child_context, which adds new bindings to the current context. It allows a short-cut in the common case where nothing is added. But I think that's not worth the complexity. I'll remove it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, wow, it appears I was somebody else while editing blockmatrix_ir. I'm looking forward to getting back to a statically typed language.

@@ -46,10 +66,17 @@ def __init__(self, child, pred):
self.child = child
self.pred = pred

@staticmethod
def new_block(i):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry, something else that's confused me --

new_block conceptually indicates a different value IR group, right? In that case, it feels like only value IRs should have a concept of a new block -- all relational IRs will define a new block for all value IR children.

Can we remove this from table/matrix/blockmatrix?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new_block conceptually indicates that there is some non-trivial control flow in the relationship between this node and its ith child. Or it would if I were being consistent and made loop bodies be new blocks. So it really just means "CSE won't add lets across this boundary". (Making loop bodies new blocks is technically correct for CSE to never add work to any control path, but heuristically we assume loops will run at least once.)

I considered doing it that way, but it feels to me like part of the structure of the IR, and should not be encoded differently in relational IR vs. value IR. If I remove it from BaseIR, then CSE will have to do a case on whether an IR is a value IR, which feels wrong to me.

It's not a big deal, and an easy change to make, if you want me to.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'm fine leaving this on the relational nodes, but it should just be return True on the base classes, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TableIR, MatrixIR, and BlockMatrixIR have overrides to return True, while IR overrides it to return False, with some subclasses like If overriding again. I think you're saying it would be simpler to define it as True in BaseIR? I can do that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

er, right -- I mean delete all of the overrides on relational classes, e.g. https://github.com/hail-is/hail/pull/7000/files#diff-69e69044397dd5692bf5013604bdfadaR322

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh whoops! I didn't realize I had done that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks! sorry for the confusion

Copy link
Contributor

@tpoterba tpoterba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

new_block

@@ -46,10 +66,17 @@ def __init__(self, child, pred):
self.child = child
self.pred = pred

@staticmethod
def new_block(i):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I'm fine leaving this on the relational nodes, but it should just be return True on the base classes, right?

@danking danking merged commit bce8471 into hail-is:master Sep 9, 2019
@patrick-schultz patrick-schultz deleted the python-binding branch September 17, 2019 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants