New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python IR binding and environment infrastructure #7000
Conversation
commit 07e37a3 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Sep 5 15:10:30 2019 -0400 fixes commit f6b74c5 Merge: 6320d30 baa5251 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Sep 4 15:28:04 2019 -0400 Merge branch 'tim-agglet-fix' into python-cse commit 6320d30 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Sep 4 15:24:51 2019 -0400 put back old renderer commit 367819d Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Sep 4 11:00:55 2019 -0400 enable AggLets, other refactoring commit baa5251 Author: Tim Poterba <tpoterba@gmail.com> Date: Fri Aug 30 07:20:42 2019 -0400 bump! commit e3be663 Author: Tim Poterba <tpoterba@gmail.com> Date: Fri Aug 30 06:34:45 2019 -0400 fix let lifting bug commit 0e54c39 Author: Tim Poterba <tpoterba@gmail.com> Date: Thu Aug 29 10:24:15 2019 -0400 [hail] Fix performance problems in aggregators by binding arguments commit 2786aec Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Aug 28 14:10:39 2019 -0400 use namedtuple and __slots__ commit 92c1f6b Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Aug 28 13:34:20 2019 -0400 refactoring commit 3842f38 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Aug 28 11:53:02 2019 -0400 refactoring commit f64ac3e Author: patrick-schultz <pschultz@broadinstitute.org> Date: Mon Aug 26 14:08:21 2019 -0400 refactoring commit 636aa5e Author: patrick-schultz <pschultz@broadinstitute.org> Date: Mon Aug 26 10:51:52 2019 -0400 refactoring, documenting commit 3d3671f Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 22 15:35:42 2019 -0400 refactoring commit 6aa0ba1 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 22 10:10:54 2019 -0400 refactoring commit 427d8c7 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Aug 14 12:50:30 2019 -0400 wip commit fb6e986 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Aug 14 10:43:19 2019 -0400 bugfixes commit 1cccb05 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Tue Aug 13 10:05:40 2019 -0400 refactoring, bugfixes commit d2df592 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Mon Aug 12 11:46:21 2019 -0400 refactoring commit 6ca6b23 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Mon Aug 12 10:48:31 2019 -0400 linkedlist -> stack commit 50ff007 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Fri Aug 9 16:18:55 2019 -0400 recursion -> loop commit 55c32ba Author: patrick-schultz <pschultz@broadinstitute.org> Date: Fri Aug 9 15:34:17 2019 -0400 defunctionalize commit 3db71db Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 8 15:05:37 2019 -0400 wip commit d196b2e Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 8 14:34:15 2019 -0400 inline print commit 16eb524 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 8 11:36:28 2019 -0400 inline print_renderable commit 8460050 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 8 11:23:40 2019 -0400 factor out args into state object commit f10306e Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 8 10:50:38 2019 -0400 CPS commit 654f219 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Aug 7 13:04:47 2019 -0400 second pass: loop -> recursion commit ed6ce81 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Aug 7 09:56:43 2019 -0400 more refactoring commit e12b7e5 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Aug 7 08:50:49 2019 -0400 some refactoring commit e0c76ad Author: patrick-schultz <pschultz@broadinstitute.org> Date: Fri Aug 2 15:11:21 2019 -0400 cleaning up commit 144a1bd Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 1 16:40:53 2019 -0400 recursion -> loop commit 28f23e3 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 1 16:28:47 2019 -0400 defunctionalization commit 6619b17 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 1 15:12:36 2019 -0400 inline recur commit c2e9f92 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 1 15:02:48 2019 -0400 CPS transform commit 9e788d3 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 1 14:29:51 2019 -0400 loop -> recursion commit 3c392e6 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Thu Aug 1 14:04:06 2019 -0400 finally fully working? commit ed133da Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Jul 31 14:40:18 2019 -0400 some refactoring commit 689ca40 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Mon Jul 29 14:57:16 2019 -0400 works with agg bindings commit 62e5680 Author: patrick-schultz <pschultz@broadinstitute.org> Date: Wed Jul 24 11:51:06 2019 -0400 New CSE IR renrerer in Python Doesn’t handle agg environments correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarify/fix binds, otherwise good to go
hail/python/hail/ir/base_ir.py
Outdated
def is_effectful() -> bool: | ||
return False | ||
|
||
def binds(self, i: int) -> bool: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure what this method is supposed to do. In certain places it returns a boolean, in others a set?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should always return a boolean; (/agg_/scan_)bindings returns a set. I originally had this as a way to quickly check if a node binds anything, without actually constructing the dict of bindings. But I think it may not be needed anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
places like here were throwing me off:
https://github.com/hail-is/hail/pull/7000/files#diff-de84e7c683cb75eeb5eaddc118d82078R75
I originally had this as a way to quickly check if a node binds anything,
I had guessed this, but wasn't sure! Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, right. Currently it's only used in BaseIR.child_context
, which adds new bindings to the current context. It allows a short-cut in the common case where nothing is added. But I think that's not worth the complexity. I'll remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, wow, it appears I was somebody else while editing blockmatrix_ir. I'm looking forward to getting back to a statically typed language.
hail/python/hail/ir/matrix_ir.py
Outdated
@@ -46,10 +66,17 @@ def __init__(self, child, pred): | |||
self.child = child | |||
self.pred = pred | |||
|
|||
@staticmethod | |||
def new_block(i): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, something else that's confused me --
new_block conceptually indicates a different value IR group, right? In that case, it feels like only value IRs should have a concept of a new block -- all relational IRs will define a new block for all value IR children.
Can we remove this from table/matrix/blockmatrix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new_block
conceptually indicates that there is some non-trivial control flow in the relationship between this node and its i
th child. Or it would if I were being consistent and made loop bodies be new blocks. So it really just means "CSE won't add lets across this boundary". (Making loop bodies new blocks is technically correct for CSE to never add work to any control path, but heuristically we assume loops will run at least once.)
I considered doing it that way, but it feels to me like part of the structure of the IR, and should not be encoded differently in relational IR vs. value IR. If I remove it from BaseIR, then CSE will have to do a case on whether an IR is a value IR, which feels wrong to me.
It's not a big deal, and an easy change to make, if you want me to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'm fine leaving this on the relational nodes, but it should just be return True
on the base classes, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TableIR
, MatrixIR
, and BlockMatrixIR
have overrides to return True
, while IR
overrides it to return False
, with some subclasses like If
overriding again. I think you're saying it would be simpler to define it as True
in BaseIR
? I can do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
er, right -- I mean delete all of the overrides on relational classes, e.g. https://github.com/hail-is/hail/pull/7000/files#diff-69e69044397dd5692bf5013604bdfadaR322
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh whoops! I didn't realize I had done that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! sorry for the confusion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new_block
hail/python/hail/ir/matrix_ir.py
Outdated
@@ -46,10 +66,17 @@ def __init__(self, child, pred): | |||
self.child = child | |||
self.pred = pred | |||
|
|||
@staticmethod | |||
def new_block(i): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I'm fine leaving this on the relational nodes, but it should just be return True
on the base classes, right?
This is the stuff I had to add to the python IR hierarchy for the CSE pass. Some of it feels a bit hacky; I would welcome suggestions on ways to clean it up.