-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH/DEV: Make ibis Node instances hashable #1611
Conversation
ibis/sql/compiler.py
Outdated
| return False | ||
|
|
||
| this_exprs = self._all_exprs() | ||
| other_exprs = other._all_exprs() | ||
|
|
||
| if self.limit != other.limit: | ||
| cache[(self, other)] = False | ||
| cache[self, other] = False | ||
| return False | ||
|
|
||
| for x, y in zip(this_exprs, other_exprs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to check the length of this_exprs and other_exprs otherwise we'll hit the same bug we have for equality.
ibis/expr/operations.py
Outdated
| return False | ||
|
|
||
| if len(self.args) != len(other.args): | ||
| cache[(self, other)] = False | ||
| cache[self, other] = False | ||
| return False | ||
|
|
||
| for left, right in zip(self.args, other.args): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could use simply:
all_equal(self.args, other.args, cache=cache)But the length of the arguments needs to be checked here as well, because map zips its arguments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The equality PR will address this length issue. I will refactor this and the code near your other comment to use all_equal once that is PR is merged.
ibis/sql/compiler.py
Outdated
|
|
||
| cache[(self, other)] = True | ||
| return True | ||
| cache[self, other] = len(this_exprs) == len(other_exprs) and all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use all_equals instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will address after #1600.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code got much much cleaner and better!
42cbdc0
to
1c3db1b
Compare
We were incorrectly using weakref.WeakValueDictionary to hold temp table instances that mapped to themselves. This defeats the purpose of using a weakref container because the WeakValueDictionary will hold a strong reference to the key and weak reference to the value. In this scenario we will forever hold at least one strong reference per key and leak memory. By using weakref.WeakSet we now only weak references on the table instance.
|
@kszucs arg it looks like there was a backwards incompatible change in |
c0e67b7
to
a1b02e9
Compare
|
@kszucs Can you review this again? |
|
Sure, reviewing now |
| substitutor = Substitutor() | ||
| return substitutor.substitute(expr, mapping) | ||
| """Substitute subexpressions in `expr` with expression to expression | ||
| mapping `substitutions`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got sweaty. Surprisingly rhythmical and symmetric sentence :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, now that you've pointed it out it's making my head hurt.
|
There's a large performance regression introduced by this PR that I'm looking into. |
ibis/util.py
Outdated
| @@ -291,3 +249,36 @@ def get_logger(name, level=None, format=None, propagate=False): | |||
| logging, os.environ.get('LOGLEVEL', 'WARNING').upper())) | |||
| logger.addHandler(handler) | |||
| return logger | |||
|
|
|||
|
|
|||
| def safe_get_name(expr): | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO these would be nicer as Expr methods.
ibis/expr/analysis.py
Outdated
| ---------- | ||
| expr : ibis.expr.types.Expr | ||
| An Ibis expression | ||
| substitutions : Mapping[ibis.expr.types.Expr, ibis.expr.types.Expr] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Based on the dict comprehension below, substitutions seems like a list of 2-tuples.
ibis/expr/analysis.py
Outdated
| @@ -61,38 +60,38 @@ def _substitute(self, expr, mapping): | |||
| Parameters | |||
| ---------- | |||
| expr : ibis.expr.types.Expr | |||
| mapping : Dict, OrderedDict | |||
| mapping : Mapping[ibis.expr.types.Expr, ibis.expr.types.Expr] | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mapping[ibis.expr.types.Node, ibis.expr.types.Expr]?
| unchanged = True | ||
| for i, arg in enumerate(new_args): | ||
| if isinstance(arg, ir.Expr): | ||
| new_arg = self.substitute(arg, mapping) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This recursion seems like a task for lin.traverse. Issue?
| self._hash = hash( | ||
| (type(self),) + tuple( | ||
| element.op() if isinstance(element, ir.Expr) else element | ||
| for element in self.flat_args() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should get rid of flat_args and prevent having any non-expressions in the hierarchy, except the literal leaves.
|
Ok, figured out what perf problem is: I wasn't short circuiting with |
|
@kszucs Fixed the docs, and added those functions as private methods. |
Closes #890