Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH/DEV: Make ibis Node instances hashable #1611

Closed
wants to merge 34 commits into from

Conversation

Projects
2 participants
@cpcloud
Copy link
Member

cpcloud commented Sep 5, 2018

Closes #890

@cpcloud cpcloud added this to the Next Release milestone Sep 5, 2018

@cpcloud cpcloud added this to To do in Refactoring via automation Sep 5, 2018

return False

this_exprs = self._all_exprs()
other_exprs = other._all_exprs()

if self.limit != other.limit:
cache[(self, other)] = False
cache[self, other] = False
return False

for x, y in zip(this_exprs, other_exprs):

This comment has been minimized.

Copy link
@cpcloud

cpcloud Sep 5, 2018

Author Member

This needs to check the length of this_exprs and other_exprs otherwise we'll hit the same bug we have for equality.

return False

if len(self.args) != len(other.args):
cache[(self, other)] = False
cache[self, other] = False
return False

for left, right in zip(self.args, other.args):

This comment has been minimized.

Copy link
@kszucs

kszucs Sep 5, 2018

Member

You could use simply:

all_equal(self.args, other.args, cache=cache)

But the length of the arguments needs to be checked here as well, because map zips its arguments.

This comment has been minimized.

Copy link
@cpcloud

cpcloud Sep 6, 2018

Author Member

The equality PR will address this length issue. I will refactor this and the code near your other comment to use all_equal once that is PR is merged.

Show resolved Hide resolved ibis/expr/visualize.py Outdated

cache[(self, other)] = True
return True
cache[self, other] = len(this_exprs) == len(other_exprs) and all(

This comment has been minimized.

Copy link
@kszucs

kszucs Sep 5, 2018

Member

Use all_equals instead?

This comment has been minimized.

Copy link
@cpcloud

cpcloud Sep 6, 2018

Author Member

Will address after #1600.

@kszucs
Copy link
Member

kszucs left a comment

The code got much much cleaner and better!

cpcloud added some commits Jul 4, 2018

@cpcloud cpcloud force-pushed the cpcloud:hashable-ops branch from 42cbdc0 to 1c3db1b Sep 6, 2018

cpcloud added some commits Sep 6, 2018

Bug
Use weakref.WeakSet to hold ImpalaTemporaryTable instances
We were incorrectly using weakref.WeakValueDictionary to hold temp table
instances that mapped to themselves.

This defeats the purpose of using a weakref container because the
WeakValueDictionary will hold a strong reference to the key and weak
reference to the value. In this scenario we will forever hold at least
one strong reference per key and leak memory.

By using weakref.WeakSet we now only weak references on the table
instance.
@cpcloud

This comment has been minimized.

Copy link
Member Author

cpcloud commented Sep 7, 2018

@kszucs arg it looks like there was a backwards incompatible change in nbconvert from 5.3.1 to 5.4.0 :(

cpcloud added some commits Sep 10, 2018

cpcloud added some commits Sep 10, 2018

@cpcloud cpcloud force-pushed the cpcloud:hashable-ops branch from c0e67b7 to a1b02e9 Sep 10, 2018

cpcloud added some commits Sep 10, 2018

@cpcloud

This comment has been minimized.

Copy link
Member Author

cpcloud commented Sep 10, 2018

@kszucs Can you review this again?

@kszucs

This comment has been minimized.

Copy link
Member

kszucs commented Sep 10, 2018

Sure, reviewing now

substitutor = Substitutor()
return substitutor.substitute(expr, mapping)
"""Substitute subexpressions in `expr` with expression to expression
mapping `substitutions`.

This comment has been minimized.

Copy link
@kszucs

kszucs Sep 10, 2018

Member

I got sweaty. Surprisingly rhythmical and symmetric sentence :)

This comment has been minimized.

Copy link
@cpcloud

cpcloud Sep 10, 2018

Author Member

Yes, now that you've pointed it out it's making my head hurt.

@cpcloud

This comment has been minimized.

Copy link
Member Author

cpcloud commented Sep 10, 2018

There's a large performance regression introduced by this PR that I'm looking into.

@@ -291,3 +249,36 @@ def get_logger(name, level=None, format=None, propagate=False):
logging, os.environ.get('LOGLEVEL', 'WARNING').upper()))
logger.addHandler(handler)
return logger


def safe_get_name(expr):

This comment has been minimized.

Copy link
@kszucs

kszucs Sep 10, 2018

Member

IMO these would be nicer as Expr methods.

----------
expr : ibis.expr.types.Expr
An Ibis expression
substitutions : Mapping[ibis.expr.types.Expr, ibis.expr.types.Expr]

This comment has been minimized.

Copy link
@kszucs

kszucs Sep 10, 2018

Member

Based on the dict comprehension below, substitutions seems like a list of 2-tuples.

@@ -61,38 +60,38 @@ def _substitute(self, expr, mapping):
Parameters
----------
expr : ibis.expr.types.Expr
mapping : Dict, OrderedDict
mapping : Mapping[ibis.expr.types.Expr, ibis.expr.types.Expr]

This comment has been minimized.

Copy link
@kszucs

kszucs Sep 10, 2018

Member

Mapping[ibis.expr.types.Node, ibis.expr.types.Expr]?

unchanged = True
for i, arg in enumerate(new_args):
if isinstance(arg, ir.Expr):
new_arg = self.substitute(arg, mapping)

This comment has been minimized.

Copy link
@kszucs

kszucs Sep 10, 2018

Member

This recursion seems like a task for lin.traverse. Issue?

self._hash = hash(
(type(self),) + tuple(
element.op() if isinstance(element, ir.Expr) else element
for element in self.flat_args()

This comment has been minimized.

Copy link
@kszucs

kszucs Sep 10, 2018

Member

We should get rid of flat_args and prevent having any non-expressions in the hierarchy, except the literal leaves.

@kszucs

kszucs approved these changes Sep 10, 2018

cpcloud added some commits Sep 10, 2018

@cpcloud

This comment has been minimized.

Copy link
Member Author

cpcloud commented Sep 10, 2018

Ok, figured out what perf problem is: I wasn't short circuiting with self is other in Node.equals.

cpcloud added some commits Sep 10, 2018

@cpcloud

This comment has been minimized.

Copy link
Member Author

cpcloud commented Sep 10, 2018

@kszucs Fixed the docs, and added those functions as private methods.

@cpcloud cpcloud closed this in 2cfd385 Sep 10, 2018

Refactoring automation moved this from To do to Done Sep 10, 2018

@cpcloud cpcloud deleted the cpcloud:hashable-ops branch Sep 10, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.