Use different hash function for arrays of `AbstractExpr`. #378

riccardomurri · 2020-03-12T09:35:30Z

This is an update on top of #369.

Testing with my own code, this definition of hash is faster than the one on branch "master" by a good margin (3 sec per iteration vs. 7 sec) and still faster than the hashing variant currently used in #369.

Somewhat strangely, `reduce(..., map(...))` is faster, in this case, than `mapreduce(...)`.

ericphanson · 2020-03-12T12:58:02Z

Thanks for the PR and the benchmarking.

I am concerned about the correctness of the xor based hashing (which is also a problem in the other PR). We are getting the same failure as examined in https://discourse.julialang.org/t/convex-jl-slow-in-creating-constraints/34775/26 but also two other ones this time. I'm glad our tests are sensitive enough to catch these; they are pretty subtle issues where hash collisions cause two expressions to be identified as the same, causing correctness problems if the potential collisions actually occur in one of the problems being tested.

I am a bit reluctant to mess with the hashing because of that-- even if we do find a fast version that doesn't cause test failures, we don't know it won't cause correctness problems in other code out in the wild, unless we are sure that the new implementation is not susceptible to collisions, and I'm not entirely sure how to check that.

Is the hashing still a bottleneck in your work?

riccardomurri · 2020-03-12T13:03:34Z

Is the hashing still a bottleneck in your work?

No, definitely not. I just thought I would save the work done on the hashing somewhere before I forget, but please feel free to ignore it, if it breaks tests.

ericphanson · 2020-03-12T13:17:08Z

Is the hashing still a bottleneck in your work?

No, definitely not. I just thought I would save the work done on the hashing somewhere before I forget, but please feel free to ignore it, if it breaks tests.

Sure, makes sense :). I will close this for now because I now think we might be able to wait for a better solution to avoid hashing altogether, ref #78. Or some other solution... I think the internals definitely need a performance redesign, but I haven't yet figured out how that might work.

Use different hash function for arrays of AbstractExpr.

d141562

Somewhat strangely, `reduce(..., map(...))` is faster, in this case, than `mapreduce(...)`.

ericphanson closed this Mar 12, 2020

ericphanson mentioned this pull request Mar 12, 2020

Remove slow hash by defining isequal #369

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use different hash function for arrays of `AbstractExpr`. #378

Use different hash function for arrays of `AbstractExpr`. #378

riccardomurri commented Mar 12, 2020

ericphanson commented Mar 12, 2020 •

edited

riccardomurri commented Mar 12, 2020

ericphanson commented Mar 12, 2020

Use different hash function for arrays of AbstractExpr. #378

Use different hash function for arrays of AbstractExpr. #378

Conversation

riccardomurri commented Mar 12, 2020

ericphanson commented Mar 12, 2020 • edited

riccardomurri commented Mar 12, 2020

ericphanson commented Mar 12, 2020

Use different hash function for arrays of `AbstractExpr`. #378

Use different hash function for arrays of `AbstractExpr`. #378

ericphanson commented Mar 12, 2020 •

edited