Add a faster heuristic for IntegerRelation::isEqual#2505
Add a faster heuristic for IntegerRelation::isEqual#2505copybara-service[bot] merged 1 commit intogoogle:mainfrom
Conversation
asraa
left a comment
There was a problem hiding this comment.
everything is totally non-blocking, i think this gives a really great perf improvement as is!
|
|
||
| // If this is still too slow, would it be faster to sample or enumerate range | ||
| // points? | ||
| return fixedRel1.isEqual(fixedRel2); |
There was a problem hiding this comment.
ah i see - it may still be slow if the domain point maps to some complicated range space maybe? heuristically i guess we kind of doubt that, and we also know that our range space is finite so it can't be that complicated. but i wonder if the slowness can be determined by the volume or rank or some other measure on the resulting fixed relations
There was a problem hiding this comment.
yeah I didn't have the time to experiment here. I figured it would be better to wait until this becomes a bottleneck again (or not)
| // Since these are layouts mapping data tensors to ciphertext-semantic | ||
| // tensors, both the domain and range spaces are simple grids from (0, 0, ..., | ||
| // 0) to (bound0, bound1, ..., boundK). We can sample this grid however we | ||
| // like, but it should suffice for most cases to check some corners and a few |
There was a problem hiding this comment.
I like the idea of using these points - but I also wonder too whether we should convert to ISL and sample some random subset of points after that from one relation and check if they exist in the other? and just since ISL came to mind for point sampling - would ISL equality checks be faster in any case? (e.g. the isEqual used for the sameDomainForRangePoint above?)
There was a problem hiding this comment.
Explicit randomness would probably be more annoying than helpful (i.e., have to keep a random seed configured for determinism). The ISL "sample" method also only produces one point, I don't know of a way to ask for 10 points (random or arbitrary, or "spread out") and I worry that the exactness required there (IIUC it's something like a simplex walk) would still make it slow.
Maybe there's a better approach by fixing a single point in the domain (0, 0) and then doing equality on the subsets of the range? Something maybe when we're past deadline territory.
Fixes google#2489 The main idea, in Layout/Utils.h::tryProveUnequal, is to use some known points in the domain or range of a packing relation and test them for identical image/preimage across the two layouts. If they differ, then the layouts must be unequal. If all tests fail to prove a difference, then the exact `isEqual` test is run. This heuristic makes sense for ciphertext layout relations, but not general relations, because, for example, we know that (0, 0) is almost always in the domain of some element of a relation with a 2D domain (all packed 2D tensors have a first entry), and (0, 0) is almost always in the range of every relation (all packings have a first ciphertext with a first slot, though that slot may be unused). In this implementation, we test the extreme points of each dimension, as well as some interior points. For the lenet example, the runtime of `layout-propagation` on lenet.mlir was reduced from many minutes to 0.8760 seconds in -c opt mode and 8.8 seconds in -c dbg mode. Note, the original implementation was slow, not because `isEqual` is slow for one particular call, but that `isEqual` is called many times and is mildly slow for all of them, so the speedup is realized by making most, but not all, tests take the fast heuristic path.
b4f17b9 to
887a93c
Compare
Fixes #2489
The main idea, in Layout/Utils.h::tryProveUnequal, is to use some known points in the domain or range of a packing relation and test them for identical image/preimage across the two layouts. If they differ, then the layouts must be unequal. If all tests fail to prove a difference, then the exact
isEqualtest is run.This heuristic makes sense for ciphertext layout relations, but not general relations, because, for example, we know that (0, 0) is almost always in the domain of some element of a relation with a 2D domain (all packed 2D tensors have a first entry), and (0, 0) is almost always in the range of every relation (all packings have a first ciphertext with a first slot, though that slot may be unused).
In this implementation, we test the extreme points of each dimension, as well as some interior points. For the lenet example, the runtime of
layout-propagationon lenet.mlir was reduced from many minutes to 0.8760 seconds in -c opt mode and 8.8 seconds in -c dbg mode. Note, the original implementation was slow, not becauseisEqualis slow for one particular call, but thatisEqualis called many times and is mildly slow for all of them, so the speedup is realized by making most, but not all, tests take the fast heuristic path.