Add a DagOptimizer test #745

johnynek · 2017-08-19T21:29:23Z

We are using the DagOptimizer at stripe before planning to reduce the size of some online graphs (went from 115 storm bolts or so to 69 in one example).

However, even in the case where we reach 69, there are rules that don't seem to be fully applied and I have not yet found out why.

Anyway, more test coverage never hurts.

johnynek · 2017-08-19T21:29:44Z

@ttim can you take a look?

codecov-io · 2017-08-19T21:42:41Z

Codecov Report

Merging #745 into develop will increase coverage by 0.1%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           develop     #745     +/-   ##
==========================================
+ Coverage    72.23%   72.34%   +0.1%     
==========================================
  Files          154      154             
  Lines         3742     3742             
  Branches       209      209             
==========================================
+ Hits          2703     2707      +4     
+ Misses        1039     1035      -4

Impacted Files	Coverage Δ
...witter/summingbird/scalding/ScaldingPlatform.scala	`76.19% <0%> (+0.59%)`	⬆️
.../main/scala/com/twitter/summingbird/Producer.scala	`77.27% <0%> (+1.51%)`	⬆️
...a/com/twitter/summingbird/scalding/LoopState.scala	`75% <0%> (+25%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0f06d22...39fa04c. Read the comment docs.

johnynek · 2017-08-19T22:40:06Z

okay, this now fails for me (both the fanOut and the idempotency test).

cc @non

johnynek · 2017-08-19T22:47:59Z

actually, I can't get idempotency to fail now... maybe it is just fanOut being broken being the issue, which some of the rules use...

johnynek · 2017-08-20T02:10:29Z

Okay, I don't see a bug actually. It was an issue with the test actually, and an inconsistency between how fanOut was defined in Dependants and ExpressionDag.

This test actually seems to pass for me now. I'll try to find any counter example next week, but so far, I guess everything looks correct still.

johnynek · 2017-08-20T23:44:36Z

okay, idempotency failure:

 arg0 = Summer(IdentityKeyedProducer(NamedProducer(IdentityKeyedProducer(MergedProducer(IdentityKeyedProducer(FlatMappedProducer(LeftJoinedProducer(IdentityKeyedProducer(NamedProducer(IdentityKeyedProducer(NamedProducer(IdentityKeyedProducer(OptionMappedProducer(IdentityKeyedProducer(FlatMappedProducer(Source(List(-483916215, 1947511647, -1, 2147483647, -1094725117, -1107074050, 1316607071, -894421428, 438074163, -1618870436, 2147483647, -927761159, 1000599281, 1, 2147483647, 1, -941762984, 1, -2147483648, 1134838592, -2147483648, -1360150793, -1774434453, -1968350011, -1999846660, 1, 940697649, -561677987, 1326923701, 734987219, 0, -1731750202, 782845389, 0, 158805678, -1, 1, -560692419, -1, 879491512, 1, 685151277, 333970011, -2147483648, 961509966, -2147483648, -1, 870342926, 2051572436, 2147483647, -178556721, 0, -1028446598, -2147483648, 1, 2147483647, 1, -1915898801, 1834110173, 519019579, 2147483647, 0, 1514822104, -1, -877887704, 2147483647, -224941626, -675798454, 1, 2092027473, -487281773, 638669148, -2147483648, 1, 2147483647, -2147483648, 1, -567098335, -1, 795549827, -1995849024, -577239589, 1867920629, 1, 0, -1839762855, 0, 1650235957, -385664255, 1676297418, 2147483647, -1701772912, -1, -1, 2147483647, 330749634, 1, 2147483647, -2147483648, 1462163691)),org.scalacheck.GenArities$$Lambda$3183/1815517983@3d533ae3)),org.scalacheck.GenArities$$Lambda$3183/1815517983@368308bf)),tjiposzOlkplcu)),tvpwpdyScehGnwcaVjjWvlfuwxatxhdjhozscucpbq)),Map(1122506458 -> -422595330, 985940285 -> 1773535903, -1012102957 -> 577191710, 1865284784 -> 0, 1133011483 -> -1, -1571538327 -> -106754684, -883421086 -> -1843831487, -1186741108 -> -1297188435, 511615026 -> 892838107, 2147483647 -> 0, -1 -> 0, 654842167 -> 2147483647, 766280486 -> 0, 46050994 -> -2147483648, -146388742 -> -1080181663, 1883141387 -> 2147483647, -1170327451 -> 41927314, -1265823639 -> 2038185509, -1531812404 -> -1, -1426771473 -> 0, 107420900 -> 1, -778321474 -> -613855929, 730121132 -> -1824061783, -193486826 -> 0, -1603389086 -> -1, -1663776702 -> 2025651619, 1791509260 -> -1, -249195648 -> 0, 1795996912 -> 470882984, -1635961338 -> 995590274, -1315404816 -> 1, 303482491 -> 1, -1120629917 -> -913091299, 1135859794 -> 0, 1 -> 431843073, -1055837163 -> -428420920, 1920175445 -> -2147483648, -248886982 -> -2003982440, 1518754224 -> -1, 796959542 -> 1, -1658050660 -> -1870329784, 1129081426 -> -1878450200, 992361458 -> -1, -315984592 -> 1918575397, -1137239383 -> 1, -102255667 -> -1, -734906869 -> -611180863, -473252927 -> 2147483647, 886258153 -> 1, -916284450 -> 1, 1559234564 -> -1740554658, -2147483648 -> -1363669838, 0 -> -1)),org.scalacheck.GenArities$$Lambda$3183/1815517983@2bb51e67)),IdentityKeyedProducer(OptionMappedProducer(IdentityKeyedProducer(FlatMappedProducer(Source(List(-483916215, 1947511647, -1, 2147483647, -1094725117, -1107074050, 1316607071, -894421428, 438074163, -1618870436, 2147483647, -927761159, 1000599281, 1, 2147483647, 1, -941762984, 1, -2147483648, 1134838592, -2147483648, -1360150793, -1774434453, -1968350011, -1999846660, 1, 940697649, -561677987, 1326923701, 734987219, 0, -1731750202, 782845389, 0, 158805678, -1, 1, -560692419, -1, 879491512, 1, 685151277, 333970011, -2147483648, 961509966, -2147483648, -1, 870342926, 2051572436, 2147483647, -178556721, 0, -1028446598, -2147483648, 1, 2147483647, 1, -1915898801, 1834110173, 519019579, 2147483647, 0, 1514822104, -1, -877887704, 2147483647, -224941626, -675798454, 1, 2092027473, -487281773, 638669148, -2147483648, 1, 2147483647, -2147483648, 1, -567098335, -1, 795549827, -1995849024, -577239589, 1867920629, 1, 0, -1839762855, 0, 1650235957, -385664255, 1676297418, 2147483647, -1701772912, -1, -1, 2147483647, 330749634, 1, 2147483647, -2147483648, 1462163691)),org.scalacheck.GenArities$$Lambda$3183/1815517983@3d533ae3)),org.scalacheck.GenArities$$Lambda$3183/1815517983@368308bf)))),ncn)),Map(),com.twitter.algebird.IntRing$@4deb204c),

 arg1 = com.twitter.summingbird.planner.DagOptimizer$RemoveNames$@7d710ac9.orElse(com.twitter.summingbird.planner.DagOptimizer$RemoveIdentityKeyed$@2a6e2829).orElse(com.twitter.summingbird.planner.DagOptimizer$FlatMapFusion$@728d18e6).orElse(com.twitter.summingbird.planner.DagOptimizer$OptionMapFusion$@5dba444a).orElse(com.twitter.summingbird.planner.DagOptimizer$OptionToFlatMap$@28e82a76).orElse(com.twitter.summingbird.planner.DagOptimizer$KeyFlatMapToFlatMap$@731c6022).orElse(com.twitter.summingbird.planner.DagOptimizer$FlatMapKeyFusion$@69cc1c8f).orElse(com.twitter.summingbird.planner.DagOptimizer$ValueFlatMapToFlatMap$@66db9de8).orElse(com.twitter.summingbird.planner.DagOptimizer$FlatMapValuesFusion$@f09a0df).orElse(com.twitter.summingbird.planner.DagOptimizer$FlatThenOptionFusion$@1f04dc6c).orElse(com.twitter.summingbird.planner.DagOptimizer$DiamondToFlatMap$@16258367).orElse(com.twitter.summingbird.planner.DagOptimizer$MergePullUp$@2c2b7896).orElse(com.twitter.summingbird.planner.DagOptimizer$AlsoPullUp$@6ee63a94)

)

I'll try to reproduce that failure and fix.

There was a subtle bug in ExpressionDag.fanOut which caused non-idempotency, and underapplication of rules in some cases. The problem was in computing fanOut in Id space, where there is actually no 1:1 function between Ids and N nodes. The solution is to compute fanOut directly in N space which is what is meaningful anyway. This seems to fix the bug even with rather large numbers of trials

johnynek · 2017-08-21T08:48:12Z

This seems to fix the issue.

The bug was with fanOut. It was working in the Expr[_, _] and Id[_] space, which is not meaningful to the rules, which operate on the original N[_] space. The problem is that rewrites can give the same node two different Ids. When that happens, it looks like the fanOut is greater than it is, and this can prevent rules from applying.

If you jump back to the N[_] space, and then back into the optimizer, you reset the Ids and you may have a chance that the rule would apply then. This seems to be the problem.

Along the way I cleaned things up slightly.

Can you take a look @ttim

johnynek · 2017-08-21T08:58:01Z

had a storm flake. Restarted a 2.12 build.

johnynek · 2017-08-21T21:25:30Z

I'd like to cherry pick this when we merge and publish a version of 0.10.1 which includes this fix. We are using the optimizer and it would be helpful to us.

Also, I'm considering breaking this code out into a standalone library. I copied it into scalding, but it is inconvenient to have to publish these large projects to update this totally generic DAG rewriting tool.

piyushnarang

Looks good to me but I'm not super familiar with the code, might be nice to get another pair of eyes on this.

piyushnarang · 2017-08-21T23:34:33Z

summingbird-core-test/src/test/scala/com/twitter/summingbird/planner/DagOptimizerTest.scala

+
+  implicit val generatorDrivenConfig =
+    PropertyCheckConfig(minSuccessful = 1000, maxDiscarded = 1000) // the producer generator uses filter, I think
+    //PropertyCheckConfig(minSuccessful = 100, maxDiscarded = 1000) // the producer generator uses filter, I think


piyushnarang · 2017-08-22T17:53:21Z

summingbird-core-test/src/test/scala/com/twitter/summingbird/planner/DagOptimizerTest.scala

+
+  }
+
+  test("test some idempotency specific past failures") {


maybe nice to add a comment above on the details of the past failure? might be hard for folks reading the code to know?

I don't know what to say. This was a hand minimized example (it took me about an hour) from a failure case found by scalacheck. Since the failures were quite rare, it took a long time to even find a failure with scalacheck, so once I found it, I wanted to test that failure every time.

That's what I mean by "specific past failures".

I understand (somewhat) why this failed now, but I couldn't easily generate another that would also show the bug.

Can you suggest some specific text you would like to see me add?

Yeah I guess reading the title it doesn't give you a sense of what the issue is and what it's testing. Would it make sense to add a tldr of your understanding of why it failed?

CLAassistant · 2019-11-16T23:49:34Z

All committers have signed the CLA.

Add a DagOptimizer test

8894914

johnynek assigned ttim Aug 19, 2017

johnynek requested a review from ttim August 19, 2017 21:29

johnynek added 2 commits August 19, 2017 11:38

add a fanOut test

3f21029

Make sure we are using the right Gen

39fa04c

Add idempotency check, turn up number of trials

7be4298

add a better toString on Rule

3e7b47b

fix Dependants

996a15e

turn down the number of trials

6048196

piyushnarang approved these changes Aug 22, 2017

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a DagOptimizer test #745

Add a DagOptimizer test #745

johnynek commented Aug 19, 2017

johnynek commented Aug 19, 2017

codecov-io commented Aug 19, 2017 •

edited

Loading

johnynek commented Aug 19, 2017

johnynek commented Aug 19, 2017

johnynek commented Aug 20, 2017

johnynek commented Aug 20, 2017

johnynek commented Aug 21, 2017

johnynek commented Aug 21, 2017

johnynek commented Aug 21, 2017

piyushnarang left a comment

piyushnarang Aug 21, 2017

piyushnarang Aug 22, 2017

johnynek Aug 22, 2017

piyushnarang Aug 22, 2017

CLAassistant commented Nov 16, 2019 •

edited

Loading

Add a DagOptimizer test #745

Are you sure you want to change the base?

Add a DagOptimizer test #745

Conversation

johnynek commented Aug 19, 2017

johnynek commented Aug 19, 2017

codecov-io commented Aug 19, 2017 • edited Loading

Codecov Report

johnynek commented Aug 19, 2017

johnynek commented Aug 19, 2017

johnynek commented Aug 20, 2017

johnynek commented Aug 20, 2017

johnynek commented Aug 21, 2017

johnynek commented Aug 21, 2017

johnynek commented Aug 21, 2017

piyushnarang left a comment

Choose a reason for hiding this comment

piyushnarang Aug 21, 2017

Choose a reason for hiding this comment

piyushnarang Aug 22, 2017

Choose a reason for hiding this comment

johnynek Aug 22, 2017

Choose a reason for hiding this comment

piyushnarang Aug 22, 2017

Choose a reason for hiding this comment

CLAassistant commented Nov 16, 2019 • edited Loading

codecov-io commented Aug 19, 2017 •

edited

Loading

CLAassistant commented Nov 16, 2019 •

edited

Loading