Zipper unselection benchmark #60

ncreep · 2011-09-26T17:42:20Z

I think we should devise some target benchmark for zipper unselection (deep, shallow, or whatever).

Otherwise, uber horrible performance like the one before the fix in issue #55 may creep in without us noticing.

djspiewak · 2011-09-28T03:47:42Z

I think this is absolutely a good idea. I put some thought into this a while ago, but I didn't act on it because, frankly, it's such a mess to encode the equivalent operations in scala.xml or jaxp. :-S Also, I was a little afraid of what the performance would turn out to be.

You're quite right that we should be measuring this empirically.

josharnold52 · 2011-10-11T21:46:18Z

If no one has started on this, I'll take a crack at it. I also want to get some measurements on some of the other Zipper methods (updated, flatMap, etc.)

djspiewak · 2011-10-12T03:28:02Z

I haven't started on it yet (haven't found the time). It's definitely an important item though; thanks for tackling it!

josharnold52 · 2011-10-19T07:39:10Z

The above pull request includes these benchmarks along with some optimizations. The new "modify" trials measure an entire select/modify/unselect cycle (or its equivalent). To run them, enter test-perf :modify from SBT. I also added benchmarks for individual zipper methods, which can be run using test-perf :zipperOps. See test-perf --help for more information on the new command line options.

Just to give a flavor for some of the new trials, I'll discuss some of the deep-modify results. These trials find an element by name and add an attribute to it. anti-xml does this using a Zipper. The other implementations use the most straightforward algorithm I could think of.

The first deep-modify trial uses the same query as deepSelectSmall and adds an attribute to each element:

[deepModifyFewSmall]        deep modify a few elements of a 7 MB tree          [modify 8 nodes]
 + anti-xml:                    min:   2 ms, max:   3 ms, average:   2 ms          [result has 378832 nodes, 133010 elems, 2780 attrs]
 + scala.xml:                   min:  57 ms, max:  61 ms, average:  58 ms          [result has 378832 nodes, 133010 elems, 2780 attrs]
 + javax.xml:                   min:  65 ms, max:  71 ms, average:  67 ms          [result has 378832 nodes, 133010 elems, 2780 attrs]

As can be seen, anti-xml significantly outperforms the other implementations. This is primarily due to the bloom filter optimization, which gives anti-xml a huge advantage in this trial.

A different extreme can be seen in the deepModifyManySmall trial. It is structurally similar, but it selects 2772 nodes rather than 8. This amounts to one element from every "doc" structure in the spending.xml file.

[deepModifyManySmall]       deep modification of many elements of a 7 MB tree  [modify 2771 nodes]
 + anti-xml:                    min:  89 ms, max: 105 ms, average:  94 ms          [result has 378832 nodes, 133010 elems, 5543 attrs]
 + scala.xml:                   min:  51 ms, max:  55 ms, average:  52 ms          [result has 378832 nodes, 133010 elems, 5543 attrs]
 + javax.xml:                   min:6867 ms, max:6867 ms, average:6867 ms          [result has 378832 nodes, 133010 elems, 5543 attrs]

The first item that jumps out is the awful jaxp performance. I haven't fully analyzed this, but it seems that updating nodes found via getElementsByTagName is very slow.

Notice also that scala.xml outperforms anti-xml. To analyze this, we can look at the selection-only times for this particular query.

[deepModifyManySmallSel]    Selection-only times for [deepModifyManySmall]     [378832 nodes, 133010 elems, 2772 attrs]
 + anti-xml:                    min:  60 ms, max:  65 ms, average:  62 ms          [Selected 2771 nodes]
 + scala.xml:                   min: 170 ms, max: 188 ms, average: 175 ms          [Selected 2771 nodes]
 + javax.xml:                   min:   6 ms, max:   8 ms, average:   6 ms          [Selected 2771 nodes]
 + anti-xml - no zipper:        min:  60 ms, max:  65 ms, average:  62 ms          [Selected 2771 nodes]
 + anti-xml - no bloom:         min:  32 ms, max:  35 ms, average:  33 ms          [Selected 2771 nodes]

As can be seen, roughtly 2/3 of the anti-xml modify time was spent on selection. It turns out that the bloom filter doesn't help with this particular query, because we can't eliminate any significant branches of the tree. The anti-xml - no bloom implementation measures this effect by using a custom selector that does not use the filter. For this query, the "no-bloom" version runs in roughly half the time as the normal version.

Strangely, the scala.xml selection time is worse than its modify time! That's because the selection trial uses \\, which is apparently slower than the manual traversal I used for the modify trial.

To get an idea of what's possible, I tried writing a some modify algorithms in anti-xml that didn't use selectors or zippers. The full set of modify results is:

[deepModifyManySmall]       deep modification of many elements of a 7 MB tree  [modify 2771 nodes]
 + anti-xml:                    min:  89 ms, max: 105 ms, average:  94 ms          [result has 378832 nodes, 133010 elems, 5543 attrs]
 + scala.xml:                   min:  51 ms, max:  55 ms, average:  52 ms          [result has 378832 nodes, 133010 elems, 5543 attrs]
 + scala-xml, opt:              min:  39 ms, max:  40 ms, average:  39 ms          [result has 378832 nodes, 133010 elems, 5543 attrs]
 + javax.xml:                   min:6867 ms, max:6867 ms, average:6867 ms          [result has 378832 nodes, 133010 elems, 5543 attrs]
 + anti-xml no-zip:             min: 251 ms, max: 268 ms, average: 259 ms          [result has 378832 nodes, 133010 elems, 5543 attrs]
 + anti-xml no-zip, opt:        min:  25 ms, max:  27 ms, average:  25 ms          [result has 378832 nodes, 133010 elems, 5543 attrs]
 + anti-xml no-zip, update:     min:  28 ms, max:  35 ms, average:  29 ms          [result has 378832 nodes, 133010 elems, 5543 attrs]

My best time was 25ms, in the anti-xml no-zip, opt implementation. You can find the code for all of these in the modifyTrial.scala file.

josharnold52 · 2011-10-21T20:16:53Z

Followup - I actually found some optimizations that significantly reduce the overhead of the bloom filter. Indeed, they actually turn it into a net positive for the deepModifyManySmallSel trial, above. They took the corresponding modify trial down from 94ms to 61ms.

djspiewak · 2011-10-23T16:13:39Z

I'm going to try to find some time in the near future to create some new charts on the website based on the new benchmarks. This is really amazing stuff. I was hoping to get zipper to within an order of 5 worse than manual rebuilding. The idea that it could be faster than rebuilding with scala.xml just never occurred to me!

josharnold52 · 2011-10-23T20:29:48Z

:) Let me know if anything could use further explanation. Implementations like "anti-xml - no bloom" are mainly of internal interest, so I hacked it such that they don't show up at smaller run levels. If you run test-perf without arguments, then you should just get the main results without the extra noise.

josharnold52 mentioned this issue Oct 19, 2011

New Performance Trials and Optimizations #72

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zipper unselection benchmark #60

Zipper unselection benchmark #60

ncreep commented Sep 26, 2011

djspiewak commented Sep 28, 2011

josharnold52 commented Oct 11, 2011

djspiewak commented Oct 12, 2011

josharnold52 commented Oct 19, 2011

josharnold52 commented Oct 21, 2011

djspiewak commented Oct 23, 2011

josharnold52 commented Oct 23, 2011

Zipper unselection benchmark #60

Zipper unselection benchmark #60

Comments

ncreep commented Sep 26, 2011

djspiewak commented Sep 28, 2011

josharnold52 commented Oct 11, 2011

djspiewak commented Oct 12, 2011

josharnold52 commented Oct 19, 2011

josharnold52 commented Oct 21, 2011

djspiewak commented Oct 23, 2011

josharnold52 commented Oct 23, 2011