Add Image Optimisations #54

Closed
wants to merge 5 commits into
from

Conversation

Projects
None yet
2 participants

I need to be able to generate spreadsheets with 2000 images fast enough for a synchronous HTTP request. 3.16 takes ~25 seconds for this usecase for me. These changes take it down to ~1 second. I've added a test for my case, and I don't get any more failures than trunk. I don't think I've broken any invariants but it's definitely worth a 2nd look!

The slowdown was caused by the cost of creating and sorting PackagePartNames. I assume it's part of the OOXML spec so there's no avoiding the overhead. But addPicture happened to make some redundant usage of these:

  • adding a new relationship enumerated all current relationships, building PackagePartNames for each
  • PackageParts were stored as as a TreeMap<PackagePartName, PackagePart>

Instead we

  • cache relationship lookups by name (similarly to what is already done for ID and type)
  • Store PackageParts in a HashMap for quick lookups, and explicitly sort its .values()

First commit adds a benchmark using http://openjdk.java.net/projects/code-tools/jmh/

Prior to my changes addPicture gets:

# Run complete. Total time: 00:00:31

Benchmark                                                          Mode  Cnt        Score         Error   Units
AddImageBench.benchCreatePicture                                   avgt   10     2831.586 ±      38.824   us/op
AddImageBench.benchCreatePicture:·gc.alloc.rate                    avgt   10      810.418 ±      22.303  MB/sec
AddImageBench.benchCreatePicture:·gc.alloc.rate.norm               avgt   10  2407955.352 ±   33327.581    B/op
AddImageBench.benchCreatePicture:·gc.churn.PS_Eden_Space           avgt   10      847.676 ±     361.511  MB/sec
AddImageBench.benchCreatePicture:·gc.churn.PS_Eden_Space.norm      avgt   10  2520570.616 ± 1084187.937    B/op
AddImageBench.benchCreatePicture:·gc.churn.PS_Survivor_Space       avgt   10        0.561 ±       0.645  MB/sec
AddImageBench.benchCreatePicture:·gc.churn.PS_Survivor_Space.norm  avgt   10     1667.673 ±    1912.256    B/op
AddImageBench.benchCreatePicture:·gc.count                         avgt   10       16.000                counts
AddImageBench.benchCreatePicture:·gc.time                          avgt   10       69.000                    ms
AddImageBench.benchCreatePicture:·stack                            avgt               NaN                   ---

Afterwards we get 10x improvement in execution time, and 100x in memory:

# Run complete. Total time: 00:00:31

Benchmark                                                          Mode  Cnt      Score       Error   Units
AddImageBench.benchCreatePicture                                   avgt   10    227.339 ±    49.226   us/op
AddImageBench.benchCreatePicture:·gc.alloc.rate                    avgt   10    119.667 ±    25.859  MB/sec
AddImageBench.benchCreatePicture:·gc.alloc.rate.norm               avgt   10  28021.776 ±    54.539    B/op
AddImageBench.benchCreatePicture:·gc.churn.PS_Eden_Space           avgt   10     98.653 ±   314.433  MB/sec
AddImageBench.benchCreatePicture:·gc.churn.PS_Eden_Space.norm      avgt   10  19826.075 ± 63192.153    B/op
AddImageBench.benchCreatePicture:·gc.churn.PS_Survivor_Space       avgt   10      0.228 ±     1.090  MB/sec
AddImageBench.benchCreatePicture:·gc.churn.PS_Survivor_Space.norm  avgt   10     45.594 ±   217.979    B/op
AddImageBench.benchCreatePicture:·gc.count                         avgt   10      2.000              counts
AddImageBench.benchCreatePicture:·gc.time                          avgt   10     88.000                  ms
AddImageBench.benchCreatePicture:·stack                            avgt             NaN                 ---

Happy to back out the benchmark inclusion if you don't want to include another test dependency.

thelmstedt added some commits Oct 23, 2016

@thelmstedt thelmstedt Benchmark adding images c26b958
@thelmstedt thelmstedt PackageRelationshipCollection caches lookup by targetPart
Building partnames for all relationships is expensive. Here we avoid
this in findExistingRelation, which is used every time we add a relation
to a DocumentPart.
1d7cf35
@thelmstedt thelmstedt Drawing test 0fc2463
@thelmstedt thelmstedt PackagePartCollection optimisations
Instead of extending a lookup TreeMap and incurring the natural ordering
cost for each insertion, we wrap a HashMap and ensure calls to .values()
are sorted.
d4f01a9

You should be able to just execute the main method in the benchmark to run, although I am using intellij (with the project loaded from gradle) to do so - YMMV.

@onealj

Looks good to me, with a few questions.

@@ -239,8 +239,8 @@ private static ZipEntrySource openZipEntrySourceStream(ThresholdInputStream zis)
}
if (this.zipArchive == null) {
- return this.partList.values().toArray(
@onealj

onealj May 15, 2017

Is partList.sortedValues() an expensive call? If so, the result should be saved in a temp variable so it doesn't have to be called twice.

Does this work?

return this.partList.sortedValues().toArray(
    new PackagePart[this.partList.size()]);
@thelmstedt

thelmstedt May 15, 2017

Whoops! Yeah that's actually how it's done further down the method. I'll make the change and ammend the pull request

@@ -344,7 +344,7 @@ else if (contentType != null) {
}
}
- return partList.values().toArray(new ZipPackagePart[partList.size()]);
@onealj

onealj May 15, 2017

Any reason for returning a PackagePart array instead of a ZipPackagePart array?

@thelmstedt

thelmstedt May 15, 2017

They actually are ZipPackageParts at that point, but the method returns PackagePart[] so no consumer can know about that. Seemed odd to force it to cast at that point, though it makes no real difference if you wanna keep it as it was.

@@ -649,12 +649,11 @@ public PackagePart getPart(PackagePartName partName) {
*/
@onealj

onealj May 15, 2017

I see that you started this pull request back in 2016. Have you rebased this to poi trunk (currently progressing towards 3.17 beta 1)?

@thelmstedt

thelmstedt May 15, 2017

Yep, I've been running a version of this in production since Jan. I rebased before opening this pull request yesterday

@onealj

onealj approved these changes May 15, 2017

Thanks for adding the testXSSFSAddPicture unit test

onealj commented May 15, 2017 edited

I like that benchmarking library. I have no reservations about making it a test dependency. There are plenty of other places we could use a benchmarking library. We have a few places of roll-your-own timed unit tests which will inevitably break on a sluggish build slave.

Aside from our unit tests, integration tests, api checks, we could also trend execution memory and speed. This will be especially helpful as we look for an xmlbeans replacement.

I've committed your changes, essentially unmodified, with a couple whitespace changes in SVN r1795252 https://svn.apache.org/viewvc?view=revision&revision=1795252

This is really good stuff. Be careful, though. We might just ask you to be a committer with this quality of pull requests.

asfgit closed this in 5c62492 May 15, 2017

onealj commented May 16, 2017

These changes will be included in POI 3.17 beta 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment