New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Image Optimisations #54

Closed
wants to merge 5 commits into
from

Conversation

Projects
None yet
2 participants
@thelmstedt

I need to be able to generate spreadsheets with 2000 images fast enough for a synchronous HTTP request. 3.16 takes ~25 seconds for this usecase for me. These changes take it down to ~1 second. I've added a test for my case, and I don't get any more failures than trunk. I don't think I've broken any invariants but it's definitely worth a 2nd look!

The slowdown was caused by the cost of creating and sorting PackagePartNames. I assume it's part of the OOXML spec so there's no avoiding the overhead. But addPicture happened to make some redundant usage of these:

  • adding a new relationship enumerated all current relationships, building PackagePartNames for each
  • PackageParts were stored as as a TreeMap<PackagePartName, PackagePart>

Instead we

  • cache relationship lookups by name (similarly to what is already done for ID and type)
  • Store PackageParts in a HashMap for quick lookups, and explicitly sort its .values()

First commit adds a benchmark using http://openjdk.java.net/projects/code-tools/jmh/

Prior to my changes addPicture gets:

# Run complete. Total time: 00:00:31

Benchmark                                                          Mode  Cnt        Score         Error   Units
AddImageBench.benchCreatePicture                                   avgt   10     2831.586 ±      38.824   us/op
AddImageBench.benchCreatePicture:·gc.alloc.rate                    avgt   10      810.418 ±      22.303  MB/sec
AddImageBench.benchCreatePicture:·gc.alloc.rate.norm               avgt   10  2407955.352 ±   33327.581    B/op
AddImageBench.benchCreatePicture:·gc.churn.PS_Eden_Space           avgt   10      847.676 ±     361.511  MB/sec
AddImageBench.benchCreatePicture:·gc.churn.PS_Eden_Space.norm      avgt   10  2520570.616 ± 1084187.937    B/op
AddImageBench.benchCreatePicture:·gc.churn.PS_Survivor_Space       avgt   10        0.561 ±       0.645  MB/sec
AddImageBench.benchCreatePicture:·gc.churn.PS_Survivor_Space.norm  avgt   10     1667.673 ±    1912.256    B/op
AddImageBench.benchCreatePicture:·gc.count                         avgt   10       16.000                counts
AddImageBench.benchCreatePicture:·gc.time                          avgt   10       69.000                    ms
AddImageBench.benchCreatePicture:·stack                            avgt               NaN                   ---

Afterwards we get 10x improvement in execution time, and 100x in memory:

# Run complete. Total time: 00:00:31

Benchmark                                                          Mode  Cnt      Score       Error   Units
AddImageBench.benchCreatePicture                                   avgt   10    227.339 ±    49.226   us/op
AddImageBench.benchCreatePicture:·gc.alloc.rate                    avgt   10    119.667 ±    25.859  MB/sec
AddImageBench.benchCreatePicture:·gc.alloc.rate.norm               avgt   10  28021.776 ±    54.539    B/op
AddImageBench.benchCreatePicture:·gc.churn.PS_Eden_Space           avgt   10     98.653 ±   314.433  MB/sec
AddImageBench.benchCreatePicture:·gc.churn.PS_Eden_Space.norm      avgt   10  19826.075 ± 63192.153    B/op
AddImageBench.benchCreatePicture:·gc.churn.PS_Survivor_Space       avgt   10      0.228 ±     1.090  MB/sec
AddImageBench.benchCreatePicture:·gc.churn.PS_Survivor_Space.norm  avgt   10     45.594 ±   217.979    B/op
AddImageBench.benchCreatePicture:·gc.count                         avgt   10      2.000              counts
AddImageBench.benchCreatePicture:·gc.time                          avgt   10     88.000                  ms
AddImageBench.benchCreatePicture:·stack                            avgt             NaN                 ---

Happy to back out the benchmark inclusion if you don't want to include another test dependency.

thelmstedt added some commits Oct 23, 2016

PackageRelationshipCollection caches lookup by targetPart
Building partnames for all relationships is expensive. Here we avoid
this in findExistingRelation, which is used every time we add a relation
to a DocumentPart.
PackagePartCollection optimisations
Instead of extending a lookup TreeMap and incurring the natural ordering
cost for each insertion, we wrap a HashMap and ensure calls to .values()
are sorted.
@thelmstedt

This comment has been minimized.

Show comment
Hide comment
@thelmstedt

thelmstedt May 15, 2017

You should be able to just execute the main method in the benchmark to run, although I am using intellij (with the project loaded from gradle) to do so - YMMV.

You should be able to just execute the main method in the benchmark to run, although I am using intellij (with the project loaded from gradle) to do so - YMMV.

@onealj

Looks good to me, with a few questions.

@@ -239,8 +239,8 @@ private static ZipEntrySource openZipEntrySourceStream(ThresholdInputStream zis)
}
if (this.zipArchive == null) {
return this.partList.values().toArray(

This comment has been minimized.

@onealj

onealj May 15, 2017

Contributor

Is partList.sortedValues() an expensive call? If so, the result should be saved in a temp variable so it doesn't have to be called twice.

Does this work?

return this.partList.sortedValues().toArray(
    new PackagePart[this.partList.size()]);
@onealj

onealj May 15, 2017

Contributor

Is partList.sortedValues() an expensive call? If so, the result should be saved in a temp variable so it doesn't have to be called twice.

Does this work?

return this.partList.sortedValues().toArray(
    new PackagePart[this.partList.size()]);

This comment has been minimized.

@thelmstedt

thelmstedt May 15, 2017

Whoops! Yeah that's actually how it's done further down the method. I'll make the change and ammend the pull request

@thelmstedt

thelmstedt May 15, 2017

Whoops! Yeah that's actually how it's done further down the method. I'll make the change and ammend the pull request

@@ -649,12 +649,11 @@ public PackagePart getPart(PackagePartName partName) {
*/

This comment has been minimized.

@onealj

onealj May 15, 2017

Contributor

I see that you started this pull request back in 2016. Have you rebased this to poi trunk (currently progressing towards 3.17 beta 1)?

@onealj

onealj May 15, 2017

Contributor

I see that you started this pull request back in 2016. Have you rebased this to poi trunk (currently progressing towards 3.17 beta 1)?

This comment has been minimized.

@thelmstedt

thelmstedt May 15, 2017

Yep, I've been running a version of this in production since Jan. I rebased before opening this pull request yesterday

@thelmstedt

thelmstedt May 15, 2017

Yep, I've been running a version of this in production since Jan. I rebased before opening this pull request yesterday

@@ -344,7 +344,7 @@ else if (contentType != null) {
}
}
return partList.values().toArray(new ZipPackagePart[partList.size()]);

This comment has been minimized.

@onealj

onealj May 15, 2017

Contributor

Any reason for returning a PackagePart array instead of a ZipPackagePart array?

@onealj

onealj May 15, 2017

Contributor

Any reason for returning a PackagePart array instead of a ZipPackagePart array?

This comment has been minimized.

@thelmstedt

thelmstedt May 15, 2017

They actually are ZipPackageParts at that point, but the method returns PackagePart[] so no consumer can know about that. Seemed odd to force it to cast at that point, though it makes no real difference if you wanna keep it as it was.

@thelmstedt

thelmstedt May 15, 2017

They actually are ZipPackageParts at that point, but the method returns PackagePart[] so no consumer can know about that. Seemed odd to force it to cast at that point, though it makes no real difference if you wanna keep it as it was.

@onealj

onealj approved these changes May 15, 2017

Thanks for adding the testXSSFSAddPicture unit test

@onealj

This comment has been minimized.

Show comment
Hide comment
@onealj

onealj May 15, 2017

Contributor

I like that benchmarking library. I have no reservations about making it a test dependency. There are plenty of other places we could use a benchmarking library. We have a few places of roll-your-own timed unit tests which will inevitably break on a sluggish build slave.

Aside from our unit tests, integration tests, api checks, we could also trend execution memory and speed. This will be especially helpful as we look for an xmlbeans replacement.

I've committed your changes, essentially unmodified, with a couple whitespace changes in SVN r1795252 https://svn.apache.org/viewvc?view=revision&revision=1795252

This is really good stuff. Be careful, though. We might just ask you to be a committer with this quality of pull requests.

Contributor

onealj commented May 15, 2017

I like that benchmarking library. I have no reservations about making it a test dependency. There are plenty of other places we could use a benchmarking library. We have a few places of roll-your-own timed unit tests which will inevitably break on a sluggish build slave.

Aside from our unit tests, integration tests, api checks, we could also trend execution memory and speed. This will be especially helpful as we look for an xmlbeans replacement.

I've committed your changes, essentially unmodified, with a couple whitespace changes in SVN r1795252 https://svn.apache.org/viewvc?view=revision&revision=1795252

This is really good stuff. Be careful, though. We might just ask you to be a committer with this quality of pull requests.

@asfgit asfgit closed this in 5c62492 May 15, 2017

@onealj

This comment has been minimized.

Show comment
Hide comment
@onealj

onealj May 16, 2017

Contributor

These changes will be included in POI 3.17 beta 1.

Contributor

onealj commented May 16, 2017

These changes will be included in POI 3.17 beta 1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment