Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alternative PNG alpha-channel extraction #600

Closed
wants to merge 5 commits into from

Conversation

pointlessone
Copy link
Member

According to PNG spec filtering happens on a per-byte basis and uses corresponding bytes from other pixels. This means that we don't have to actually unfilter all the data. Instead it's possible to split pixel data into planes and reuse filters.

A few points to note:

  • Filters are reused from original image
  • Resulting opaque PNG uses the same filtered data
  • Resulting mask PNG image uses the same filteres data
  • Both resulting images are valid PNG images
  • All the unfiltering math is skipped which results in great boosts

Benchmarks

Benchmark is done by creating a document and adding a PNG image, repeated 100 times.

Mem is recorder by watching Activity Monitor. Values are not exact and provided just to illustrate approximate scale.

RBX

Original

                   user     system      total        real
opaque:        2.724633   0.541629   3.266262 (  1.698649)
transparent: 340.380027   9.747552 350.127579 (326.535695)

Mem: 225-227 MB

Modified

                   user     system      total        real
opaque:        2.675154   0.526945   3.202099 (  1.706802)
transparent:  77.626282   0.724561  78.350843 ( 72.843238)

Mem: 95-97 MB

Change

Execution time: 4.47x faster
Memory: 57% less

MRI

Original

                   user     system      total        real
opaque:        1.900000   0.190000   2.090000 (  2.206105)
transparent: 339.740000   6.330000 346.070000 (349.369761)

Mem: 110-130 -> 150-180 MB (Leak?)

Notes: Mem fluctuates widely. Median steadily rises. Mem leak?

Modified

                   user     system      total        real
opaque:        1.780000   0.170000   1.950000 (  1.964678)
transparent:  50.380000   0.350000  50.730000 ( 51.129041)

Mem: 36-38 MB

Change

Execution time: 6.82x faster
Memory: 62% - 79% less

JRuby 1.7.8

Original

                   user     system      total        real
opaque:        5.390000   0.270000   5.660000 (  2.817000)
transparent: 564.830000   5.690000 570.520000 (474.866000)

Mem: 287.4 -> 298, 293 -> 312.2 MB (Leak?)

Modified

                    user     system       total         real
opaque:         5.190000   0.210000    5.400000 (   2.497000)
transparent: 3559.290000  10.180000 3569.470000 (3614.186000)

Mem: 197.9 -> 211, 185.2 -> 187.2 MB (Leak?)

                    user     system       total         real
opaque:         5.440000   0.240000    5.680000 (   2.985000)
transparent: 3553.670000  12.020000 3565.690000 (3668.508000)

Mem: 224, 223.1 MB

Notes: Mem steadily rises, no fluctuation. Sometimes it drops.

Change

Execution time: 6.25x slower
Memory: 25%—34% less

JRuby 1.7.9

Original

                   user     system      total        real
opaque:        5.570000   0.240000   5.810000 (  3.035000)
transparent: 628.690000   7.210000 635.900000 (585.308000)

Mem: 298MB

Modified

                  user     system      total        real
opaque:       5.130000   0.240000   5.370000 (  2.919000)
transparent: 27.050000   0.760000  27.810000 ( 24.731000)

Mem: 234.1 MB

Change

Execution time: 22.87x faster
Memory: 21% less


PNG-related specs are broken. That is expected because "unfiltered" data is different with this implementation.

Also note that this code is extremely slow on JRuby. I'm looking into it.

@practicingruby
Copy link
Member

All of this is really exciting, hopefully you can get the tests to go green and figure out why we're slow on JRuby, but otherwise I'm very interested in this approach.

For the MRI memory variation, this may be due to mark-and-sweep GC running "whenever it gets around to it". So although there may be leak, seeing memory spikes in Ruby are not uncommon whenever there a bunch of short-lived objects that haven't been GCed yet.

@practicingruby
Copy link
Member

@cheba Which JRuby did you run on? I haven't even run my own optimizations there yet, but I'd like to be consistent. Either 1.7.8 or 1.7.9 would be OK. Earlier than that and we run into JRuby bugs affecting Prawn's font stuff.

@pointlessone
Copy link
Member Author

@sandal jruby 1.7.8 (1.9.3p392) 2013-11-14 0ce429e on Java HotSpot(TM) 64-Bit Server VM 1.6.0_65-b14-462-11M4609 [darwin-x86_64]

@pointlessone
Copy link
Member Author

JRuby 1.7.8 seem to be broken.

JRuby 1.7.9 is much faster. On the other hand, 1.7.9 is inconsistent between runs on memory side. Sometimes it's 230MB and sometimes it's over 400MB. It's hard to get any meaningful improvement (or otherwise) ratios.

@practicingruby
Copy link
Member

Still, we are seeing less memory usage overall on JRuby 1.7.9 with this patch than without it. I would be OK with requiring JRuby 1.7.9+ since it's a bug fix release anyway.

Please work on figuring out the spec failures, and revising test coverage where necessary. This seems more promising than my branch at the moment.

This was referenced Dec 16, 2013
@practicingruby
Copy link
Member

I think you're still seeing GC related memory fluctuations: we do care about peak utilization but it may be hard to get a clear sense of how much of this is actual memory used vs. garbage not being collected.

Anyway, I feel like this definitely is an improvement over what we have now, so revise the tests as needed and let's see if we can try to get this merged before 0.14.0 next month.

@yob
Copy link
Member

yob commented Dec 16, 2013

Are you measuring jruby memory usage using activity monitor or similar?

I'm no JVM expert, but I seem to recall it can allocate itself a large heap
space and is often reluctant to return memory to the OS. We might need to
find a more sophisticated way to measure memory usage on jruby.
On 17/12/2013 2:25 AM, "Alexander Mankuta" notifications@github.com wrote:

JRuby 1.7.8 seem to be broken.

JRuby 1.7.9 is much faster. On the other hand, 1.7.9 is inconsistent
between runs on memory side. Sometimes it's 230MB and sometimes it's over
400MB. It's hard to get any meaningful improvement (or otherwise) ratios.


Reply to this email directly or view it on GitHubhttps://github.com//pull/600#issuecomment-30668729
.

@bradediger
Copy link
Member

@yob is correct. You can measure JRuby memory usage most accurately using a tool like jvisualvm, which will also allow you to take snapshots to look at individual object allocations.

@pointlessone
Copy link
Member Author

@yob Yes, I just watch Activity Monitor. Simply to have some consistent numbers between implementations.

Though, it might be useful to have some better numbers to measure the changes. Can anybody share a short HOWTO on how to measure actually used memory on JVM?

It certainly looks like what you've described. Memory usage doesn't fluctuate much on JRuby.

According to PNG spec filtering happens on a per-byte basis and uses
corresponding bytes from other pixels. This means that we don't have to actually
unfilter all the data. Instead it's possible to split pixel data into planes.

A few points to note:

* Filters are reused from original image
* Resulting opaque PNG uses the same filtered data
* Resulting mask PNG image uses the same filteres data
* Both resulting images are valid PNG images
* All the unfiltering masth is skipped which results in great boosts

Here are some benchmarks:

RBX
==========================================================

Original
--------

                   user     system      total        real
opaque:        2.724633   0.541629   3.266262 (  1.698649)
transparent: 340.380027   9.747552 350.127579 (326.535695)

Mem: 225-227 MB

Modified
--------

                   user     system      total        real
opaque:        2.675154   0.526945   3.202099 (  1.706802)
transparent:  77.626282   0.724561  78.350843 ( 72.843238)

Mem: 95-97 MB

MRI
=========================================================

Original
--------
                   user     system      total        real
opaque:        1.900000   0.190000   2.090000 (  2.206105)
transparent: 339.740000   6.330000 346.070000 (349.369761)

Mem: 110-130 -> 150-180 MB (Leak?)

Notes: Mem fluctuates widely. Median steadily rises. Mem leak?

Modified
--------
                   user     system      total        real
opaque:        1.780000   0.170000   1.950000 (  1.964678)
transparent:  50.380000   0.350000  50.730000 ( 51.129041)

Mem: 36-38 MB

JRuby
=========================================================

Original
--------
                   user     system      total        real
opaque:        5.390000   0.270000   5.660000 (  2.817000)
transparent: 564.830000   5.690000 570.520000 (474.866000)

Mem: 287.4 -> 298, 293 -> 312.2 MB (Leak?)

Modified
--------
                    user     system       total         real
opaque:         5.190000   0.210000    5.400000 (   2.497000)
transparent: 3559.290000  10.180000 3569.470000 (3614.186000)

Mem: 197.9 -> 211, 185.2 -> 187.2 MB (Leak?)

                    user     system       total         real
opaque:         5.440000   0.240000    5.680000 (   2.985000)
transparent: 3553.670000  12.020000 3565.690000 (3668.508000)

Mem: 224, 223.1 MB

Notes: Mem steadily rises, no fluctuation. Sometimes it drops.
@pointlessone
Copy link
Member Author

@sandal I consider this pull request to be completed. Please review.

@practicingruby
Copy link
Member

I merged it! In doing so, I squashed everything down into a single commit to make reverting easier if we experience problems.

Two points:

  1. In the future please use feature branches on prawnpdf rather than your fork, it makes upstream management a little bit easier and allows all Prawn committers to make revisions to the same pull request.
  2. I am considering releasing this code as part of a 0.13.1 maintenance release, because it only touches internal features. Any concerns there?

@pointlessone
Copy link
Member Author

  1. "Make revisions" means commit to the branch? Otherwise I don't see any advantages. On my fork I can safely rewrite history to organize commits nicely. On the upstream repo that is a bit dangerous.
  2. Apart from supporting 16bit alpha channels it's the same. It may break some tests on user projects if they are trying to do string comparisons of rendered documents as it was done in repo. Otherwise it should be safe.

@practicingruby
Copy link
Member

Yes, we want to be able to have all contributors make commits to each other's feature branches. This makes it so that simple changes can be directly applied, speeding up feedback loops. I think it's still safe to rewrite history as needed as long as you only affect the commits on your branch and not master or stable. Also, we can always rebase before we merge, as I did with this branch.

I'm going to merge this into stable now, and it will be included in the 0.13.1 release, whenever that happens. Thanks!

@pointlessone
Copy link
Member Author

Got it. Will use topic branches.

Thanks for merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants