Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize applyMask in PdImageXObject #121

Closed
wants to merge 5 commits into from

Conversation

gunnar-ifp
Copy link

@gunnar-ifp gunnar-ifp commented Jun 18, 2021

There was a severe performance issue with really big masks if the image needs to be scaled to it (i.e. 10000*10000 pixels). Scaling bicubic can take 6-10 seconds. This patch tries to switch to bilinear resizing for these cases, although the threshold might have to be fine tuned, still.

There was also a double allocation for the final masked image when we can simply use the image since applyMask() is always fed with a newly created one. Reference hogging and needless allocation have been removed.

Additionally the alpha blending routines were very slow, working on pixels. There is now a staggered approach by:

  • direct byte masking which is very fast even for big images (right now does not work with padded buffers),
  • exploiting data buffer's sample system to merge the alpha component into the ARGB image, letting the sample model do the bit masking,
  • slow pixel expansion to reverse premultiply matte values (but using fixed point integer arithmetics).

Additionally also using the interpolation flag of the mask to decide if the mask should be interpolated.

optimized PDImageXObject:
- applyMask(): faster alpha blending routines, avoid extra image allocation
- scaleImage(): use a lower quality scaling algorithm if large image sizes are involved
fixed some formatting, also explicitly disabled interpolation on image format conversion "scaling".
@THausherr
Copy link
Contributor

Do you have a PDF where this large scaling happened? I found only one in my entire collection where the "largeScale switch is arbitrarily chosen" segment is hit, which is the second page of PDFBOX-2103, but this was a small image.

@gunnar-ifp
Copy link
Author

Do you have a PDF where this large scaling happened? I found only one in my entire collection where the "largeScale switch is arbitrarily chosen" segment is hit, which is the second page of PDFBOX-2103, but this was a small image.

Yes, you can download the PDF from here: https://archive.org/details/AlfaWaffenkatalog1911
It's around 46 MB. It has been processed with Abby Finereader that probably did the optimization. It's a nice and clever one, getting the 400 pages into 46 MB but it's very hard for most PDF libraries to work with this file.

@THausherr
Copy link
Contributor

Oops, you did mention that one in your mailing list post. OK, I'll need some time to look at all of this.

@THausherr
Copy link
Contributor

I tested this one with the Alpha file (page 5) and all my files. Rendering of the Alfa file is definitively much faster. I intend to remove the "interpolate" block from scaleImage because the effect is almost invisible (but I would leave a comment). I did not have any difference in the rendering tests (done at 96 dpi), I tried at 300 dpi where I did a visual test and saw only meaningless differences, but it is a bit slower (with interpolation).

Is there a reason that you'd like to keep that block? (besides that the block was there before). It comes from PDFBOX-4218, where I had corrected previous code (PDFBOX-2750) that always interpolated. I retested the file PDFBOX-2750 and it always looks ok. Even when trying the "worst" interpolation I can't get a bad rendering of the file from PDFBOX-2750. I was wondering if java itself was improved, but no, when running an old version PDFBox the rendering was still bad. So there were other changes that improved the quality of the scaling.

@gunnar-ifp
Copy link
Author

gunnar-ifp commented Jun 24, 2021

Yes, I would like to keep the interpolation block.

The PDF 8.9.6.2 spec says that stencil masks can have an interpolate flag, and when enabled to interpolate the mask, not the colors, so low res stencil masks don't appear with jagged edges.

Originally this happens when the image / mask are rendered onto the canvas, but we apply the mask before hand w/o knowing the target resolution. This is why we have to interpolate ourself and also work in maybe way too high of image size.

But I noticed a few things while writing this text:

We do extend the the 1 bit stencil mask to 8 bit, which might pose an interesting problem:
Looking at the PDFBOX-4218, if instead the image was 48 pixel wide and the mask 6 pixel, and interpolate flag set for the mask, this would create alpha blending near the masked out areas. It would be interesting to create such a file and see how other pdf engines behave. But if this is wrong, one can simply clamp the alpha value to 0 and 255 with a threshold at 127 when writing the alpha value into the stencil mask image (I reckon this should not be done for soft masks). I do believe that the current behaviour is the better choice.

As for interpolating the image, if we knew the target resolution it's going to be rendered in the end, we could scale everything bigger than that down to that and then proceed. But we don't and the PDF spec says that mask and image can have different resolutions, even aspect ratios, but get squeezed into the same target rectangle. We need to emulate that and that's why we scale up to the larger one.
So if we were to not interpolate the image when scaling it up to the mask, we can introduce a crappy, jagged image, which might be actually written 1:1 to the canvas because maybe the mask is in the resolution of the final render. Usually rendering a non masked image to the canvas will do the proper interpolation, whatever is set in the canvas graphics, but we are circumventing that in applyMask() with large masks.

PDFBOX-2750 doesn't use interpolation so scaleImage doesn't interpolate. And it's just a color soup background (you can't really see it in PDFDebugger because the mask is always applied, but if one disabled the alpha composition in applyMask() it shows the soup). So differences are probably almost invisible. With disabled alpha composition and forced interpolation jagged vs not so jagged diagonal lines are clearly visible in the soup, they are simply not visible with the stencil mask:
PDF2750-bgsoup_jagged
PDF2750-bgsoup_soft
Still shows that image interpolation is good when it is requested by the image.

Even worse, we are violating that overlay principle sometimes:
If there is an image with 100 x 100 and the mask is 200 x 10, we will scale the mask to 100 x 100 when we actually must scale everything to 200 x 100 as to not deliberately reduce resolution of image or mask.

Seeing that interpolation is not that slow once we switch to bilinear, I could live with it. For very large scaling operations even worse than 10*10 maybe one should switch to nearest neighbor. I was looking for faster image scaling in java and did find AffineTransformationOp, which is handled by java natively and I thought that the Graphics instances use that already. Turns out it is still faster except for nearest neighbor (results for Alfa Page 1):
Graphics.drawImage():
Bicubic / Quality: 5.83s
Bicubic / Speed: 5.83s
Bilinear / Speed: 1.74s
Nearest Neighbor: 0.11
AffineTransformOp.filter() (no difference in speeds between Raster and BufferedImage (even if src = RGB and dst = ARGB)
Bicubic: 4.79
Bilinear: 1.5s
Nearest Neighbor: 0.47

I added some code for both the Op and the Max(mask, image).

@gunnar-ifp
Copy link
Author

gunnar-ifp commented Jun 25, 2021

I did some further testing, because I felt like the large scale test that the scaling must be bigger than factor 9 x 9 is pointless and only the target resolution is important, i.e. no matter how small the source image is, if the target is 10000x10000, it's going to be slow. And I was right. Here some tests, scaling up from 10, 50, 100, 500, 1000, 2000, 5000 and 10000 square to 10000:
afto = AffineTransformOp.filter(), draw = Graphics2D.drawImage(). Times in milliseconds.

GRAY -> GRAY
~~~~~~~~~~~~
      | Bilinear                 | Bicubic                  | Nearest Neighbor
00010 | afto =  686, draw = 2679 | afto = 2526, draw = 8882 | afto = 227, draw = 111
00050 | afto =  494, draw = 2679 | afto = 1658, draw = 8873 | afto = 180, draw = 110
00100 | afto =  477, draw = 2678 | afto = 1537, draw = 8870 | afto = 181, draw = 111
00500 | afto =  461, draw = 2678 | afto = 1451, draw = 8877 | afto = 179, draw = 112
01000 | afto =  459, draw = 2680 | afto = 1434, draw = 8877 | afto = 181, draw = 110
02000 | afto =  458, draw = 2677 | afto = 1424, draw = 8876 | afto = 182, draw = 111
05000 | afto =  461, draw = 2678 | afto = 1430, draw = 8904 | afto = 182, draw = 115
10000 | afto =  461, draw =   23 | afto = 1467, draw =   23 | afto = 182, draw =  25


RGB -> ARGB
~~~~~~~~~~~
      | Bilinear                 | Bicubic                  | Nearest Neighbor
00010 | afto = 2133, draw = 2290 | afto = 7576, draw = 7674 | afto = 587, draw = 147
00050 | afto = 1994, draw = 2302 | afto = 6573, draw = 7757 | afto = 530, draw = 146
00100 | afto = 1985, draw = 2305 | afto = 6445, draw = 7680 | afto = 530, draw = 146
00500 | afto = 1979, draw = 2417 | afto = 6327, draw = 7673 | afto = 533, draw = 145
01000 | afto = 1979, draw = 2338 | afto = 6330, draw = 7670 | afto = 533, draw = 147
02000 | afto = 1987, draw = 2297 | afto = 6312, draw = 7688 | afto = 538, draw = 146
05000 | afto = 2030, draw = 2294 | afto = 6477, draw = 7691 | afto = 577, draw = 150
10000 | afto = 2144, draw =  136 | afto = 6502, draw =  141 | afto = 720, draw = 141


Bilinear
~~~~~~~~~~~~
      | ARGB -> ARGB             | GRAY -> ARGB             | INDEXED -> ARGB
00010 | afto = 2130, draw = 2284 | afto = 2812, draw = 2758 | afto = 2133, draw = 2625
00050 | afto = 1984, draw = 2281 | afto = 2573, draw = 2759 | afto = 1988, draw = 2561
00100 | afto = 1972, draw = 2282 | afto = 2558, draw = 2758 | afto = 1977, draw = 2560
00500 | afto = 1974, draw = 2282 | afto = 2436, draw = 2756 | afto = 1977, draw = 2560
01000 | afto = 1975, draw = 2282 | afto = 2452, draw = 2756 | afto = 1986, draw = 2563
02000 | afto = 1976, draw = 2282 | afto = 2527, draw = 2756 | afto = 1997, draw = 2568
05000 | afto = 1986, draw = 2288 | afto = 3096, draw = 2760 | afto = 2092, draw = 2573
10000 | afto = 1963, draw =  152 | afto = 5072, draw =  149 | afto = 2229, draw =  112
      | BINARY -> BINARY         | INDEXED -> INDEXED
00010 | afto = 2741, draw = 4463 | afto = 2943, draw = 3388
00050 | afto = 2504, draw = 4238 | afto = 2634, draw = 3388
00100 | afto = 2349, draw = 4235 | afto = 2585, draw = 3393
00500 | afto = 2365, draw = 4236 | afto = 2574, draw = 3404
01000 | afto = 2383, draw = 4247 | afto = 2570, draw = 3404
02000 | afto = 2393, draw = 4274 | afto = 2795, draw = 3417
05000 | afto = 2582, draw = 4507 | afto = 2638, draw = 3384
10000 | afto = 3196, draw =  460 | afto = 2809, draw =   22

What we can see is:

  • Bicubic is ~3 times slower than bilinear.
  • Graphics is always faster for nearest neighbor and if source = destination size
  • Using a graphics to scale up anything if the target is not not ARGB or RGB is slow. E.g. scaling up the mask to final size.
  • Using indexed is also very slow.
    With the commits of last night using AffineTransformationOp if src < dest and interpolation is enabled, scaleImage should be a lot faster for mask scaling and faster overall.
    We can remove the scale factor test and simply test on destination pixel area to establish thresholds for bicubic, bilinear and maybe nearest neighbor.

- Ensure that mask is 8 bit gray when doing alpha composition.
- Scaling speed depends on destination size only, so largeScale switch will only depend on that, too.
@THausherr
Copy link
Contributor

THausherr commented Jun 27, 2021

Thanks, this has been committed in https://issues.apache.org/jira/browse/PDFBOX-5229

@asfgit asfgit closed this in 3b5a080 Jun 27, 2021
@gunnar-ifp gunnar-ifp deleted the PDImageXObject branch November 24, 2021 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants