Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TurboJPEG has problems tiling images from PDFBox #482

Closed
cmhdave opened this issue May 7, 2021 · 9 comments
Closed

TurboJPEG has problems tiling images from PDFBox #482

cmhdave opened this issue May 7, 2021 · 9 comments
Labels
Milestone

Comments

@cmhdave
Copy link
Contributor

cmhdave commented May 7, 2021

We are seeing issues with TurboJPEG and tiling PDFs. It is really strange and arbitrary. What I can say is that it does not happen when you remove TurboJPEG from the processing chain. I am able to reproduce this every time with the sample.pdf file that I've created and attached.

What you need to do is start Cantaloupe 5.0 with TurboJPEG available in the library path. We do this like so:

java -cp cantaloupe.jar -Djava.library.path=/usr/local/opt/jpeg-turbo/lib -Dcantaloupe.config=cantaloupe.properties edu.illinois.library.cantaloupe.StandaloneEntry

Then you open a tile like below (which happens to be a tile that was being requested from our manifest via Mirador)

http://localhost:8182/iiif/2/sample.pdf/512,512,512,512/512,/0/default.jpg

The resulting image starts at x=0, y=0 instead of x=512, y=512.

Now try the same URL with a PNG extension and you will get the correct tile.

http://localhost:8182/iiif/2/sample.pdf/512,512,512,512/512,/0/default.png

Here's where it gets strange. If you load the PDF as a JPG with a slightly different size it works correctly (notice 513, rather than 512,

http://localhost:8182/iiif/2/sample.pdf/512,512,512,512/513,/0/default.jpg

Another strange thing is that I can debug this in IntelliJ into TurboJPEGImageWriter in write(BufferedImage image, OutputStream os) where when you call tjc.setSourceImage(image, 0, 0, 0, 0) the image IS THE CORRECT TILE.

image

image

Yet this is what gets rendered in the browser.

image

I have all very basic controls turned on and no caching enabled (to rule out an caching issue)

========

sample.pdf

@adolski
Copy link
Contributor

adolski commented May 14, 2021

@cmhdave Could you try this in 5.0.2?

@cmhdave
Copy link
Contributor Author

cmhdave commented May 14, 2021

Sure enough, works like a charm with 5.0.2. I will mark as closed. Thanks @adolski !

@cmhdave cmhdave closed this as completed May 14, 2021
@cmhdave cmhdave reopened this Dec 3, 2021
@cmhdave
Copy link
Contributor Author

cmhdave commented Dec 3, 2021

We are seeing this exact same issue once again with v5.0.4. I can remove TurboJPEG and it works fine, add it back in and the behavior we see above returns.

adolski pushed a commit that referenced this issue Dec 22, 2021
…482)

This works around an apparent bug in TJCompressor whereby it does not deal correctly with images produced by BufferedImage.getSubimage()
@adolski adolski added the bug label Dec 22, 2021
@adolski adolski added this to the 5.0.6 milestone Dec 22, 2021
@adolski
Copy link
Contributor

adolski commented Dec 22, 2021

In the past, the code in Java2DUtil.cropPhysically() was used to crop BufferedImages, but at some point I switched to Java2DUtil.cropVirtually() because it is so much more efficient. But it seems that TJCompressor doesn't work correctly with BufferedImages that have been cropped this way. The above commit is a workaround that redraws a virtually-cropped image into a new one that is TJCompressor-safe.

@adolski adolski closed this as completed Dec 22, 2021
@hrvoj3e
Copy link

hrvoj3e commented Apr 20, 2022

I tried sample.pdf with this commit fff5425 and it mostly works but not for images with x=0 on y axis.

E.g.
0,1024,512,512: /iiif/2/https:%2F%2Fgithub.com%2Fcantaloupe-project%2Fcantaloupe%2Ffiles%2F6443024%2Fsample.pdf/0,1024,512,512/512,/0/default.jpg?cache=nocache

will return the same image as

0,0,512,512: /iiif/2/https:%2F%2Fgithub.com%2Fcantaloupe-project%2Fcantaloupe%2Ffiles%2F6443024%2Fsample.pdf/0,0,512,512/512,/0/default.jpg?cache=nocache

@adolski could you check please?

@DiegoPino
Copy link

From my limited understanding the issue is that the patch has a typo (the same condition twice which escapes the Y offset when X is not offset):

  if (image.getRaster().getSampleModelTranslateX() < 0 ||
                image.getRaster().getSampleModelTranslateX() < 0) {

Should be

  if (image.getRaster().getSampleModelTranslateX() < 0 ||
                image.getRaster().getSampleModelTranslateY() < 0) {

I will make a fork and test locally during the weekend if anyone is still observing this. removing the TurboJPEG library is still a workaround but because it is quite deep into the code, manual strategy needs to be selected and an ugly warning appears.

@hrvoj3e
Copy link

hrvoj3e commented Aug 14, 2023

I am so glad that this issue got your attention.
Please let us know when you have something to test.

@DiegoPino
Copy link

@hrvoj3e I'm building a multi arch docker container right now, if all goes well will share the tag so you can test (or use in production). Will still make a pull request against develop (again/if the fix is good enough) but who will merge this/if/someone will merge at all is an unknown to me.

@DiegoPino
Copy link

See #593 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants