Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JPX rendering issues #5649

Closed
timvandermeij opened this issue Jan 14, 2015 · 13 comments
Closed

JPX rendering issues #5649

timvandermeij opened this issue Jan 14, 2015 · 13 comments

Comments

@timvandermeij
Copy link
Contributor

While testing various .jp2 files, I have come across numerous images that are not rendered correctly by PDF.js, mainly because of unimplemented header types. I'm listing the broken PDFs along with the source of their images below.

First file

PDF
https://pdf.yt/d/fSMs0rOfBIjacwHv

Source
Images taken from https://code.google.com/p/openjpeg/source/browse/data/input/conformance/?r=2833, files subsampling_1.jp2 and subsampling_2.jp2

License
See https://code.google.com/p/openjpeg/source/browse/data/input/conformance/COPYRIGHT?r=2833. I don't think it's permissive enough to include them in the repository.

Console output

"Warning: Unsupported header type 1970628964 (uuid)"
"Warning: Trying to recover from JPX Error: Out of packets"
"Warning: Unsupported header type 1970628964 (uuid)"

Second file

PDF
https://pdf.yt/d/Zd4e1zB9oGGPBO85

Source
Images taken from https://code.google.com/p/openjpeg/source/browse/data/input/conformance/?r=2833, files file1.jp2 to file9.jp2

License
See https://code.google.com/p/openjpeg/source/browse/data/input/conformance/COPYRIGHT?r=2833. I don't think it's permissive enough to include them in the repository.

Console output

"Warning: Unsupported header type 2020437024 (xml )"
"Warning: Unknown colorspace 21"
"Warning: Unknown colorspace 20"
"Warning: Trying to recover from JPX Error: Out of packets"
"Warning: Unsupported header type 1667523942 (cdef)"
"Warning: Unsupported header type 2020437024 (xml )"
"Warning: Unsupported header type 1885564018 (pclr)"
"Warning: Unsupported header type 1668112752 (cmap)"

Third file

PDF
https://pdf.yt/d/heg0slSRtdEA-vqi

Source
Image taken from http://opf-labs.org/format-corpus/jp2k-formats/, file balloon.jp2

License
From http://opf-labs.org/format-corpus/jp2k-formats/readme.md: created from https://commons.wikimedia.org/wiki/File:1783_balloonj.jpg, which is in public domain, so we can include this file in the repository.

Console output

"Warning: Unsupported header type 1969843814 (uinf)"
"Warning: Unsupported header type 2020437024 (xml )"
"Warning: Trying to recover from JPX Error: Out of packets"
"Warning: Unable to decode image: Error: JPX Error: Invalid segmentation symbol"
"Warning: Dependent image isn't ready yet"
@fkaelberer
Copy link
Contributor

I wonder which features of those images are actually supported by the pdf standard. The pdf specification says at page 87:

To promote interoperability, the specifications define a subset of JPX called JPX baseline (of which JP2 is also a subset). The complete details of the baseline set of JPX features are contained in ISO/IEC 15444-2, Information Technology—JPEG 2000 Image Coding System: Extensions.
Data used in PDF image XObjects should be limited to the JPX baseline set of features, except for enumerated color space 19 (CIEJab).

Unfortunately, the JPEG2000 spec isn't freely available to check the baseline feature set.
Some of the images in the first two documents have problems in Sumatra PDF as well, the first one even in Acrobat 8.

By the way, the image corpus of the second file can be found at http://www.gwg.nga.mil/ntb/baseline/software/testfile/Jpeg2000/index.htm, where they also list the jpeg2000 feature sets of the individual images and a copyright notice

@jpambrun
Copy link

I think this is related to #5727. This behavior is very similar to what I experienced. Boats.pdf and ballon.pdf both contains image with multiple quality layers. Unfortunately, it does not fix this issue.

Everything read after the bug occurs is misaligned causing the "Out of packets" error. When reading boat.pdf and ballons.pdf with PR #5727, "JPX Error: Invalid tag tree" is thrown suggesting that the inclusion tree was not built while decoding the first layer.

@boxerab
Copy link

boxerab commented Nov 23, 2015

+1 to fix this issue(s). Is there anyone currently working in improving j2k support in pdf.js?

@timvandermeij
Copy link
Contributor Author

Not that I'm aware of. Please note that the image I posted might not be supported by the PDF standard, so there might be no need to actually support them.

@boxerab
Copy link

boxerab commented Nov 23, 2015

Thanks. How would I find out what part of the jpeg 2000 standard is supported by PDF ? All of the images in the OpenJPEG repo are jpeg 2000 Part I images.

@timvandermeij
Copy link
Contributor Author

Hard to say because from #5649 (comment) it appears that the baseline set is not freely available. We might need to check what other PDF readers support.

@boxerab
Copy link

boxerab commented Nov 23, 2015

Thanks. From the standard doc:

"To promote interoperability, the specifications define a subset of JPX called
JPX baseline (of which JP2 is also a subset)."

So, since all of the files in the OpenJPEG repo are jp2 files (jp2 == jpeg 2000 part I), pdf.js should support them.

Perhaps you can interest some of the OpenJPEG guys in taking a look at the issues.

@brendandahl
Copy link
Contributor

Another example:
https://bugzilla.mozilla.org/show_bug.cgi?id=1695361
Warning: Unknown colorspace 12

@skierpage
Copy link

Someone sent me a PDF with a jpx image (according to pdfimages -list) that renders repeated and pixelated in Thunderbird and Firefox 97.0a1. It also appears the same in the Gwenview and Okular viewers in Fedora 35, and doesn't appear at all in LibreOffice Writer and Inkscape (using either Poppler/Cairo or internal import).

In the Firefox browser console, I see

PDF ca5ebce36b1609665d1737484e8a8020 [1.4 macOS Version 12.0.1 (Build 21A559) Quartz PDFContext / Pages] (PDF.js: 2.12.248)    viewer.js:1508:13
Warning: Unsupported header type 1667523942 (cdef). pdf.worker.js:1098:13

suggesting the user made the document in the MacOS Pages app.
The warning is the same as one of the warnings in the second file in this bug. Here's what it looks like in Thunderbird:
garbled_jpx_JPEG2000
Kinda cool 😉, but not what it looks like if I extract the image with pdfimages -jp2 and convert with ImageMagick. I can't supply the original PDF but if you need the extracted image let me know.

I'm not finding similar bug reports for Okular and Gwenview. Does the similar garbled appearance mean the problem is in a shared library?

@mrtcode
Copy link

mrtcode commented Aug 19, 2022

Another JPX that fails to be rendered on PDF.js 2.16.75. Works well on Preview.app and all other tested PDF viewers.

wu-89008118317-11-1660897219.pdf.

[Log] Warning: Dependent image isn't ready yet (pdf.js, line 456)
[Log] Warning: Unsupported header type 1970628964 (uuid). (pdf.worker.js, line 1150)
[Log] Warning: JPX: Unsupported COD options (terminationOnEachCodingPass, verticallyStripe, predictableTermination). (pdf.worker.js, line 1150)
[Log] Warning: Unable to decode image "img_p0_1": "JpxError: JPX error: JPX error: Out of packets". (pdf.worker.js, line 1150)

@Snuffleupagus
Copy link
Collaborator

Unfortunately none of the PDF links in #5649 (comment) work now, and given #5649 (comment) it's not clear if those JPEG2000 images are actually supported when used in PDF files.

@timvandermeij Given the points above, and the age of this issue, should we perhaps close this one now?
Please note that we've a number of, maybe slightly more actionable, JPEG2000 issues in the image-jpx category.

@andylizi
Copy link

andylizi commented Jun 19, 2023

Unfortunately none of the PDF links in #5649 (comment) work now

Here's an easy way to reproduce the issue:

  1. Make an PNG image with an alpha channel.
    test.png
  2. Download OpenJPEG.
  3. Run opj_compress -i test.png -o test.jp2. Because the input file has alpha, OpenJPEG will generate an cdef header.
  4. Open Adobe Acrobat, create a PDF from the JP2 file.
  5. The resultant PDF renders correctly in Acrobat, SumatraPDF and Chrome (PDFium). But not in pdf.js:

result

In this case, the transparent pixels are just shown as black, but I've seen cases where the image is missing entirely.

Files involved: samples.zip


given #5649 (comment) it's not clear if those JPEG2000 images are actually supported when used in PDF files.

Regarding whether the cdef header is actually included by the PDF standard, I've done some research. According to the format description of JPEG 2000 Part 1 summarized by the Library of Congress: (emphasis mine)

Full name: ISO/IEC 15444-1:2016. Information technology -- JPEG 2000 image coding system -- Part 1: Core coding system, Annex I: JP2 file format syntax (formal name)

Color maintenance: Rich support, further extended in JPX. In JP2_FF, the color space of the decompressed image data is indicated in the Color Specification box inside the JP2 Header box…… For palettized images, ……the Component Mapping box defines which codestream components map to which palette components or bypass the palette…… Finally, the Channel Definition box maps codestream components (if unpalettized) or channels to color components, allowing them to be permuted if desired and enabling support for alpha channels (opacity) as well as color channels.

History: ISO/IEC 15444-2:2004. Information technology -- JPEG 2000 image coding system: Extensions. Defines a set of lossless (bit-preserving) and lossy compression methods for coding continuous-tone, bi-level, grey-scale, colour digital still images, or multi-component images;

The "Component Definition box" mentioned in the document is the cdef header, according to OpenJPEG source code:

#define     JP2_JP   0x6a502020    /**< JPEG 2000 signature box */
#define     JP2_FTYP 0x66747970    /**< File type box */
#define     JP2_JP2H 0x6a703268    /**< JP2 header box (super-box) */
#define     JP2_IHDR 0x69686472    /**< Image header box */
#define     JP2_COLR 0x636f6c72    /**< Colour specification box */
/* ... */
#define     JP2_CMAP 0x636d6170    /**< Component Mapping box */
#define     JP2_CDEF 0x63646566    /**< Channel Definition box */
/* ... */

Also, the document mentioned that ISO/IEC 15444-2 supports multi-component images. And in the PDF standard excerpted above:

To promote interoperability, the specifications define a subset of JPX called JPX baseline (of which JP2 is also a subset). The complete details of the baseline set of JPX features are contained in ISO/IEC 15444-2, Information Technology—JPEG 2000 Image Coding System: Extensions.

Considering all of those secondary evidence together, I do feel they make a strong case for the standardized inclusion of at least some of the unimplemented features reported in this issue.

@timvandermeij
Copy link
Contributor Author

Closing since most files aren't available anymore and the ones that are render fine now, most likely thanks to #17946. Moreover, given that we delegate JPX parsing to OpenJPEG now we can also be sure that any spec-compliant images will be handled properly now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants