Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

imageio-jpeg does not handle IIOMetaData #4

Closed
marcuslinke opened this issue May 30, 2013 · 37 comments
Closed

imageio-jpeg does not handle IIOMetaData #4

marcuslinke opened this issue May 30, 2013 · 37 comments
Assignees

Comments

@marcuslinke
Copy link

I'm currently trying to convert a CMYK jpeg to an RGB one while preserving its meta data. Then i recognized that JPEGImageReader.getImageMetadata(...) method always returns null.

@haraldk
Copy link
Owner

haraldk commented May 30, 2013

Hi Marcus,

You are right, and I'm aware of the issue. :-(

I have a quick fix in the local branch, where the methods are overridden to just call delegate.getXxMetadata() instead of super.getXxMetadata() which always return null.

Unfortunately, the metadata from the delegate isn't necessary correct (ie: it never contains a APP2/ICC_PROFILE), because I filter out certain JPEG segments from the delegate, to prevent it from crashing. Now, this is probably OK for you, as you won't need the original CMYK color profile in the converted RGB image. But it's still an issue that I feel should be resolved, and relying on this "feature" might break things in the future.

I do have a couple of ideas for how to fix the issue, but they are all quite time consuming, as the com.sun.imageio.plugins.jpeg.JPEGMetadata isn't extendable or mutable... So I need to recreate it all from scratch + extend it with the fixes that the new reader brings along. In addition, we probably want support for the EXIF flavor (in contrast to JFIF) that the current metadata just ignores.

A rough guesstimate would be a full weeks worth of work, just for the basics.

So... How badly do you need metadata? ;-)

@marcuslinke
Copy link
Author

Hi Harald,

thanks for explanation. I'm not sure if I need the metadata but hey...maybe my employer will! ;-)

Sadly your advice to call the delegate does not work:

javax.imageio.IIOException: Inconsistent metadata read from stream
at com.sun.imageio.plugins.jpeg.JPEGMetadata.(JPEGMetadata.java:343)
at com.sun.imageio.plugins.jpeg.JPEGImageReader.getImageMetadata(JPEGImageReader.java:998)

However, i'm not very familar with all this image/metadata stuff, but wouldn't it be better to not rely on that delegate with all the filtering stuff needed? What about to implement the fixed logic directly (by copying most of the OpenJDK implementation for example). Is it that what you have in mind? At least this is it what i would try to do if my employer insists this functionality. Maybe thats some kind naive and/or violates some licence terms. WDYT?

Regards

@haraldk
Copy link
Owner

haraldk commented May 30, 2013

Too bad, we're out of luck for a quick fix then... (these exceptions are part of the reason I created the reader in the first place).

I agree, the best thing is not to rely on the delegate, and that is my goal. However, the metadata code for the JPEG format alone is about 4000+ lines, and requires quite extensive knowledge of the JPEG/JFIF format. The OpenJDK code is LGPL licensed, and I can't use it, but I have thought about looking at the code from JAI (jai-imageio), which is BSD licensed and thus more compatible (license-wise). That code should provide compatible metadata.

It's still several days work I'm afraid, just to shoe-horn one of these implementations onto my reader...

@marcuslinke
Copy link
Author

Yeah, i agree. It will be much work to implement it. After consulting my employer we decided to renounce the meta data functionality as we can live without it at the moment. However, could you shortly explain the licence compatibility issue with LGPL code? I did not found any licence terms in your code.

@haraldk
Copy link
Owner

haraldk commented May 31, 2013

The project may be distributed under BSD license, every source file should include the license in the header. Most modules have a license.txt in them as well.

I'm not a lawyer, but from my understanding of the (L)GPL license, is that if I add parts from (L)GPL licensed projects (in contrast to just dynamic linking), my project has to become (L)GPL licensed as well. So, I try to avoid that, just in case.

@marcuslinke
Copy link
Author

Hmmm. So i should have searched for 'license' and not 'licence' :) Thanks for clearing this up!

@marcuslinke
Copy link
Author

While the research regarding the meta data topic i found

https://github.com/cescoffier/Commons-Image-IO

Would it be an option to handle the metadata stuff via Apache commons-imaging?

@haraldk
Copy link
Owner

haraldk commented May 31, 2013

Hmmm.. Re. license: I wasn't even aware there were alternative spellings.. But yes, I use american english. ;-)

The twelvemonkeys library has decent support for reading (not writing at the moment) EXIF, IPTC, and XMP metadata (see the com.twelvemonkeys.imageio.medata package).

Commons Image IO seems to have a nice API, but my guess is, it relies on ImageIO for the metadata, so you'll have the same problem with "Inconsistent metadata". Apache commons-imaging might help. I have no experience with either of these libraries though.

Might be worth giving it a try!

@haraldk
Copy link
Owner

haraldk commented Sep 28, 2013

Hi Marcus!

Good news! :-)
I think I "accidentally" found a way to fix this issue, without having to re-invent all of the metadata-classes, as part of another issue I was working on. Do you have some more sample images that throws the "inconsistent metadata" exception you could add to this issue? Preferably not too large in size, but I'll take what you've got. :-)

This way, I can create a more robust test case/regression test, and get rid of this bug, hopefully.

Thanks for all your input so far!

Regards,

Harald K

@marcuslinke
Copy link
Author

Hi Harald,

sorry for the late answer but i was really busy last days... I will continue the work on the mentioned CMYK -> RGB topic soon. I have a bunch of CMYK test/example images here and probably i will find some proper images for the regression test then. So stay patient with me for 2-3 weeks please. However, have you already committed the fix or do you have it in a local branch only?

@haraldk
Copy link
Owner

haraldk commented Oct 10, 2013

No stress! :-)

I checked in a small part of it, but I have much more comprehensive changes in the local branch. Unfortunately, I got stuck on some bugs in the com.sun...metadata classes when I started more heavy testing, but I might work them out. It's annoying, because if not, I'd have a solution by now... In any case, more test cases can only be good.

Regards,

Harald K

@haraldk
Copy link
Owner

haraldk commented Oct 23, 2013

Hi Marcus,

Update: I now checked in the fix I was working on, please give it a try.

@marcuslinke
Copy link
Author

Hi Harald,

i checked out your latest changes and tested it with the attached image. Unfortunately the known

javax.imageio.IIOException: Inconsistent metadata read from stream

is thrown then.

bb286f0f-3b9aca00-2-luk_cvt_kette_und_gleitschiene_0001a5af_pra_4c_de_de

@marcuslinke
Copy link
Author

...while another image's metadata can be read. But when i try to write metadata back to a file via:

writer.write(null, new IIOImage(image, null, metadata),writer.getDefaultWriteParam());

i get

javax.imageio.IIOException: Metadata components != number of destination bands
at com.sun.imageio.plugins.jpeg.JPEGImageWriter.checkSOFBands(JPEGImageWriter.java:1324)
at com.sun.imageio.plugins.jpeg.JPEGImageWriter.writeOnThread(JPEGImageWriter.java:695)
at com.sun.imageio.plugins.jpeg.JPEGImageWriter.write(JPEGImageWriter.java:340)
at com.twelvemonkeys.imageio.plugins.jpeg.JPEGImageWriter.write(JPEGImageWriter.java:155)

Any idea?

sabinespitz_dhu_pr_02

@haraldk
Copy link
Owner

haraldk commented Oct 24, 2013

Thanks for giving it a try!

Really appreciate the feedback. Close but no cigar, it seems then.. :-(

I'll try to create some test cases from your attached images, and see if I can figure it out. Will report back when I have some progress!

Harald K

@marcuslinke
Copy link
Author

I have to thank you Harald. I'm looking forward for a working solution.. ;) Having some other images here for both cases but they are rather big. But if you need more test files i will post it here...

@haraldk
Copy link
Owner

haraldk commented Oct 24, 2013

I'm sure we'll get a working solution one day... ;-)

I'm not able to reproduce the "inconsistent metadata" issue for your first image, so I'm not sure how we should proceed there. I've tested both Apple's Java 1.6 and Oracle's 7 on OS X (Mountain Lion and Mavericks), and no issues. What Java version/vendor do you use, and what platform?

If you have some more images, especially for the "inconsistent metadata" issue, maybe you can email them to me directly (or share a zip via dropbox or similar)?

For the second issue I might have to do some work on the JPEGImageWriter as well, to handle writing of the metadata. The problem is that the image is converted from CMYK to RGB by default, but the metadata still contains SOF, SOS, Adobe DCT, ICC profile etc for CMYK data... I was hoping the writer would just see that these values were no longer correct, thus ignoring them.

It IS possible to read CMYK images as CMYK (without conversion), you could try that as a workaround (using ImageReader.getImageTypes and ImageReadParam.setDestination/setDestinationType).

Harald K

@marcuslinke
Copy link
Author

The first image fails with latest 1.6.0_51 from Apple under Mountain Lion here. Don't know why this is not reproducable. Could you post your test code to see if there are any differences regarding our test conditions?

Beside this here are a few CMYK test images: https://www.dropbox.com/sh/avsx712khejcbyh/PwYm3cLDiK

@haraldk
Copy link
Owner

haraldk commented Oct 24, 2013

Great,

Thanks I've downloaded them.

I'm for some reason still using 1.6.0_33, could be some changes there... Downloading the latest 1.6.0_65 now, hoping it will expose the problem. Will try it out tonight. :-)

Harald K

On 24. okt. 2013, at 13:34, marcuslinke notifications@github.com wrote:

The first image fails with latest 1.6.0_51 from Apple under Mountain Lion here. Don't know why this is not reproducable. Could you post your test code to see if there are any differences regarding our test conditions?

Beside this here are a few CMYK test images: https://www.dropbox.com/sh/avsx712khejcbyh/PwYm3cLDiK


Reply to this email directly or view it on GitHub.

@marcuslinke
Copy link
Author

Hi Harald,

just working on the CMYK -> RGB conversion topic and one question raised up. How could i detect that the source JPEG was CMYK encoded? Is there any possibility to check this?

Regards

@haraldk
Copy link
Owner

haraldk commented Oct 25, 2013

Hi Marcus,

Good question! ;-)

The idea is that the metadata should be able to tell you this. Unfortunately, the algorithm (heuristic) for detecting it is somewhat complicated. The basics are described here http://docs.oracle.com/javase/6/docs/api/javax/imageio/metadata/doc-files/jpeg_metadata.html#color, but not all JPEGs are conforming to this (the "inconsistent metadata" issue, others are just malformed or contain garbage data that confuses the IJG library), further complicating matters.

Normally, the source color space can be derived from ImageReader.getRawImageType. However, there's no Java color space matching YCbCr or YCCK, and the documentation (same link as above) says we should return null in these cases.

I suggest you have a look at JPEGImageReader.getSourceCSType, which is what the reader itself uses to determine source color space. Maybe it could make sense to expose this functionality (or something similar) as a utility method?

Regards,

Harald K

On 25. okt. 2013, at 10:08, marcuslinke notifications@github.com wrote:

Hi Harald,

just working on the CMYK -> RGB conversion topic and one question raised up. How could i detect that the source JPEG was CMYK encoded? Is there any possibility to check this?

Regards


Reply to this email directly or view it on GitHub.

@marcuslinke
Copy link
Author

Thanks for the detailed explanation. What a mess... things get even more complicated than i thought ;) So it would be really cool to have that functionality as a utility method. For me at the moment it seems enough to detect whether the source image is RGB and otherwise load and save as such. But maybe requirements changes sometimes...

Regards

@marcuslinke
Copy link
Author

Regarding the non-reproducable "javax.imageio.IIOException: Inconsistent metadata read from stream" while processing of the first image: I just found that i used accidentally an old version of TwelveMonkeys lib for the tests. So with the latest 3.0-SNAPSHOT it works as expected. No Exception at all then... Sorry for confusing you!

@haraldk
Copy link
Owner

haraldk commented Oct 25, 2013

All right!

Thanks for reporting back. I'm not spending more time on that issue then. The main focus is to make the writer accept the metadata it receives, and possibly adapt it to the current image data.

BTW: How to you read your images? Using readAll() or reading the image and metadata separately?

Harald K

@marcuslinke
Copy link
Author

Looking at your ImageReader.getRawImageType implementation i see that the method mostly returns null in case the delegate returns null. The only exception is made when CMYK is detected via metadata. Is this the expected behavior?
I mean if metadata says its RGB it returns null nevertheless.

...and back to your question: I read the images using read() method and get the metadata separately.

@haraldk
Copy link
Owner

haraldk commented Oct 25, 2013

Hmmm..

Yes, it's intentional. The delegate will return null if the image is encoded in YCbCr (which does not have a corresponding Java ColorSpace), OR if it's in a color space the delegate does not support (like CMYK). So my idea, is to keep this behavior, with the exception that the reader now supports CMYK and YCCK. There's no color space for YCCK, so I return null (analogous to YCbCr). I have a color space for CMYK, so I can return that. Makes sense? :-)

Ok, thanks. I was just considering if it would make sense to make sure the image/metadata from readAll would correspond to each other. But it won't matter then. The only thing that makes sense I think, is leaving it to the Writer to figure it out. Any metadata that doesn't match the input image, should be updated to reflect the current image/write param (and generate warnings, not exceptions). Does that make sense?

Harald K

On 25. okt. 2013, at 14:04, marcuslinke notifications@github.com wrote:

Looking at your ImageReader.getRawImageType implementation i see that the method mostly returns null in case the delegate returns null. The only exception is made when CMYK is detected via metadata. Is this the expected behavior?

...and back to your question: I read the images using read() method and get the metadata separately.


Reply to this email directly or view it on GitHub.

@marcuslinke
Copy link
Author

Sounds good! However I checked approx. 1800 JPEGs with the ImageReader.getRawImageType method and in 1700 cases the method returns null. This seems strange to me. The test images are uploaded by our customers over the last 6 months. So RGB JPEGs are rare cases?

@haraldk
Copy link
Owner

haraldk commented Oct 25, 2013

Yes,

RGB JPEGs are rather rare (If you find one, please send it over :-) ). I'd guess 90+ % of all JPEGs are YCbCr encoded (note that JFIF only supports YCbCr and Gray). The conversion to/from RGB is mostly perfect and subsampling in YCbCr space gives a lot better visual results.

Harald K

On 25. okt. 2013, at 14:45, marcuslinke notifications@github.com wrote:

Sounds good! However I checked approx. 1800 JPEGs with the ImageReader.getRawImageType method and in 1700 cases the method returns null. This seems strange to me. The test images are uploaded by our customers over the last 6 month. So RGB JPEGs are rare cases?


Reply to this email directly or view it on GitHub.

@marcuslinke
Copy link
Author

OK. Thanks. So i'm fucked ATM... ;-) Need to think about it. The only solution for me seems to extract ImageReader.getSourceCSType as utility method... However, have a nice weekend!

@haraldk
Copy link
Owner

haraldk commented Oct 25, 2013

Hmm..

Just got an idea. If this is your lucky day, have a look at the ColroSpaceType of the Chroma node int the standard metadata format ("javax_imageio_1.0"/IIOMetadataFormatImpl.standardMetadataFormatName). It seems to report the correct color space type (as a string, but anyway). :-)

Didn't think of it before, and it seems almost too easy.. ;-)

Have a nice weekend!

Harald K

On 25. okt. 2013, at 15:18, marcuslinke notifications@github.com wrote:

OK. Thanks. So i'm fucked ATM... ;-) Need to think about it. The only solution for me seems to extract the ImageReader.getSourceCSType as utility method... However have a nice weekend!


Reply to this email directly or view it on GitHub.

@haraldk
Copy link
Owner

haraldk commented Oct 25, 2013

Good news: A quick test shows that the Chroma node is correct most of the time. :-)

It sometimes reports RGB for YCbCr, but I guess that's less of a problem, as the image will be decoded to RGB in both cases.

I also found that getSourceCSType reports wrong in a few cases where the Adobe transform segment is present (but the transform doesn't match the number of color components in the SOF, ie. YCCK but only 3 components). I'll fix that at some point. Chroma node was right in this case (but I think that was coincidence).

@marcuslinke
Copy link
Author

Yeah but parsing Chromanode sometimes throws:

javax.imageio.metadata.IIOInvalidTreeException: Xdensity attribute out of range
at com.sun.imageio.plugins.jpeg.MarkerSegment.getAttributeValue(MarkerSegment.java:141)
at com.sun.imageio.plugins.jpeg.JFIFMarkerSegment.updateFromNativeNode(JFIFMarkerSegment.java:248)
at com.sun.imageio.plugins.jpeg.JFIFMarkerSegment.(JFIFMarkerSegment.java:133)
at com.sun.imageio.plugins.jpeg.JPEGMetadata.setFromNativeTree(JPEGMetadata.java:2184)
at com.sun.imageio.plugins.jpeg.JPEGMetadata.setFromTree(JPEGMetadata.java:2149)
at com.twelvemonkeys.imageio.plugins.jpeg.JPEGImage10MetadataCleaner.cleanMetadata(JPEGImage10MetadataCleaner.java:259)
at com.twelvemonkeys.imageio.plugins.jpeg.JPEGImageReader.getImageMetadata(JPEGImageReader.java:1043)

So for now i will use getSourceCSTypemethod probably. Therefore I switched it to public so i can use it. Would it be possible to merge that into master? WDYT?

@haraldk
Copy link
Owner

haraldk commented Oct 25, 2013

Hmm.. I just ran through my extended test case with 200+ bad (and some good) images, and all of them happily reported chroma without exception... Do you have a test image for me (and possibly show the code, the stack trace looks odd)? :-)

Feel free to use getSourceCSType as a workaround for now, but I really don't want to make it public. I'd rather fix the metadata and stick to the standard interface.

Harald K

@marcuslinke
Copy link
Author

OK. So refactor method getSourceCSType out of JPEGImageReaderwould be an option for master? Anyhow here is one test image as requested.

obs2_100

@haraldk
Copy link
Owner

haraldk commented Oct 25, 2013

Progress: Just pushed a fix for the X density out of range issue. Your sample image reads fine now. It was unrelated to the Chroma node reading. It's just another example on how to break the JFIF spec.. sigh

I'm still cheering for the Chroma node. And no, I don't want the getSourceCSType public in master. Sorry. :-)

.k

@marcuslinke
Copy link
Author

Great! Now that i can use the metadata approach there is no need for making getSourceCSType public anymore. Thanks for your effort and have a nice weekend!

@ghost ghost assigned haraldk Oct 25, 2013
@haraldk
Copy link
Owner

haraldk commented Oct 30, 2013

Cool,

I'd like to close this issue for now, if you don't mind? And if you discover more problems, maybe file new issues with more specific details?

Regards,

Harald K

@haraldk haraldk closed this as completed Oct 30, 2013
jessieZZZZ pushed a commit to jessieZZZZ/TwelveMonkeys that referenced this issue Jul 6, 2017
[ThumbnailLibrary] Adding test comment to test travis and build script
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants