Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDF converted from an icc-profiled raster images should use /ColorSpace /ICCBased #417

Closed
d-ph opened this issue Apr 2, 2017 · 24 comments
Closed

Comments

@d-ph
Copy link

d-ph commented Apr 2, 2017

Hello,

Sorry for the wall of text. The problem is simple. I think the solution isn't that difficult either. The explanation is what needs elaboration.

tl;dr:

In order for PDFs produced by ImageMagick from icc-profiled raster images to be correctly displayed in Adobe Reader, I need ImageMagick to instead of doing the following:

8 0 obj
<<
/Type /XObject
/Subtype /Image
/Name /Im0
/Filter [ /DCTDecode ]
/Width 420
/Height 315
/ColorSpace 10 0 R
/BitsPerComponent 8
/Length 9 0 R
>>
stream

(image binary data)

endstream
endobj

10 0 obj
/DeviceRGB
endobj

to be doing the following:

8 0 obj
<<
/Type /XObject
/Subtype /Image
/Name /Im0
/Filter [ /DCTDecode ]
/Width 420
/Height 315
/ColorSpace 10 0 R
/BitsPerComponent 8
/Length 9 0 R
>>
stream

(image binary data)

endstream
endobj

10 0 obj
[ /ICCBased 11 0 R ]
endobj
11 0 obj
<<
/N 3
/Filter /ASCIIHexDecode
/Alternate /DeviceRGB
>>
stream

(icc profile in hex)

endstream
endobj

Further ado:

Would it be possible to make PDFs, converted from icc-profiled raster images, be created with the image XObject that uses /ColorSpace [ /ICCBased (data) ], please?

Currently ImageMagick includes the icc profile in the XObject's binary data. This icc profile is not interpreted by Adobe Reader (and many other apps), making the image be rendered with incorrect colours. I propose, that ImageMagick continue doing what it's doing (there is a very good reason for it), but additionally ImageMagick should also use the /ColorSpace /ICCBased, so the whole world beside ImageMagick can also benefit from icc-profiled images.

Test case:

λ magick.exe --version
Version: ImageMagick 7.0.5-4 Q16 x64 2017-03-25 http://www.imagemagick.org
Copyright: Copyright (C) 1999-2015 ImageMagick Studio LLC
License: http://www.imagemagick.org/script/license.php
Visual C++: 180040629
Features: Cipher DPC Modules OpenMP
Delegates (built-in): bzlib cairo flif freetype jng jp2 jpeg lcms lqr openexr pangocairo png ps rsvg tiff webp xml zlib

You'll also need:

  1. Download and unzip the test file: link
  2. Run magick jpeg_ProphotoRGB_8bits_embedded_profile.jpg 1.pdf
  3. Open both 1.pdf and jpeg_sRGB_8bits.jpg --> Result: colours are different (especially the red).
  4. Run magick jpeg_ProphotoRGB_8bits_embedded_profile.jpg 1.icc --> icc profile is extracted
  5. Generate the hex of the 1.icc. One way of doing it is to run this: php -r "echo bin2hex(file_get_contents('1.icc'));"
  6. Run cp 1.pdf 1dirty.pdf
  7. Run vim 1dirty.pdf. Find this:
10 0 obj
/DeviceRGB
endobj

Replace it with this

10 0 obj
[ /ICCBased 20 0 R ]
endobj
20 0 obj
<<
/N 3
/Filter /ASCIIHexDecode
/Alternate /DeviceRGB
>>
stream
{{ PUT THE HEX OF THE ICC PROFILE GENERATED IN STEP 5 HERE }}
endstream
endobj
  1. Run pdftk 1dirty.pdf output 1fixed.pdf
  2. Open 1fixed.pdf --> Result: the colours are exactly the same as in jpeg_sRGB_8bits.jpg and in jpeg_ProphotoRGB_8bits_embedded_profile.jpg

I included 0.pdf and 0fixed.pdf in the file available in step 1, as a reference.

Implementation details:

File: coders/pdf.c

  1. The colour space is decided here. At this point, ImageMagick should see, whether an icc profile can be extracted from the input image. If yes, then it should use a /ICCBased colour space (ref page 252):
8 0 obj [/ICCBased 9 0 R]
endobj
9 0 obj
<<
/N 3 (number of colour channels, here 3, because the icc profile is for the RGB colour model)
/Filter (choose appropriate filter to achieve best compression or none)
/Length 1881 (this is the byte-length of the binary stream)
/Alternate /DeviceRGB (fallback colour space; this should be set to whatever's used currently instead of /ICCBased colour space)
>>
stream
( icc profile binary )
endstream
endobj
  1. Choosing PDF Version link

According to the PDF Reference link, I say it's best to use PDF version 1.7 for the /ColorSpace /ICCBased. See page 253 (starting at the bottom).

PDF ver | ICC standard version support
1.3     | 3.3
1.4     | ICC.1:1998-09 and its addendum ICC.1A:1999-04
1.5     | ICC.1:2001-12
1.6     | ICC.1:2003-09
1.7     | ICC.1:2004-10

If ImageMagick is able to extract the icc profile from the input raster file (i.e. step 4 from the test case succeeds), then the output pdf should be produced in version 1.7.

  1. Both the image XObject and the Thumbnail (link) should use the same /ICCBased colour space. Alternatively, the thumbnail might just be in sRGB, but ImageMagick would need to convert the color space from the original one to sRGB prior Thumbnail inclusion.

Final words.

This change would make one more thing to JustWorkTM in ImageMagick. There are a couple of links in the google showing, that some people stumble upon this problem. I am one of them.

I'd prefer someone knowledgable in C to create the PR instead of me. I'm a php/javascript developer. I don't know how to compile stuff.

Many thanks,
@d-ph

@d-ph d-ph changed the title PDF converted from a icc-profiled raster images should use /ColorSpace /ICCBased PDF converted from an icc-profiled raster images should use /ColorSpace /ICCBased Apr 2, 2017
@dlemstra
Copy link
Member

dlemstra commented Apr 2, 2017

Thanks for the detailed information that you provided. We will take a look at this at later time but this will help us a lot to figure out how to solve it. I will mark this issues as something that someone else could pick up also.

@mikayla-grace
Copy link

Thanks for the problem report. We can reproduce it and will have a patch to fix it in GIT master branch @ https://github.com/ImageMagick/ImageMagick later today. The patch will be available in the beta releases of ImageMagick @ http://www.imagemagick.org/download/beta/ by sometime tomorrow.

@d-ph
Copy link
Author

d-ph commented Apr 2, 2017

That's great news! Thanks.

I looked at the diff and saw, that PDF version 1.7 isn't being chosen, when /ICCBased is to be used. I.e. this part of the code hasn't been altered. Is there a reason why you didn't decide to up the PDF version, when /ICCBased is used?

• For the most predictable and consistent results, a producer of a given PDF version
should embed only profiles conforming to the corresponding version of
the ICC specification. 

~ PDF Reference, page 254

Also, I saw, that in the ChangeLog file you reference this github issue. I was planning on removing the test file (mentioned in the test case) from my Dropbox. Is it fine for me to do that, or would you like me to upload it somewhere else?

Thanks.

@mikayla-grace
Copy link

Feel free to remove the test case image. IMv7 Beta produces a PDF version of 1.7 when it uses an ICCBased object.

@d-ph
Copy link
Author

d-ph commented Apr 2, 2017

Okay, that's brilliant. Thank you for such a quick turnaround. I'll check the IMv7 Beta tomorrow.

@d-ph
Copy link
Author

d-ph commented Apr 3, 2017

Just tested latest IMv7 build.

It produces corrupted pdfs, when input file is without icc.

10 0 obj
DeviceRGB
endobj

The issue is the missing slash / character before the PDF name: /DeviceRGB:

  1. link
  2. link
  3. link
  4. link

@dlemstra

@dlemstra
Copy link
Member

dlemstra commented Apr 3, 2017

Just pushed a fix for the slash issue. Should be available in the beta tomorrow or you can download the pdf.c file from github and replace that in your local copy.

@d-ph
Copy link
Author

d-ph commented Apr 3, 2017

Thank you very much @dlemstra, will do. Other than that, everything looks good to me 👍

@d-ph
Copy link
Author

d-ph commented Apr 3, 2017

It works fantastic now. Thanks again @dlemstra .

One thing though. If you are able to stream the icc profile as a binary into the output pdf, instead of hex-encoding it, it would produce a smaller pdf. In this case you just need to drop the /Filter /ASCIIHexDecode\n here (just remove it and PDF Readers will read the enclosed icc stream as binary; I mentioned hex-encoding in my original post, because I don't know how to copy&paste binary streams using system clipboard). But if you'd rather leave it be, then that's also fine.

Thanks again. Can I close this ticket now or do you have a different process for closing tickets?

@mikayla-grace
Copy link

We changed it to ASCII85Decode and saved about 800 characters.

@dlemstra
Copy link
Member

dlemstra commented Apr 3, 2017

Can you give it another try @d-ph? If you are happy with the results you are free to close this issue yourself. Thanks for bringing this to our attention in this way and helping us add this feature.

@d-ph
Copy link
Author

d-ph commented Apr 4, 2017

Everything still works and the output pdfs are now smaller.

Thanks for such a rapid idea-to-release time. Pleasure's all mine.

Cheers.

@d-ph d-ph closed this as completed Apr 4, 2017
jsonn pushed a commit to jsonn/pkgsrc that referenced this issue May 2, 2017
2017-04-24  7.0.5-5 Cristy  <quetzlzacatenango@image...>
  * Release ImageMagick version 7.0.5-5, GIT revision 19908:bc92979:20170424.

2017-03-26  7.0.5-5 Cristy  <quetzlzacatenango@image...>
  * Minimize buffer copies to improve OpenCL performance.
  * Morphology thinning is no longer a no-op (reference
    https://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=31650).
  * Patch two PCD writer problems, corrupt output and dark pixels (reference
    https://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=3164).
  * Support ICC based PDF's (reference
    ImageMagick/ImageMagick#417).
  * Fix improper EPS clip path rendering (reference
    http://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=31722).
@Bastlynn
Copy link

Bastlynn commented May 2, 2017

I don't see the fix regarding the ICC profile backported to 6.9.8-4 yet? Did that get overlooked or will that be on -5? https://github.com/ImageMagick/ImageMagick/blob/6.9.8-4/coders/pdf.c#L2188

@dlemstra
Copy link
Member

dlemstra commented May 2, 2017

@Bastlynn This is the commit for IM6: bc46a1c

@Bastlynn
Copy link

Bastlynn commented May 2, 2017

Right, the fix for this got in - but I'm curious if the additional fix in this comment got covered? #417 (comment)

When I look at ln 2188 in the 6.x branch I'm seeing no slashes and running into a problem that looks identical to the bug he reported. Or am I chasing a red herring and the slashes described there weren't the issue?

@dlemstra
Copy link
Member

dlemstra commented May 2, 2017

@Bastlynn Thats this commit: 2b0924a

@Bastlynn
Copy link

Bastlynn commented May 2, 2017

And I just found that commit and was going to update with that here so you didn't have to go hunting. Cool - then whatever is slagging my PDF is probably coming from something else. Thanks! :)

@dlemstra
Copy link
Member

dlemstra commented May 2, 2017

@d-ph It seems that these changes broke something when we write PDF files. This is the other issues: #463. This won't happen when we revert our changes. Would you mind helping us figure out why this is no longer working?

@d-ph
Copy link
Author

d-ph commented May 3, 2017

Ok, I'll take a look.

@d-ph
Copy link
Author

d-ph commented May 3, 2017

@Bastlynn There's currently an issue with conversion from raster files with icc profiles to pdfs. The issue is tracked by #463.

If you're not converting from a raster file (with icc) to pdf, then unfortunately your issue is unrelated with the icc support introduced by this ticket.

@d-ph
Copy link
Author

d-ph commented May 3, 2017

Or maybe it is. If you could create a ticket with steps to reproduce your problem (including the test file), I might be able to tell more.

@fmw42
Copy link

fmw42 commented May 3, 2017

@d-ph
Copy link
Author

d-ph commented May 4, 2017

@fmw42 The malformed Example.pdf from the quoted thread is affected by the issue tracked in #463

The CaliSolo-DVD-Cover-Art.jpg is with an ICC profile, therefore is affected by the issue.

To all people getting here from google:

ImageMagick versions 7.0.5-5 and 6.9.8-4 produce malformed pdfs, when converting from images, that have embedded icc profiles. You can tell, whether the pdf is broken because of that by opening the created pdf in a text editor (preferably vim, so it doesn't crash) and try to find the following string: ICCBased. If you find at least one occurrence of this string in your pdf (e.g. [/ICCBased 53 0 R]), then you are experiencing the bug introduced in the latest ImageMagick. I advise you to revert to a previous version until #463 gets resolved (a solution is known, so I reckon the fix will be available in days/weeks).

Sorry for the inconvenience caused and thanks for your reports.

@mikayla-grace
Copy link

Thanks for the problem report. We can reproduce it and will have a patch to fix it in GIT master branch @ https://github.com/ImageMagick/ImageMagick later today. The patch will be available in the beta releases of ImageMagick @ http://www.imagemagick.org/download/beta/ by sometime tomorrow.

jsonn pushed a commit to jsonn/pkgsrc that referenced this issue May 17, 2017
Pkgsrc changes:
Adapt to new version, PLIST (2 new files).

Upstream changes:
2017-05-12  6.9.8-5 Cristy  <quetzlzacatenango@image...>
  * Release ImageMagick version 6.9.8-5, GIT revision 11575:186b15d:20170512.

2017-05-10  6.9.8-5 John Cupitt <jcupitt@gmail.com>
  * Revise DICOM window and rescale handling (reference
    ImageMagick/ImageMagick#484)

2017-05-08  6.9.8-5 Cristy  <quetzlzacatenango@image...>
  * Fix transient PDF bug (reference
    ImageMagick/ImageMagick#463).
  * Check for EOF conditions for RLE image format.
  * Reset histogram page geometry (reference
    https://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=31920).

2017-04-24  6.9.8-4 Cristy  <quetzlzacatenango@image...>
  * Release ImageMagick version 6.9.8-4, GIT revision 11521:d7433aa:20170424.

2017-03-26  6.9.8-4 Cristy  <quetzlzacatenango@image...>
  * Minimize buffer copies to improve OpenCL performance.
  * Patch a PCD writer problem, dark pixels (reference
    https://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=3164).
  * Support ICC based PDF's (reference
    ImageMagick/ImageMagick#417).

2017-03-24  6.9.8-3 Cristy  <quetzlzacatenango@image...>
  * Release ImageMagick version 6.9.8-3, GIT revision 11444:3f523e5:20170324.

2017-03-20  6.9.8-3 Cristy  <quetzlzacatenango@image...>
  * MagickWand-config, use --cflags, not --clags (reference
    https://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=31602).
  * Respect -loop option for animate -window (reference
    https://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=31619).

2017-03-18  6.9.8-2 Cristy  <quetzlzacatenango@image...>
  * Release ImageMagick version 6.9.8-2, GIT revision 11436:a912791:20170318.

2017-03-14  6.9.8-2 Cristy  <quetzlzacatenango@image...>
  * Support namespaces for the security policy.

2017-03-11  6.9.8-1 Cristy  <quetzlzacatenango@image...>
  * Release ImageMagick version 6.9.8-1.

2017-03-03  6.9.8-0 Cristy  <quetzlzacatenango@image...>
  * Release ImageMagick version 6.9.8-0, GIT revision 11408:da91a7c:20170311.

2017-03-06  6.9.8-0 Cristy  <quetzlzacatenango@image...>
  * Respect throttle policy (reference
    ImageMagick/ImageMagick#393).
  * Support the -authenticate option for PDF (reference
    https://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=31530).

2017-03-03  6.9.7-10 Cristy  <quetzlzacatenango@image...>
  * Release ImageMagick version 6.9.7-10, GIT revision 11396:44b1bc6:20170303.

2017-02-21  6.9.7-10 Cristy  <quetzlzacatenango@image...>
  * Fix Spurious memory allocation message (reference
    https://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=31438).
  * Identical images should return inf for PSNR (reference
    https://www.imagemagick.org/discourse-server/viewtopic.php?f=3&t=31487).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants