Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Improvements for JPEG decoding #1570

Closed
wants to merge 19 commits into from

Conversation

scaramallion
Copy link
Member

@scaramallion scaramallion commented Jan 13, 2022

Describe the changes

  • Improvements for handling colour space when decoding JPEG 10918-1 using Pillow
  • Adds jpeg module and JPEG parsing utilities
  • Adds util.debug_pixel_data function for printing debug information when troubleshooting pixel data issues
>>> from pydicom.util import debug_pixel_data
>>> from pydicom.data import get_testdata_file
>>> ds = get_testdata_file("JPEG2000.dcm", read=True)
>>> debug_pixel_data(ds)
File Meta Information: present
  Transfer Syntax UID: 1.2.840.10008.1.2.4.91 (JPEG 2000 Image Compression)

Dataset
  (0028, 0002) Samples per Pixel                   US: 1
  (0028, 0004) Photometric Interpretation          CS: 'MONOCHROME2'
  (0028, 0008) Number of Frames                    IS: '1'
  (0028, 0009) Frame Increment Pointer             AT: [(0054, 0010), (0054, 0020)]
  (0028, 0010) Rows                                US: 1024
  (0028, 0011) Columns                             US: 256
  (0028, 0030) Pixel Spacing                       DS: [2.260000, 2.260000]
  (0028, 0051) Corrected Image                     CS: ['NRGY', 'LIN']
  (0028, 0100) Bits Allocated                      US: 16
  (0028, 0101) Bits Stored                         US: 16
  (0028, 0102) High Bit                            US: 15
  (0028, 0103) Pixel Representation                US: 1
  (0028, 0106) Smallest Image Pixel Value          SS: 0
  (0028, 0107) Largest Image Pixel Value           SS: 278
  (0028, 2110) Lossy Image Compression             CS: '01'
  (0028, 2112) Lossy Image Compression Ratio       DS: '2097.0'
  (7fe0, 0010) Pixel Data                          OB: Array of 266 elements

JPEG 2000 codestream info for frame 0
  SOI (FF 4F) marker found @ offset 0
  SIZ (FF 51) segment found @ offset 2
    Rows: 1024
    Columns: 256
    Components:
      0: signed, precision 16
  COD (FF 52) segment found @ offset 45
    Multiple component transform: none
    Wavelet transform: 9-7 irreversible

Tasks

  • Unit tests added that reproduce the issue or prove feature is working
  • Fix or feature added
  • Code typed and mypy shows no errors
  • Unit tests passing and overall coverage the same or better

@pep8speaks
Copy link

pep8speaks commented Jan 13, 2022

Hello @scaramallion! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-02-16 22:01:18 UTC

@codecov
Copy link

codecov bot commented Jan 13, 2022

Codecov Report

Merging #1570 (426d09b) into master (37dd49e) will increase coverage by 0.15%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1570      +/-   ##
==========================================
+ Coverage   97.56%   97.72%   +0.15%     
==========================================
  Files          66       71       +5     
  Lines       10643    10878     +235     
==========================================
+ Hits        10384    10630     +246     
+ Misses        259      248      -11     
Impacted Files Coverage Δ
pydicom/__init__.py 100.00% <100.00%> (ø)
pydicom/jpeg/__init__.py 100.00% <100.00%> (ø)
pydicom/jpeg/jpeg10918.py 100.00% <100.00%> (ø)
pydicom/jpeg/jpeg15444.py 100.00% <100.00%> (ø)
pydicom/pixel_data_handlers/pillow_handler.py 96.77% <100.00%> (+0.62%) ⬆️
pydicom/pixel_data_handlers/util.py 100.00% <100.00%> (ø)
pydicom/util/__init__.py 100.00% <100.00%> (ø)
pydicom/util/debug.py 100.00% <100.00%> (ø)
pydicom/valuerep.py 99.53% <0.00%> (+0.15%) ⬆️
pydicom/filewriter.py 98.06% <0.00%> (+0.24%) ⬆️
... and 3 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 37dd49e...426d09b. Read the comment docs.

Comment on lines +183 to +186
# Assume *Photometric Interpretation* is correct:
# * If the decoded pixel data is correct then source is RGB
# * If the decoded pixel data is incorrect then source is YCbCr
# but this can be fixed by user applying YCbCr -> RGB transform
Copy link
Member Author

@scaramallion scaramallion Jan 14, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taking the opposite approach, i.e. assuming the Photometric Interpretation is incorrect (which it probably is when cs is None), we'd let libjpeg apply the YCbCr -> RGB transform, which gives:

  • If the source is YCbCr, libjpeg applies YCbCr -> RGB, decoded pixel data is RGB and correct
  • If the source is RGB, libjpeg applies YCbCr -> RGB, decoded pixel data is weird colour space and incorrect. The user would then need to apply RGB -> YCbCr transform to return to RGB (which is non-intuitive)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree. Something like this was the case with the recent question, which confused me, because I assumed it would be the other way (as you are doing it now, as I understand).
I'm not really able to review this, would have to do some readup on JPEG first, but I won't have much time for a week due to family stuff. The concept looks fine to me, though, from what I can understand.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not make assumptions about the value of Photometric Interpretation being incorrect. If the value is "RGB", it means that the data has been compressed in RGB color space without transformation into YCbCr color space and the decompressed data shall therefore not be color transformed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a mismatch between the information in the JPEG codestream and the DICOM metadata, I tend argue that an error should be raised instead of a warning being logged. If necessary, the caller can work around such issues by correcting the DICOM metadata before decompression.

Copy link
Member Author

@scaramallion scaramallion Jan 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not make assumptions about the value of Photometric Interpretation being incorrect.

I haven't though, I've assumed it is correct (which is also the logic currently used).

I was going to raise on mismatches, but I think that YCbCr is much more common then unflagged RGB, so I was worried about how frequently we would get "I can't decode my pixel data" and similar issues.

Copy link
Contributor

@hackermd hackermd Jan 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My comment is related to the following lines:

    if photometric_interpretation == "RGB":
        ...
        if cs is None:
            # Source data may be YCbCr (most likely) or RGB (less likely)

If the value of Photometric Interpretation is "RGB", than no color transformation should be applied during decoding. If for whatever reason the value of the data element is wrong, than that's not something the library should try to be smart about and work around in my opinion, since it could lead to a wrong result. We have to be able to rely on the correctness of the metadata in the DICOM image.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree

@scaramallion
Copy link
Member Author

@hackermd are you happy with the changes to the colour space logic?

@scaramallion scaramallion marked this pull request as ready for review January 17, 2022 04:01
@scaramallion scaramallion changed the title [WIP] Improvements for JPEG decoding [MRG] Improvements for JPEG decoding Jan 17, 2022
@darcymason
Copy link
Member

I'm not expert enough to review the technical details here, but I looked through it quickly and don't see any issues.

A couple of thoughts, though, related to the common questions we get about colour issues etc.:

  • perhaps some more documentation is necessary - reviewing again, I see some descriptions in existing documentation, but perhaps they are not visible enough. Maybe a "my image doesn't look right" page in either a Troubleshooting section, or Guide of some kind would get better SEO.
  • I'd suggest updating the issue template with some more information/links to existing documentation, and if still trouble, then ask for the debugging info. I expect that's where you were heading with this anyway.

Not that those have to be added with this PR - just some thoughts.

@hackermd
Copy link
Contributor

@hackermd are you happy with the changes to the colour space logic?

Thanks for these changes @scaramallion. At first glance, this looks great. I will do some testing and get back to you.

@darcymason darcymason mentioned this pull request Jan 21, 2022
4 tasks
@hackermd
Copy link
Contributor

@scaramallion something still seems to be something quite wrong with JPEG decoding (current master as well as this development branch).

Here is an example for a WG-26 test file: image.zip

import pydicom
import numpy as np
from matplotlib import pyplot as plt
from PIL import Image

ds = pydicom.dcmread("2.16.840.1.113995.3.110.3.0.10118.2000002.922101.1.dcm")

frames = list(pydicom.encaps.generate_pixel_data_frame(ds.PixelData))
image = Image.open(BytesIO(frames[0]))
pixel_array = np.asarray(image)

fig, axes = plt.subplots(1, 2)
axes[0].imshow(ds.pixel_array)
axes[0].axis('off')
axes[1].imshow(pixel_array)
axes[1].axis('off')
plt.show()

Screen Shot 2022-02-15 at 9 13 21 AM

@scaramallion
Copy link
Member Author

scaramallion commented Feb 16, 2022

@hackermd it seems correct, the source JPEG data is YCbCr given the APP0 marker and the Photometric Interpretation is YBR_FULL_422, so the resulting array should be in YCbCr colour space (as seen in the LHS image). Converting the colour space from YCbCr to RGB gives the result from the RHS image.

Unless I'm missing something?

@hackermd
Copy link
Contributor

@scaramallion if the pixel data are JPEG-compressed (encapsulated), then the value of Photometric Interpretation describes the color space prior to encoding/compression. I tend argue that the data should always be converted into RGB upon decompression (i.e., when accessing the pixel_array property).

@hackermd
Copy link
Contributor

It raises an interesting question with regard to color correction. @dclunie do you have any thoughts on this?

@scaramallion
Copy link
Member Author

scaramallion commented Feb 16, 2022

As I understand it, our practice is to return the pixel data as found in the dataset for JPEG 10918, matching the Photometric Interpretation (i.e. not performing any additional transforms).

The new pixel data backend I'm working on will have an option for transforming YBR to RGB which will default to True

@hackermd
Copy link
Contributor

hackermd commented Feb 16, 2022

As I understand it, our practice is to return the pixel data as found in the dataset for JPEG 10918, matching the Photometric Interpretation (i.e. not performing any additional transforms).

The new pixel data backend I'm working on will have an option for transforming YBR to RGB which will default to True

I think this will lead to downstream problems and I am not sure it is correct either.

In this specific case, it would assume that the image has been acquired in YBR color space. However, the included ICC Profile says that the data color space is RGB.

@scaramallion
Copy link
Member Author

Unfortunately there's no perfect answer for this, "common sense" would dictate that we return RGB, but there are use cases where returning the raw YCbCr data is better (transcoding from one JPEG format to another, for example).

Our general philosophy is to make as few changes to the data found in the dataset as possible.

@dclunie
Copy link

dclunie commented Feb 17, 2022

I won't comment on whether your toolkit should be returning the untransformed or transformed color values (in terms of RGB or YCbCr), but I can describe what I understand to be the relationship between what the DICOM PhotometricInterpretation data element says (RGB or YBR_FULL_422) versus what is signaled in the encapsulated compressed bitstream (if anything), and the consequences of what codecs "do", for some commonly encountered patterns:

  1. PhotometricInterpretation = YBR_FULL_422, compressed bitstream has no information about component meaning, most codecs will assume YCbCr and transform to RGB
  2. PhotometricInterpretation = YBR_FULL_422, compressed bitstream has APP0 JFIF marker segment, then bitstream components are to be interpreted as YCbCr, most codecs will detect this and transform to RGB
  3. PhotometricInterpretation = YBR_FULL_422, compressed bitstream has APPE Adobe marker segment with ColorTransform 1 = YCbCr, some codecs will detect this and transform to RGB, though most codecs are probably ignoring this and just assuming YCbCr, which happens to be correct
  4. PhotometricInterpretation = RGB, compressed bitstream has APP0 JFIF marker segment, then bitstream components are to be interpreted as YCbCr, most codecs will detect this and transform to RGB, DICOM PhotometricInterpretation is "wrong"
  5. PhotometricInterpretation = RGB, compressed bitstream has APPE Adobe marker segment with ColorTransform 0 = Unknown (RGB or CMYK), some codecs will detect this and suppress usual YCbCr to RGB transformation, which is what is intended (this is the tactic we use to make Aperio/Leica SVS converted images that really are RGB not YCbCr components work with most codecs)
  6. PhotometricInterpretation = RGB, compressed bitstream has no APP0 JFIF or Adobe APPE, components are not numbered from 1 and channels not downsampled, some codecs (notably Java JRE) will detect this and assume RGB rather than YCbCr, and suppress usual YCbCr to RGB transformation, which is what is intended
  7. PhotometricInterpretation = RGB, compressed bitstream has no APP0 JFIF or Adobe APPE, codec doesn't apply any special heuristics and applies YCbCr to RGB transformation, which may (if DICOM PhotometricInterpretation is "wrong") be correct, or may not (if DICOM PhotometricInterpretation is "right") - this is an ambiguous situation, encountered sometimes
  8. PhotometricInterpretation = YBR_FULL_422, compressed bitstream has no APP0 JFIF or Adobe APPE, components are not numbered from 1 and channels not downsampled, some codecs (notably Java JRE) will detect this and incorrectly assume RGB rather than YCbCr, and suppress usual YCbCr to RGB transformation, which is not what is intended, so caller has to figure this out, trust the DICOM PhotometricInterpretation, and perform the transformation themselves

The chrominance component downsampling is actually a pretty good clue, since that is almost always done for chrominance channels and never done for RGB channels, but it does require examining the JPEG marker segment that describes this.

The business of the numbering of the components is a strange one to rely on, but that is a consequence of JFIF making specific requirements for these.

Since I use Java and their various JPEG codecs a lot, if find that the behavior described in https://docs.oracle.com/javase/8/docs/api/javax/imageio/metadata/doc-files/jpeg_metadata.html#color informative.

Note also that what is present in any DICOM ICCProfile or ColorSpace has nothing to do with the components in the JPEG bitstream; these are intended to be applied after the bitstream has been decompressed and transformed (if necessary) into RGB. The same applies to any ICC profile embedded in the JPEG bitstream in an APP2 marker segment, since AFAIK, ICC profiles are specified with input and output in RGB or CMYK only.

@hackermd
Copy link
Contributor

Unfortunately there's no perfect answer for this, "common sense" would dictate that we return RGB, but there are use cases where returning the raw YCbCr data is better (transcoding from one JPEG format to another, for example).

Our general philosophy is to make as few changes to the data found in the dataset as possible.

From a user perspective, I find this behavior very unexpected and problematic. Most users will probably expect the data to be returned as RGB and will not consider additional color space conversions before using the data for image analysis or display. There is also the problem that one cannot readily tell the color space from the returned NumPy array and it's unclear what transformations have been applied during decoding. How is the caller in your opinion supposed to handle the image provided here as an example?

To me, the RGB -> YCbCr color space conversion during JPEG compression is part of the encoding and should be reversed upon decoding. As you said this is common sense and doing anything else will probably get the majority of users into trouble.

@scaramallion
Copy link
Member Author

scaramallion commented Feb 17, 2022

Most users will probably expect the data to be returned as RGB and will not consider additional color space conversions before using the data for image analysis or display.

I'm not disagreeing! Hence the new pixel data backend. Changing Dataset.pixel_array in a backwards incompatible way is a massive no because it's a fundamental part of the library, so I'm stuck working within the limits of the current behaviour. And since the current behaviour is based on the philosophy of "make no changes" this is what you get.

How is the caller in your opinion supposed to handle the image provided here as an example?

arr = ds.pixel_array
if ds.PhotometricInterpretation in ("YBR_FULL", "YBR_FULL_422")
    arr = convert_color_space(arr, ds.PhotometricInterpretation, "RGB")

@hackermd
Copy link
Contributor

Most users will probably expect the data to be returned as RGB and will not consider additional color space conversions before using the data for image analysis or display.

I'm not disagreeing! Hence the new pixel data backend. Changing Dataset.pixel_array in a backwards incompatible way is a massive no because it's a fundamental part of the library, so I'm stuck working within the limits of the current behaviour. And since the current behaviour is based on the philosophy of "make no changes" this is what you get.

Thanks for the clarification. It's unfortunate. I hope that the new pixel data backend will not be as tightly integrated into the Dataset class and that the pixel_array property ultimately gets deprecated (see also #1447 (comment)).

@hackermd
Copy link
Contributor

hackermd commented Feb 18, 2022

@scaramallion I just noted that this behavior is specific to JPEG. Color images encoded with JPEG 2000 or JPEG-LS appear to be transformed into RGB color space upon decoding. Is that intended? If that's the case, then this should be fixed (regardless of whether if would break backwards compatibility) in my opinion. It would be super confusing if the behavior would differ depending on the transfer syntax.

@hackermd
Copy link
Contributor

@scaramallion we will provide a workaround in highdicom for the time being. I remain convinced that this is a bug in pydicom that should be addressed. See ImagingDataCommons/highdicom#152 for further information.

@darcymason
Copy link
Member

Our general philosophy is to make as few changes to the data found in the dataset as possible.

I've probably been the one primarily driving that philosophy, and in general it is a good one, and in fact I always anticipated other libraries (like highdicom) would take low-level abilities and add to them. But ...

From a user perspective, I find this behavior very unexpected and problematic. Most users will probably expect the data to be returned as RGB
...
To me, the RGB -> YCbCr color space conversion during JPEG compression is part of the encoding and should be reversed upon decoding.

... I like this idea of thinking of it as just part of the decoding, but it's not my area of expertise, so I'm not sure on whether that would be everyone's interpretation. Pydicom is in the business of decoding DICOM data elements into the 'most natural' python types, but as I've said, I don't know myself what that interpretation should be in this case.

Regardless, as long as documentation is very clear, I think users can be expected to read a half-page or page describing how it works. In any case, they should obviously be checking any outputs against known/expected results, and not assuming they know what the data is.

I'm not disagreeing! Hence the new pixel data backend. Changing Dataset.pixel_array in a backwards incompatible way is a massive no because it's a fundamental part of the library, so I'm stuck working within the limits of the current behaviour.

I do agree here that we can't fundamentally change existing behavior, so very much support it happening through a different backend.

So, I'm not sure I've added much clarity overall, but thought I should at least express some thoughts...

@hackermd
Copy link
Contributor

Thanks for your feedback @darcymason.

Regardless, as long as documentation is very clear, I think users can be expected to read a half-page or page describing how it works. In any case, they should obviously be checking any outputs against known/expected results, and not assuming they know what the data is.

I would be fine with this as long as the behavior is consistent across transfer syntaxes. It appears that currently a different approach is used for JPEG baseline 8-bit transfer syntax versus JPEG 2000 or JPEG-LS transfer syntaxes. If that's indeed the case (please correct me if I am wrong), then I would suggest changing one of them to use a consistent approach.
As elaborated above, my preference would be to change the behavior for JPEG baseline 8-bit. However, that's probably the more common use case and an API change would consequently be more disruptive. Therefore, we could consider changing the behavior for JPEG 2000 and JPEG-LS instead and document the behavior well. For reference see https://github.com/herrmannlab/highdicom/blob/41d7b9c179ef20ec91a4f74c233a67ca965d9823/src/highdicom/frame.py#L395-L417

@darcymason
Copy link
Member

I would be fine with this as long as the behavior is consistent across transfer syntaxes. It appears that currently a different approach is used for JPEG baseline 8-bit transfer syntax versus JPEG 2000 or JPEG-LS transfer syntaxes.

Ah, yes, I meant to address your previous comments about that too. That is a big problem if true.

If that's indeed the case (please correct me if I am wrong), then I would suggest changing one of them to use a consistent approach.

I'm not sure about that - I would defer to @scaramallion, but I will say that would really make the new solution even a higher priority, which would allow everything to work consistently with no backwards baggage and become the clear "right way" to do things from here forward. We could strongly promote the new methods in documentation and release messages, perhaps even suggesting converting existing code to it.

@darcymason
Copy link
Member

@scaramallion, I've been delinquent on getting a release out, but I'm wondering where you think this and #1605 stand in terms of the release - it seems reasonable to wait on it if possible, but I'm not sure if it is all sorted out yet given this discussion, or how available your time is. We have no fixed schedule in terms of minor releases before 3.0, so could do some closer together if needed.

@hackermd
Copy link
Contributor

hackermd commented Apr 3, 2022

@scaramallion @darcymason PS 3.3 C.11.15.1.1 ICC Profile states

The color space of the input shall be RGB, i.e., header bytes 16 through 19, Color Space Signature, shall be "RGB", regardless of the Photometric Interpretation of the image pixel data prior to decompression

That means that pixels of a color image after decompression shall be in RGB color space.

@scaramallion
Copy link
Member Author

Sorry, I've been super busy (and I needed a break).

My opinion is that this PR is a simple extension of what our current practice is regarding colour spaces, plus some housekeeping code that will get moved around anyway. I think I'll just close it and work on the future pixel data decoding back end instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants