-
-
Notifications
You must be signed in to change notification settings - Fork 473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Improvements for JPEG decoding #1570
Conversation
Hello @scaramallion! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2022-02-16 22:01:18 UTC |
Codecov Report
@@ Coverage Diff @@
## master #1570 +/- ##
==========================================
+ Coverage 97.56% 97.72% +0.15%
==========================================
Files 66 71 +5
Lines 10643 10878 +235
==========================================
+ Hits 10384 10630 +246
+ Misses 259 248 -11
Continue to review full report at Codecov.
|
# Assume *Photometric Interpretation* is correct: | ||
# * If the decoded pixel data is correct then source is RGB | ||
# * If the decoded pixel data is incorrect then source is YCbCr | ||
# but this can be fixed by user applying YCbCr -> RGB transform |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Taking the opposite approach, i.e. assuming the Photometric Interpretation is incorrect (which it probably is when cs is None
), we'd let libjpeg apply the YCbCr -> RGB transform, which gives:
- If the source is YCbCr, libjpeg applies YCbCr -> RGB, decoded pixel data is RGB and correct
- If the source is RGB, libjpeg applies YCbCr -> RGB, decoded pixel data is weird colour space and incorrect. The user would then need to apply RGB -> YCbCr transform to return to RGB (which is non-intuitive)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I agree. Something like this was the case with the recent question, which confused me, because I assumed it would be the other way (as you are doing it now, as I understand).
I'm not really able to review this, would have to do some readup on JPEG first, but I won't have much time for a week due to family stuff. The concept looks fine to me, though, from what I can understand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not make assumptions about the value of Photometric Interpretation being incorrect. If the value is "RGB"
, it means that the data has been compressed in RGB color space without transformation into YCbCr color space and the decompressed data shall therefore not be color transformed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there is a mismatch between the information in the JPEG codestream and the DICOM metadata, I tend argue that an error should be raised instead of a warning being logged. If necessary, the caller can work around such issues by correcting the DICOM metadata before decompression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not make assumptions about the value of Photometric Interpretation being incorrect.
I haven't though, I've assumed it is correct (which is also the logic currently used).
I was going to raise on mismatches, but I think that YCbCr is much more common then unflagged RGB, so I was worried about how frequently we would get "I can't decode my pixel data" and similar issues.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My comment is related to the following lines:
if photometric_interpretation == "RGB":
...
if cs is None:
# Source data may be YCbCr (most likely) or RGB (less likely)
If the value of Photometric Interpretation is "RGB"
, than no color transformation should be applied during decoding. If for whatever reason the value of the data element is wrong, than that's not something the library should try to be smart about and work around in my opinion, since it could lead to a wrong result. We have to be able to rely on the correctness of the metadata in the DICOM image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree
@hackermd are you happy with the changes to the colour space logic? |
I'm not expert enough to review the technical details here, but I looked through it quickly and don't see any issues. A couple of thoughts, though, related to the common questions we get about colour issues etc.:
Not that those have to be added with this PR - just some thoughts. |
Thanks for these changes @scaramallion. At first glance, this looks great. I will do some testing and get back to you. |
@scaramallion something still seems to be something quite wrong with JPEG decoding (current master as well as this development branch). Here is an example for a WG-26 test file: image.zip import pydicom
import numpy as np
from matplotlib import pyplot as plt
from PIL import Image
ds = pydicom.dcmread("2.16.840.1.113995.3.110.3.0.10118.2000002.922101.1.dcm")
frames = list(pydicom.encaps.generate_pixel_data_frame(ds.PixelData))
image = Image.open(BytesIO(frames[0]))
pixel_array = np.asarray(image)
fig, axes = plt.subplots(1, 2)
axes[0].imshow(ds.pixel_array)
axes[0].axis('off')
axes[1].imshow(pixel_array)
axes[1].axis('off')
plt.show() |
@hackermd it seems correct, the source JPEG data is YCbCr given the APP0 marker and the Photometric Interpretation is Unless I'm missing something? |
@scaramallion if the pixel data are JPEG-compressed (encapsulated), then the value of Photometric Interpretation describes the color space prior to encoding/compression. I tend argue that the data should always be converted into RGB upon decompression (i.e., when accessing the |
It raises an interesting question with regard to color correction. @dclunie do you have any thoughts on this? |
As I understand it, our practice is to return the pixel data as found in the dataset for JPEG 10918, matching the Photometric Interpretation (i.e. not performing any additional transforms). The new pixel data backend I'm working on will have an option for transforming YBR to RGB which will default to |
I think this will lead to downstream problems and I am not sure it is correct either. In this specific case, it would assume that the image has been acquired in YBR color space. However, the included ICC Profile says that the data color space is RGB. |
Unfortunately there's no perfect answer for this, "common sense" would dictate that we return RGB, but there are use cases where returning the raw YCbCr data is better (transcoding from one JPEG format to another, for example). Our general philosophy is to make as few changes to the data found in the dataset as possible. |
I won't comment on whether your toolkit should be returning the untransformed or transformed color values (in terms of RGB or YCbCr), but I can describe what I understand to be the relationship between what the DICOM PhotometricInterpretation data element says (RGB or YBR_FULL_422) versus what is signaled in the encapsulated compressed bitstream (if anything), and the consequences of what codecs "do", for some commonly encountered patterns:
The chrominance component downsampling is actually a pretty good clue, since that is almost always done for chrominance channels and never done for RGB channels, but it does require examining the JPEG marker segment that describes this. The business of the numbering of the components is a strange one to rely on, but that is a consequence of JFIF making specific requirements for these. Since I use Java and their various JPEG codecs a lot, if find that the behavior described in https://docs.oracle.com/javase/8/docs/api/javax/imageio/metadata/doc-files/jpeg_metadata.html#color informative. Note also that what is present in any DICOM ICCProfile or ColorSpace has nothing to do with the components in the JPEG bitstream; these are intended to be applied after the bitstream has been decompressed and transformed (if necessary) into RGB. The same applies to any ICC profile embedded in the JPEG bitstream in an APP2 marker segment, since AFAIK, ICC profiles are specified with input and output in RGB or CMYK only. |
From a user perspective, I find this behavior very unexpected and problematic. Most users will probably expect the data to be returned as RGB and will not consider additional color space conversions before using the data for image analysis or display. There is also the problem that one cannot readily tell the color space from the returned NumPy array and it's unclear what transformations have been applied during decoding. How is the caller in your opinion supposed to handle the image provided here as an example? To me, the RGB -> YCbCr color space conversion during JPEG compression is part of the encoding and should be reversed upon decoding. As you said this is common sense and doing anything else will probably get the majority of users into trouble. |
I'm not disagreeing! Hence the new pixel data backend. Changing
arr = ds.pixel_array
if ds.PhotometricInterpretation in ("YBR_FULL", "YBR_FULL_422")
arr = convert_color_space(arr, ds.PhotometricInterpretation, "RGB") |
Thanks for the clarification. It's unfortunate. I hope that the new pixel data backend will not be as tightly integrated into the |
@scaramallion I just noted that this behavior is specific to JPEG. Color images encoded with JPEG 2000 or JPEG-LS appear to be transformed into RGB color space upon decoding. Is that intended? If that's the case, then this should be fixed (regardless of whether if would break backwards compatibility) in my opinion. It would be super confusing if the behavior would differ depending on the transfer syntax. |
@scaramallion we will provide a workaround in highdicom for the time being. I remain convinced that this is a bug in pydicom that should be addressed. See ImagingDataCommons/highdicom#152 for further information. |
I've probably been the one primarily driving that philosophy, and in general it is a good one, and in fact I always anticipated other libraries (like highdicom) would take low-level abilities and add to them. But ...
... I like this idea of thinking of it as just part of the decoding, but it's not my area of expertise, so I'm not sure on whether that would be everyone's interpretation. Pydicom is in the business of decoding DICOM data elements into the 'most natural' python types, but as I've said, I don't know myself what that interpretation should be in this case. Regardless, as long as documentation is very clear, I think users can be expected to read a half-page or page describing how it works. In any case, they should obviously be checking any outputs against known/expected results, and not assuming they know what the data is.
I do agree here that we can't fundamentally change existing behavior, so very much support it happening through a different backend. So, I'm not sure I've added much clarity overall, but thought I should at least express some thoughts... |
Thanks for your feedback @darcymason.
I would be fine with this as long as the behavior is consistent across transfer syntaxes. It appears that currently a different approach is used for JPEG baseline 8-bit transfer syntax versus JPEG 2000 or JPEG-LS transfer syntaxes. If that's indeed the case (please correct me if I am wrong), then I would suggest changing one of them to use a consistent approach. |
Ah, yes, I meant to address your previous comments about that too. That is a big problem if true.
I'm not sure about that - I would defer to @scaramallion, but I will say that would really make the new solution even a higher priority, which would allow everything to work consistently with no backwards baggage and become the clear "right way" to do things from here forward. We could strongly promote the new methods in documentation and release messages, perhaps even suggesting converting existing code to it. |
@scaramallion, I've been delinquent on getting a release out, but I'm wondering where you think this and #1605 stand in terms of the release - it seems reasonable to wait on it if possible, but I'm not sure if it is all sorted out yet given this discussion, or how available your time is. We have no fixed schedule in terms of minor releases before 3.0, so could do some closer together if needed. |
@scaramallion @darcymason PS 3.3 C.11.15.1.1 ICC Profile states
That means that pixels of a color image after decompression shall be in RGB color space. |
Sorry, I've been super busy (and I needed a break). My opinion is that this PR is a simple extension of what our current practice is regarding colour spaces, plus some housekeeping code that will get moved around anyway. I think I'll just close it and work on the future pixel data decoding back end instead. |
Describe the changes
jpeg
module and JPEG parsing utilitiesutil.debug_pixel_data
function for printing debug information when troubleshooting pixel data issuesTasks