Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Color space and saving issues with JPEG Baseline (Process 1) #1781

Open
naterichman opened this issue Mar 31, 2023 · 3 comments
Open

Color space and saving issues with JPEG Baseline (Process 1) #1781

naterichman opened this issue Mar 31, 2023 · 3 comments

Comments

@naterichman
Copy link

naterichman commented Mar 31, 2023

Describe the bug
I'm not sure what of any of this is a bug, but I have a sample file that is JPEG encoded, and I'm having issues converting the color space and saving after decompressing

Steps To Reproduce

I wrote a simple script to visualize the issues I'm having with this jpeg US dicom

When I load the image, decompress using pylibjpeg and plot the image, the color is in YBR, which is fine, but I tried (1) saving the image and reloading and it gets distorted, and (2) converting the color space to RGB and saving (I did remember to update the PhotometricInterpretation), and that also gets distorted (see figure below).

My question is, am I using this wrong? I would at least expect decompressing, saving, and reloading to give the same results.

import pydicom
from pydicom.pixel_data_handlers.util import convert_color_space
from matplotlib import pyplot as plt
import tempfile
from importlib import metadata

if __name__ == '__main__':
    print(metadata.version('pydicom'))
    fig, ax = plt.subplots(1,3)
    dcm = pydicom.dcmread('/home/naterichman/Downloads/ob_us_single.dcm')
    dcm.decompress(handler_name='pylibjpeg')
    ax[0].imshow(dcm.pixel_array[0,:,:,:])
    ax[0].set_title('Original')

    tmp1 = tempfile.NamedTemporaryFile()
    dcm.save_as(tmp1.name)
    dcm2 = pydicom.dcmread(tmp1.name)
    ax[1].imshow(dcm2.pixel_array[0,:,:,:])
    ax[1].set_title('Decompressed')


    tmp2 = tempfile.NamedTemporaryFile()
    dcm.PixelData = convert_color_space(
        dcm.pixel_array, dcm.PhotometricInterpretation, 'RGB'
    ).tobytes()
    dcm.PhotometricInterpretation = 'RGB'
    dcm.save_as(tmp2.name)
    dcm3 = pydicom.dcmread(tmp2.name)
    ax[2].imshow(dcm3.pixel_array[0,:,:,:])
    ax[2].set_title('Converted Color Space')
    plt.show()

This script prints the following warning as well:

/home/naterichman/.pyenv/versions/3.9.6/lib/python3.9/site-packages/pydicom/pixel_data_handlers/numpy_handler.py:250: UserWarning: The Photometric Interpretation of the dataset is YBR_FULL_422, however the length of the pixel data (43338240 bytes) is a third larger than expected (28892160 bytes) which indicates that this may be incorrect. You may need to change the Photometric Interpretation to the correct value.
  warnings.warn(msg)

Expected behavior

Given the following figure, I expect the first to look like it does, the second to look like the first, and the third to have the correct colors.

Screenshot from 2023-03-31 14-35-16

Your environment

$ python -m pydicom.env_info
$ python -m pydicom.env_info
module       | version
------       | -------
platform     | Linux-5.15.0-67-generic-x86_64-with-glibc2.35
Python       | 3.9.6 (default, Oct 10 2021, 15:58:45)  [GCC 10.3.0]
pydicom      | 2.3.1
gdcm         | _module not found_
jpeg_ls      | _module not found_
numpy        | 1.22.3
PIL          | 9.0.1
pylibjpeg    | 1.4.0
openjpeg     | 1.3.2
libjpeg      | 1.3.4

Edit: Added the warning

@darcymason
Copy link
Member

Hopefully others will comment, color conversion is not an area of expertise for me, but it is possibly a bug in pydicom: decompress does promise to update data elements as necessary, although it does not specifically mention PhotometricInterpretation.

@naterichman
Copy link
Author

So it turns out its the saving of the dataset without having first converted color space after decompressing that messes up the pixel data.

I tried removing the middle section of the above script and it produces expected results
image

Updated script:

import pydicom
from pydicom.pixel_data_handlers.util import convert_color_space
from pydicom import env_info
from matplotlib import pyplot as plt
import tempfile
from importlib import metadata

if __name__ == '__main__':
    env_info.main()
    fig, ax = plt.subplots(1, 2)
    dcm = pydicom.dcmread('/tmp/dicom/OB/us_acuson2.dcm')
    dcm.decompress('pylibjpeg')
    print(dcm.file_meta.TransferSyntaxUID, dcm.PhotometricInterpretation)
    ax[0].imshow(dcm.pixel_array[0,:,:,:])
    ax[0].set_title('Original')

    tmp1 = tempfile.NamedTemporaryFile()
    # Note, uncommenting the below produces the same result as first (corrupt pixel data)
    #dcm.save_as(tmp1.name)

    tmp = tempfile.NamedTemporaryFile()
    dcm.PixelData = convert_color_space(
        dcm.pixel_array, 'YBR_FULL', 'RGB'
    ).tobytes()
    dcm.PhotometricInterpretation = 'RGB'
    dcm.save_as(tmp.name)
    dcm3 = pydicom.dcmread(tmp.name)
    print(dcm3.file_meta.TransferSyntaxUID, dcm3.PhotometricInterpretation)
    ax[1].imshow(dcm3.pixel_array[0,:,:,:])
    ax[1].set_title('Converted Color Space')
    plt.show()

@mrbean-bremen
Copy link
Member

This may not be obvious from the documentation, but decompress does not change PixelData - it just changes the cached NumPy array accessed via pixel_data. To adapt the DICOM tag PixelData you have to write it explicitely (what you do in the second part).
decompress does change the transfer syntax though - which is a bit inconsistent, and leads to the wrong image saved in this case. As far as I remember @darcymason at one point had tried to make this more transparent by writing back PixelData on decompress, but that turned out to be quite complicated.

I'm not sure if and how we want to change this behavior, but it probably deserves at least a clearer documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants