Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pixel_array is read as broken values #966

Closed
seilna opened this issue Nov 2, 2019 · 11 comments
Closed

pixel_array is read as broken values #966

seilna opened this issue Nov 2, 2019 · 11 comments

Comments

@seilna
Copy link

seilna commented Nov 2, 2019

Describe the issue
I want to access pixel data from the dicom file below. It opens successfully in dicom viewers such as horos or dwv, but if I use pydicom to access pixel data from the dicom file, the pixel value will be read as all zero values.

Expected behavior
Read pixel array without broken pixel value.

Steps To Reproduce
How to reproduce the issue. Please include:

  1. A minimum working code sample
import pydicom
x = pydicom.dcmread(‘dcm_path.dcm’)
pixel = x.pixel_array
print(pixel.min(), pixel.mean(), pixel.max())
>>> 0, 0.0, 0
  1. The traceback (if one occurred)
Stream too short, expected SOT
Failed to decode tile 1/1
  1. Which of the following packages are available and their versions:
  • Numpy == 1.16.5
  • Pillow == 6.2.0
  • GDCM == 2.8.9
  1. The anonymized DICOM dataset (if possible).
    https://drive.google.com/file/d/1QEDbjcK09cxqxflJkOvqBUIHgTJbtBaa/view?usp=sharing

Your environment
Please run the following and paste the output.

$ python -c "import platform; print(platform.platform())"
>>> Linux-4.4.0-87-generic-x86_64-with-debian-stretch-sid
$ python -c "import sys; print('Python ', sys.version)"
>>> Python  3.7.4 (default, Aug 13 2019, 20:35:49)
[GCC 7.3.0]
$ python -c "import pydicom; print('pydicom ', pydicom.__version__)"
>>> pydicom  1.4.0.dev0
@scaramallion
Copy link
Member

scaramallion commented Nov 4, 2019

Transfer Syntax UID is 1.2.840.10008.1.2.4.90 - JPEG 2000 Lossless.

I've checked the image data stored in the pixel data using jpylyzer and its reporting that its not a valid JPEG2K file.

Edit: See this comment

@seilna
Copy link
Author

seilna commented Nov 4, 2019

@scaramallion Thank you so much for your help. From your comment, two questions arise me.

  1. I also used jpylyzer as follows to check validity of JPEG 2000 lossless image:
import pydicom
x = pydicom.dcmread('failed_dcm.dcm')
with open('failed.jp2', 'wb') as f:
    f.write(x.PixelData)
$ jpylyzer failed.jp2 > failed.xml

and below result shows me image data stored in dicom is not valid:

...
<isValidJP2>False</isValidJP2>
...

However, when I check the same validity of dicom file whose pixel data open successfully using pydicom, jpylyzer also told me this image data is also invalid. How is it possible? Do you have any ideas how pydicom can handle pixel data which is known to be broken by jpylyzer?

  1. I also thought that the image data is not valid, but it confuses me that the dicom file open without problem in dicom viewer like dwv or horos. Do you have any ideas why they can handle broken image data? How I can handle image data successfully using pydicom?

@mrbean-bremen
Copy link
Member

Handling compressed pixel data is not done directly by pydicom, but rather by GDCM or Pillow, so we rely on these libraries to correctly decode the data. Other libraries used by the viewers you mentioned may be able to handle some invalid DICOM files that these cannot handle.
So, unfortunately you are out of luck here. I cannot think of a solution other than some ugly workaround like re-saving the DICOM data with somer other program that can handle this kind of data. You could also add an issue in GDCM to support handling of such data, though that won't help you short-term, of course.

@mrbean-bremen
Copy link
Member

One last thing to check is if it is supported by Pillow (not very likely, but worth a try). To test this, you have to use an enviroment without GDCM, or adapt config.pixel_data_handlers by removing gdcm_handler, or moving it behind pillow_handler.

@seilna
Copy link
Author

seilna commented Nov 4, 2019

@mrbean-bremen Thank you very much your help. Unfortunately, pillow_handler also failed to decode the pixel data with following traceback:

NotImplementedError                       Traceback (most recent call last)
<ipython-input-4-7c0b6bd55455> in <module>
----> 1 x.pixel_array

/opt/conda/lib/python3.7/site-packages/pydicom/dataset.py in pixel_array(self)
   1513             The (7fe0,0010) *Pixel Data* converted to a :class:`numpy.ndarray`.
   1514         """
-> 1515         self.convert_pixel_data()
   1516         return self._pixel_array
   1517

/opt/conda/lib/python3.7/site-packages/pydicom/dataset.py in convert_pixel_data(self)
   1374         )
   1375
-> 1376         raise last_exception
   1377
   1378     def decompress(self):

/opt/conda/lib/python3.7/site-packages/pydicom/dataset.py in convert_pixel_data(self)
   1342             try:
   1343                 # Use the handler to get a 1D numpy array of the pixel data
-> 1344                 arr = handler.get_pixeldata(self)
   1345                 self._pixel_array = reshape_pixel_array(self, arr)
   1346

/opt/conda/lib/python3.7/site-packages/pydicom/pixel_data_handlers/pillow_handler.py in get_pixeldata(ds)
    129                "Pillow lacks the jpeg 2000 decoder plugin"
    130                .format(transfer_syntax.name))
--> 131         raise NotImplementedError(msg)
    132
    133     if transfer_syntax not in PillowSupportedTransferSyntaxes:

NotImplementedError: this transfer syntax JPEG 2000 Image Compression (Lossless Only), can not be read because Pillow lacks the jpeg 2000 decoder plugin

as you said, it is better trying workaround with other programs... But it was really helpful to know that it is not the problem of pydicom. Thank you again.

@mrbean-bremen
Copy link
Member

mrbean-bremen commented Nov 5, 2019

I think you need OpenJPEG installed for Pillow JPEG 2000 support, though I have no experience there, and it is not guaranteed that this will work anyway...

@mrbean-bremen
Copy link
Member

Closing as answered.

@scaramallion
Copy link
Member

scaramallion commented Dec 23, 2019

For future readers, something I've realised after working on better support for JPEG2000 is that jpylyzer won't return a valid result for the data stored in PixelData because as per the DICOM Standard, only the JPEG codestream is included and not the file format header.

The optional JP2 file format header shall NOT be included.

After re-checking using the openjpeg library, the JPEG2000 codestream data itself in this issue was showing as broken.

@scaramallion
Copy link
Member

scaramallion commented Dec 26, 2019

And looking at it again after seeing that openjpeg (as used by Pillow and GDCM) can have issues with trailing padding, it looks like the problem with the data is the COM marker at offset 4099003 (which I think is non-conformant to the JPEG2K standard). If you do the following your image is readable:

from pydicom import dcmread
from pydicom.encaps import defragment_data, encapsulate

ds = dcmread('966.dcm')
bs = defragment_data(ds.PixelData)
bs = bs[:4099003] + b'\xff\xd9'

ds.PixelData = encapsulate([bs])
arr = ds.pixel_array

So final conclusion: image broken but savable. The other viewers likely use a different JPEG2K library.

@georgeivan24
Copy link

I have a similar case when a try to open my .dcm with pydicom and the pixel value will be read as all zero values.
using the [scaramallion] solution I could´t get that my dicom was readable, in this case, how could get it ?

https://drive.google.com/file/d/1AC44KMxfKdR6_ODO97mCBeLepNV7-x4d/view?usp=sharing

@scaramallion
Copy link
Member

scaramallion commented Dec 1, 2023

Welp, looking at the file, I'd say your image is all zeros because your image is all zeros.

(0002,0010) UI =LittleEndianExplicit                    #  20, 1 TransferSyntaxUID
(0028,0002) US 1                                        #   2, 1 SamplesPerPixel
(0028,0004) CS [MONOCHROME2]                            #  12, 1 PhotometricInterpretation
(0028,0006) US 0                                        #   2, 1 PlanarConfiguration
(0028,0010) US 512                                      #   2, 1 Rows
(0028,0011) US 512                                      #   2, 1 Columns
(0028,0100) US 16                                       #   2, 1 BitsAllocated
(0028,0101) US 12                                       #   2, 1 BitsStored
(0028,0102) US 11                                       #   2, 1 HighBit
(0028,0103) US 0                                        #   2, 1 PixelRepresentation
(0028,1050) DS [200]                                    #   4, 1 WindowCenter
(0028,1051) DS [50]                                     #   2, 1 WindowWidth
(0028,1052) DS [0]                                      #   2, 1 RescaleIntercept
(0028,1053) DS [1]                                      #   2, 1 RescaleSlope
(0028,1055) LO [Selecci�n del usuario]                  #  22, 1 WindowCenterWidthExplanation
(0028,1056) CS [LINEAR]                                 #   6, 1 VOILUTFunction
(0028,2110) CS [01]                                     #   2, 1 LossyImageCompression
(0028,2112) DS [4228]                                   #   4, 1 LossyImageCompressionRatio
(7fe0,0010) OW 0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000\0000... # 524288, 1 PixelData

Data is uncompressed (i.e. not JPEG), entire Pixel Data (offset is 74566) is 0x00 (i.e. zero).

Are you expecting to see text? There does appear to be Overlay Data, but I didn't look at it. Edit: 0x6002 has some text in it.

from pydicom import dcmread
import matplotlib.pyplot as plt

ds = dcmread("966b.dcm")
plt.imshow(ds.overlay_array(0x6002))
plt.show()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants