Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open JPEG Lossless image #532

Closed
alexattia opened this issue Jan 15, 2018 · 60 comments
Closed

Open JPEG Lossless image #532

alexattia opened this issue Jan 15, 2018 · 60 comments
Labels

Comments

@alexattia
Copy link

Description

Hello,
I am trying to use pydicom to open images in a dicom directory.
But when I try to open an image, I have this error :
NotImplementedError: No available image handler could decode this transfer syntax JPEG Lossless, Non-Hierarchical, First-Order Prediction (Process 14 [Selection Value 1])

Steps/Code to Reproduce

I run this :

for record in ds.DirectoryRecordSequence:
  if record.DirectoryRecordType == "IMAGE":
  # Extract the relative path to the DICOM file
        path = os.path.join(*record.ReferencedFileID)
        dcm = dicom.read_file(path)
        d = dcm.pixel_array

and it output the NotImplemented Error

Versions

I have tried with pydicom 1.0.

Do you have any idea how to solve this error or another package to open these images ?
Thank you very much in advance.

@mrbean-bremen
Copy link
Member

This probably means that you do not have a package installed that can handle this syntax.
There are 3 packages that can handle this: jpeg_ls (CharPyLS in pypi), pillow and gdcm. Pydicom itself uses these packages for decoding compressed images (additionally, you need numpy for any image data related functionality).

@alexattia
Copy link
Author

Just installed pillow and gdcm (numpy already installed) but I didn't find a way to install CharPyLS on Mac. I tried without CharPyLS and it throws the same error.

@mrbean-bremen
Copy link
Member

Hm, I just checked the code, and gdcm shall definitely handle this transfer syntax, while pillow can handle it if the pillow jpeg plugin is installed, so it definitely should work for you. Can you check that you really can import gdcm or import PIL in you python environment?

@alexattia
Copy link
Author

Thank you for your answer. I just checked and import gdcm, import PILworks fine but when I run :

sr = pydicom.read_file(path)
sr.pixel_array

I have this error:
AttributeError: module 'gdcm' has no attribute 'ImageReader'

@mrbean-bremen
Copy link
Member

Sorry for the late answer - but I don't have a solution. Sounds like a problem with the version of gdcm. I don't have a Mac to reproduce this, and on my Windows machine I had trouble to install gdcm (though it works with pillow).
Can you post the output of pip list to check the installed versions?
Also, you could try to remove gdcm and check if it works with only pillow installed.

@mrbean-bremen
Copy link
Member

There is also the option to install gdcm using conda-forge (see https://pydicom.github.io/pydicom/dev/getting_started.html) -this works at least under Linux and Windows.

@dalcacer
Copy link

Hi @alexattia, hi @mrbean-bremen, the hint with gdcm via conda-forge works out for that file format/transfer syntax. CharPyLS/pillow can't read 1.2.840.10008.1.2.4.70.

A little heads up regarding gdcm (on mac?): the conda-forge gdcm only works in an python=3.6.1 environment. Higher versions will crash at runtime.

@mrbean-bremen
Copy link
Member

@glemaitre - do you know anything about this conda-forge problem?

@glemaitre
Copy link
Contributor

@dalcacer could you open a PR with a minimum example triggering the crash.
I don't see why a different version of python should trigger a crash but we never know.

@dalcacer
Copy link

dalcacer commented Feb 27, 2018

Just in case, a little more information.
I started out with Python 3.6.1, pydicom 0.9.9 and gdcm 2.6.6 (conda-forge).
I then moved to Python 3.6.4, pydicom 1.0.0a (via clinical-graphics) and gdcm 2.6.6 (via clinical-graphics).
Finally, I ended up building gdcm 2.6.6 and gdcm 2.8.4 based on the conda recipes, and I tinkered with the involved version numbers.
Whenever a Python > 3.6.1 is involved I end up with a

Fatal Python error: PyThreadState_Get: no current thread

when the test are run.

@glemaitre
Copy link
Contributor

Could you upgrade pydicom from conda-forge with the new version 1.0.1

@glemaitre
Copy link
Contributor

Whenever a Python > 3.6.1 is involved I end up with a
Fatal Python error: PyThreadState_Get: no current thread

Could you give a small snippet of code to trigger that error

@dalcacer
Copy link

First of all, thanks for the conda-forge release :)
The issue is not directly related to pydicom, but to (conda-forge/clinical-graphics) gdcm and macOS.
A issue-report against https://github.com/conda-forge/gdcm-feedstock/ would be useful, I guess.
Though the gdcm-feedstock seems to be somewhat stale: https://travis-ci.org/conda-forge/gdcm-feedstock.

I could fork and link the recipe with the according changes.

@rhaxton
Copy link
Contributor

rhaxton commented Feb 27, 2018

I have encountered the above error many times ( PyThreadState_Get: no current thread )
This is usually with cx_Freeze, but others as well. Almost always, it is related to some python C extension that is compiled with a different compiler than the python library. This is the bane of my existence with python. Installing a fresh conda and then trying to reinstall the modules does not always fix it because pip install/python setup.py install will not always recompile the extension. I have to remember to force recompilation...

gdcm's python wrapper has always been difficult for me to use because this issue appears on some builds/os's but not others, or I will get a crash on exit (a "double free" of something). Other times, it works perfectly. Again, the issue is almost always that I missed some step of compiling/installing correctly.

@mrbean-bremen
Copy link
Member

Looks like #513 is a similar issue.

@Liu0329
Copy link

Liu0329 commented Aug 16, 2018

Update to 1.2.0, solved the problem.
1.1.0 cannot handle it.

@fredaTian
Copy link

Hi @Liu0329 , I met the same problem, but could not solve this problem even I have already update to 1.2.0 dev0

@darcymason
Copy link
Member

@fredaTian, what supporting libraries do you have installed? (see, e.g. second comment in this issue)

@scaramallion
Copy link
Member

And what's your transfer syntax UID?

from pydicom import dcmread
ds = dcmread('/path/to/file')
print(ds.file_meta.TransferSyntaxUID)

@darcymason
Copy link
Member

@jasonminsookim
Copy link

I fixed the issue by updating pydicom to 1.2.0dev() then restarting my kernel.

@Luxas98
Copy link

Luxas98 commented Sep 10, 2018

I found that GDCM installation was missing compiling/importing it's libs for python3. Made a little wrapper, hope it helps: https://github.com/HealthplusAI/python3-gdcm

after just do pip install Pillow, at least that's what our current production servers do to get rid of NotImplementedError: No available image handler could decode this transfer syntax JPEG 2000 Image Compression

@darcymason
Copy link
Member

@Luxas98, I think that could be helpful for users -- perhaps we could add a link to python3-gdcm to the pydicom documentation at the page I linked above.

Otherwise, I think we can close this issue. While we can try to point people in the right direction with errors like the ones discussed above, generally this is beyond the scope of pydicom. There are so many possible variations in platform, install, compiling, etc. as has been noted above, that the user really has to take that on themselves if using these external libraries. And the pydicom google group can be used to reach a broader user base for possible help.

@fimafurman
Copy link

I had a similar issue and after installing gdcm (CentOS) I am now getting 'utf-8' codec can't decode byte 0xa8 in position 24806: invalid start byte error

I tried loading a few different DICOM images and seem to always get utf8 errors.

@scaramallion
Copy link
Member

scaramallion commented Sep 12, 2018

Could you start a new issue and attach an anonymised version of an example file please?

@Liu0329
Copy link

Liu0329 commented Oct 8, 2018

I also have the problem using pydicom 1.2.0 dev and cannot solve it.
I also tried conda gdcm
gdcm_array = image.GetBuffer() only gives a string '\udc88'.
and image.GetDimension(0) would give Segmentation fault.
The dicom viewer can display the image, but I need to access the pixels and convert them to numpy array.
@scaramallion I have uploaded my file, and added '.txt' to follow the rules of github.
7428_000001_1.2.840.113619.2.203.4.2147483647.1417723747.750376.txt

@Liu0329
Copy link

Liu0329 commented Oct 8, 2018

Hi @Liu0329 , I met the same problem, but could not solve this problem even I have already update to 1.2.0 dev0

@fredaTian
Yes, 1.2.0 still fails sometimes

@scaramallion
Copy link
Member

scaramallion commented Oct 8, 2018

Transfer syntax: 1.2.840.10008.1.2.4.70 (JPEGLossless Non-hierarchical, first-order prediction), 1 frame, 2021 x 2021, 1 sample/pixel, 16/14 bit, with 3 LUTs, overlay and image icon.

Pixel Data element starts at offset 616798, has BOT with length 4, value 0. Image data starts at offset 616830 and ends at 4700086 (length 4083257).

JPG has SOF11 marker, format matches transfer syntax, 2021x2021, 1 channel, 16 bit precision on input sample. Dataset can be decompressed with DCMTK's dcmdjpeg.

@Liu0329 I don't have any issues in a new conda virtualenv with python 3.7, numpy 1.15.2, gdcm 2.8.4 and pydicom 1.2 running on ubuntu 18.04. Could you post a minimal working example and your system/package versions?

The following works for me:

from pydicom import dcmread

ds = dcmread('path/to/dataset')
arr = ds.pixel_array

Also, is it OK if we add your dataset as one of our testing datasets?

@slowvak
Copy link

slowvak commented May 10, 2019 via email

@slowvak
Copy link

slowvak commented May 10, 2019 via email

@mrbean-bremen
Copy link
Member

Well, it should log that - something along the lines in the original description.

@mrbean-bremen
Copy link
Member

@darcymason - can we close this issue? I think it gets confusing with all the problems with unsupported transfer syntaxes and problems with GDCM mixed. Maybe we shall add some troubleshooting guide that explains this kind of problems in a more prominent place...

@slowvak
Copy link

slowvak commented May 10, 2019 via email

@scaramallion
Copy link
Member

@amandagb please don't upload non-anonymised datasets.

I would suggest starting a new issue with an anonymised version of the dataset

@pydicom pydicom deleted a comment from amandagb Jul 20, 2020
@amandagb
Copy link

amandagb commented Jul 20, 2020

@scaramallion The dataset was anonymized by Orthanc, I simply forced it to be a "real looking" name with the faker package. If you'd like, I can force the name to be Anonymous. All PHI for HIPPA and GDPR was removed.
You can look at the Tag: "PatientIdentityRemoved" : "YES" --- I assure you I had taken that step

@scaramallion
Copy link
Member

@amandagb my apologies then, all the patient related info (name, ID, birthdate) seemed like it was from a real identity.

@amandagb
Copy link

@scaramallion I just finished re-creating the dataset with an anonymous name as you requested but it seems maybe the refresh lost my previous post....can you retrieve it somehow or do I have to re-type the issue? I have a new txt dicom file (which should be basically the same but with the name "Anonymous JPEGfailure"
84b0ff9b-594e-4f23-ae3e-e9050ab99a46.txt

@scaramallion
Copy link
Member

@amandagb's original comment:

I'm having this same issue when trying to read the pixel array from a dicom sequence with transfer syntax '1.2.840.10008.1.2.4.70' (JPEG Lossless, Non-Hierarchical, First-Order Prediction (Process 14 [Selection Value 1]) )

The behavior is a bit complicated. The files are hosted on an Orthanc server, and I typically retrieve the files via the request (shown below), convert the file content to a bytes object, then feed it into pydicom. When I then try to access the pixel_array from the pydicom dataset, the kernel just dies (see Example Code Block 1).

Alternatively, if I go through Chrome, and make the same request to download the file (type http://169.254.100.100:8042/instances/efd2303d-bd5b83e5-5656f3c1-31ece5f7-db2dfbbf/file into Chrome and get the attached dicom file (which I've re-defined as a txt file per Git file attachment requirements) ), then execute Example Code Block 2, I can access the pixel_array without any issues.

Any insight into why this might be happening? I'm running Windows 10 with an Anaconda3 64bit install. I've attached my conda list output as well as the anonymized file causing the problems.

python=3.6.8
pydicom=1.3.0=py_0
gdcm=2.8.4=py36_vc14_0
pillow=5.4.1=py36hdc69c19_0

Example Code Block 1

import pydicom
import io
import requests
dcmInstance = 'http://169.254.100.100:8042/instances/efd2303d-bd5b83e5-5656f3c1-31ece5f7-db2dfbbf/file'
dcmFile = requests.get(dcmInstance)
dcmBytes = io.BytesIO(dcmFile.content)
ds = pydicom.dcmread(dcmBytes)
# Running this line will then kill the kernel
testArr = ds.pixel_array

Example Code Block 2

import pydicom
ds = pydicom.dcmread(r'C:\Users\Downloads\aaf20e6f-f35f-4fb4-a52a-d257d2ccf4bc.txt')
testArr = ds.pixel_array

pydicomEnv.txt

aaf20e6f-f35f-4fb4-a52a-d257d2ccf4bc.txt

@scaramallion
Copy link
Member

scaramallion commented Jul 21, 2020

It's difficult to determine the problem without more information (or a traceback). Pillow doesn't actually support JPEG Lossless, so that may be a cause, although it should just fallback to GDCM...

You could try running python with pdb or -d and see if there's anything useful

@amandagb
Copy link

amandagb commented Jul 21, 2020

I have a feeling it has to do with how the request contents are encoded or something. I'm not sure I understand how to use pydicom's pydicom.encaps.generate_pixel_data but when I run this command:
pydicom.encaps.generate_pixel_data(dcmBytes)
I don't have an issue. I don't understand how to use this to get the actual pixel data out, though...

I actually save the request contents (the dcmBytes object) to an actual dicom file. If I then re-read the file, I can access the pixel_array without problems. Is there any chance that there's a race condition where we're trying to access pixel_data property before the entire file has been read?

@scaramallion
Copy link
Member

scaramallion commented Jul 21, 2020

generate_pixel_data() will return a generator that yields encoded frames, so you'd still need to decode them. It should be run like:

generator = generate_pixel_data(ds.PixelData, getattr(ds, 'NumberOfFrames', 1))
encoded_frame = next(generator)  # each frame is a JPEG image

Is there any chance that there's a race condition where we're trying to access pixel_data property before the entire file has been read?

As far as I can tell requests.get() is synchronous, so it should be complete by the time pixel_array is called.

@slowvak
Copy link

slowvak commented Jul 21, 2020 via email

@scaramallion
Copy link
Member

I have been burned by images compressed with that codec and the explanation from pydicom is not so clear.

We'd love to improve the documentation if you'd like to explain where/how it's deficient

@slowvak
Copy link

slowvak commented Jul 21, 2020 via email

@amandagb
Copy link

Yes, gdcm is installed.
I think the issue is in the dcmread function. It seems that somehow to dcmread, the byte object isn't interpreted as the same type of object as a file path. If I write the byte object to a file and then feel dcmread the file path, it seems to work; though, not in a loop (hence my race condition question, it's more geared toward the writing then reading of the file than accessing the request content).
I'm going to poke around pydicom source to see if I can figure anything out. Otherwise, I'll just have to refactor my code! The file content seems completely fine, it's just how dcmread is interpreting the input argument (byte object vs file path) that's the issue.

@scaramallion
Copy link
Member

scaramallion commented Jul 21, 2020

BytesIO and other file-like objects should work. If you can print(ds) successfully then it's working as intended.

It might be a long shot, but you should probably try with the most recent release (v2.0) as well.

@amandagb
Copy link

Other parts of the dataset are accessible, and with other transfer syntax it works with the byte objects. For some reason just these files seem to kill the kernel when I try to access pixel_data. I’m not sure how generalizable it is. I’ll to do some more investigation tonight and post anything I figure out.

@scaramallion
Copy link
Member

scaramallion commented Jul 21, 2020

Hmm, GDCM 2.8.4 doesn't have the in-memory pixel data decoding, so it'll be using Dataset.filename which is the BytesIO() instance which could be causing the crash. Let me test this and see if I can reproduce.

Side note: I really hate debugging GDCM issues, it's such a pain to install

@scaramallion
Copy link
Member

scaramallion commented Jul 21, 2020

No, I just get a regular exception:

>>> import gdcm
>>> gdcm.Version_GetVersion()
'2.8.4'
>>> from pydicom import __version__
>>> __version__
'1.3.0'
>>> from pydicom import dcmread
>>> from io import BytesIO
>>> with open('532b.dcm', 'rb') as f:
...     bs = BytesIO(f.read())
... 
>>> ds = dcmread(bs)
>>> arr = ds.pixel_array
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../pydicom/dataset.py", line 1362, in pixel_array
    self.convert_pixel_data()
  File ".../pydicom/dataset.py", line 1308, in convert_pixel_data
    raise last_exception
  File ".../pydicom/dataset.py", line 1276, in convert_pixel_data
    arr = handler.get_pixeldata(self)
  File ".../pydicom/pixel_data_handlers/gdcm_handler.py", line 210, in get_pixeldata
    raise TypeError("GDCM could not read DICOM image")
TypeError: GDCM could not read DICOM image

Although that's a bug all on its own...

@amandagb
Copy link

I ultimately decided to just code a work around. With the versions and files I mentioned, I still get the python kernel just dying without any indication of the problem. I looked into pydicom's filereader module, but wasn't able to find anything that seemed like an issue. I may see if the issue can be resolved by upgrading pydicom/gdcm, but for now I can't quite figure out why the byte object fails but the file is fine.

@amandagb
Copy link

amandagb commented Jul 21, 2020

@scaramallion, I finally got to the root of the issue. As you mentioned, the ds.filename is used in the gdcm_handler, lines 208 - 211. While your environment somehow gracefully handles the error, mine kills the kernel

  File ".../pydicom/pixel_data_handlers/gdcm_handler.py", line 210, in get_pixeldata
    raise TypeError("GDCM could not read DICOM image")
TypeError: GDCM could not read DICOM image

It seems what's happening is that since type(ds.filename) = NoneType, when calling image_reader = gdcm.ImageReader().SetFileName(NoneType) there's an exception since the type given to SetFileName is not supported by gdcm.

As best I can tell, you can't use the ImageReader directly for a bytes object. Seems like you'd have to use a gdcm.Fragment (see this Stackoverflow post which gives C# code). Am I correct, then, in saying that for the gdcm supported syntaxes (basically all the JPEG ones), you can't pass in a Bytes object?

@scaramallion
Copy link
Member

@amandagb this should be fixed in master, is it too much trouble to test it out?

@SimonBiggs
Copy link
Contributor

@scaramallion I find it can be really helpful to make it easier for people to test these sorts of things to create a dev release on pypi. Not sure if that's something that is easy to do for pydicom... But it makes user testing super simple...

@amandagb
Copy link

@scaramallion let me see what I can do. Is the best way to just clone from github and add that reference to pydicom to my path? I'm accustomed to using conda to set up environments so I'm not too clear on how to test the master branch.

@scaramallion
Copy link
Member

pip install git+https://github.com/pydicom/pydicom should work

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests