-
-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve support for non-compliant DICOM and edge cases #327
Comments
Thanks @lnogueir! I agree that we can attempt some enhanced fallback behavior, particularly if that's standard practice in other parsers. Our current fallback behavior for no transfer syntax found in the Metadata is to proceed with little endian implicit but it shouldn't be too difficult to try a couple different transfer syntaxes speculatively. |
For missing metadata group length we do hav a flag for that, but can consider automatic fallback behavior: Line 184 in 65259e5
I think originally the idea was the safest option is to force callers to be as explicit as possible about what loosened restrictions they may want in case there are any safety issues with fallback behavior or in case the fact the dicom is not compliant is something the user would want raised to them (in case that led to other concerns). That being said, recently I've been thinking it's more reasonable to just have more automatic fallbacks, at least ones that are "standard" in the industry for dicom non-compliance. |
Going through these in a little more detail, and adding some more notes:
|
The draft changes in #330 address 1 and 2. |
I agree. I think it's better to have these automatic fallbacks and just log warnings if non-compliant things are found. That's the behavior most libraries choose I believe. |
Also, the changes in #331 address 6 (with SkipPixelData). They almost address 4 but some other changes are still needed for that. |
This addresses #220 (and one of the dicoms in #327) by treating elements with a VR=UN and undefined VL as Sequences. This pulls from work that I and @jabillings did previously (#235). Some follow on changes may be needed to support this part of [the spec CP](https://dicom.nema.org/dicom/cp/cp246_01.pdf): > If at some point an application knows the actual VR for an Attribute of VR UN (e.g. has its own applicable data dictionary), it can assume that the Value Field of the Attribute is encoded in Little Endian byte ordering with implicit VR encoding, irrespective of the current TransferSyntax. --------- Co-authored-by: jabillings <jabillings@users.noreply.github.com>
This fixes #328. This change will also help set the stage for roundtrip tests for transfer syntax uid inference in the case of missing transfer syntax UIDs (part of #327). For the future: should we give this option as something to always override whatever transfer syntax is present (or if not present, use the override as well)?
…next 100 bytes. (#330) This PR attempts to infer missing transfer syntax in dicoms during Parse. Specifically: * When transfer syntax is missing in dicom metadata, attempt to infer the correct transfer syntax by peeking the next 100bytes and trying to read an element without an error. This isn't foolproof, but one option to start with. * This also makes test updates to support testfiles/ that may not have PixelData. * This also introduces a write option to write dicoms without transfer syntax elements, in order to write some "roundtrip" unit tests for this behavior on Parse. I was able to successfully test using some test data from #327, but I need to do some more investigation to see if we can safely add those to our test files (licensing and otherwise). Things to consider in the future: * Try deflated little endian explicit as well. * Peek more/less than the initial 100 bytes, or move away from a fixed peek.
Edited the original comment to update current status, link to PRs, etc. |
I noticed that the following files (also from pydicom) do not error, but drop values:
The issue seems to be that they change the transfer syntax when writing the SQ dataset. The top level dataset is encoded with explicit little endian, but the dataset of the Referenced RT Plan Sequence is encoded with implicit little endian. These were the pydicom issues that handled that: pydicom/pydicom#1035, pydicom/pydicom#1067 |
I was playing around with the pydicom test files and noticed that parsing (with
SkipProcessingPixelDataValue()
option enabled) the following files result in errors:These files were parseable by dcm4che.
For most of them, the issue is related to missing some or all meta tags.
Ideally, if the meta tags aren't included, we should ignore and try parsing the first data element. And we can guess the transfer syntax by peeking the first tag and VR of the first element. We try [implicit LE, explicit BE, explicit LE], whichever one decodes a known tag and VR is likely the correct transfer syntax and we should use that for the rest of the dataset.
The text was updated successfully, but these errors were encountered: