Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Silently) produces .nii (not .nii.gz) despite -z i switch #124

Closed
yarikoptic opened this issue Aug 30, 2017 · 10 comments
Closed

(Silently) produces .nii (not .nii.gz) despite -z i switch #124

yarikoptic opened this issue Aug 30, 2017 · 10 comments

Comments

@yarikoptic
Copy link
Contributor

bids@rolando:~$ mkdir /tmp/heudicon-eshin-run2
bids@rolando:~$ dcm2niix -b y -z i -x n -t n -m n -f func -o /tmp/heudicon-eshin-run2 -s n -v n /inbox/DICOM/2017/08/28/unknown/015-func_run-02_task-movie
Compression will be faster with /usr/local/bin/pigz
Chris Rorden's dcm2niiX version v1.0.20170624 GCC6.3.0 (64-bit Linux)
Found 3473 DICOM image(s)
slices stacked despite varying acquisition numbers (if this is not desired please recompile)
Convert 3473 DICOM as /tmp/heudicon-eshin-run2/func (82x82x48x3473)
Conversion required 33.909003 seconds (9.394634 for core code).
bids@rolando:~$ ls -l /tmp/heudicon-eshin-run2
total 2189304
-rw-rw-r-- 1 bids bids       1489 Aug 30 13:56 func.json
-rw-rw-r-- 1 bids bids 2241835744 Aug 30 13:56 func.nii

although works correctly for another run (didn't compare if of the same kind, but sizes differ, not my data)

bids@rolando:~$ dcm2niix -b y -z i -x n -t n -m n -f func -o /tmp/heudicon-eshin-run1 -s n -v n /inbox/DICOM/2017/08/28/unknown/008-func_run-01_task-movie
Compression will be faster with /usr/local/bin/pigz
Chris Rorden's dcm2niiX version v1.0.20170624 GCC6.3.0 (64-bit Linux)
Found 3225 DICOM image(s)
slices stacked despite varying acquisition numbers (if this is not desired please recompile)
Convert 3225 DICOM as /tmp/heudicon-eshin-run1/func (82x82x48x3225)
Conversion required 246.323882 seconds (188.650848 for core code).
bids@rolando:~$ ls -l /tmp/heudicon-eshin-run*                                                                  /tmp/heudicon-eshin-run1:
total 821140
-rw-rw-r-- 1 bids bids      1482 Aug 30 13:58 func.json
-rw-rw-r-- 1 bids bids 840838757 Aug 30 14:01 func.nii.gz

/tmp/heudicon-eshin-run2:
total 2189304
-rw-rw-r-- 1 bids bids       1489 Aug 30 13:56 func.json
-rw-rw-r-- 1 bids bids 2241835744 Aug 30 13:56 func.nii

what could be a culprit?

neurolabusc added a commit that referenced this issue Aug 30, 2017
@neurolabusc
Copy link
Collaborator

@yarikoptic - I note you are running an older version of dcm2niix: recent versions do provide a warning:
Saving uncompressed data: internal compressor limited to %n byte images.

Prior to today, this was limited to 2Gb for the internal compressor. I have just updated this to support files up to 3758096384 bytes for 64-byte compiles. However, I have not tested this, so please try it out.

By the way, if you use the external compressor, the maximum uncompressed input size is limited to #define kMaxPigz 3758096384. The issue here is that while in theory the gz format can support files larger than 4 Gb (as the format stores 4 bytes uncompressed input size modulo 2^32), in practice I think many reading tools treat this as the total uncompressed input size. Therefore, limiting compression to files < 4 Gb seems like a safe choice. Regardless, recent versions should give you meaningful feedback.

@neurolabusc
Copy link
Collaborator

@yarikoptic and @satra - you may want to consider adding pigz as a dependency for heudiconv. While my software includes a serial compressor that works if pigz is missing, unless you are working with network storage you will find pigz is dramatically faster. My software looks for "pigz" in the user path, but will fall back to "pigz_afni" or "pigz_mricro" in the same folder as dcm2niix. Since compression takes longer than all the other stages combined, including pigz dramatically accelerates processing.

@yarikoptic
Copy link
Contributor Author

Thank you! I will try recent version.
But, with the advent of parallel acquisition methods we will see this situation more and more often... I wondered if dcm2niix should just try to do best possible ie to not impose limits while assuming crippled tools down the stream?

@yarikoptic
Copy link
Contributor Author

And yeah, I love pigz as well... Usually I just symlink it to be gzip in my PATH ;-) I guess we indeed should switch to use/recommend it

@neurolabusc
Copy link
Collaborator

I think having large gz files fail on tools could have many catastrophic effects for users. Pipelines that use many different tools would be at the mercy of the weakest link, and tools may fail in non-obvious ways (e.g. only processing initial volumes).

As a personal preference, I would advise most users to stick with nii instead of nii.gz for the initial DICOM to NIfTI data conversion. Due to poor SNR, gz does not compress raw data well. I do think that nii.gz makes a lot more sense after scalp stripping (e.g. bet), where there is a lot more redundancy in the images. This results in more efficient compression and faster decompression. One thing to consider is if you want to use drive-level disk compression (e.g. NTFS compression) rather than file level compression (e.g. .gz). This is transparent to the software.

@yarikoptic
Copy link
Contributor Author

I think having large gz files fail on tools could have many catastrophic effects for users. Pipelines that use many different tools would be at the mercy of the weakest link, and tools may fail in non-obvious ways (e.g. only processing initial volumes).

yeap -- but how do you expect to encourage/facilitate/push developers of those tools to fix them up to be ready to deal with growing data sizes if they otherwise would not even be aware? so IMHO it would be counterproductive and, again, could always be worked around by uncompressing them IF a particular software cannot handle them

Due to poor SNR, gz does not compress raw data well.

[bids@rolando heudicon-eshin-run1] > zcat func.nii.gz > func.nii   
[bids@rolando heudicon-eshin-run1] > du -scm func.nii*          
1986	func.nii
802	func.nii.gz
2788	total

so 60% reduction in size is imho worthwhile ;)

One thing to consider is if you want to use drive-level disk compression

I love those too BUT they need to balance between compression ratio and IO performance so often their compression ratio is not as good as achieved by a dedicated compressor run, e.g. here is those files on ZFS

[bids@rolando BIDS] > du -scm func.nii*   
1328	func.nii
803	func.nii.gz

so only 35% reduction in size for func.nii as compared to 60% for .nii.gz

@yarikoptic
Copy link
Contributor Author

FWIW, -z y (use pigz, upon upgrade of dcm2niix) worked fine on this file! and it did fall back to use built in zip and produced .nii whenever I removed pigz (temporarily ;) )

neurolabusc added a commit that referenced this issue Aug 30, 2017
Compiler switch "-dmyDisableGzSizeLimits" will generate GZ files regardless of size, otherwise GZ limited to ~4Gb
@neurolabusc
Copy link
Collaborator

I have added a new compiler directive -dmyDisableGzSizeLimits. This allows you to build versions of dcm2niix that will generate GZ files that exceed 4Gb. Please test this out, once we have confidence that popular tools support these images we can make this the default compilation.

yarikoptic added a commit to neurodebian/dcm2niix that referenced this issue Aug 30, 2017
* commit 'v1.0.20170818-8-gdd1d994':
  Fix for rordenlab#124
@aqw
Copy link

aqw commented Aug 31, 2017

@yarikoptic Does that ZFS dataset have gzip-9 set as the compression method?

---Alex

@yarikoptic
Copy link
Contributor Author

Unlikely (I am not administrator on that one), isn't it cpu heavy and rarely recommended (unless for backup)? That is what I meant by talking about tradeoffs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants