Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dataset cannot be copied if masker comes from a gzipped file #399

Closed
tsalo opened this issue Nov 16, 2020 · 8 comments · Fixed by #431
Closed

Dataset cannot be copied if masker comes from a gzipped file #399

tsalo opened this issue Nov 16, 2020 · 8 comments · Fixed by #431
Labels
bug Issues noting problems and PRs fixing those problems.

Comments

@tsalo
Copy link
Member

tsalo commented Nov 16, 2020

When I try to use the ale_2mm target (e.g., when loading a Sleuth dataset), the resulting Dataset cannot be copied due to the following error:

TypeError: cannot serialize 'IndexedGzipFile' object

After digging a bit, if I just uncompress the associated mask file, everything works fine. I believe it's an issue I came across a while ago with nibabel (see cosanlab/nltools#281). I thought the issue was resolved in the newest versions of nibabel, but I may have just misunderstood what was causing it. In any case, I think we need to either enforce uncompressed files for our masks or find some safer way of loading them in our masker objects.

BTW, converting to a masker with utils.get_masker() doesn't help.

@tsalo tsalo added the bug Issues noting problems and PRs fixing those problems. label Nov 16, 2020
@tsalo tsalo changed the title Dataset cannot be copied if masker is nii.gz Dataset cannot be copied if masker comes from a gzipped file Nov 16, 2020
@tsalo
Copy link
Member Author

tsalo commented Nov 16, 2020

As a temporary solution, we could incorporate a check into the get_masker() function that determines if an input is from a gzipped file and raises a warning if so?

@tsalo
Copy link
Member Author

tsalo commented Nov 17, 2020

@tyarkoni Based on some testing in #400 and the original cosanlab issue, it looks like the bug is eliminated by not having indexed_gzip installed. This is why I encountered the bug on my laptop and not in CircleCI.

Do you have any thoughts on what we should do? Add indexed_gzip as a dependency and work around the bug by default or add some kind of check to see if it's installed and raise a warning if the right conditions are met?

@tyarkoni
Copy link
Contributor

I don't feel I know enough about the packages in question to have an informed opinion, so I'll defer to your judgment on this.
The only thing I do feel pretty strongly about is that things should still work fine without any user intervention—a warning is fine, but no additional configuration should be necessary. Assuming that's what's on the table (i.e., implicitly unpack the gzipped mask to a temporary location and then use that instead, with a warning), that seems fine. But if this isn't already a known issue on the NiBabel repo, might be worth raising it for discussion there.

@tsalo
Copy link
Member Author

tsalo commented Nov 17, 2020

@rmarkello shared the following trick that we can use:

import gzip
import nibabel as nib

nib.openers.HAVE_INDEXED_GZIP = False
nib.openers.IndexedGzipFile = gzip.GzipFile

At the moment, it might be sufficient to place this in get_masker(), and then reset the nib.openers parameters after the image is loaded. There are probably a bunch of edge cases where this might not be enough (e.g., if a mask is "fetched" from a nii.gz file and loaded before reaching get_masker or if an image is provided at Dataset initialization instead of being loaded as part of __init__), but I don't think we can cover everything.

@tsalo
Copy link
Member Author

tsalo commented Nov 17, 2020

It looks like the original issue is resolved in indexed_gzip v1.1.0 (pauldmccarthy/indexed_gzip#28), but now I get an error trying to do the same thing:

AttributeError: '_io.BytesIO' object has no attribute 'mode'

It looks like, now that indexed_gzip has resolved its issue, NiftiMasker is the problem.

from copy import deepcopy
from nilearn.input_data import NiftiMasker

masker1 = NiftiMasker('nifti.nii').fit()
deepcopy(masker1)  # no error

masker2 = NiftiMasker('nifti.nii.gz').fit()
deepcopy(masker2)  # error

@tsalo
Copy link
Member Author

tsalo commented Nov 17, 2020

Back to nibabel. I've opened nipy/nibabel#969 about it.

@effigies
Copy link

Ref: pauldmccarthy/indexed_gzip#50

@tsalo
Copy link
Member Author

tsalo commented Jan 4, 2021

This should be fixed in indexed_gzip v1.4.0, so it will be a matter of adding indexed_gzip >= 1.4.0 to our dependencies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Issues noting problems and PRs fixing those problems.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants