-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nibabel.load crashes for very large files on python 3, not python 2 #362
Comments
What specific file from HCP did you use to produce this crash? Also, can you try with Python 3.4? I don't believe nibabel is testing on 3.5, yet. |
I used a resting state nifti file with all 4 sessions combined, hence the 1200 timepoints. |
Is it possible you have a 32-bit version of Python 3.5? |
Pretty sure I got 64
|
Can you try |
Sure
|
As the error message says, that will just overflow a 32-bit unsigned integer. Would you mind putting a 1/0 or similar at line 512 of volumeutils.py in order to raise an error, then check the size of |
It's pretty close:
|
So, that should overflow a 32-bit unsigned int as well. I guess the difference is in the Python implementations. |
I'm not sure I'm following. Why would there be a 32 bit constraint when running on a 64 bit python shell? |
I don't know why that would happen, but it will depend on how Python implements gzip decompression. My guess is Python changed the implementation at some point, and the newer implementation happens to use an unsigned 32-bit int. Can you check other Python versions? Sorry to ask... |
not easily, no. Sorry as well. I'll try on another machine. But this is seriously weird. |
Fair enough. I'll do my best to replicate on my laptop. |
Any movement on this? |
I will take a look at this. |
I was able to reproduce the error in Python 3 (and no error in Python 2) by creating a dummy image of the same size: import nibabel as nib
import numpy as np
import os
img = nib.Nifti1Image(np.zeros((91,109,91,1200), dtype=np.float32), affine=np.eye(4))
nib.save(img, 'test.nii.gz')
del img
try:
data = nib.load('test.nii.gz').get_data()
finally:
os.remove('test.nii.gz') Output:
I will debug a bit from here, but the bottom line is |
Yep, it's exactly as expected: the call to So I guess the question is whether we report a bug, or work around the issue. Will try to find some relevant documentation. To start, perhaps: |
Thanks for tracking that down. Looks like that issue was superseded and they went with this patch. For now, it makes sense to me to report upstream and see if they're willing to fix this before trying to do any workarounds, here. |
I am having the same issue here, hasn't the patch above been merged already into Python 3 default branch ? |
Sorry, to clarify: I'm guessing that's the patch causing the problem. |
Ben - how about reporting the bug AND working around the issue :) |
Don't have a good test rig, so trying to set up a test case using Ben's code (and a smaller Travis test suite) and getting weird Travis errors. Did I screw something up, or is this just a thing where I should wait a bit and re-run the tests? Edit: It was the test I added. Took too much time or memory, so got killed. Will see if I can upload a sample dataset so we're not wasting time writing, at least. |
Loading a pre-made dataset still causes either a memory or time issue. Not sure testing for this can be automated... My initial thought for a strategy would be: try loading with gzip, and if that fails, pipe through |
Just as a note, I don't see |
After further review, I don't see a quick way to do this either. I would suggest that, for a known bug, we should at least I also think we ought to re-open #209. I will do that now. |
Don't we need to file a Python bug?
|
|
Ben - do you get an error reading a file created like this?
|
Ah - how about :
http://unix.stackexchange.com/questions/32988/why-does-dd-from-dev-random-give-different-file-sizes |
If you edit the nibabel code to not pass the size, but just ask to read all the file (not pass the size argument), do you still get a crash? |
And nibabel won't open it; the file type is unrecognized (even if I change to
I don't see anywhere that we explicitly pass in the size. We pass in a buffer in |
Sorry, I meant, do you get the same error as for an image, if you do:
|
My thought was to try a buffered read loop, to avoid the problem. This seems to work:
|
Ah I see, sorry. Yes, you get the same error. |
Or, more succinctly:
|
I guess another option is to disable use of |
Any preference? |
I guess buffering will be more efficient. But - I think we need to cover the case where the read returns fewer bytes than we thought. How about something like:
|
Sounds great. Since it's gzip-specific, we should probably put that in For testing, I think a unit test on I'll try the unit test in the meantime. |
Er.... we'll also have to wrap |
I see that it's possible to pass a file-like object to |
Well, I guess it'll have to be 4GB + the size of the GZIP output in memory, vs. 4GB (just the image in memory) that seems to be unavoidable. So 6GB vs. 4GB... the major concern is the time. As I'm running this code, I'm realizing that it's gzip itself that's slow. I'll decrease compression level in the test... |
I hate to say this now, but maybe we could do without a test here, on the basis that it's too time-consuming and memory intensive to test routinely. Or have a separate test repo, for these kinds of very long / slow tests, that we trigger travis on / run on the buildbots from time to time. |
Sure thing. I want a test case to make sure things are working as expected before doing a PR, so ... it's useful, even if we disable the test by default. |
How about an environment variable |
Sure. Even something like |
Hi,
I just noticed the following error: loading a very large file (such as a resting state image from the HCP dataset, e.g. with dimensions (91, 109, 91, 1200)) works fine with nibabel and python 2.7. The same command on python 3.5 fails with the following error:
Happy to provide further info, not sure what is going on.
The text was updated successfully, but these errors were encountered: