Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Detect when writing to open mmap #492

Open
fepegar opened this issue Oct 25, 2016 · 10 comments
Open

Feature Request: Detect when writing to open mmap #492

fepegar opened this issue Oct 25, 2016 · 10 comments

Comments

@fepegar
Copy link
Contributor

fepegar commented Oct 25, 2016

I have written a small function in IPython to delete the origin information from an image:
In [1]: def removeOrigin(imPath):
...: nii=nib.load(imPath)
...: affine = nii.affine
...: affine[:3, 3] = 0
...: data = nii.get_data()
...: new = nib.Nifti1Pair(data, affine)
...: nib.save(new, imPath)

My images are in Nifti1Pair format and very small (500 kB - 1 MB).

When I call my function, IPython crashes, the Terminal says Bus error and something very bad happens: my image data is gone!! The .hdr/.imgs file contain now 0 bytes, so I just lost my data.

I've tried outside the function and it happens the same when I try to overwrite the original image, i.e.:
nib.save(new, imPath) -> Bus error
nib.save(new, imPath.replace('.hdr', '_2.hdr')) -> Ok

@effigies
Copy link
Member

Hi, sorry about the delay.

My guess is that the image is an mmap, so that when you begin writing, you''re truncating the file, and thus losing the data you want to write.

You could try data = nii.get_data().copy() to make sure the data is loaded into memory.

@fepegar
Copy link
Contributor Author

fepegar commented Mar 29, 2017

Better late than never.

I don't have the technical background to fully understand your solution, but I'm sure it works. However, shouldn't nibabel be able to prevent this from hapenning?

@effigies
Copy link
Member

Perhaps. Certainly worth a discussion. I do believe (though I haven't tested) that #466 or something similar would resolve the problem by unlinking the input file before writing to that name.

It should also be possible to verify that the target filename is the same as the source mmap, but there might be weirdness there I'm not immediately seeing.

As an aside, I would say that, for reproducibility and documentation reasons, I'd be very cautious about overwriting files. In this case, performing your action twice is the same as performing it once, so at least there's no danger to forgetting you've already done it. But in general, I'd strongly recommend setting up your workflows such that there is a clear flow of (input file, operation, output file).

@fepegar
Copy link
Contributor Author

fepegar commented Mar 29, 2017

Sure, one has to be cautious when overwriting. But if everything I want is, say, change the origin of my 2GB nifti, maybe I wouldn't want two copies of my file...

@effigies
Copy link
Member

Fair enough. Well, in the short term, I'd say use nib.load(fname, mmap=False), which won't require a manual copy command.

@fepegar
Copy link
Contributor Author

fepegar commented Mar 29, 2017

Duly noted, thanks!

@effigies effigies changed the title Images are deleted after Bus error Feature Request: Detect when writing to open mmap Mar 30, 2017
@VincentXWD
Copy link

Thank you @effigies . .get_data().copy() helps me a lot!

@matthew-brett
Copy link
Member

I should also say, that in general you should prefer img.get_fdata(), which, I believe, always does a copy.

@fepegar
Copy link
Contributor Author

fepegar commented Jan 30, 2020

I confirm this problem is still happening: fepegar/torchio#77

@effigies
Copy link
Member

effigies commented Jan 30, 2020

Right.

The reason it looked like get_fdata() always did a copy was because we always coerced to float64, so get_fdata(dtype=np.float32) would cast up and then down on data that is already float32, and casting necessarily makes a copy. When we fixed that bug, memmaps started being exposed again.

Loading with mmap=False is still the right move.

Fiddling with this, though, I think we can reliably predict the BusError with the criterion:

getattr(new.dataobj, '_mmap') and \
    os.path.realpath(new.dataobj.filename) == os.path.realpath(imPath)

The simplest and safest thing to do is to detect this case and copy the data. It would have to go in *Image.to_file_map, to handle the case where you have a single-file image and copy the data before the file is opened to write the header.

This would in any case be no greater a penalty than requiring mmap=False, although it would be more magic and might have some surprising consequences. For example, I suspect that the original image would stop being usable, and it would change the BusError from new.to_filename() to attempting to look at img.get_fdata() again.

This is a long way to say: I think we can (sometimes) detect when writing to an open mmap. I'm not sure that doing so is a good idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants