You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some datasets contain .mrc and .mrcs files that the Python library mrcfile will not open for various reasons. Currently, we allow mrcfile to throw whatever error it decides when trying to load such files.
Not all of these errors mean that the mrc is unusable though. We may want to report in more detail to the user what the problem is, and/or provide some kind of utility. Alternatively, we could open the file in "permissive mode" and work from there, only failing if the problem is critical.
See ComputationalCryoEM/ASPIRE-Data#2
Some common messages:
Unrecognised machine stamp: 0x00 0x00 0x00 0x00
This means the header does not contain information about endianness.
Map ID string not found - not an MRC file, or file is corrupt
This is a simple issue of a header field not being set to the standard value (which is a constant).
The mrcfile library provides a permissive read mode which will attempt to open the file anyway. Once this is done, sometimes the header can be fixed and the file re-saved with the update. For example, the following code attempts to fix the two example issues above:
importmrcfileimportsysfp=sys.argv[1]
withmrcfile.open(fp, "r+", permissive=True) asmrc:
ifnotmrc.header.map==mrcfile.constants.MAP_ID:
mrc.header.map=mrcfile.constants.MAP_IDifnotmrc.dataisNone:
mrc.update_header_from_data()
else:
print(f"ERROR with {fp}: data is None!")
try:
withmrcfile.open(fp, "r") asmrc:
passexceptValueErrorase:
print(f"ERROR with {fp}: {e}")
In EMPIAR 10005, the files can be fixed with the above script. The code can and should be fleshed out into a fix-it script that suits our purposes.
However, can be more complex cases. There are two situations in which an MrcFile object will be returned, but its data field, typically a Numpy array containing the image data, is None:
The mode number is not recognised. Currently accepted modes are 0, 1, 2, 4 and 6.
or
The data block is not large enough for the specified data type and dimensions.
We should figure out how to fix these last two categories and/or at what point to decide that an .mrc file we have received is truly "corrupted" and unusable. On resolution we should store the above information as well as any new methods that we discover.
Note that the mrcfile.validate() method and the mrcfile-validate CLI tools will return False even for usable MRC's. The warnings given are not critical in general and mrcfile will open the file without complaint:
These tools are also slow, taking 3-5 seconds per mrc file.
e.g.
python
>>> mrcfile.validate("patch/10028/data/Particles/MRC_0601/037_particles_shiny_nb50_new.mrcs")
Error in header labels: nlabl is 10 but 0 labels contain text
File does not declare MRC format version 20140: nversion = 0
Error in data statistics: RMS deviation is 0.9954950213432312 but the value in the header is 0.9954612255096436
False
The text was updated successfully, but these errors were encountered:
Some datasets contain
.mrc
and.mrcs
files that the Python librarymrcfile
will not open for various reasons. Currently, we allowmrcfile
to throw whatever error it decides when trying to load such files.Not all of these errors mean that the mrc is unusable though. We may want to report in more detail to the user what the problem is, and/or provide some kind of utility. Alternatively, we could open the file in "permissive mode" and work from there, only failing if the problem is critical.
See ComputationalCryoEM/ASPIRE-Data#2
Some common messages:
This means the header does not contain information about endianness.
This is a simple issue of a header field not being set to the standard value (which is a constant).
The
mrcfile
library provides a permissive read mode which will attempt to open the file anyway. Once this is done, sometimes the header can be fixed and the file re-saved with the update. For example, the following code attempts to fix the two example issues above:In EMPIAR 10005, the files can be fixed with the above script. The code can and should be fleshed out into a fix-it script that suits our purposes.
However, can be more complex cases. There are two situations in which an
MrcFile
object will be returned, but itsdata
field, typically a Numpy array containing the image data, isNone
:or
(see: https://mrcfile.readthedocs.io/en/latest/usage_guide.html#permissive-read-mode)
We should figure out how to fix these last two categories and/or at what point to decide that an
.mrc
file we have received is truly "corrupted" and unusable. On resolution we should store the above information as well as any new methods that we discover.Note that the
mrcfile.validate()
method and themrcfile-validate
CLI tools will returnFalse
even for usable MRC's. The warnings given are not critical in general andmrcfile
will open the file without complaint:These tools are also slow, taking 3-5 seconds per mrc file.
e.g.
The text was updated successfully, but these errors were encountered: