Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace all usages of mrcfile and spider_file3 by ImageHandler #107

Closed
MohamadHarastani opened this issue Apr 2, 2022 · 11 comments
Closed
Assignees
Labels
enhancement New feature or request

Comments

@MohamadHarastani
Copy link
Collaborator

It works as follows

from pwem.emlib.image import ImageHandler
volume_data = ImageHandler().read(volume_filename).getData()
@MohamadHarastani
Copy link
Collaborator Author

Eventually, remove dependency on mrcfile

@MohamadHarastani
Copy link
Collaborator Author

Maybe replace reading and writing PDBs with the usage of AtomicStructHandler
from pwem.convert.atom_struct import AtomicStructHandler

@mms29
Copy link
Collaborator

mms29 commented Apr 12, 2022

Unfortunately, Imagehandler do not save header information when writing to file as it should be (samplign rate and origin axis) and I need these to be in the header for the input of GENESIS. For instance, for the sampling rate when I open a volume with sampling rate 2.0A :

file = "/home/guest/Workspace/test.mrc"
volume = ImageHandler().read(file)

I obtain an Image object with :

Sampling rate :
X-rate (Angstrom/pixel) = 2
Y-rate (Angstrom/pixel) = 2
Z-rate (Angstrom/pixel) = 2

Then, I save it and open it again :

volume.write(file)
volume = ImageHandler().read(file)

Sampling rate :
X-rate (Angstrom/pixel) = 1
Y-rate (Angstrom/pixel) = 1
Z-rate (Angstrom/pixel) = 1

The same thing appends for the origin axis. That is why for the moment I need mrcfile to wright these two parameters in the header.

@MohamadHarastani
Copy link
Collaborator Author

MohamadHarastani commented Apr 12, 2022 via email

@mms29
Copy link
Collaborator

mms29 commented Apr 12, 2022

Concerning AtomicStructHandler, I have an issue when opening large PDB like the ribosome (3j77 ) caused by a too large number of different chains. PDB format allows only one character to define the chain name, consequently, the max number of chains is 36 (0-9 and A-Z), however, the ribosome has more than 80 chains.

One solution could be to read this file as CIF, however, at one point I need to convert to PDB to run GENESIS.

Another solution is to use the "segment ID" field of PDB format which is done in VMD to define the segments of the structure, which allows 3 or more characters I believe. This is what I do for the moment with my own PDB handler. however BioPython is based on these chains and could not be changed...

The best for now might be to keep the PDB handler I wrote (I compared my IO parser and the one of BioPython and they are both the same), I implemented some of the function used in AtomicStructHandler for instance the one to align 2 PDBs rigid body wise. It also has several features that are very convenient (for instance matching 2 different structures to find RMSD). I prepared a clean version of the handler named ContinuousFlexPDBHandler which is available in protocols.utilities.pdb_handler.py

@MohamadHarastani
Copy link
Collaborator Author

MohamadHarastani commented Apr 12, 2022 via email

@mms29
Copy link
Collaborator

mms29 commented Apr 12, 2022

I see. Keep mrcfile of course. I will replace the usages of spiderfile3 since I am handling the sampling rate using scipion objects attributes and not using the header. We can open an issue asking Scipion to allow setting the true sampling rate (at least when we pass a flag) since we have a motive: connecting with other packages.

On Tue, Apr 12, 2022, 9:52 AM Rémi Vuillemot @.> wrote: Unfortunately, Imagehandler do not save header information when writing to file as it should be (samplign rate and origin axis) and I need these to be in the header for the input of GENESIS. For instance, for the sampling rate when I open a volume with sampling rate 2.0A : file = "/home/guest/Workspace/test.mrc" volume = ImageHandler().read(file) I obtain an Image object with : Sampling rate : X-rate (Angstrom/pixel) = 2 Y-rate (Angstrom/pixel) = 2 Z-rate (Angstrom/pixel) = 2 Then, I save it and open it again : volume.write(file) volume = ImageHandler().read(file) Sampling rate : X-rate (Angstrom/pixel) = 1 Y-rate (Angstrom/pixel) = 1 Z-rate (Angstrom/pixel) = 1 The same thing appends for the origin axis. That is why for the moment I need mrcfile to wright these two parameters in the header. — Reply to this email directly, view it on GitHub <#107 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AK2I5FOI467SFKXN4KGA6XDVEUTSTANCNFSM5SKVVHWQ . You are receiving this because you were assigned.Message ID: @.>

It would be good to open an issue for Scipion as it is a major issue (all the image format have this information in header meaning that all packages assume they are present). It might be a problem of Xmipp also since I noticed the same problem with the function "xmipp_image_header" that does not do the job it should (saving sampling rate etc in the header).

@MohamadHarastani
Copy link
Collaborator Author

@mms29 Hi
Do you have an idea if this is fixed in Scipion/Xmipp?
I recently noticed the use of xmipp_image_header to set sampling rate of an mrc file

@mms29
Copy link
Collaborator

mms29 commented Nov 29, 2022

I just tried xmipp_image_header and it looks fixed. I will try to see if saving an image from Scipion take into account the sampling rate and the other header parameters. If so, I'll remove the dependency to mrcfile

@MohamadHarastani
Copy link
Collaborator Author

MohamadHarastani commented Nov 29, 2022 via email

@mms29
Copy link
Collaborator

mms29 commented Dec 23, 2022

I believe we can continue using mrcfile, especially now that we have our own environment
If so we can close this issue

@mms29 mms29 closed this as completed Dec 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants