New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PDB files with >=100k atoms #226
Comments
On the openmm github they discussed about this issue: It looks like they chose the VMD format, which switches to hexadecimal past 100k. This is what I obtained from a xyz file read and write from VMD:
This however would not be compatible with our needs since we need the serial to be unique when we store non-consecutive atoms for RMSD calculation. In the VMD representation indeed, past 100k you can have some hex number containing only digits that are the same as decimal number. E.g.,
corresponds to atom The format discussed hybrid 36 format seems more consistent since there is a 1-to-1 map between numbers and strings. The sequence of strings would be something like
etc. Plus it uses 26-base numbers, resulting in a maximum atom number equal to ~ 87M for 5 digits strings (for VMD syntax it is ~ 1M, so much less). However, I am not sure how easy it would be to produce such files. They provide small C and python tool to manipulate these numbers. Actually, including in PLUMED either the VMD format or this hybrid 36 would be super easy and backward compatible (with the exception of the problem in the VMD format mentioned above). |
PDB format does not allow atom numbers >=100k.
Currently we are bound to PDB format in two places:
For MOLINFO we need to access residue numbers, chains, and atom types, so we are probably forces to remain with PDB format. A possible solution would be to allow multiple PDBs (see also #134).
For reference structures instead it could be complicated to split a system on multiple files. We could perhaps think about some different format?
We could look for some already existing modification of the PDB formats, for instance the hybrid format mentioned here.
The text was updated successfully, but these errors were encountered: