Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PDBReader reopens files reading every frame #850

Closed
kain88-de opened this issue May 12, 2016 · 2 comments
Closed

PDBReader reopens files reading every frame #850

kain88-de opened this issue May 12, 2016 · 2 comments

Comments

@kain88-de
Copy link
Member

The PDBReader opens and closes the trajectory file for reading every frame, see. Reopening the files all the time means we are doing direct I/O. This is an issue when MDAnalysis is run on an HPC cluster with 10 to 100 of different processes at the same time.

HPC cluster usually have some kind of heavily buffered distributed filesystem, like the GPFS. These type of filesystems hate direct I/O, they prefer buffered I/O. This would mean keeping a file open for longer period of time and reading from it. They can buffer that. Closing the file breaks their buffering mechanism. The end result is super slow file access for everybody on a cluster.

The solution is to keep the file pointer open. I did the same for the XTC/TRR readers.

@richardjgowers
Copy link
Member

#239 was trying to move towards always accessing this way, but it sounds like this is a bad idea now

@kain88-de
Copy link
Member Author

Yes if we want to allow MDAnalysis to run on HPC clusters for massively parallel analysis we have to keep the file descriptors open.

richardjgowers added a commit that referenced this issue May 12, 2016
PDBReader only opens a single file handle in its lifetime
kain88-de added a commit that referenced this issue May 13, 2016
 First draft of faster pdb reading (fixes #848, #850)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants