-
Notifications
You must be signed in to change notification settings - Fork 157
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segmentation fault on ANM calculations on large systems, despite of tones of avaliable RAM #220
Comments
Lapack is dealing with those over scipy. I guess it will be due to diagonalizer. Currently, we have an effort to build gpu based eigenvalue decomposition for large molecules with magma. With 1.9, it will be online. |
This is still in progress. There's a GPU implementation pretty much complete but not integrated yet. |
You could maybe try using Dask. This package handles much larger arrays and splits the computations into parallel units. They have a singular value decomposition (https://docs.dask.org/en/latest/array-api.html?highlight=linalg#dask.array.linalg.svd) that you could use for the matrix decomposition maybe. I have also thought about incorporating it, but never got a chance. Alternatively, you could try coarse-graining further. One option could be to take ever second or third residue or even larger gaps (see https://pubmed.ncbi.nlm.nih.gov/11913377/ where this seems to work up to 1/40 of residues using a shifting scheme) or take the average position of every few residues. Another option is more rigorous Markov methods for hierarchical coarse-graining (https://pubmed.ncbi.nlm.nih.gov/17691893/) although this might be more challenging. Another alternative could be to use our new interface CryoDy to obtain a coarse grained elastic network directly from a cryo-EM map (see http://prody.csb.pitt.edu/tutorials/cryoem_tutorial/). Best wishes |
I was just trying this yesterday. I think the best option is to use buildHessian with sparse=True and calcModes with turbo=False. It may also help to limit the number of cpus. What I did was the following:
I've found that limiting the number of threads can also speed up the calculations sometimes |
This issue was moved to a discussion.
You can continue the conversation there. Go to discussion →
Dear all,
I am trying to perform ANM calculations on a tetrameric membrane protein (PDB ID 3J8H, chains A,C,E,G) using only CA atoms, which amount a total of 14640 atoms. In our unversity we have a computer cluster with up to 1 TB of RAM, so for testing purposes I am booking up to 800 Gb, but the jobs crashes producing a segmentation fault (and core dump). Here it is the script:
from prody import *
import matplotlib as mpl
mpl.use('Agg')
import matplotlib.pylab as plab
import matplotlib.pyplot as plt
import numpy as np
RyR_closed = parsePDB('3J8H_ACEG.pdb')
RyR_closed = RyR_closed.select('protein and name CA')
compute ANM modes
anm = ANM('RyR closed ANM analysis')
anm.buildHessian(RyR_closed)
anm.calcModes()
write Modes to be represented by VMD
to be read by parseNMD subroutine
writeNMD('RyR_closed.ANM.nmd',anm,RyR_closed)
quit()
And here is the error:
@> 107796 atoms and 1 coordinate set(s) were parsed in 1.23s.
@> Hessian was built in 65.85s.
/netscr/EX/lsf/killdevil/lsbatch/1420483745.214510: line 8: 12861 Segmentation fault (core dumped) ./prody_anm.CA.py
However I know that the maximum memory used is about 29 Gb:
Exited with exit code 139.
Resource usage summary:
I see two possible problems (not being a python programer):
Python memory allocation for arrays is predetermined to a size below the one needed in the diagonalization process (Hessian computation is made extremely fast)
Or perhaps it is known that LAPACK/BPACK subroutines used as diagonalizer (the ones in C) are not able to handle this array size.
Can anyone help me to perform ANM on large systems ?
Thank you so much,
Raul
The text was updated successfully, but these errors were encountered: