Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.wait() method should report if acemd is stuck #56

Closed
alejandrovr opened this issue Jun 9, 2016 · 18 comments
Closed

.wait() method should report if acemd is stuck #56

alejandrovr opened this issue Jun 9, 2016 · 18 comments
Assignees

Comments

@alejandrovr
Copy link
Contributor

alejandrovr commented Jun 9, 2016

Hello,
we are trying to equilibrate the system but when it arrives to "mdx.wait()" it takes too long (more than 2 hours and still running) and in the documentation says it will take 5 minutes.

from htmd.protocols.equilibration_v1 import Equilibration
from natsort import natsorted
md = Equilibration()
md.numsteps = 1000
md.temperature = 298

builds = natsorted(glob('./docked/build/*/'))
for i, b in enumerate(builds):
    md.write(b, 'docked/equil/{}/'.format(i+1))

mdx = AcemdLocal()
mdx.submit(glob('./docked/equil/*/'))

mdx.wait()

It does not give any errors, should we still waiting or is there any errors?

@stefdoerr
Copy link
Contributor

Can you check in the directory which it's simulating what is written in the log.txt files and the FINISH_TIME file?

@stefdoerr
Copy link
Contributor

I've had ACEMD get stuck in an infinite minimization loop before. To debug it actually try running acemd directly from the simulation folder (from command line). Then you will see what's happening.

@giadefa
Copy link
Contributor

giadefa commented Jun 9, 2016

maybe the wait() could give more info also

On 9 June 2016 at 13:42, Stefan notifications@github.com wrote:

I've had ACEMD get stuck in an infinite minimization loop before. To debug
it actually try running acemd directly from the simulation folder (from
command line). Then you will see what's happening.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#56 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AHaqOuWyDgJDyqhXrcT4EHtwhn_R338Wks5qJ_w2gaJpZM4Ix3Zi
.

http://www.acellera.com

   <https://twitter.com/acellera>

https://www.youtube.com/user/acelleracom
https://www.linkedin.com/company/2133167?trk=tyah&trkInfo=clickedVertical%3Acompany%2CclickedEntityId%3A2133167%2Cidx%3A2-1-2%2CtarId%3A1448018583204%2Ctas%3Aacellera
https://www.acellera.com/md-simulation-blog-news/
http://is.gd/1eXkbS

@stefdoerr
Copy link
Contributor

Like what info? Projected runtime maybe? But it would still not help in debugging this I think. Only looking at the acemd realtime output helps for such problems and printing that would be impossible (and messy).

@nesilin
Copy link

nesilin commented Jun 9, 2016

Hi!
For your knowledge , in fact the tutorial barnase-barstar-building-simulation of htmd which also uses ACEMD like

mdx = AcemdLocal()
mdx.submit('./equil')
mdx.wait()

also got stuck at this point.
A part from that, we have tried what you suggested but the FINISH_TIME file does not exist and the log.txt does not say anything about trouble in the wait method.
Moreover, the log.txt did not report any problem.
We have tried ACEMD with the input file inside the /docked/equil/1 directory and we have obtained the output.coor and output.xsc and others.
Could it be that we are missing the driver of nvidia? Because we have tried to run ACEMD from command line in another machine and as I said it worked but in alejandrovr 's computer did not. In alejandrovr 's computer the error has been the following:

 /home/alejandro/miniconda3/bin/acemd.bin:
 error while loading shared libraries: libcuda.so.1: 
cannot open shared object file: No such file or directory

@stefdoerr
Copy link
Contributor

Hm ok that error means you don't have cuda installed on the machine. ACEMD needs cuda and an NVIDIA GPU to work.

So the mdx.wait() problem was only on the machine of @alejandrovr ?

@j3mdamas
Copy link
Contributor

j3mdamas commented Jun 9, 2016

yeah, gianni said that some of their machines wouldn't be able to run acemd...

@stefdoerr
Copy link
Contributor

Would be nice though if AcemdLocal did not get stuck and reported something. But I need a test machine to see what goes wrong.

@giadefa
Copy link
Contributor

giadefa commented Jun 9, 2016

workspace1

On 9 June 2016 at 16:06, Stefan notifications@github.com wrote:

Would be nice though if AcemdLocal did not get stuck and reported
something. But I need a test machine to see what goes wrong.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#56 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AHaqOnwZIJgcyPtfjEgk4ySQ7bxvx42rks5qKB39gaJpZM4Ix3Zi
.

http://www.acellera.com

   <https://twitter.com/acellera>

https://www.youtube.com/user/acelleracom
https://www.linkedin.com/company/2133167?trk=tyah&trkInfo=clickedVertical%3Acompany%2CclickedEntityId%3A2133167%2Cidx%3A2-1-2%2CtarId%3A1448018583204%2Ctas%3Aacellera
https://www.acellera.com/md-simulation-blog-news/
http://is.gd/1eXkbS

@j3mdamas
Copy link
Contributor

j3mdamas commented Jun 9, 2016

So, this is issue is about changing AcemdLocal (anything that runs acemd really), to report stuff if stuck.

For the students, well, there are large groups, I hope that there is at least one person per group which can run acemd

@j3mdamas j3mdamas changed the title .wait() method taking too long. .wait() method should report it acemd is stuck Jun 9, 2016
@j3mdamas j3mdamas added this to the v1.2.0 milestone Jun 9, 2016
@giadefa
Copy link
Contributor

giadefa commented Jun 9, 2016

only 4 people need to run

On 9 June 2016 at 16:16, João M. Damas notifications@github.com wrote:

So, this is issue is about changing AcemdLocal (anything that runs acemd
really), to report stuff if stuck.

For the students, well, there are large groups, I hope that there is at
least one person per group which can run acemd


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#56 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AHaqOhc-RWF0j94vPIOpTQrkafoN3DFYks5qKCBMgaJpZM4Ix3Zi
.

http://www.acellera.com

   <https://twitter.com/acellera>

https://www.youtube.com/user/acelleracom
https://www.linkedin.com/company/2133167?trk=tyah&trkInfo=clickedVertical%3Acompany%2CclickedEntityId%3A2133167%2Cidx%3A2-1-2%2CtarId%3A1448018583204%2Ctas%3Aacellera
https://www.acellera.com/md-simulation-blog-news/
http://is.gd/1eXkbS

@alejandrovr
Copy link
Contributor Author

@stefdoerr it did not work for me, neither for my colleagues who does not have nvidia. Thanks for answering, running from command line has helped us.

@giadefa
Copy link
Contributor

giadefa commented Jun 10, 2016

Why are you not running on the provided machines with Nvidia GPUs?

On 10 June 2016 at 11:12, alejandrovr notifications@github.com wrote:

@stefdoerr https://github.com/stefdoerr it did not work for me, neither
for my colleagues who does not have nvidia. Thanks for answering, running
from command line has helped us.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#56 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AHaqOgqBbPpbueP686NnmzbGNzjEalPuks5qKSp8gaJpZM4Ix3Zi
.

http://www.acellera.com

   <https://twitter.com/acellera>

https://www.youtube.com/user/acelleracom
https://www.linkedin.com/company/2133167?trk=tyah&trkInfo=clickedVertical%3Acompany%2CclickedEntityId%3A2133167%2Cidx%3A2-1-2%2CtarId%3A1448018583204%2Ctas%3Aacellera
https://www.acellera.com/md-simulation-blog-news/
http://is.gd/1eXkbS

@alejandrovr
Copy link
Contributor Author

Because vmd was not installed. Now it is, however vmd is not fullly working, because it will only run if one does ssh without -Y option, but then no window will appear, only the command line from vmd. That seems enough to make the scripts that use vmd run.

@giadefa
Copy link
Contributor

giadefa commented Jun 10, 2016

hmm, no script use VMD if not for visualization?

On 10 June 2016 at 11:24, alejandrovr notifications@github.com wrote:

Because vmd was not installed. Now it is, however vmd is not fullly
working, because it will only run if one does ssh without -Y option, but
then no window will appear, only the command line from vmd. That seems
enough to make the scripts that use vmd run.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#56 (comment), or mute
the thread
https://github.com/notifications/unsubscribe/AHaqOhuoFHWBT0a-Qp3TntPhj4IhuRO0ks5qKS1YgaJpZM4Ix3Zi
.

http://www.acellera.com

   <https://twitter.com/acellera>

https://www.youtube.com/user/acelleracom
https://www.linkedin.com/company/2133167?trk=tyah&trkInfo=clickedVertical%3Acompany%2CclickedEntityId%3A2133167%2Cidx%3A2-1-2%2CtarId%3A1448018583204%2Ctas%3Aacellera
https://www.acellera.com/md-simulation-blog-news/
http://is.gd/1eXkbS

@alejandrovr
Copy link
Contributor Author

alejandrovr commented Jun 10, 2016

We found this piece of code that saves a view into a variable:

from ipywidgets.widgets import Box
w = []
for i, m in enumerate(molbuilt):
m.view(sel='protein', style='NewCartoon', hold=True)
m.view(sel='water', style='Lines', hold=True)
h = m.view(sel='resname MOL', style='Licorice', color=0)
w.append(h)
b = Box(children=(w[0],w[1]))
b
Now our script fails at Box, but we think that's some kind of Jupyter stuff not really needed for the simulation process so we are trying to run the script with it. Is that correct? Thanks.

PD: We have eliminated Box command and it has worked fine.

@stefdoerr stefdoerr changed the title .wait() method should report it acemd is stuck .wait() method should report if acemd is stuck Jul 12, 2016
@stefdoerr
Copy link
Contributor

If there is still an issue please report it. Closing for now

@mj-harvey
Copy link
Contributor

What I take from this is that there's a problem with ACEMD sometimes getting stuck minimizing ill-conditioned input. There's a case about that open elsewhere, so I'm oging to close this one. If you think there's another problem, please reopen and elaborate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants