Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parallel computing error #51

Closed
phymalidoust opened this issue Oct 19, 2018 · 7 comments
Closed

parallel computing error #51

phymalidoust opened this issue Oct 19, 2018 · 7 comments

Comments

@phymalidoust
Copy link
Collaborator

When running on a supercomputer, with 1node and asking for 1 task per node, it works although returns several warnings. However, when the number of nodes or tasks per node is increased, the code stops with error. The output files of both cases are attached.

output_1node_1task.txt
output_3nodes_1task.txt

@davidkleiven
Copy link
Owner

I have a few questions

  1. What was the exact command used to install the package?
  2. Which modules where loaded (output of module list)
  3. Which compilers where used (I guess it is Intel or GCC)

@davidkleiven
Copy link
Owner

How are you parallelizing?
It looks like these are extremely small calculations with only 2500 mc steps. How long does each take, a few seconds?

@phymalidoust
Copy link
Collaborator Author

questions:
1- pip3 install -e . --user (på idun)
2,3- gcc-8, mpi, python3 ... (it does not complain about missing compiler. I'm getting results with 1node-1task.). Also, I have installed and load the compiler/module from my own directory

I assumed it is internally parallelized and I don't need to make the outer for-loop parallelized. Am I correct?
Please note that it raises these errors even when 1node and 1 task per node is requested. It complains about something inside the MC code.

The steps' time sounds reasonable but didn't check it out how long they take exactly.

@davidkleiven
Copy link
Owner

davidkleiven commented Oct 20, 2018

Monte Carlo sampling is a Markov Chain so you cannot parallelize that.
There are a couple of options

  1. If you use a Markov Chain to sample the configurational space the only thing you can do is to start multiple Markov Chains and average the outcome of them.
  2. If you search for ground state you can start multiple searches and take the one having lowest energy as the ground state
  3. If you have multiple parameters (like several compositions) you can parallelize over that

For your case I think option 3 is the best. Which means that you have to parallelize a for loop over compositions.

@davidkleiven
Copy link
Owner

davidkleiven commented Oct 20, 2018

Option 1 is handled internally, but you have to pass a MPI communicator object to the Monte Carlo class to activate it. For ground search I recommend option 3.

@phymalidoust
Copy link
Collaborator Author

phymalidoust commented Oct 20, 2018

OK, understand. Will try to make an example of option 3 and then maybe you could complement it with option 1.

@davidkleiven
Copy link
Owner

davidkleiven commented Oct 20, 2018 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants