parallel computing error #51

phymalidoust · 2018-10-19T11:47:48Z

When running on a supercomputer, with 1node and asking for 1 task per node, it works although returns several warnings. However, when the number of nodes or tasks per node is increased, the code stops with error. The output files of both cases are attached.

output_1node_1task.txt
output_3nodes_1task.txt

davidkleiven · 2018-10-19T17:33:11Z

I have a few questions

What was the exact command used to install the package?
Which modules where loaded (output of module list)
Which compilers where used (I guess it is Intel or GCC)

davidkleiven · 2018-10-19T17:38:28Z

How are you parallelizing?
It looks like these are extremely small calculations with only 2500 mc steps. How long does each take, a few seconds?

phymalidoust · 2018-10-20T06:57:33Z

questions:
1- pip3 install -e . --user (på idun)
2,3- gcc-8, mpi, python3 ... (it does not complain about missing compiler. I'm getting results with 1node-1task.). Also, I have installed and load the compiler/module from my own directory

I assumed it is internally parallelized and I don't need to make the outer for-loop parallelized. Am I correct?
Please note that it raises these errors even when 1node and 1 task per node is requested. It complains about something inside the MC code.

The steps' time sounds reasonable but didn't check it out how long they take exactly.

davidkleiven · 2018-10-20T07:56:15Z

Monte Carlo sampling is a Markov Chain so you cannot parallelize that.
There are a couple of options

If you use a Markov Chain to sample the configurational space the only thing you can do is to start multiple Markov Chains and average the outcome of them.
If you search for ground state you can start multiple searches and take the one having lowest energy as the ground state
If you have multiple parameters (like several compositions) you can parallelize over that

For your case I think option 3 is the best. Which means that you have to parallelize a for loop over compositions.

davidkleiven · 2018-10-20T08:08:33Z

Option 1 is handled internally, but you have to pass a MPI communicator object to the Monte Carlo class to activate it. For ground search I recommend option 3.

phymalidoust · 2018-10-20T08:54:26Z

OK, understand. Will try to make an example of option 3 and then maybe you could complement it with option 1.

davidkleiven · 2018-10-20T09:01:05Z

Option 1 is of no use when searching for ground states. At low temperatures it might also be rather inefficient because all Markov Chain can easily end up visiting the same states and then there is not much point. If you really want to make sure that you use the CPUs efficiently option 3 is the best. Den lør. 20. okt. 2018, 10.54 skrev phymalidoust <notifications@github.com>:

…

OK, understand. Will try to make an example of option 3 and then maybe you could complement it with option 1. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#51 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AMg_T00qqBEsfEw_O1u-cv6hZBBEgky1ks5umuTCgaJpZM4XwNJx> .

davidkleiven closed this as completed Jan 11, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

parallel computing error #51

parallel computing error #51

phymalidoust commented Oct 19, 2018

davidkleiven commented Oct 19, 2018

davidkleiven commented Oct 19, 2018

phymalidoust commented Oct 20, 2018

davidkleiven commented Oct 20, 2018 •

edited

Loading

davidkleiven commented Oct 20, 2018 •

edited

Loading

phymalidoust commented Oct 20, 2018 •

edited

Loading

davidkleiven commented Oct 20, 2018 via email

parallel computing error #51

parallel computing error #51

Comments

phymalidoust commented Oct 19, 2018

davidkleiven commented Oct 19, 2018

davidkleiven commented Oct 19, 2018

phymalidoust commented Oct 20, 2018

davidkleiven commented Oct 20, 2018 • edited Loading

davidkleiven commented Oct 20, 2018 • edited Loading

phymalidoust commented Oct 20, 2018 • edited Loading

davidkleiven commented Oct 20, 2018 via email

davidkleiven commented Oct 20, 2018 •

edited

Loading

davidkleiven commented Oct 20, 2018 •

edited

Loading

phymalidoust commented Oct 20, 2018 •

edited

Loading