-
Notifications
You must be signed in to change notification settings - Fork 618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FDS6 6.5.3 - Fatal error in PMPI_init_thread Windows MPI runs #5162
Comments
For one thing, you are trying to run a 2 mesh case with four CPUs; 2 on machine1 and 2 on machine2. Also, I do not understand what this command means:
Are you trying to run fds or the test_mpi program? |
Thanjs for the reply and Apologies for the typo,
The right commands I tried should be
mpiexec -n 2 fds CHID.FDS
mpiexec -n 2 test_mpi
They both seem to work on my machine as well as the other commands I
specified in my original post.
I did try correcting the mistake on the command seen on the figure I
attached to 1 process per machine and still got the same error. Should I be
specifying the number of threads/processors per machine (OMP_NUM_THREADS=2)
as well??
Thanks
…On Thu., 22 Jun. 2017, 10:46 pm Kevin McGrattan, ***@***.***> wrote:
For one thing, you are trying to run a 2 mesh case with four CPUs; 2 on
machine1 and 2 on machine2. Also, I do not understand what this command
means:
mpiexec -n 4 fds CHID.fds test_mpi -> returns hello world
Are you trying to run fds or the test_mpi program?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5162 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AcQT4cvSdPL4SBtp0ONovy6D7JGKYB2_ks5sGmIxgaJpZM4OCJsv>
.
|
I cannot say what is wrong. If you say that you can run a case across two machines, I cannot say why you can or cannot run another case on another pair of machines. I do not understand the error message. |
I can run the file on each machine alone through the mpiexec command using
the same shared directory path. However the problem arises when I trying
running across two computers. It just seems to time out. I wanted to check
if anyone had come across such an error before.
Thanks
…On Fri., 23 Jun. 2017, 7:35 am Kevin McGrattan, ***@***.***> wrote:
I cannot say what is wrong. If you say that you can run a case across two
machines, I cannot say why you can or cannot run another case on another
pair of machines. I do not understand the error message.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5162 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AcQT4YE1AQE1IhlqxaGKadLASB4V41-Iks5sGt43gaJpZM4OCJsv>
.
|
Hmm, I haven't had a chance to try it myself to dig deep into it, but might be this an issue of administration privileges issue, on both computers you are trying to use. https://github.com/firemodels/fds/wiki/Installing-and-Running-FDS-on-a-Windows-PC |
Thanks. Will make sure I follow that and get back to you.
…On Fri., 23 Jun. 2017, 9:50 am Salah Benkorichi, ***@***.***> wrote:
Hmm, I haven't had a chance to try it myself to dig deep into it, but
might be this an issue of administration privileges issue, on both
computers you are trying to use.
I would suggest you try follow these steps, and let us know at which step
you can't proceed further.
https://github.com/firemodels/fds/wiki/Installing-and-Running-FDS-on-a-Windows-PC
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5162 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AcQT4aJbRqqSCq8rTp6pT73-qpMffo3Aks5sGv3dgaJpZM4OCJsv>
.
|
Make sure that your computers are on a Windows Domain Network; that is, you can login to each machine using the same name and password. But even then, my experience with running FDS across two Windows machines is that it sometimes fails with an error message I cannot understand. I am not an expert on Windows or networks, so I cannot say what your particular problem is. Is FDS installed on both machines? Same version? Same location? Are the two machines identical in the version of the OS? |
Yes to all the above. I followed the fds user guide on how to run a MPI
process in Windows on multiple computers at home and it worked fine.
However when trying to set it up at work I had a few issues with
restrictions which I was able to solve. The command:
Mpiexec -hosts 2 machine1 1 machine2 1 test_mpi
works fine. It is when specifying the shared directory and trying to run
the fds file that I get the error I described in my original post.
I can only think that it is because the shared directory is located within
program files or because of work network restrictions. I will keep trying
and update with the outcome.
Thanks
…On Fri., 23 Jun. 2017, 10:38 pm Kevin McGrattan, ***@***.***> wrote:
Make sure that your computers are on a Windows Domain Network; that is,
you can login to each machine using the same name and password. But even
then, my experience with running FDS across two Windows machines is that it
sometimes fails with an error message I cannot understand. I am not an
expert on Windows or networks, so I cannot say what your particular problem
is.
Is FDS installed on both machines? Same version? Same location? Are the
two machines identical in the version of the OS?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5162 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AcQT4ceoylGx1NWD9ICr4w143fCCRKjeks5sG7HWgaJpZM4OCJsv>
.
|
login to each computer and see if you can access the shared directory EXACTLY as you have specified it in your |
I did try what you suggested and I can run the Simulation on each computer
from the shared directory I specified, but it fails when I specify to run
mpiexec on the two host machines. I will try getting IT at my work to allow
for a shared directory outside the program files folder and get back to
you. Thanks!
Roger
…On Mon., 26 Jun. 2017, 11:51 pm Kevin McGrattan, ***@***.***> wrote:
login to each computer and see if you can access the shared directory
EXACTLY as you have specified it in your mpiexec command. Do not use a
directory within Program Files or other restricted parts of the computer.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5162 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AcQT4e3FTMOgBjozE13Fmcz_-KoMTQJcks5sH7d5gaJpZM4OCJsv>
.
|
Any luck with this issue? |
Hi Kevin, thanks for following up. Unfortunately no luck. IT was not able
to help me. It still times out when specifying two host machines and I
presume it is to do with network restrictions.
To try and get around the issue regarding restrictions, I got admin
privileges in a couple of towers and unhooked them of the work network and
internet. I then connected them to a spare router I had (no internet). With
this set up I was not even able to get the test_mpi command to work. Really
confused as to what I am missing. Any feedback would be much appreciated.
Regards
Roger
…On Wed., 16 Aug. 2017, 11:34 pm Kevin McGrattan ***@***.***> wrote:
Any luck with this issue?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#5162 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AcQT4bA26o1DWyc6j9BPsNk3jWaRbRiyks5sYu_LgaJpZM4OCJsv>
.
|
This is why more and more users are switching to linux clusters or cloud computing services. I cannot offer much advice about Windows. |
@rpas1231 |
Thank you. Salah, will report back once I do a test run.
Regards,
Roger Pasco
M: +61404278702
…On Fri, Nov 3, 2017 at 10:18 AM, Salah Benkorichi ***@***.***> wrote:
@rpas1231 <https://github.com/rpas1231>
We've just released FDS 6.6 , See if that can help with your issue.
Report back.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5162 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AcQT4SVuhPCKxuCInAmU-qnprQeXwwCmks5syk2-gaJpZM4OCJsv>
.
|
I want to know what the solution was, because I also have the same problem |
Hi,
I have recently started using FDS6 at work. I am trying to run mpiexec across multiple computers. I have used the FDS 6 user guide to setup MPI runs on windows. I have test the commands below and seem to be working fine.
However, when get around specifying (shown below) the working directory and the two machines, I get the error in the image attached.
I have tried doing this at home across two computers and works fine as well. The only thing I can think of is that the issue is related to either one of the following:
I would appreciate any information or alternatives regarding this issue. This is the first time posting so please advise if you require me to upload any extra info. I am attaching the test file 2MeshSim.fds used for mpi testing and the snippet of the error I am getting.
Thanks,
Roger
2MeshSim.fds.txt
The text was updated successfully, but these errors were encountered: