-
Notifications
You must be signed in to change notification settings - Fork 614
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FDS 6.5.3 Upgrade Issue #4904
Comments
try installing using this option. ie use the openmpi library that came
with the fds installer
OpenMPI options
Press 1 to install OpenMPI manually [default]
…On Fri, Mar 31, 2017 at 11:17 PM, tgob ***@***.***> wrote:
I have FDS 6.5.2 with OpenMPI 1.8.4 running stably on a Linux Ubuntu 16.04
LTS cluster with Mellanox Infiniband. The installation was completed using
the NISTpre-compiled FDS binaries. I recently attempted to upgrade to FDS
6.5.3 using the NIST precompiled binaries but there is a problem with the
new installation:
Installing 64 bit Linux FDS 6.5.3 and Smokeview 6.4.4
Options:
1. Press to begin installation
2. Type "extract" to copy the installation files to
FDS_6.5.3-SMV_6.4.4_linux64.tar.gz
[Enter]
FDS install options
Press 1 to install in /home/ob1/FDS/FDS6 [default]
Press 2 to install in /opt/FDS/FDS6
Press 3 to install in /usr/local/bin/FDS/FDS6
Enter a directory path to install elsewhere
[1][Enter]
OpenMPI options
Press 1 to install OpenMPI manually [default]
See /home/ob1/FDS/FDS6/bin/README.html for details
Press 2 to use /shared/openmpi_64ib
[2][Enter]
Installation directory: /home/ob1/FDS/FDS6
OpenMPI directory: /shared/openmpi_64ib
Installation beginning
The directory, /home/ob1/FDS/FDS6, already exists.
The installation directory, /home/ob1/FDS/FDS6, has been created.
Creating directory /home/ob1/FDS/FDS6/Uninstall
The installation directory, /home/ob1/FDS/FDS6/Uninstall, has been created.
Copying FDS installation files to /home/ob1/FDS/FDS6
Copy complete.
Backing up /home/ob1/.bashrc_fds to /home/ob1/.bashrc_fds_20170401_102309
Updating .bashrc_fds
Backing up /home/ob1/.bashrc to /home/ob1/.bashrc_20170401_102309
Updating .bashrc
*** Log out and log back in so changes will take effect.
Installation complete.
No issues were reported during the install but when I execute fds from a
terminal command prompt on the Master node I get the following output:
------------------------------
Sorry! You were supposed to get help about:
ini file:file not found
But I couldn't open the help file:
/shared/openmpi_64/share/openmpi/help-mpi-btl-openib.txt: No such file or
directory. Sorry!
libibverbs: Warning: couldn't load driver '/usr/lib/libibverbs/libmlx5':
/usr/lib/libibverbs/libmlx5-rdmav2.so: symbol ibv_cmd_destroy_flow,
version IBVERBS_1.0 not defined in file libibverbs.so.1 with link time
reference
libibverbs: Warning: couldn't load driver '/usr/lib/libibverbs/libmlx4':
/usr/lib/libibverbs/libmlx4-rdmav2.so: symbol ibv_cmd_destroy_flow,
version IBVERBS_1.0 not defined in file libibverbs.so.1 with link time
reference
libibverbs: Warning: no userspace device-specific driver found for
/sys/class/infiniband_verbs/uverbs0
Sorry! You were supposed to get help about:
btl:no-nics
But I couldn't open the help file:
/shared/openmpi_64/share/openmpi/help-mpi-btl-base.txt: No such file or directory. Sorry!
------------------------------
Fire Dynamics Simulator
Current Date : April 1, 2017 10:32:42
Version : FDS 6.5.3
Revision : FDS6.5.3-598-geb56ed1
Revision Date : Thu Jan 19 16:12:59 2017 -0500
Compilation Date : Jan 22, 2017 18:04:30
MPI Enabled; Number of MPI Processes: 1
OpenMP Enabled; Number of OpenMP Threads: 4
MPI version: 3.0
MPI library version: Open MPI v1.8.4, package: Open MPI ***@***.***
Distribution, ident: 1.8.4, repo rev: v1.8.3-330-g0344f04, Dec 19, 2014
Consult FDS Users Guide Chapter, Running FDS, for further instructions.
Hit Enter to Escape...
For some reason FDS appears to be trying to access a non-infinband openmpi
installation and associated help files at:
/shared/openmpi_64/share/openmpi
However openmpi resides in the default Infiniband installation directory
(as for FDS 6.5.2):
/shared/openmpi_64ib/share/openmpi
.bashrc and .bashrc_fds are setting the environment variables
appropriately, and in particular PATH, LD_LIBRARY_PATH and FDSNETWORK as
follows:
PATH:
/shared/openmpi_64ib/bin:/home/ob1/FDS/FDS6/bin:/usr/
local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/
usr/games:/usr/local/games:/snap/bin
LD_LIBRARY_PATH
/shared/openmpi_64ib/lib:/home/ob1/FDS/FDS6/bin/LIB64:/
home/ob1/FDS/FDS6/bin/INTELLIBS16
FDSNETWORK
Infiniband
Past upgrades (for example from 6.5.0 to 6.5.2) have been completed
successfully by simply downloading the FDS and SmokeView precompiled Linux
bundle and running the script (.sh) file.
Infiniband is still working, openmpi is still working (via Infiniband) and
SSH is still working (password-less access) between all nodes.
A Windows 7 upgrade to 6.5.3 worked just fine (albeit without openmpi and
Infinband on my Windows workstation).
I also tried a Linux Ubuntu 16.04 LTS install without openmpi or
Infiniband. This also worked fine.
Any suggestions on how I might complete the upgrade to FDS 6.5.3?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4904>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRiQtMDCi8wua-OlQbGaxCwiTbkGUks5rrcG9gaJpZM4MwWP5>
.
--
Glenn Forney
|
Thank you for responding so quickly Glenn.
Yes, I have already tried the manual OpenMPI install option, followed by manually adjusting the .bashrc MPIDIST_IB variable to /shared/openmpi_64ib, and passing this to .bashrc_fds through the source line.
The result was the same. Running fds from a terminal command line produced the same error message as previously posted. I'm not sure why fds is looking for the directory:
/shared/openmpi_64/share/openmpi/
The actual directory with the help files is /shared/openmpi-64ib/share/openmpi.
Have the default OpenMPI or Infiniband installation directories changed since FDS6.5.2 was built?
One thing I haven't done is to recompile OpenMPI or use the bundled OpenMPI. This was originally compiled with GNU for Infiniband using the FDS 6.5.0 default installation directories and is working just fine (as it was with FDS 6.5.0 and FDS 6.5.2).
t.
From: Glenn Forney [mailto:notifications@github.com]
Sent: Saturday, 1 April 2017 4:43 p.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
try installing using this option. ie use the openmpi library that came
with the fds installer
OpenMPI options
Press 1 to install OpenMPI manually [default]
…On Fri, Mar 31, 2017 at 11:17 PM, tgob ***@***.***> wrote:
I have FDS 6.5.2 with OpenMPI 1.8.4 running stably on a Linux Ubuntu 16.04
LTS cluster with Mellanox Infiniband. The installation was completed using
the NISTpre-compiled FDS binaries. I recently attempted to upgrade to FDS
6.5.3 using the NIST precompiled binaries but there is a problem with the
new installation:
Installing 64 bit Linux FDS 6.5.3 and Smokeview 6.4.4
Options:
1. Press to begin installation
2. Type "extract" to copy the installation files to
FDS_6.5.3-SMV_6.4.4_linux64.tar.gz
[Enter]
FDS install options
Press 1 to install in /home/ob1/FDS/FDS6 [default]
Press 2 to install in /opt/FDS/FDS6
Press 3 to install in /usr/local/bin/FDS/FDS6
Enter a directory path to install elsewhere
[1][Enter]
OpenMPI options
Press 1 to install OpenMPI manually [default]
See /home/ob1/FDS/FDS6/bin/README.html for details
Press 2 to use /shared/openmpi_64ib
[2][Enter]
Installation directory: /home/ob1/FDS/FDS6
OpenMPI directory: /shared/openmpi_64ib
Installation beginning
The directory, /home/ob1/FDS/FDS6, already exists.
The installation directory, /home/ob1/FDS/FDS6, has been created.
Creating directory /home/ob1/FDS/FDS6/Uninstall
The installation directory, /home/ob1/FDS/FDS6/Uninstall, has been created.
Copying FDS installation files to /home/ob1/FDS/FDS6
Copy complete.
Backing up /home/ob1/.bashrc_fds to /home/ob1/.bashrc_fds_20170401_102309
Updating .bashrc_fds
Backing up /home/ob1/.bashrc to /home/ob1/.bashrc_20170401_102309
Updating .bashrc
*** Log out and log back in so changes will take effect.
Installation complete.
No issues were reported during the install but when I execute fds from a
terminal command prompt on the Master node I get the following output:
------------------------------
Sorry! You were supposed to get help about:
ini file:file not found
But I couldn't open the help file:
/shared/openmpi_64/share/openmpi/help-mpi-btl-openib.txt: No such file or
directory. Sorry!
libibverbs: Warning: couldn't load driver '/usr/lib/libibverbs/libmlx5':
/usr/lib/libibverbs/libmlx5-rdmav2.so: symbol ibv_cmd_destroy_flow,
version IBVERBS_1.0 not defined in file libibverbs.so.1 with link time
reference
libibverbs: Warning: couldn't load driver '/usr/lib/libibverbs/libmlx4':
/usr/lib/libibverbs/libmlx4-rdmav2.so: symbol ibv_cmd_destroy_flow,
version IBVERBS_1.0 not defined in file libibverbs.so.1 with link time
reference
libibverbs: Warning: no userspace device-specific driver found for
/sys/class/infiniband_verbs/uverbs0
Sorry! You were supposed to get help about:
btl:no-nics
But I couldn't open the help file:
/shared/openmpi_64/share/openmpi/help-mpi-btl-base.txt: No such file or directory. Sorry!
------------------------------
Fire Dynamics Simulator
Current Date : April 1, 2017 10:32:42
Version : FDS 6.5.3
Revision : FDS6.5.3-598-geb56ed1
Revision Date : Thu Jan 19 16:12:59 2017 -0500
Compilation Date : Jan 22, 2017 18:04:30
MPI Enabled; Number of MPI Processes: 1
OpenMP Enabled; Number of OpenMP Threads: 4
MPI version: 3.0
MPI library version: Open MPI v1.8.4, package: Open MPI ***@***.***
Distribution, ident: 1.8.4, repo rev: v1.8.3-330-g0344f04, Dec 19, 2014
Consult FDS Users Guide Chapter, Running FDS, for further instructions.
Hit Enter to Escape...
For some reason FDS appears to be trying to access a non-infinband openmpi
installation and associated help files at:
/shared/openmpi_64/share/openmpi
However openmpi resides in the default Infiniband installation directory
(as for FDS 6.5.2):
/shared/openmpi_64ib/share/openmpi
.bashrc and .bashrc_fds are setting the environment variables
appropriately, and in particular PATH, LD_LIBRARY_PATH and FDSNETWORK as
follows:
PATH:
/shared/openmpi_64ib/bin:/home/ob1/FDS/FDS6/bin:/usr/
local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/
usr/games:/usr/local/games:/snap/bin
LD_LIBRARY_PATH
/shared/openmpi_64ib/lib:/home/ob1/FDS/FDS6/bin/LIB64:/
home/ob1/FDS/FDS6/bin/INTELLIBS16
FDSNETWORK
Infiniband
Past upgrades (for example from 6.5.0 to 6.5.2) have been completed
successfully by simply downloading the FDS and SmokeView precompiled Linux
bundle and running the script (.sh) file.
Infiniband is still working, openmpi is still working (via Infiniband) and
SSH is still working (password-less access) between all nodes.
A Windows 7 upgrade to 6.5.3 worked just fine (albeit without openmpi and
Infinband on my Windows workstation).
I also tried a Linux Ubuntu 16.04 LTS install without openmpi or
Infiniband. This also worked fine.
Any suggestions on how I might complete the upgrade to FDS 6.5.3?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#4904>, or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRiQtMDCi8wua-OlQbGaxCwiTbkGUks5rrcG9gaJpZM4MwWP5>
.
--
Glenn Forney
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypG0BDP-PUMYWdmqZuRX3GdOyYFTIks5rrceogaJpZM4MwWP5> .Image removed by sender.
|
Can you try the following:
Open your .bashrc
Now, test if mpirun works or not with any simple example case. |
Thank you for your reply Salah.
I can do this but I doubt that your precompiled version of OpenMPI will utilize my Mellanox Infiniband network.
It took quite a lot of effort to get OpenMPI 1.8.4 working over Infiniband on my cluster, however the current OpenMPI install runs over Infiniband with FDS 6.5.2 and FDS 6.5.0 using the FDS documented default installation directories.
You can read about the install at procedure in the attached document.
Would you please advise if the directory structure for OpenMPI has changed from the FDS 6.5.2 to 6.5.3 build.
With kindest regards,
Tim
From: Salah Benkorichi [mailto:notifications@github.com]
Sent: Saturday, 1 April 2017 7:53 p.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Can you try the following:
1.) Reinstall FDS 6.5.3 , choose
FDS install options
Press 1 to install in /home/ob1/FDS/FDS6 [default]
1
OpenMPI options
Press 1 to install OpenMPI manually [default]
1
Continue yes, yes, since you're overwritten the previous installation.
2. cd to FDS/FDS6/bin and then type:
gunzip openmpi_1.8.4_linux_64.tar.gz
tar -xvf openmpi_1.8.4_linux_64.tar
Open your .bashrc
Make sure these two lines are found:
export MPIDIST_FDS=/home/ob1/FDS/FDS6/bin/openmpi_64
source ~/.bashrc_fds $MPIDIST_FDS
Now, test if mpirun works or not with any simple example case.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypAqkLw3yIIlfn3BP7f4nCTAIqdroks5rrfRmgaJpZM4MwWP5> .Image removed by sender.
|
Can you run these commands
|
Here are the variables from the original install as listed in my initial post:
FDS install options
Press 1 to install in /home/ob1/FDS/FDS6 [default]
OpenMPI options
Press 2 to use /shared/openmpi_64ib
$MPIDIST_FDS /home/ob1/FDS/FDS6/bin/openmpi_64
$MPIDIST_ETH ("" or null)
$MPIDIST_IB /shared/openmpi_64ib
$MPIDIST /shared/openmpi_64ib
These are all set correctly in accordance with .bashrc and .bashrc_fds
The problem appears to be associated with the reference to shared/openmpi_64/share/openmpi/.
The actual directory that contains these files is shared/openmpi_64ib/share/openmpi/.
Which takes me back to my question, have the default Infiniband and OpenMPI installation directories changed between FDS 6.5.2 and 6.5.3?
With kindest regards,
Tim
From: Salah Benkorichi [mailto:notifications@github.com]
Sent: Saturday, 1 April 2017 11:05 p.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Can you run these commands
[salah@jbk27s092]$ echo $MPIDIST_FDS
/home/salah/FDS/FDS6/bin/openmpi_64
[salah@jbk27s092]$ echo $MPIDIST_ETH
/shared/openmpi_64
[salah@jbk27s092]$ echo $MPIDIST_IB
[salah@jbk27s092]$ echo $MPIDIST
/shared/openmpi_64
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypBLVbAZDhsozdUSYxdqFuhiYOYSZks5rriFlgaJpZM4MwWP5> .Image removed by sender.
|
Tim, Regards, |
Fds doesn't care where the openmpi library is located. But you need to tell
it correctly. The installer script "asks" for openmpi location so it can
set environment varuables. ( PATH, etc). So edit your .bashrc to have
source ~/.bashrc path_to_open_mpi_library w
…On Apr 1, 2017 8:08 AM, "Salah Benkorichi" ***@***.***> wrote:
Tim,
As far as I know it didn't change. The colleagues at NIST are using the
Latest version as well and it's working.
Wait for Kevin or gforney to detail on this matter.
Regards,
Salah
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4904 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRvSzz_KOypZVLD2UGwvJHDTNUgrcks5rrj5QgaJpZM4MwWP5>
.
|
The previously posted error message indicates that there is an issue with how FDS 6.5.3 is interpreting the variables passed to it. This is not an issue with either .bashrc or .bashrc_fds. FDS 6.5.3 is looking for error message files in a non-existent directory. I figure that this is specific to Infiniband installs.
I'll start looking at the code in detail over the next few days to try and sort out what is going on, but I reiterate that FDS 6.5.2 and 6.5.0 worked perfectly from the bundled .sh install files, and my install of openmpi 1.8.4 is rock solid using Infiniband. The only thing that has changed here is FDS 6.5.3.
With kindest regards,
Tim
From: Salah Benkorichi [mailto:notifications@github.com]
Sent: Sunday, 2 April 2017 1:09 a.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Tim,
As far as I know it didn't change. The colleagues at NIST are using the Latest version as well and it's working.
Wait for Kevin or gforney to detail on this matter.
Regards,
Salah
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypChLe_zhvynRtefRkJ1rSAIeCr_Aks5rrj5QgaJpZM4MwWP5> .Image removed by sender.
|
Tim, |
Yes, that is exactly what is in .bashrc.
The file is attached.
I really do appreciate NIST's responsiveness on this. Thank you for this.
t.
From: Salah Benkorichi [mailto:notifications@github.com]
Sent: Sunday, 2 April 2017 1:58 a.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Tim,
Could you please share the fds environment in your .bashrc file.
FDS 6.5.3 changes your .bashrc after installing it,
Check if this library path is set for fds_bashrc
source ~/.bashrc_fds $MPIDIST_IB
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypEES43J1a3AK4gXw_8QB_ZlHgzspks5rrknUgaJpZM4MwWP5> .Image removed by sender.
|
Tim, |
Thank you for your email Glenn.
It's nice to know that FDS shouldn't care where openmpi is installed (equivalent to earlier version behaviour) but this doesn't seem to hold for Infiniband.
My understanding is that the FDS install script detects openmpi (and openmpi with ib) if it is installed in the default directory locations. This worked perfectly with FDS6.5.2 and 6.5.0. FDS 6.5.3 detects /shared/openmpi_64ib correctly and offers this as the installation parameter. That's good but why is it looking for an error message in /shared/openmpi_64/share/openmpi? Look at these directories again please. /shared/openmpi-64ib/share/openmpi exists. . /shared/openmpi-64/share/openmpi does not exist.
I'll be looking at the FDS code over the next few days to see if I can figure out what has changed. But it's way too late here in God's Own (NZ) for my tired grey cells to solve this.
With kindest regards,
t.
From: Glenn Forney [mailto:notifications@github.com]
Sent: Sunday, 2 April 2017 1:25 a.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Fds doesn't care where the openmpi library is located. But you need to tell
it correctly. The installer script "asks" for openmpi location so it can
set environment varuables. ( PATH, etc). So edit your .bashrc to have
source ~/.bashrc path_to_open_mpi_library w
On Apr 1, 2017 8:08 AM, "Salah Benkorichi" ***@***.***> wrote:
Tim,
As far as I know it didn't change. The colleagues at NIST are using the
Latest version as well and it's working.
Wait for Kevin or gforney to detail on this matter.
Regards,
Salah
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#4904 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRvSzz_KOypZVLD2UGwvJHDTNUgrcks5rrj5QgaJpZM4MwWP5>
.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypJU_bWI2mHt7ZVM1NM2LLnv9bCJPks5rrkINgaJpZM4MwWP5> .Image removed by sender.
|
Here they are in .zip format.
t.
From: Salah Benkorichi [mailto:notifications@github.com]
Sent: Sunday, 2 April 2017 2:13 a.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Tim,
There is no attached file.
You mentioned earlier about the installation document, you didn't attach it.
Attach it as .txt file or zip it. Some formats are not supported.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypDakCTPWYHNX8IvFmLJLQ6pOrQ4aks5rrk1bgaJpZM4MwWP5> .Image removed by sender.
|
Tim |
Try this.., |
I've checked your .bashrc file, the parameters are set in accordance. This version was compiled with intel 16, while your Openmpi was done previously with GNU on 6.5.0. Regards, |
No joy Salah - same error message.
The variable $IFORT_COMPILER_LIB is not initialized in .bashrc so the proposed change is simply passing a null as the second parameter to .bashrc_fds.
Similarly, $INTEL_SHARED_LIB is initialized to /intel64 in .bashrc. There is no intel64 directory at the root or in ~/FDS/FDS6/bin/INTELLIB16.
Would you please confirm that I should be changing to OpenMPI version 2.1. In the past anything other than 1.8.4 (from FDS 6.3.0) caused problems, as identified in the FDS Users manual.
Unfortunately I am not a student and I use fds for commercial fire engineering design. The Intel compiler suit is around US$3,000 (plus GST) for a version that supports mpi .
With kindest regards,
t.
From: Salah Benkorichi [mailto:notifications@github.com]
Sent: Sunday, 2 April 2017 11:22 a.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
I've checked your .bashrc file, the parameters are set in accordance. This version was compiled with intel 16, while your Openmpi was done previously with GNU on 6.5.0.
Can you try to test this plz :
modify your .bashrc file by adding IFORT as this line below :
source ~/.bashrc_fds $MPIDIST_IB $IFORT_COMPILER_LIB
on your terminal update your .bashrc:
source ~/.bashrc
Then check if fds now is working. Otherwise, I would recommend you use the latest version of
OpenMPI https://www.open-mpi.org/software/ompi/v2.1/
, and if you can get intel 17 compiler it would be better as well. https://software.intel.com/en-us/intel-parallel-studio-xe
They provide a free version for students, that you can use on 3 different machines.
And, maybe @gforney <https://github.com/gforney> , might have other suggestions that might fix it within these setup.
Regards,
Salah
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypHRIJ3V9wGqyZhevHITM9rc7zJrxks5rrtw1gaJpZM4MwWP5> .Image removed by sender.
|
Where exactly is your infiniband openmpi library located? What version is
it? What compiler was used to build it?
…On Apr 1, 2017 10:12 PM, "tgob" ***@***.***> wrote:
No joy Salah - same error message.
The variable $IFORT_COMPILER_LIB is not initialized in .bashrc so the
proposed change is simply passing a null as the second parameter to
.bashrc_fds.
Similarly, $INTEL_SHARED_LIB is initialized to /intel64 in .bashrc. There
is no intel64 directory at the root or in ~/FDS/FDS6/bin/INTELLIB16.
Would you please confirm that I should be changing to OpenMPI version 2.1.
In the past anything other than 1.8.4 (from FDS 6.3.0) caused problems, as
identified in the FDS Users manual.
Unfortunately I am not a student and I use fds for commercial fire
engineering design. The Intel compiler suit is around US$3,000 (plus GST)
for a version that supports mpi .
With kindest regards,
t.
From: Salah Benkorichi ***@***.***
Sent: Sunday, 2 April 2017 11:22 a.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
I've checked your .bashrc file, the parameters are set in accordance. This
version was compiled with intel 16, while your Openmpi was done previously
with GNU on 6.5.0.
Can you try to test this plz :
modify your .bashrc file by adding IFORT as this line below :
source ~/.bashrc_fds $MPIDIST_IB $IFORT_COMPILER_LIB
on your terminal update your .bashrc:
source ~/.bashrc
Then check if fds now is working. Otherwise, I would recommend you use the
latest version of
OpenMPI https://www.open-mpi.org/software/ompi/v2.1/
, and if you can get intel 17 compiler it would be better as well.
https://software.intel.com/en-us/intel-parallel-studio-xe
They provide a free version for students, that you can use on 3 different
machines.
And, maybe @gforney <https://github.com/gforney> , might have other
suggestions that might fix it within these setup.
Regards,
Salah
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <https://github.com/
firemodels/fds#4904#issuecomment-290953768> , or mute the thread <
https://github.com/notifications/unsubscribe-auth/
ATPypHRIJ3V9wGqyZhevHITM9rc7zJrxks5rrtw1gaJpZM4MwWP5> .Image removed by
sender.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4904 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRuZTFrrlsDv_zjWSoJ3lGZz7RGrYks5rrwP7gaJpZM4MwWP5>
.
|
OpenMPI is installed at: /shared/openmpi_64/bin
The OpenMPI libraries are at: /shared/openmpi_64/lib
OpenMPI is version 1.8.4
It was compiled using GNU g++ 4.8.2 in accordance with the OpenMPI Installation Instructions.
OpenMPI works fine with both FDS 6.5.2 and FDS 6.5.0 over Infinband using the bundled NIST binaries. It also works fine with the OpenMPI test programs connectivity_c, hello_c and ring_c.
t.
From: Glenn Forney [mailto:notifications@github.com]
Sent: Sunday, 2 April 2017 3:29 p.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Where exactly is your infiniband openmpi library located? What version is
it? What compiler was used to build it?
On Apr 1, 2017 10:12 PM, "tgob" ***@***.***> wrote:
No joy Salah - same error message.
The variable $IFORT_COMPILER_LIB is not initialized in .bashrc so the
proposed change is simply passing a null as the second parameter to
.bashrc_fds.
Similarly, $INTEL_SHARED_LIB is initialized to /intel64 in .bashrc. There
is no intel64 directory at the root or in ~/FDS/FDS6/bin/INTELLIB16.
Would you please confirm that I should be changing to OpenMPI version 2.1.
In the past anything other than 1.8.4 (from FDS 6.3.0) caused problems, as
identified in the FDS Users manual.
Unfortunately I am not a student and I use fds for commercial fire
engineering design. The Intel compiler suit is around US$3,000 (plus GST)
for a version that supports mpi .
With kindest regards,
t.
From: Salah Benkorichi ***@***.***
Sent: Sunday, 2 April 2017 11:22 a.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
I've checked your .bashrc file, the parameters are set in accordance. This
version was compiled with intel 16, while your Openmpi was done previously
with GNU on 6.5.0.
Can you try to test this plz :
modify your .bashrc file by adding IFORT as this line below :
source ~/.bashrc_fds $MPIDIST_IB $IFORT_COMPILER_LIB
on your terminal update your .bashrc:
source ~/.bashrc
Then check if fds now is working. Otherwise, I would recommend you use the
latest version of
OpenMPI https://www.open-mpi.org/software/ompi/v2.1/
, and if you can get intel 17 compiler it would be better as well.
https://software.intel.com/en-us/intel-parallel-studio-xe
They provide a free version for students, that you can use on 3 different
machines.
And, maybe @gforney <https://github.com/gforney> , might have other
suggestions that might fix it within these setup.
Regards,
Salah
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <https://github.com/
firemodels/fds#4904#issuecomment-290953768> , or mute the thread <
https://github.com/notifications/unsubscribe-auth/
ATPypHRIJ3V9wGqyZhevHITM9rc7zJrxks5rrtw1gaJpZM4MwWP5> .Image removed by
sender.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4904 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRuZTFrrlsDv_zjWSoJ3lGZz7RGrYks5rrwP7gaJpZM4MwWP5>
.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypFDe9__mwcs_b5HEglgXNgh_sDWAks5rrxXqgaJpZM4MwWP5> .Image removed by sender.
|
Based on what you said I would have said you openmpi library/distribution
is located at /shared/openmpi_64. Do you have this line in your .bashrc
file?
source ~/.bashrc_fds /shared/openmpi_64
When you type fds at a command line what does it say?
On Apr 2, 2017 4:57 AM, "tgob" <notifications@github.com> wrote:
OpenMPI is installed at: /shared/openmpi_64/bin
The OpenMPI libraries are at: /shared/openmpi_64/lib
OpenMPI is version 1.8.4
It was compiled using GNU g++ 4.8.2 in accordance with the OpenMPI
Installation Instructions.
OpenMPI works fine with both FDS 6.5.2 and FDS 6.5.0 over Infinband using
the bundled NIST binaries. It also works fine with the OpenMPI test
programs connectivity_c, hello_c and ring_c.
t.
From: Glenn Forney [mailto:notifications@github.com]
Sent: Sunday, 2 April 2017 3:29 p.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Where exactly is your infiniband openmpi library located? What version is
it? What compiler was used to build it?
On Apr 1, 2017 10:12 PM, "tgob" ***@***.***> wrote:
No joy Salah - same error message.
The variable $IFORT_COMPILER_LIB is not initialized in .bashrc so the
proposed change is simply passing a null as the second parameter to
.bashrc_fds.
Similarly, $INTEL_SHARED_LIB is initialized to /intel64 in .bashrc. There
is no intel64 directory at the root or in ~/FDS/FDS6/bin/INTELLIB16.
Would you please confirm that I should be changing to OpenMPI version 2.1.
In the past anything other than 1.8.4 (from FDS 6.3.0) caused problems, as
identified in the FDS Users manual.
Unfortunately I am not a student and I use fds for commercial fire
engineering design. The Intel compiler suit is around US$3,000 (plus GST)
for a version that supports mpi .
With kindest regards,
t.
From: Salah Benkorichi ***@***.***
Sent: Sunday, 2 April 2017 11:22 a.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
I've checked your .bashrc file, the parameters are set in accordance. This
version was compiled with intel 16, while your Openmpi was done previously
with GNU on 6.5.0.
Can you try to test this plz :
modify your .bashrc file by adding IFORT as this line below :
source ~/.bashrc_fds $MPIDIST_IB $IFORT_COMPILER_LIB
on your terminal update your .bashrc:
source ~/.bashrc
Then check if fds now is working. Otherwise, I would recommend you use the
latest version of
OpenMPI https://www.open-mpi.org/software/ompi/v2.1/
, and if you can get intel 17 compiler it would be better as well.
https://software.intel.com/en-us/intel-parallel-studio-xe
They provide a free version for students, that you can use on 3 different
machines.
And, maybe @gforney <https://github.com/gforney> , might have other
suggestions that might fix it within these setup.
Regards,
Salah
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <https://github.com/
firemodels/fds#4904#issuecomment-290953768> , or mute the thread <
https://github.com/notifications/unsubscribe-auth/
ATPypHRIJ3V9wGqyZhevHITM9rc7zJrxks5rrtw1gaJpZM4MwWP5> .Image removed by
sender.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4904 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRuZTFrrlsDv_
zjWSoJ3lGZz7RGrYks5rrwP7gaJpZM4MwWP5>
.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <https://github.com/
firemodels/fds#4904#issuecomment-290962240> , or mute the thread <
https://github.com/notifications/unsubscribe-auth/ATPypFDe9__mwcs_
b5HEglgXNgh_sDWAks5rrxXqgaJpZM4MwWP5> .Image removed by sender.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4904 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRiNHZ3spuqCzjfn38BxA6y_4vwvCks5rr2LwgaJpZM4MwWP5>
.
|
Yes, that's what he has in his bashrc, #FDS ----------------------------------- |
As he said, once he tries to execute it in the terminal, it throws a warnings as he shared above, this might be an issue of libraries. Sorry! You were supposed to get help about: libibverbs: Warning: couldn't load driver '/usr/lib/libibverbs/libmlx5': /usr/lib/libibverbs/libmlx5-rdmav2.so: symbol ibv_cmd_destroy_flow, version IBVERBS_1.0 not defined in file libibverbs.so.1 with link time reference libibverbs: Warning: couldn't load driver '/usr/lib/libibverbs/libmlx4': /usr/lib/libibverbs/libmlx4-rdmav2.so: symbol ibv_cmd_destroy_flow, version IBVERBS_1.0 not defined in file libibverbs.so.1 with link time reference libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0 Sorry! You were supposed to get help about: btl:no-nics /shared/openmpi_64/share/openmpi/help-mpi-btl-base.txt: No such file or directory. Sorry! Current Date : April 1, 2017 10:32:42 Version : FDS 6.5.3 Revision : FDS6.5.3-598-geb56ed1 Revision Date : Thu Jan 19 16:12:59 2017 -0500 Compilation Date : Jan 22, 2017 18:04:30 MPI Enabled; Number of MPI Processes: 1 OpenMP Enabled; Number of OpenMP Threads: 4 MPI version: 3.0 MPI library version: Open MPI v1.8.4, package: Open MPI gforney@burn Distribution, ident: 1.8.4, repo rev: v1.8.3-330-g0344f04, Dec 19, 2014 Consult FDS Users Guide Chapter, Running FDS, for further instructions. Hit Enter to Escape... |
My error entirely Glenn ( I am transcribing stuff from the Linux cluster [which does almost nothing except FDS] to my regular Windows workstation via my hand-written notebooks which I keep as a record of what I have done).
OpenMPI is installed at: /shared/openmpi_64ib/bin
The OpenMPI libraries are at: /shared/openmpi_64ib/lib
Yes, these are in .bashrc and appear in $PATH and $LD_LIBRARY_PATH on interactive and non-interactive login.
The fds error test has been posted previously (and I see that Salah has posted this again).
With kindest regards,
Tim
From: Glenn Forney [mailto:notifications@github.com]
Sent: Sunday, 2 April 2017 10:51 p.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Based on what you said I would have said you openmpi library/distribution
is located at /shared/openmpi_64. Do you have this line in your .bashrc
file?
source ~/.bashrc_fds /shared/openmpi_64
When you type fds at a command line what does it say?
On Apr 2, 2017 4:57 AM, "tgob" <notifications@github.com> wrote:
OpenMPI is installed at: /shared/openmpi_64/bin
The OpenMPI libraries are at: /shared/openmpi_64/lib
OpenMPI is version 1.8.4
It was compiled using GNU g++ 4.8.2 in accordance with the OpenMPI
Installation Instructions.
OpenMPI works fine with both FDS 6.5.2 and FDS 6.5.0 over Infinband using
the bundled NIST binaries. It also works fine with the OpenMPI test
programs connectivity_c, hello_c and ring_c.
t.
From: Glenn Forney [mailto:notifications@github.com]
Sent: Sunday, 2 April 2017 3:29 p.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Where exactly is your infiniband openmpi library located? What version is
it? What compiler was used to build it?
On Apr 1, 2017 10:12 PM, "tgob" ***@***.***> wrote:
No joy Salah - same error message.
The variable $IFORT_COMPILER_LIB is not initialized in .bashrc so the
proposed change is simply passing a null as the second parameter to
.bashrc_fds.
Similarly, $INTEL_SHARED_LIB is initialized to /intel64 in .bashrc. There
is no intel64 directory at the root or in ~/FDS/FDS6/bin/INTELLIB16.
Would you please confirm that I should be changing to OpenMPI version 2.1.
In the past anything other than 1.8.4 (from FDS 6.3.0) caused problems, as
identified in the FDS Users manual.
Unfortunately I am not a student and I use fds for commercial fire
engineering design. The Intel compiler suit is around US$3,000 (plus GST)
for a version that supports mpi .
With kindest regards,
t.
From: Salah Benkorichi ***@***.***
Sent: Sunday, 2 April 2017 11:22 a.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
I've checked your .bashrc file, the parameters are set in accordance. This
version was compiled with intel 16, while your Openmpi was done previously
with GNU on 6.5.0.
Can you try to test this plz :
modify your .bashrc file by adding IFORT as this line below :
source ~/.bashrc_fds $MPIDIST_IB $IFORT_COMPILER_LIB
on your terminal update your .bashrc:
source ~/.bashrc
Then check if fds now is working. Otherwise, I would recommend you use the
latest version of
OpenMPI https://www.open-mpi.org/software/ompi/v2.1/
, and if you can get intel 17 compiler it would be better as well.
https://software.intel.com/en-us/intel-parallel-studio-xe
They provide a free version for students, that you can use on 3 different
machines.
And, maybe @gforney <https://github.com/gforney> , might have other
suggestions that might fix it within these setup.
Regards,
Salah
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <https://github.com/
firemodels/fds#4904#issuecomment-290953768> , or mute the thread <
https://github.com/notifications/unsubscribe-auth/
ATPypHRIJ3V9wGqyZhevHITM9rc7zJrxks5rrtw1gaJpZM4MwWP5> .Image removed by
sender.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4904 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRuZTFrrlsDv_
zjWSoJ3lGZz7RGrYks5rrwP7gaJpZM4MwWP5>
.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <https://github.com/
firemodels/fds#4904#issuecomment-290962240> , or mute the thread <
https://github.com/notifications/unsubscribe-auth/ATPypFDe9__mwcs_
b5HEglgXNgh_sDWAks5rrxXqgaJpZM4MwWP5> .Image removed by sender.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4904 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRiNHZ3spuqCzjfn38BxA6y_4vwvCks5rr2LwgaJpZM4MwWP5>
.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypK8K7YINqswCblwpGli62aYn-22iks5rr314gaJpZM4MwWP5> .Image removed by sender.
|
Thought he said his library was at /shared/openmpi_64 not
/shared/openmpi_64ib
On Apr 2, 2017 6:58 AM, "Salah Benkorichi" <notifications@github.com> wrote:
Yes, that's what he has in his bashrc,
bashrc.txt <https://github.com/firemodels/fds/files/888314/bashrc.txt>
#FDS -----------------------------------
export MPIDIST_FDS=/home/ob1/FDS/FDS6/bin/openmpi_64
export MPIDIST_ETH=
export MPIDIST_IB=/shared/openmpi_64ib
INTEL_SHARED_LIB=$IFORT_COMPILER_LIB/intel64
source ~/.bashrc_fds $MPIDIST_IB
#FDS -----------------------------------
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4904 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRv_lef4wg9NpNlOm5yTNCYim_yKeks5rr39RgaJpZM4MwWP5>
.
|
what version of openmpi do you have? here are a couple of experiments to try The following link contains an openmpi 1.8.4 infiniband distribution and an fds linux executable built against this openmpi library. The source for this fds is identical to the source used to build the latest "official" fds, ie githash eb56ed1 . Experiment 1
Experiment 2
Experiment 3 with all these experiments, after any edits to .bashrc you have to log out and log back in. (in theory you can just source .bashrc - but logging out and logging back in is "safer" ) Note all previous fds' that we have distributed were built against an ethernet (not infiniband) version of Openmpi. [20:54:34 gforney@blaze:~ ] $ ls /shared/openmpi_64ib You need to invoke .bashrc_fds using |
Thanks for the suggestions Glenn.
OpenMPI is 1.8.4 from https://www.open-mpi.org/software/ompi/v1.8/
First up I'll try recompiling OpenMPI with Infiniband support into /shared/openmpi_64. If nothing else this should correct the fds error messages.
You have previously indicated that FDS6.5.3 doesn't actually care where OpenMPI is located so long as its location (and the associated dynamically linked libraries) are correctly passed to .bashrc_fds. For whatever reason FDS is loosing the ib in the current directory path to the error messages (it is looking for /shared/openmpi_64/share/openmpi/help-mpi-btl-openib.txt when it should be looking for for /shared/openmpi_64ib/share/openmpi/help-mpi-btl-openib.txt). I figure that if fds is looking in the wrong place for the error messages then it is probably looking in the wrong place for init files which is the cause of the error.
I shall then proceed with your suggested experiments.
This may take a few days but I'll report back shortly...
From: Glenn Forney [mailto:notifications@github.com]
Sent: Monday, 3 April 2017 1:40 p.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
what version of openmpi do you have?
here are a couple of experiments to try
The following link contains an openmpi 1.8.4 infiniband distribution and an fds linux executable built against this openmpi library. The source for this fds is identical to the source used to build the latest "official" fds, ie githash <eb56ed1> eb56ed1 .
https://drive.google.com/drive/folders/0B-W-dkXwdHWNSVJhWXJBMXlSMDQ?usp=sharing
Experiment 1
1. create a directory named ~/test
2. cd ~/test
3. assuming you downloaded the openmpi...tar.gz file to your home directory, type:
tar xvf ~/openmpi_1.8.4_linux_64ib.tar.gz
4. edit your .bashrc file replacing the source ~/.bashrc_fds xxx to the following
source ~/test/openmpi_64ib
5. log out and log back in then type fds
what does it say?
Experiment 2
1. do steps 1 -> 4 in experiment 1
2. assuming you downloaded fds_mpi_intel_linux_64ib from the above google drive link to the current directory, type
./fds_mpi_intel_linux_64ib
what does it say?
Experiment 3
reinstall the "official" FDS but select the openmpi library we distribute
with all these experiments, after any edits to .bashrc you have to log out and log back in. (in theory you can just source .bashrc - but logging out and logging back in is "safer" )
Note all previous fds' that we have distributed were built against an ethernet (not infiniband) version of Openmpi.
[20:54:34 gforney@blaze:~ ] $ ls /shared/openmpi_64ib
bin etc include lib share
You need to invoke .bashrc_fds using
source ~/.bashrc_fds /shared/openmpi_64ib
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypCG4JyRY_5GckYDZqOvSRoqYYHhuks5rsE3WgaJpZM4MwWP5> .Image removed by sender.
|
Tim, |
Thank you for your further response Salah.
In trying to sort out the source of this error my focus is on what was working and what has changed with the installation.
My OpenMPI installation over IB has worked perfectly with FDS 6.5.2 for over 6 months clocking up 1,000s of core hours of processing on numerous fds models. It also runs OpenMPI diagnostics without error, has been subjected to multi-mesh verification, and strong and weak scaling tests. My OpenMPI installation also worked faultlessly with FDS 6.5.0.
Each of my experiments is preceded by a clean mirror install. The OS kernel is locked and other Linux software updates are highly managed. OpenMPI over Infiniband continues to perform faultlessly across all nodes in my cluster.
The only thing that has changed in recent times is fds (through the upgrade to FDS 6.5.3). The fds error that I have reported is from running the command fds at a terminal prompt on a single node. So OpenMPI should not even be instigated and Infiniband should not be exercised. In my mind this suggests that the problem is not with my OpenMPI or Infiniband installation, but with FDS 6.5.3.
From my efforts to date it appears that FDS 6.5.3 is looking in the wrong directory for OpenMPI components and error messages. This appears to be something to do with OpenMPI installed in the /shared/openmpi-64ib directory as the error messages indicate that fds is searching for OpenMPI components in /shared/openmpi_64. The 'ib' in the path changes the $FDSNETWORK environment variable to 'Infiniband' in .bashrc_fds, but I have yet to establish what this does in the compiled code.
Please elaborate if you can see a flaw in what I think is a logical diagnostic approach.
In a preceding post you recommended updating to OpenMPI Version 2.1 and I have asked you to confirm this as the FDS documentation still recommends Version 1.8.4. I agree that there are good reasons for updating to a more recent version of OpenMPI (primarily because OpenMPI Version 1.8.4 is now listed as 'retired' by the OpenMPI development team). However the more changes that I implement to my system without discovering the cause of the issue actually makes solving the problem potentially more difficult.
I appreciate that I seem to be the only dude having a problem with FDS 6.5.3 running with OpenMPI implemented over Infiniband on a Linux cluster. If anyone else in the community has had similar problems I would love to hear from you and so would NIST.
With kindest regards,
Tim
From: Salah Benkorichi [mailto:notifications@github.com]
Sent: Monday, 3 April 2017 5:51 p.m.
To: firemodels/fds
Cc: tgob; Author
Subject: Re: [firemodels/fds] FDS 6.5.3 Upgrade Issue (#4904)
Tim,
Try what gforney has suggested.
As for the type of the error your receiving might be due to something broken in your ompi. There is miscommunication and setup between the libraries. It's sort of generic issue. I've seen other people receiving it after they non properly install and set ompi with whatever they want to run it with.
Let us know how it went.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub <#4904 (comment)> , or mute the thread <https://github.com/notifications/unsubscribe-auth/ATPypMOtgwb_ncxIgQq3W-k64txEFVeRks5rsIjWgaJpZM4MwWP5> .Image removed by sender.
|
Okay, now I think I got the issue solved (hopefully it's the last trial LOL) Go to the makefile in /build directory, scroll down to line 601.
Then changed -03 to -02 in the first line.
Let us know if it solves it for you. |
Thank you Salah. I know that this post is getting to be long, but I experimented with compiler options some days ago. This is documented in my post of 21 April above:
I have experimented with FDS 6.5.3 compiler optimization options (O1, O2, O3). This has no effect on the intermittent behaviour of FDS 6.5.3 with multiple OpenMPI processes.
I'll try changing the compiler optimizations again with the re-cloned Git repository but, unless I have made an inadvertent error in my earlier tests, I believe I have already tried this.
t.
|
Actually, the error persist time to time. |
Tested this example, the cpu time is working.
run it |
From the tests that I've run, PS: when you try to recompile it again, make sure to remove all the files except the make_fds.sh before you compile it. |
I have recompiled FDS 6.5.3 with the -O0 and -O1 compiler optimizations and the error still occurs, just as it did previously (21 April). Sometimes it works and sometimes it doesn't. The error message is identical to that previously reported.
Note that the intermittent nature of the fault means that sometimes FDS will fail repeatedly, and sometime it will run to conclusion repeatedly (say over ten times in succession).
Everything is still pointing to that CPU_TIME() subroutine call.
Should I be seeking advice from the GNU gurus?
t.
|
Interesting, |
fyi, I've tested commenting the CPU_TIME for the sake of testing it. |
I have reloaded everything (including the Git clone) and re-tried the O0 compiler option without commenting out 'CPU_TIME' and the following variable assignment line 'CPU_TIME_START = ...'.
FDS is now running to completion under OpenMPI on a single node and on multiple nodes. The out files are sensible (although I haven't verified these yet).
However I am still getting IEEE_UNDERFLOW and IEEE_DENORMAL flag warnings at the end of processing, with one issued for each MPI process that is launched. While the warnings do not appear to be critical I can't seem to turn them off. If I set the FDS input file to T_END=0 then I only get the IEEE_DENORMAL flag warning.
Interestingly this may also have links to CPU_TIME(). https://gcc.gnu.org/onlinedocs/gcc-4.6.4/gfortran/Debugging-Options.html advises:
…-ffpe-trap=list
Specify a list of IEEE exceptions when a Floating Point Exception (FPE) should be raised. On most systems, this will result in a SIGFPE signal being sent and the program being interrupted, producing a core file useful for debugging. list is a (possibly empty) comma-separated list of the following IEEE exceptions: ‘invalid’ (invalid floating point operation, such as SQRT(-1.0)), ‘zero’ (division by zero), ‘overflow’ (overflow in a floating point operation), ‘underflow’ (underflow in a floating point operation), ‘precision’ (loss of precision during operation) and ‘denormal’ (operation produced a denormal value).
Some of the routines in the Fortran runtime library, like ‘CPU_TIME’, are likely to trigger floating point exceptions when ffpe-trap=precision is used. For this reason, the use of ffpe-trap=precision is not recommended.
At the moment I assume that the cause of the warnings is the FDS6.5.3. Although I have tried to change the compiler warning options in ~/fds/Build/make to incorporate -w (disable warnings) this doesn't seem to do anything. The warnings persist.
With kindest regards, t.
|
the warning Yes, there still that warning after the completion of each process. However if you notice if you run it without mpirun, you won't receive it. I will play around it a bit . |
I agree that the IEEE errors are being reported by OpenMPI but they appear to be generated from FDS.
I'll be doing some more experimenting over the next few days on this.
t.
|
I've tested now under Centos 7 couple times, it's working with -O3, and no errors of memory or about the floating points .
This under ubuntu that throws that warning at the end eventhough it's up to date.
|
Your post raises a two possibilities as your refer to both the compiler AND the OS.
Changing my OS to CentOS is on the cards but this is generally regarded as not as well supported or used as Ubuntu.
Note that while you may think you have the latest gnu compiler (through apt-get) the current gnu compiler is 6.3.0! My installed version is 5.4.0 (previously reported in this post under ompi-info output). My next set of experiments are going to be looking at this very issue. t.
|
Yes you are right about the version. |
Alright, here is an update on what I've done
Then Create a link to gfortran
Now open .bashrc file
save it and close it, Test it the path of of mpirun
If it matches the path you set, then you're ready for compilation. Now test it :
Now it works with no error whether memory or floating points. ( I've tested once with this file, and then had to reduce the number of cells and time to 5 sec since my laptop has only 4 cores, just to gain some time during testing) Try to follow these steps, and let us know if it works now for you. Cheers, |
Thank you Salah.
I had all manner of difficulties installing gcc 6.3.0 to the point where it was actually compiling under the mpifort wrapper so I have put this on hold for now.
I'll try gfortan-4.8 tomorrow in accordance with your instructions and let you know how I get on.
t.
|
Lol I faced a bit of hassle to get it installed even the instructions from stackoverflow didn't help. |
GCC-6.3.0 installation,
then with make You can check it, |
gfortran 4.9 , gfortran 5 and gfortran 6 all failed. |
Dear Salah, my reply to your post of Tuesday, 25 Apr at 9:03 pm:
Your instructions didn't work for me. Specifically:
sudo apt-get remove gfortran OK
sudo apt-get install gfortran-4.8 OK
But we have the first issue. This installs the executive gfortran-4.8 in /usr/bin. So the command gfortran can't find the compiler any more!
As a consequence the OpenMPI 2.10 build fails with Errors.
Renaming or aliasing gfortran-4.8 only leads to further problems. I note that gfortran-5 exec also still exists in /usr/bin.
There seems to be a problem with your:
./configure CC=icc CC=gcc --enable-mpi-fortran --enable-static --disable-shared -disable-dlopen --prefix=/home/salah/openmpi3
Why are you specifying CC twice? Surely only one instance holds? I understand that the way to specify a non-standard fortran compiler name is to incorporate it in the configure command as FC=gfortran-4.8 However we still fail the build. Something is fundamentally wrong with on or both of gcc and gfortran 4.8.
OpenMPI installation instructions suggests that there is no need for playing with configure, except perhaps to explicitly specify transports that are not in Standard paths and perhaps the prefix.
Your comments would be appreciated.
t.
|
Yes Tim, you would receive an error, because first you need to create a symbolik link |
I have a stable FDS 6.5.3 install running this evening Salah.
The cluster runs both OpenMPI 2.1.0 (over 40 GBs Infiniband) and OpenMP on single nodes and across multiple nodes now. There are no intermittent SIGSEGV errors (over hundreds of runs). There are no IEEE_UNDERFLOW or IEEE_DENORMAL warnings.
For completeness the OS is Linux Ubuntu 16.04 LTS with Kernel 4.4.0-35-generic.
The Mellanox OFED is compiled under GNU GCC 5.4.0. Both OpenMPI and FDS are compiled under GNU GCC 5.4.0 with gfortran reverted to 4.8.5.
My OpenMPI configuration is necessarily slightly different from your recommendation to force Infiniband verbs and to remove Slurm support. I will fully document this shortly but it is essentially clean as recommended by www.open-mpi.org.
Although I have yet to re-run the verification suite the output of my test model is consistent with earlier installations.
Interestingly there has been no overall speed increase with the new installation and I have yet to apply any tuning such as the fast maths routines that are available in the GCC suite. This will wait for now.
First things first. I am completing a mirror backup of each cluster node so I can revert the cluster to the current state at any time in the future (I have a set of these that covers every major software upgrade).
There are a number of FDS documentation revisions that might be useful for other users from this issue. I'll be looking at this and forward my suggestions. I have also found a few formatting inconsistencies and comment typo's in the source code but I figure that these are really low on your priority list.
I'd like to keep this issue open until I have completed verification, but note that I don't anticipate any problems from this.
My sincere thanks to you, Glenn and Kevin for your advice and persistence with attending to this issue. It would have been so easy for you to simply say 'This is not an FDS issue. Try something else on the other side of the demarcation.' I do hope that other folk benefit from this issue, and I owe you all several beers.
t.
|
Glad we could be of help. Regards |
In addition to beers, is there some way FireNZE can contribute to NIST's FDS program funding? I do appreciate that I have used a heap of your time, that NIST has provided my small fire engineering consultancy with a particularly valuable resource in FDS, and that your support is unequalled in the software industry (try asking Intel a question about their compiler suite. You'll be waiting to 'infinity and beyond' for a response - even if you're prepared to pay for their product)! t.
|
The way you (and others) can help is by doing what you have already done -
actively participating , adding to everyones knowledge base. The FDS and
Smokeview software ends up being better for it. Thanks for all your help.
glenn
…On Wed, Apr 26, 2017 at 9:18 AM, tgob ***@***.***> wrote:
In addition to beers, is there some way FireNZE can contribute to NIST's
FDS program funding? I do appreciate that I have used a heap of your time,
that NIST has provided my small fire engineering consultancy with a
particularly valuable resource in FDS, and that your support is unequalled
in the software industry (try asking Intel a question about their compiler
suite. You'll be waiting to 'infinity and beyond' for a response - even if
you're prepared to pay for their product)! t.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#4904 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AL1BRgAlJu_uBBTIlEx0TZHHLWmui6EDks5rz0Q2gaJpZM4MwWP5>
.
--
Glenn Forney
|
The new installation has been verified (FDS Technical Reference Manual, Volume 2, Appendix B) today without issue. It also verifies with FDS 6.5.3 compiled using -O3 -ffast-math compiler options. These, combined with OpenMPI 2.1.0, provide a typical decrease of 10% in overall processing time over the previous installation (subject of course to the model and process allocations). |
FYI. We got away from using -O3 because we showed it gave in some cases significantly different numerical results from -O2. Ultimately, we lost trust in it. But my memory is fuzzy on this. We are going back a few years to when Kris Overholt was here. |
Thanks Randy. -O3 and -O3 -ffast-math under gfortran 4.8 are performing very well with FDS 6.5.3 against the verification suite criteria (the margins are actually better than some of my earlier installations). I've also run some FireNZE in-house models with several million cells over 16 meshes across IB for extended periods (~ 6 hours). These tests are on a contrived compartment model with outputs typical of those used in a fire report. The outputs are essentially identical to previous FDS installations. I have almost finished writing up the Verification report for this upgrade. Please do let me know if you'd like this, or the verification spread sheets (MS Excel). Next to validation and verification, processing speed is important in a commercial setting (noting that model appropriateness is another issue). Often times I'll have scores of fire model scenarios to run and analyse (including sensitivity studies) in developing a fire design solution, perhaps in as little as four weeks. While I do put some effort in to optimizing models occasionally run times can exceed a week. Any significant changes to my FDS cluster are tested. I have previously posted a discussion document on Verification which you may not have seen. You can read it at: http://fire.aquacoustics.biz/html/publications.html Click on the 'Verification of Fire Dynamics Simulator' download link. |
The problem with -O3 is that it has in the past generated seg faults and other problems that -O2 did not. It is faster, in general, but we have to make sure our released executable runs well on all platforms. I think that -O3 is a good option for those compiling themselves, where they can tune the compilation to their exact hardware. |
I have FDS 6.5.2 with OpenMPI 1.8.4 running stably on a Linux Ubuntu 16.04 LTS cluster with Mellanox Infiniband. The installation was completed using the NISTpre-compiled FDS binaries. I recently attempted to upgrade to FDS 6.5.3 using the NIST precompiled binaries but there is a problem with the new installation:
Installing 64 bit Linux FDS 6.5.3 and Smokeview 6.4.4
Options:
FDS install options
Press 1 to install in /home/ob1/FDS/FDS6 [default]
Press 2 to install in /opt/FDS/FDS6
Press 3 to install in /usr/local/bin/FDS/FDS6
Enter a directory path to install elsewhere
OpenMPI options
Press 1 to install OpenMPI manually [default]
See /home/ob1/FDS/FDS6/bin/README.html for details
Press 2 to use /shared/openmpi_64ib
Installation directory: /home/ob1/FDS/FDS6
OpenMPI directory: /shared/openmpi_64ib
Installation beginning
The directory, /home/ob1/FDS/FDS6, already exists.
The installation directory, /home/ob1/FDS/FDS6, has been created.
Creating directory /home/ob1/FDS/FDS6/Uninstall
The installation directory, /home/ob1/FDS/FDS6/Uninstall, has been created.
Copying FDS installation files to /home/ob1/FDS/FDS6
Copy complete.
Backing up /home/ob1/.bashrc_fds to /home/ob1/.bashrc_fds_20170401_102309
Updating .bashrc_fds
Backing up /home/ob1/.bashrc to /home/ob1/.bashrc_20170401_102309
Updating .bashrc
*** Log out and log back in so changes will take effect.
Installation complete.
No issues were reported during the install but when I execute fds from a terminal command prompt on the Master node I get the following output:
Sorry! You were supposed to get help about:
ini file:file not found
But I couldn't open the help file:
/shared/openmpi_64/share/openmpi/help-mpi-btl-openib.txt: No such file or directory. Sorry!
libibverbs: Warning: couldn't load driver '/usr/lib/libibverbs/libmlx5': /usr/lib/libibverbs/libmlx5-rdmav2.so: symbol ibv_cmd_destroy_flow, version IBVERBS_1.0 not defined in file libibverbs.so.1 with link time reference
libibverbs: Warning: couldn't load driver '/usr/lib/libibverbs/libmlx4': /usr/lib/libibverbs/libmlx4-rdmav2.so: symbol ibv_cmd_destroy_flow, version IBVERBS_1.0 not defined in file libibverbs.so.1 with link time reference
libibverbs: Warning: no userspace device-specific driver found for /sys/class/infiniband_verbs/uverbs0
Sorry! You were supposed to get help about:
But I couldn't open the help file:
Fire Dynamics Simulator
Current Date : April 1, 2017 10:32:42
Version : FDS 6.5.3
Revision : FDS6.5.3-598-geb56ed1
Revision Date : Thu Jan 19 16:12:59 2017 -0500
Compilation Date : Jan 22, 2017 18:04:30
MPI Enabled; Number of MPI Processes: 1
OpenMP Enabled; Number of OpenMP Threads: 4
MPI version: 3.0
MPI library version: Open MPI v1.8.4, package: Open MPI gforney@burn Distribution, ident: 1.8.4, repo rev: v1.8.3-330-g0344f04, Dec 19, 2014
Consult FDS Users Guide Chapter, Running FDS, for further instructions.
Hit Enter to Escape...
For some reason FDS appears to be trying to access a non-infinband openmpi installation and associated help files at:
However openmpi resides in the default Infiniband installation directory (as for FDS 6.5.2):
.bashrc and .bashrc_fds are setting the environment variables appropriately, and in particular PATH, LD_LIBRARY_PATH and FDSNETWORK as follows:
PATH:
/shared/openmpi_64ib/bin:/home/ob1/FDS/FDS6/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin
LD_LIBRARY_PATH
/shared/openmpi_64ib/lib:/home/ob1/FDS/FDS6/bin/LIB64:/home/ob1/FDS/FDS6/bin/INTELLIBS16
FDSNETWORK
Infiniband
Past upgrades (for example from 6.5.0 to 6.5.2) have been completed successfully by simply downloading the FDS and SmokeView precompiled Linux bundle and running the script (.sh) file.
Infiniband is still working, openmpi is still working (via Infiniband) and SSH is still working (password-less access) between all nodes.
A Windows 7 upgrade to 6.5.3 worked just fine (albeit without openmpi and Infinband on my Windows workstation).
I also tried a Linux Ubuntu 16.04 LTS install without openmpi or Infiniband. This also worked fine.
Any suggestions on how I might complete the upgrade to FDS 6.5.3?
The text was updated successfully, but these errors were encountered: