Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with install_wps_openmpi.sh - geogrid.exe and metgrid.exe not built. #294

Closed
dazzag24 opened this issue Jun 16, 2020 · 6 comments
Closed

Comments

@dazzag24
Copy link
Contributor

mpif90 -o geogrid.exe cio.o wrf_debug.o bitarray_module.o constants_module.o module_stringutil.o geogrid.o gridinfo_module.o hash_module.o interp_module.o list_module.o llxy_module.o misc_definitions_module.o module_debug.o module_map_utils.o output_module.o parallel_module.o process_tile_module.o proc_point_module.o queue_module.o read_geogrid.o smooth_module.o source_data_module.o \ /home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/frame/module_driver_constants.o \ /home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/frame/pack_utils.o /home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/frame/module_machine.o \ /home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/frame/module_internal_header_util.o \ -I/home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/external/io_netcdf -I/home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/external/io_grib_share -I/home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/external/io_grib1 -I/home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/external/io_int -I/home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/inc -I/home/darreng/apps/spack/d/linux-centos7-broadwell/gcc-9.2.0/netcdf-fortran-4.5.2-6wg7fpk7qmq6wnoscmeui7pjsp67a2jc/include \ -L/home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/external/io_grib1 -lio_grib1 -L/home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/external/io_grib_share -lio_grib_share -L/home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/external/io_int -lwrfio_int -L/home/darreng/apps/d/wrf-openmpi/WRF-4.1.5/external/io_netcdf -lwrfio_nf -L/home/darreng/apps/spack/d/linux-centos7-broadwell/gcc-9.2.0/netcdf-fortran-4.5.2-6wg7fpk7qmq6wnoscmeui7pjsp67a2jc/lib -lnetcdff -lnetcdf \ /opt/openmpi-4.0.3/lib /opt/openmpi-4.0.3/lib: file not recognized: Is a directory collect2: error: ld returned 1 exit status

Something and I am not sure if it is spack or something else in the setup is setting the MPI_LIB env var incorrectly. It was set as /openmpi-4.0.3/lib which causes SOME of the WPS executables to fail to compile. ungrib.exe was being built however.

To work around this issue I have done this:

export MPI_LIB="-L/openmpi-4.0.3/lib -lmpi" WRF_VERSION=4.1.5 ./install_wps_openmpi.sh

I'd appreciate someone thoughts on if my workaround is acceptable and if there is more correct solution.

Thanks

Darren

dazzag24 added a commit to dazzag24/azurehpc that referenced this issue Jun 16, 2020
Something is incorrectly setting MPI_LIB which results in metgrid and geogrid not compiling.
@dazzag24
Copy link
Contributor Author

Turns out that the fix above was not sufficient. I ended up having to override the MPI_LIB fix in the install_wps_openmpi.sh script itself. See changes in my branch:

https://github.com/dazzag24/azurehpc/blob/a3b2f6b778c63997e7529825036a3821b925cbe2/apps/wrf/install_wps_openmpi.sh#L30

@garvct
Copy link
Collaborator

garvct commented Jun 16, 2020

Is the openmpi module being loaded correctly?

@dazzag24
Copy link
Contributor Author

dazzag24 commented Jun 17, 2020 via email

@dazzag24
Copy link
Contributor Author

[cyclecloud@ip-0A060006 ~]$ export SHARED_APP=$HOME/apps
[cyclecloud@ip-0A060006 ~]$ export SKU_TYPE=hb
[cyclecloud@ip-0A060006 ~]$ export MODULEPATH=${SHARED_APP}/modulefiles/${SKU_TYPE}:$MODULEPATH
[cyclecloud@ip-0A060006 ~]$
[cyclecloud@ip-0A060006 ~]$ export SPACK_ROOT=$HOME/apps/spack/0.14.2/spack
[cyclecloud@ip-0A060006 ~]$ source $SPACK_ROOT/share/spack/setup-env.sh
[cyclecloud@ip-0A060006 ~]$ set | grep MPI_
   
[cyclecloud@ip-0A060006 ~]$ module load mpi/openmpi-4.0.3
[cyclecloud@ip-0A060006 ~]$ set | grep MPI
MPI_BIN=/opt/openmpi-4.0.3/bin
MPI_HOME=/opt/openmpi-4.0.3
MPI_INCLUDE=/opt/openmpi-4.0.3/include
MPI_LIB=/opt/openmpi-4.0.3/lib
MPI_MAN=/opt/openmpi-4.0.3/share/man

So it is module load mpi/openmpi-4.0.3 that is setting the MPI_LIB env var.

However I am not sure if it is being set to an incorrect value OR if it is the WPS configure setup that is using it incorrectly. In any case it end up not building all of the required WPS executables.

Thanks

@garvct
Copy link
Collaborator

garvct commented Jun 17, 2020

Thanks for finding this build error. I took a closer look at it today. Here is the problem.
The OpenMPI modulefile set MPI_LIB to the location of its libraries. WPS used MPI_LIB
directly in its Makefiles (without a -L option) and so we see the Makefile build error. Since WPS
uses the mpi wrappers (mpif90, mpicc) it does not need MPI_LIB, because the wrappers already know the location of the openmpi libraries and include files. So, one simple solution is to just unset MPI_LIB (ie export MPI_LIB=""). I have create a PR (#295 ) with this fix.

@kanchanm
Copy link
Contributor

closing as this was fixed with above PR. please reopen if you still hit issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants