-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question: Is AWS Pcluster supported by SCHISM? #88
Comments
seem to get a similar error when I run this on our controller node without mpiexec. [centos@ip-10-137-0-172 tune44h]$ /modeling/pschism/icm_Balg/build/bin/pschism_ICM_ANALYSIS_PREC_EVAP_TVD-VL Program received signal SIGSEGV: Segmentation fault - invalid memory reference. Backtrace for this error: |
It appears pschism will work on AWS pcluster if Intel Ifortan and Intel MPI libraries are used. I am thinking that this code may not be compatible with Gfortran. Has anyone successfully run pschism on Gfortran ? To get things to compile on our end I had to modify two files. Maybe there are other Gfortran compatiblity issues I am missing. Files and lines modified:
Ref: ./icm_Balg/src/Hydro/schism_init.F90:5451.114 ','CPOC ','tlfveg ','tstveg ','trtveg ','hcanveg','lfsav ','stsav ','rts Error: Different CHARACTER lengths (7/6) in array constructor at (1) Ref: ./icm_Balg/src/ICM/read_icm_input.F90:326: Output: |
Hi Teddy:
Many of us have used gcc. The error you mentioned below seems to be from an older tag (5.7.0?) and has been fixed in newer tag v5.10.0.
…-Joseph
Y. Joseph Zhang
Web: schism.wiki
Office: 804 684 7466
From: Teddy Knab ***@***.***>
Sent: Thursday, September 29, 2022 8:10 AM
To: schism-dev/schism ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [schism-dev/schism] Question: Is AWS Pcluster supported by SCHISM? (Issue #88)
[EXTERNAL to VIMS received message]
Per an email from NOAA who runs this on AWS it appears pschism will work on AWS pcluster if Intel Ifortan and Intel MPI libraries are used. I am thinking that this code may not be compatible with Gfortran. Has anyone successfully run pschism on Gfortran ? To get things to compile on our end I had to modify two files. Maybe there are other compatiblity issues I am missing.
Files and lines modified:
1. File: ./icm_Balg/src/Hydro/schism_init.F90:5451.114
2. File: ./icm_Balg/src/ICM/icm_sed_flux.F90
Ref: ./icm_Balg/src/Hydro/schism_init.F90:5451.114
Summary: array format issue. All strings need to be the same size for gfortran to compile.
','CPOC ','tlfveg ','tstveg ','trtveg ','hcanveg','lfsav ','stsav ','rts
Error: Different CHARACTER lengths (7/6) in array constructor at (1)
make[3]: *** [Hydro/CMakeFiles/hydro.dir/schism_init.F90.o] Error 1
make[2]: *** [Hydro/CMakeFiles/hydro.dir/all] Error 2
make[1]: *** [Driver/CMakeFiles/pschism.dir/rule] Error 2
Ref: ./icm_Balg/src/ICM/read_icm_input.F90:326:
Summary: Gfortan seems to process '/*' as a closing comment in line 326.
Output:
/modeling/pschism/icm_Balg/src/ICM/read_icm_input.F90:326:0: warning: extra tokens at end of #endif directive [enabled by default]
#endif ICM_PH
^
/modeling/pschism/icm_Balg/src/ICM/icm_sed_flux.F90:1385:0: error: unterminated comment
!with all state variables in unit of g/*, no need to transfer
^
Error: Unexpected end of file in '/modeling/pschism/icm_Balg/src/ICM/icm_sed_flux.F90'
make[3]: *** [ICM/CMakeFiles/icm.dir/icm_sed_flux.F90.o] Error 1
make[3]: *** Waiting for unfinished jobs....
-
Reply to this email directly, view it on GitHub<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fschism-dev%2Fschism%2Fissues%2F88%23issuecomment-1262186601&data=05%7C01%7Cyjzhang%40vims.edu%7Cc67dcdb6b0ac492491d108daa2137a64%7C8cbcddd9588d4e3b9c1e2367dbdf1740%7C0%7C0%7C638000501781978945%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HvLBB594lGh60Pr7w7fQ7eh%2FNx2g8B9Ika0YYTd8vYk%3D&reserved=0>, or unsubscribe<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFBKNZ6DKG6ANEFK5EOZ3MLWAWBIBANCNFSM6AAAAAAQWYNOLM&data=05%7C01%7Cyjzhang%40vims.edu%7Cc67dcdb6b0ac492491d108daa2137a64%7C8cbcddd9588d4e3b9c1e2367dbdf1740%7C0%7C0%7C638000501781978945%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=s%2BJyXZSTvbQTxxB5wdHL%2F3usuBywpxQXDOdpMKMkFXo%3D&reserved=0>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.******@***.***>>
|
Since ICM is very important for EPA/CBPO, v5.10.0 is the one to use. Zhengui has spent a lot of time cleaning up ICM for the past year.
…-Joseph
Y. Joseph Zhang
Web: schism.wiki
Office: 804 684 7466
From: Y. Joseph Zhang
Sent: Thursday, September 29, 2022 9:08 AM
To: schism-dev/schism ***@***.***>; schism-dev/schism ***@***.***>
Cc: Subscribed ***@***.***>
Subject: RE: [schism-dev/schism] Question: Is AWS Pcluster supported by SCHISM? (Issue #88)
Hi Teddy:
Many of us have used gcc. The error you mentioned below seems to be from an older tag (5.7.0?) and has been fixed in newer tag v5.10.0.
-Joseph
Y. Joseph Zhang
Web: schism.wiki
Office: 804 684 7466
From: Teddy Knab ***@***.******@***.***>>
Sent: Thursday, September 29, 2022 8:10 AM
To: schism-dev/schism ***@***.******@***.***>>
Cc: Subscribed ***@***.******@***.***>>
Subject: Re: [schism-dev/schism] Question: Is AWS Pcluster supported by SCHISM? (Issue #88)
[EXTERNAL to VIMS received message]
Per an email from NOAA who runs this on AWS it appears pschism will work on AWS pcluster if Intel Ifortan and Intel MPI libraries are used. I am thinking that this code may not be compatible with Gfortran. Has anyone successfully run pschism on Gfortran ? To get things to compile on our end I had to modify two files. Maybe there are other compatiblity issues I am missing.
Files and lines modified:
1. File: ./icm_Balg/src/Hydro/schism_init.F90:5451.114
2. File: ./icm_Balg/src/ICM/icm_sed_flux.F90
Ref: ./icm_Balg/src/Hydro/schism_init.F90:5451.114
Summary: array format issue. All strings need to be the same size for gfortran to compile.
','CPOC ','tlfveg ','tstveg ','trtveg ','hcanveg','lfsav ','stsav ','rts
Error: Different CHARACTER lengths (7/6) in array constructor at (1)
make[3]: *** [Hydro/CMakeFiles/hydro.dir/schism_init.F90.o] Error 1
make[2]: *** [Hydro/CMakeFiles/hydro.dir/all] Error 2
make[1]: *** [Driver/CMakeFiles/pschism.dir/rule] Error 2
Ref: ./icm_Balg/src/ICM/read_icm_input.F90:326:
Summary: Gfortan seems to process '/*' as a closing comment in line 326.
Output:
/modeling/pschism/icm_Balg/src/ICM/read_icm_input.F90:326:0: warning: extra tokens at end of #endif directive [enabled by default]
#endif ICM_PH
^
/modeling/pschism/icm_Balg/src/ICM/icm_sed_flux.F90:1385:0: error: unterminated comment
!with all state variables in unit of g/*, no need to transfer
^
Error: Unexpected end of file in '/modeling/pschism/icm_Balg/src/ICM/icm_sed_flux.F90'
make[3]: *** [ICM/CMakeFiles/icm.dir/icm_sed_flux.F90.o] Error 1
make[3]: *** Waiting for unfinished jobs....
-
Reply to this email directly, view it on GitHub<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fschism-dev%2Fschism%2Fissues%2F88%23issuecomment-1262186601&data=05%7C01%7Cyjzhang%40vims.edu%7Cc67dcdb6b0ac492491d108daa2137a64%7C8cbcddd9588d4e3b9c1e2367dbdf1740%7C0%7C0%7C638000501781978945%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=HvLBB594lGh60Pr7w7fQ7eh%2FNx2g8B9Ika0YYTd8vYk%3D&reserved=0>, or unsubscribe<https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAFBKNZ6DKG6ANEFK5EOZ3MLWAWBIBANCNFSM6AAAAAAQWYNOLM&data=05%7C01%7Cyjzhang%40vims.edu%7Cc67dcdb6b0ac492491d108daa2137a64%7C8cbcddd9588d4e3b9c1e2367dbdf1740%7C0%7C0%7C638000501781978945%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=s%2BJyXZSTvbQTxxB5wdHL%2F3usuBywpxQXDOdpMKMkFXo%3D&reserved=0>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.******@***.***>>
|
Thanks. It seems I was using the wrong version. #version pulled schism v5.9.0mod #version 5.10 schism develop The proper version seems to be running. [centos@ip-10-137-0-172 tune44h]$ squeue ifort (IFORT) 2021.6.0 20220226 |
I am closing this ticket. I have been recently able to compile schism on gcc-8.5.0 using OpenMPI4, gcc-12.0.2 using OpenMPI5, intel@2021.10.0 with intel-oneapi-mpi@2021.12.1, and intel@2021.6.0 with intel-oneapi-mpi@2021.9.0. It seems my environment for the GCC was either not correctly setup or I had some errors in my cmake file. |
Has anyone got schism working on aws pcluster 3.2 ?
I created a 8 node cluster on AWS with 256 cores and 512 GB of RAM. But, on the run I am getting segfaults.
SlurmQueues:
- Name: compute
ComputeResources:
- Name: slurmworkers
InstanceType: c4.8xlarge
MinCount: 0
MaxCount: 8
os: centos7
modules loaded: hdf5-1.12.2-gcc-4.8.5-omqotpp openmpi-4.1.4-gcc-4.8.5-23hmmfu netcdf-fortran-4.5.4-gcc-4.8.5-y6iccqw netcdf-c-4.8.1-gcc-4.8.5-2eml4r3
compiled: GNU Fortran (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
Binary: -> /modeling/pschism/icm_Balg/build/bin/pschism_ICM_ANALYSIS_PREC_EVAP_TVD-VL -version
schism v5.9.0mod
git hash 2e289ae (20 commits since semantic tag, edits=True)
My mpirun call:
--> trying to tell mpi to run 32 processes on each node: Is this syntax correct ?
/opt/parallelcluster/shared/spack/opt/spack/linux-centos7-haswell/gcc-4.8.5/openmpi-4.1.4-23hmmfud3rw4njh3m5ilmukatjrgn4i2/bin/mpirun --hostfile hostnames.txt -n 32 --map-by node /modeling/pschism/icm_Balg/build/bin/pschism_ICM_ANALYSIS_PREC_EVAP_TVD-VL
job out says:
7 total processes killed (some possibly by mpirun during cleanup)
did we get an error 139
The error file says invalid memory references.
[centos@ip-10-137-0-172 tune44h]$ cat job.78.err
Currently Loaded Modulefiles:
Warning: Permanently added 'compute-dy-slurmworkers-4,10.137.0.181' (ECDSA) to the list of known hosts.
Warning: Permanently added 'compute-dy-slurmworkers-7,10.137.0.179' (ECDSA) to the list of known hosts.
Warning: Permanently added 'compute-dy-slurmworkers-3,10.137.0.137' (ECDSA) to the list of known hosts.
Warning: Permanently added 'compute-dy-slurmworkers-6,10.137.0.161' (ECDSA) to the list of known hosts.
Warning: Permanently added 'compute-dy-slurmworkers-5,10.137.0.133' (ECDSA) to the list of known hosts.
Warning: Permanently added 'compute-dy-slurmworkers-2,10.137.0.143' (ECDSA) to the list of known hosts.
Warning: Permanently added 'compute-dy-slurmworkers-8,10.137.0.182' (ECDSA) to the list of known hosts.
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
Program received signal SIGSEGV: Segmentation fault - invalid memory reference.
Backtrace for this error:
#0 0x7F27901B96D7
#1 0x7F27901B9D1E
#2 0x7F278F4983FF
#3 0x53C5D0 in calkwq_
#4 0x554151 in ecosystem_
#5 0x45F1EC in schism_step_
#6 0x404C4B in schism_main_
The text was updated successfully, but these errors were encountered: