Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix standalone build #1

Closed
uturuncoglu opened this issue Feb 6, 2024 · 36 comments
Closed

Fix standalone build #1

uturuncoglu opened this issue Feb 6, 2024 · 36 comments
Assignees
Labels
bug Something isn't working

Comments

@uturuncoglu
Copy link
Collaborator

The changes that are done to build model under UFS Coastal are creating issue in standalone build outside of the UFS Coastal. @josephzhang8 test threw a lot of errors related to netcdf and util scripts. There is a need for a target that does not compile any util scripts. Need also check pschism target.

@uturuncoglu
Copy link
Collaborator Author

@platipodium @josephzhang8 I could see BLD_STANDALONE option in the changes that I did. I think if you pass -DBLD_STANDALONE=ON to your cmake command, it will probably activate it. Anyway, I am in a training now but I'll look at it later.

@uturuncoglu
Copy link
Collaborator Author

BTW, i am not seeing any changes in my PR that removes pschism target. The changes just to compile the model inside of the UFS Coastal.

@pvelissariou1
Copy link

@josephzhang8, @platipodium , @uturuncoglu I checked the latest version of SCHISM and it compiled successfuly only if I explicitly supply to cmake the option: -DBLD_STANDALONE=ON. This should be set ON by default. Maybe I missed it, why we need to have the BLD_STANDALONE flag? What is its purpose?

@uturuncoglu
Copy link
Collaborator Author

@pvelissariou1 The build system was creating issue under UFS Coastal since it was not design to compile it under another application. So, I introduce BLD_STANDALONE=ON to get rid of those issues. The default is OFF but if we don't want to provide this option every time when we are trying to install SCHSIM, then I could change the default to ON and then make required change in the UFS Coastal SCHSIM build interface.

@pvelissariou1
Copy link

@uturuncoglu Let's set BLD_STANDALONE=ON as the default and use BLD_STANDALONE=ON/OFF in UFS-Coastal depending upon the type of compilation. Still, I believe we need to address the NetCDF issues in a different way, without having this BLD_STANDALONE flag in place. What were the issues while building SCHISM in UFS-Coastal? I remember building SCHISM using previous versions of UFS-Coastal without any issues. Your thoughts?

@josephzhang8
Copy link

josephzhang8 commented Feb 9, 2024

Thx @pvelissariou1 @uturuncoglu @platipodium . Adding standalone flag worked on my side.

If it's hard to change the default, we can live with this by adding the flag inside SCHISM.local.build.
@water-e

@pvelissariou1
Copy link

@josephzhang8 My thinking is that, because most people use SCHISM in a standalone configuration it is better to have BLD_STANDALONE=ON by default. This is easily done within CMake. I haven't checked the GNU make configs yet. Do we need to implement the BLD_STANDALONE over there too?

@water-e
Copy link

water-e commented Feb 9, 2024 via email

@josephzhang8
Copy link

Thx @water-e

@josephzhang8
Copy link

I'll go ahead push the change to SCHISM repo, b/c many users are depending on cmake working. If we later on decide to set flag on UFS side we can revise. Thx

@josephzhang8
Copy link

@josephzhang8 My thinking is that, because most people use SCHISM in a standalone configuration it is better to have BLD_STANDALONE=ON by default. This is easily done within CMake. I haven't checked the GNU make configs yet. Do we need to implement the BLD_STANDALONE over there too?

Thx @pvelissariou1 for teh reminder on gcc. I just tested it on our HPC and it works also.

My fix is to add STANDALONE inside cmake/SCHISM.local.build, together with other SCHISM options like OLDIO etc, except that it should not be turned off for SCHISM alone.

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 @pvelissariou1 @water-e I think we could change the default in the SCHSIM build system. I think it will be easy to adjust the UFS Coastal.

@pvelissariou1
Copy link

@uturuncoglu I completely agree.
@josephzhang8 I submitted a PR for PaHM updates in SCHISM, could you please take a look at it? We tested it here at NOAA and it seems to be working as expected

@pvelissariou1
Copy link

@uturuncoglu , @josephzhang8 Ufuk did you have the opportunity to compile with WWM ON? This configuration was failing during compilation. I'll check on this later today from my side.

@uturuncoglu
Copy link
Collaborator Author

@pvelissariou1 No. Is this using internal WW3? We have a regression test working with external WW3 and it is working.

@pvelissariou1
Copy link

@uturuncoglu ok, I will check and report back. WWM is the internal wave model.

@josephzhang8
Copy link

@uturuncoglu I completely agree. @josephzhang8 I submitted a PR for PaHM updates in SCHISM, could you please take a look at it? We tested it here at NOAA and it seems to be working as expected

Will do.. thx @pvelissariou1

@josephzhang8
Copy link

@uturuncoglu , @josephzhang8 Ufuk did you have the opportunity to compile with WWM ON? This configuration was failing during compilation. I'll check on this later today from my side.

I'm able to compile with WWM on.

@pvelissariou1
Copy link

Thank you Joseph. Let me try it within UFS-Coastal and I'll report back.

@pvelissariou1
Copy link

@uturuncoglu , @josephzhang8 I compiled SCHISM standalone and in UFS-Coastal. The results are as follows (please read through):

In SCHISM standalone
--------------------
commit 148e16b4b732f1775819c593db3f33b3496bc2f4 (HEAD -> master, origin/master, origin/HEAD)
Author: Joseph Zhang <yjzhang@vims.edu>
Date:   Fri Feb 9 14:21:11 2024 -0500

a) COMPILATION COMMAND: cmake -S src -B build -DUSE_WWM=ON -DNO_PARMETIS=OFF -DOLDIO=ON

Produces errors like the ones shown next and the compilation fails:

/scratch2/STI/coastal_temp/save/Panagiotis.Velissariou/schism/src/WWMIII/wwm_datapl.F90(20): error #7002: Error in opening the compiled module file.  Check INCLUDE paths.   [SCHISM_MSGP]
      use schism_msgp !, only: comm,             & ! MPI communicator
----------^
/scratch2/STI/coastal_temp/save/Panagiotis.Velissariou/schism/src/WWMIII/wwm_datapl.F90(23): error #7002: Error in opening the compiled module file.  Check INCLUDE paths.   [SCHISM_GLBL]
      use schism_glbl, only  : MNE => nea_wwm,       & ! Elements of the augmented domain
----------^
/scratch2/STI/coastal_temp/save/Panagiotis.Velissariou/schism/src/WWMIII/wwm_datapl.F90(125): error #6683: A kind type parameter must be a compile-time constant.   [RKIND]
         REAL(rkind), allocatable           :: nwild_gb(:)
--------------^
.
.
.

b) COMPILATION COMMAND: cmake -S src -B build -DUSE_WWM=ON -DNO_PARMETIS=OFF -DOLDIO=ON -DBLD_STANDALONE=ON

In this case the compilation is successful

In UFS-Coastal
--------------

a) COMPILATION COMMAND: compile.sh hera "-DAPP=CSTLS -DUSE_WWM=ON -DNO_PARMETIS=OFF -DOLDIO=ON" "coastalSWWM" intel YES NO

Produces errors like the ones shown next and the compilation fails:

/scratch2/STI/coastal_temp/save/Panagiotis.Velissariou/ufs-coastal-coastal_app-02092024/SCHISM-interface/SCHISM/src/WWMIII/wwm_parall_solver.F90(2214): error #6580: Name in only-list does not exist or is not accessible.   [EXCHANGE_P4D_WWM]
      USE datapool, only : exchange_p4d_wwm, myrank
---------------------------^
/scratch2/STI/coastal_temp/save/Panagiotis.Velissariou/ufs-coastal-coastal_app-02092024/SCHISM-interface/SCHISM/src/WWMIII/wwm_parall_solver.F90(2632): error #6580: Name in only-list does not exist or is not accessible.   [EXCHANGE_P4D_WWM]
      USE datapool, only : exchange_p4d_wwm, myrank
---------------------------^
/scratch2/STI/coastal_temp/save/Panagiotis.Velissariou/ufs-coastal-coastal_app-02092024/SCHISM-interface/SCHISM/src/WWMIII/wwm_parall_solver.F90(3315): error #6580: Name in only-list does not exist or is not accessible.   [EXCHANGE_P4D_WWM]
      USE datapool, only : myrank, exchange_p4d_wwm, WRITESTATFLAG
-----------------------------------^
compilation aborted for /scratch2/STI/coastal_temp/save/Panagiotis.Velissariou/ufs-coastal-coastal_app-02092024/SCHISM-interface/SCHISM/src/WWMIII/wwm_parall_solver.F90 (code 1)

b) COMPILATION COMMAND: compile.sh hera "-DAPP=CSTLS -DUSE_WWM=ON -DNO_PARMETIS=OFF -DOLDIO=ON -DBLD_STANDALONE=ON" "coastalSWWM" intel YES NO

In this case the compilation is successful, though the ufs_model executable is not found by the script as it is located in
build*/SCHISM-interface/SCHISM/src/bin/

NOTE: The SCHISM commit in UFS-Coastal is the same as in the standalone case (I replaced SCHISM in UFS-Coastal). The UFS executable contains SCHISM, CDEPS and CMEPS.


In UFS-Coastal coupled with WW3
-------------------------------

a) COMPILATION COMMAND: compile.sh hera "-DAPP=CSTLSW -DUSE_ATMOS=ON -DUSE_WW3=ON -DNO_PARMETIS=OFF -DOLDIO=ON -DPDLIB=ON" coastalSWW3 intel YES NO

In this case the compilation is successful

b) COMPILATION COMMAND: compile.sh hera "-DAPP=CSTLSW -DUSE_ATMOS=ON -DUSE_WW3=ON -DBLD_STANDALONE=ON -DNO_PARMETIS=OFF -DOLDIO=ON -DPDLIB=ON" coastalSWW3 intel YES NO

Again in this case the compilation was successful, though the ufs_model executable is not found by the script as it is located in
build*/SCHISM-interface/SCHISM/src/bin/

NOTE: The UFS executable contains SCHISM, WW3, CDEPS and CMEPS.


My honest opinion is that the compilation approach of SCHISM (standalone or in UFS-Coastal) is not exactly what we want. We might need to improve on this.

@janahaddad
Copy link
Collaborator

@uturuncoglu we can close, correct?

@uturuncoglu
Copy link
Collaborator Author

@janahaddad Lets's keep it open for now. I need to double check.

kjnam added a commit to schism-dev/schism that referenced this issue Feb 16, 2024
This commit fixes the issue related to non BLD_STANDALONE.
@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 I have just tested recent changes in SCHSIM to fix the standalone build (schism-dev@eb562eb) and it is working under ufs-coastal side too. I run the RT and it passed. I'll sync out fork and update ufs-coastal then I'll close this ticket if this is also fine for you.

@josephzhang8
Copy link

Thx Kijin, @uturuncoglu for your help! I've also tested on our system and the new changes in cmake are fine.

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 Okay. Great. I synced the SCHSIM under ufs-costal. @kjnam Thanks again for your help.

@kjnam
Copy link

kjnam commented Feb 16, 2024

@uturuncoglu @josephzhang8 Happy to be of help.

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 @kjnam I am opening this issue since coastal_ike_shinnecock_atm2sch2ww3 has started to fail. In this case, the only change is updating SCHISM with main (no change in SCHISM-ESMF side). Here is the details, I also checked the coastal_ike_shinnecock_atm2sch2ww3 case coupled with WW3 and it seems that the test is failing. In the PET logs I am seeing following warning,

20240216 230738.223 WARNING          PET08 OCN connected field eastward_wave_radiation_stress  not used with nws = 2 (needs nws = 3)
20240216 230738.223 INFO             PET08 OCN created array eastward_northward_wave_radiation_stress on all resident nodes
20240216 230738.223 INFO             PET08 OCN created field eastward_northward_wave_radiation_stress
20240216 230738.223 INFO             PET08 OCN added halo route to field eastward_northward_wave_radiation_stress
20240216 230738.223 INFO             PET08 OCN realized field eastward_northward_wave_radiation_stress
20240216 230738.223 WARNING          PET08 OCN connected field eastward_northward_wave_radiation_stress  not used with nws = 2 (needs nws = 3)
20240216 230738.223 INFO             PET08 OCN created array northward_wave_radiation_stress on all resident nodes
20240216 230738.223 INFO             PET08 OCN created field northward_wave_radiation_stress
20240216 230738.223 INFO             PET08 OCN added halo route to field northward_wave_radiation_stress
20240216 230738.223 INFO             PET08 OCN realized field northward_wave_radiation_stress
20240216 230738.223 WARNING          PET08 OCN connected field northward_wave_radiation_stress  not used with nws = 2 (needs nws = 3)

and also model crashes with following error,

20240216 230738.224 INFO             PET08 --- checking connection state of mesh_topology
20240216 230738.224 ERROR            PET08 /work/noaa/epic/role-epic/spack-stack/hercules/spack-stack-1.5.1/cache/build_stage/spack-stage-esmf-8.5.0-ffgsxo7vntsk3fr5lwbjnjrrviof6dz5/spack-src/src/Infrastructure/Base/src/ESMCI_Info.C:830 Info::get(): Key not found (JSON trace will follow): /NUOPC/Instance/Connected
20240216 230738.224 ERROR            PET08 ESMCI_Info.C:832 Info::get() Attribute not set  - [json.exception.out_of_range.403] key 'NUOPC' not found
20240216 230738.224 ERROR            PET08 ESMCI_Info.C:836 Info::get() Attribute not set  - Internal subroutine call returned Error
20240216 230738.224 ERROR            PET08 ESMC_InfoCDef.C:601 ESMC_InfoGetCH() Attribute not set  - Internal subroutine call returned Error
20240216 230738.224 ERROR            PET08 ESMF_Info.F90:986 ESMF_InfoGetCH() Attribute not set  - Internal subroutine call returned Error
20240216 230738.224 ERROR            PET08 src/addon/NUOPC/src/NUOPC_Base.F90:1047 Attribute not set  - Passing error in return code
20240216 230738.224 ERROR            PET08 src/addon/NUOPC/src/NUOPC_Base.F90:2901 Attribute not set  - Passing error in return code

I think that second error is due to the first one. I am not sure why coastal_ike_shinnecock_atm2sch2ww3 case gives warning about nws but coastal_ike_shinnecock_atm2sch is fine and there is no warning or issue in there. Do you have any idea?

Anyway, here are the details of configurations,

  • coastal_ike_shinnecock_atm2sch: DATM+SCHSIM (working fine)
Compiled with: -DAPP=CSTLS -DUSE_ATMOS=ON -DNO_PARMETIS=OFF -DOLDIO=ON
param.nml: nws = 2
  • coastal_ike_shinnecock_atm2sch2ww3: DATM+WW3+SCHSIM (crashes)
Compiled with: -DAPP=CSTLSW -DUSE_ATMOS=ON -DUSE_WW3=ON -DNO_PARMETIS=OFF -DOLDIO=ON -DPDLIB=ON
param.nml: nws = 2

Anyway, the only extra flag for SCHSIM is -DUSE_WW3=ON in the failed case and only that one complains about nws. BTW, if I try to run the coastal_ike_shinnecock_atm2sch2ww3 case with the version before sync with main, it runs without any issue. Probably something changed in the last sync but not sure what.

@uturuncoglu uturuncoglu reopened this Feb 17, 2024
@uturuncoglu
Copy link
Collaborator Author

@janahaddad I reopened this ticket since last sync is breaking the ww3 coupling. Maybe cap needs to be adjusted but not sure at this point. Still investigating. I will point the working hash (f9b42db) in ufs-coastal until we solve the issue.

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 I think there is a incompatibility issue between SCHSIM and SCHSIM-ESMF repository. Here is the line that prints out first WARNING message, https://github.com/schism-dev/schism-esmf/blob/9fc62b21afc1edcb691bef5ffa5b309375d5b07c/src/schism/schism_esmf_util.F90#L1127. It indicates that nws needs to be 3 for coupling but I think this is not true since we are using 2 for coupling. I also double checked the atm2sch and warning also exists over there. So, I don't think this is the root cause of the crash in atm2sch2ww3 but the cap still needs to be fixed since the PET logs has misleading warning message and needs to be removed. @platipodium do you agree with me?

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 @platipodium Okay. I seems that the issue was arising due to recent changes that brings wave related fields to the output. I created a new issue for it #4. @platipodium I'll also create a PR to the NUOPC cap that fixes the Attribute not set issue (this is a warning actually but it is appearing as error but does not result crash. It seems that the message is marked wrongly in ESMF side) and also remove the misleading nsw message.

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 @platipodium I created a PR in upstream that fixes the couple of issues. schism-dev/schism-esmf#29. I think I could close this issue but prefer to wait until PR is merged.

@platipodium
Copy link
Member

@josephzhang8 @platipodium I created a PR in upstream that fixes the couple of issues. schism-dev/schism-esmf#29. I think I could close this issue but prefer to wait until PR is merged.

Merged upstream. THank you for the hotfix

@josephzhang8
Copy link

@uturuncoglu
The latest master has removed IMPOSE_NET_FLUX and 'meth_sink', but I don't think this is the reason. Check outputs/fatal.error to see if there is anything there.

@josephzhang8
Copy link

@uturuncoglu The changes in param.nml are documented live inside src/Readme.beta_notes. In particular pay attention to 'remove' b/c of the way .nml works (you can omit parameters but cannot include those that are not in the list).

@uturuncoglu
Copy link
Collaborator Author

@josephzhang8 @platipodium As I know this issue is fixed. If you don't mind could you confirm. So, I could close the issue.

@josephzhang8
Copy link

I think so. Thx @uturuncoglu !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

7 participants