Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefix modules with something like 'CTSM' to avoid name collisions #869

Open
billsacks opened this issue Dec 12, 2019 · 16 comments
Open

Prefix modules with something like 'CTSM' to avoid name collisions #869

billsacks opened this issue Dec 12, 2019 · 16 comments
Labels
enhancement new capability or improved behavior of existing capability

Comments

@billsacks
Copy link
Member

In trying to include CTSM in WRF, we have run into name collisions. In this case, it is because WRF includes some old CLM code, and we plan to handle this by renaming the code in WRF. But this made me realize that this could be a potential issue in general. For example, we have a number of generically-named modules like 'fileutils'. If another model being linked with CTSM happened to have a module with the same name as one of CTSM's, with a subroutine inside that also had the same name, then there would be problems. Depending on the details, this either causes an error at link time (due to duplicate symbols) or causes the executable to silently be created incorrectly - where just one of the multiply-defined subroutines is used in all cases.

I propose that we rename all of our modules to have a common prefix like "CTSM". I realize that this will be invasive, but I think it's important to avoid issues down the line as other models begin to link to CTSM. It is also potentially important just within CESM: There could be subtle bugs arising from name collisions between CTSM and other components.

If others agree, we'll need to decide on:

  • What prefix to use: CTSM, Ctsm, CTSM_, Ctsm_, etc. I'd probably vote for Ctsm, because it is easier to read a module name like CtsmMyModule than CTSMMyModule or especially something like CTSMCNC14DecayMod.F90 - the long sequence of uppercase characters at the beginning of the latter is hard to parse. For what it's worth, I see that FATES prefixes its files with Fates.

  • Do we want to rename all the files, or just the modules contained within them? The benefit of renaming the files is that it keeps file names consistent with module names, which could be helpful. The downsides are that it leads to longer and harder to read file names, and in the short-term it could cause more merge issues. (@ekluzek thought that our dependency-generation tool requires file names to match module names, but I checked and that is no longer the case.)

If we do rename files at this time, this would also be a good time to broadly introduce the new file naming convention laid out in #835 .

@billsacks billsacks added the enhancement new capability or improved behavior of existing capability label Dec 12, 2019
@billsacks
Copy link
Member Author

I was able to demonstrate the possible issue with multiple definitions, even within the CESM build, with the following diffs:

Diffs in CTSM:

diff --git a/src/main/bill_mod.F90 b/src/main/bill_mod.F90
new file mode 100644
index 00000000..0b582471
--- /dev/null
+++ b/src/main/bill_mod.F90
@@ -0,0 +1,14 @@
+module bill_mod
+  implicit none
+  private
+
+  public :: bill_sub
+
+contains
+
+  subroutine bill_sub(x)
+    character(len=*), intent(in) :: x
+
+    print *, 'bill_sub in land: ', x
+  end subroutine bill_sub
+end module bill_mod
diff --git a/src/main/clm_driver.F90 b/src/main/clm_driver.F90
index f931d3ee..a981ebee 100644
--- a/src/main/clm_driver.F90
+++ b/src/main/clm_driver.F90
@@ -81,6 +81,7 @@ module clm_driver
   use clm_instMod
   use clm_instMod            , only : soil_water_retention_curve
   use EDBGCDynMod            , only : EDBGCDyn, EDBGCDynSummary
+  use bill_mod               , only : bill_sub
   !
   ! !PUBLIC TYPES:
   implicit none
@@ -199,6 +200,8 @@ subroutine clm_drv(doalb, nextsw_cday, declinp1, declin, rstwr, nlend, rdate, ro
     ! are passed to CLM in initialization, then this code block can be removed.
     ! ========================================================================

+    call bill_sub('hello')
+
     need_glacier_initialization = is_first_step()

     if (need_glacier_initialization) then

Diffs in datm:

diff --git a/src/components/data_comps/datm/mct/bill_mod.F90 b/src/components/data_comps/datm/mct/bill_mod.F90
new file mode 100644
index 000000000..b067b8845
--- /dev/null
+++ b/src/components/data_comps/datm/mct/bill_mod.F90
@@ -0,0 +1,14 @@
+module bill_mod
+  implicit none
+  private
+
+  public :: bill_sub
+
+contains
+
+  subroutine bill_sub(x)
+    integer, intent(in) :: x
+
+    print *, 'bill_sub in atmosphere: ', x
+  end subroutine bill_sub
+end module bill_mod
diff --git a/src/components/data_comps/datm/mct/datm_comp_mod.F90 b/src/components/data_comps/datm/mct/datm_comp_mod.F90
index def030c67..f39650dc0 100644
--- a/src/components/data_comps/datm/mct/datm_comp_mod.F90
+++ b/src/components/data_comps/datm/mct/datm_comp_mod.F90
@@ -31,6 +31,7 @@ module datm_comp_mod
   use datm_shr_mod   , only: factorfn       ! namelist input
   use datm_shr_mod   , only: iradsw         ! namelist input
   use datm_shr_mod   , only: nullstr
+  use bill_mod       , only : bill_sub

   ! !PUBLIC TYPES:

@@ -271,6 +272,8 @@ subroutine datm_comp_init(Eclock, x2a, a2x, &
     character(*), parameter :: subName = "(datm_comp_init) "
     !-------------------------------------------------------------------------------

+    call bill_sub(42)
+
     call t_startf('DATM_INIT')

     if (phase == 1) then

Then, on my mac (using gfortran) I built and ran SMS_D_Ld1_P4x1.f10_f10_musgs.I2000Clm50BgcCropQianRsGs.bishorn_gnu.clm-default. The result was that both the atmosphere and land used the atmosphere's version of the subroutine, and the land (which was passing a string to a subroutine expecting an integer) printed garbage like this:

[0]  bill_sub in atmosphere:   1819043176

@billsacks
Copy link
Member Author

I played around with a few possible prefixes, and am coming to prefer a prefix of ctsm_ – i.e., lowercase with an underscore. This clearly separates the package name (ctsm) from the "true" module name, and the lowercase ctsm together with _ makes the actual name of the module stand out more.

Compare

ctsm_ColumnType.F90
ctsm_GetGlobalValues.F90
ctsm_Driver.F90
ctsm_AbortUtils.F90

with

CtsmColumnType.F90
CtsmGetGlobalValues.F90
CtsmDriver.F90
CtsmAbortUtils.F90

At least for me, it's easy for my eye to skip the "ctsm_" in the first example, focusing in on the important part of the module name, whereas it is significantly harder for my eye to skip the "Ctsm" in the second example.

(Another possibility would be to use a suffix rather than prefix of _ctsm. That would help even more with focusing in on what's important, though it somehow feels wrong to me to use a suffix for the package name.)

@billsacks
Copy link
Member Author

Feeling from today's CLM software meeting: People are okay with this. @wwieder did float the idea of just renaming things in utils (and maybe main), but is also okay with a full rename. We'll probably go ahead with the full rename, using ctsm_ prefix.

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 16, 2019

@rgknox @glemieux I wanted to make sure you were aware about this change that will be coming in where we'll add ctsm_ as a prefix to all the filenames in CTSM (or changes instances of clm_ to ctsm_). FATES already has a prefix for everything in it, so we don't need FATES to do anything different than they are now.

The one question I have for Ryan and Greg though are the current CLM FATES interface files currently start with a clmfates_ prefix. So with the new names should that then be changed to a

ctsmfates_ prefix? Or something else "ctsm_fates_" maybe?

@rgknox
Copy link
Collaborator

rgknox commented Dec 16, 2019

I don't have strong feelings about either proposed name. I guess I would describe it as a ctsm module formost, and one that communicates with fates. How about:

ctsm_FatesInterfaceMod.F90 ?

On the FATES side of the code, we also have FatesInterfaceMod.F90, but the module name inside the file will also have the "ctsm_" right?

@ekluzek
Copy link
Collaborator

ekluzek commented Dec 16, 2019

Yes, the module name will match the filename. That used to be a requirement of the build system. @billsacks found that it isn't anymore, but practically it's still more readable if the two match.

I think I like

ctsm_Fates....F90

We had a separate proposal to remove all of the "Mod" suffixes from filenames, as well as some of the "Type" suffixes. It's possible that name change would happen at this same time. So if it does it would be:

ctsm_FatesInterface.F90 and ctsm_FatesParameterInterface.F90

Note, that's also moving the filenames to camelcase (or pascal case or whatever you call it for mixed case with an upper case letter to start a new word).

@rgknox
Copy link
Collaborator

rgknox commented Dec 17, 2019

camel case and dropping Mod namespace sounds good to me @ekluzek

@billsacks
Copy link
Member Author

@mvertens points out that all files / modules in src/cpl/ should be left as is: there are some assumptions about how some of these are named, and we should leave all of them as is for consistency.

@negin513
Copy link
Contributor

negin513 commented Jan 15, 2020

I started looking at the file names that need to be changed. There are a few things/exceptions not laid out in the file naming convention in #835 or #875 . I'd like to point them out here in case anyone has any comments.

  1. There are files currently with the prefix clm_ (e.g. clm_instMod.F90 ) or clm (e.g. clmfates_paraminterfaceMod_stub.F90). In both cases clm_ or clm should be replaced to ctsm_ instead of simply adding a ctsm_ prefix to all file names.

  2. Regarding the acronyms used in the file names, do we want to have them camel case or all uppercase or lowercase?

  • For example: io in file names, do we want it all uppercase or lower case or camel case? Should ncdio_utils.F90 be changed to ctsm_NcdIoUtils or ctsm_NcdioUtils or ctsm_NcdIOUtils?

  • More examples: lnd, atm ,glc or pft. How would you like these to be named? e.g. Should pftconMod.F90 be named ctsm_PftCon.F90 or ctsm_pftCon.F90 or ctsm_PFTCon.F90? Shouldatm2lndType.F90 be ctsm_Atm2LndType.F90 or ctsm_atm2lndType.F90?

  • FUN: should it be kept as is or changed to Fun. CNFUN.F90 : do we want to name is ctsm_CNFUN.F90 or ctms_CNFun.F90?

  1. In lilac cap, should we want to replace lilac_ with ctsm_ or should we write it as ctsm_ lilac_. I'd prefer to keep the lilac in the names of the files. For example: lilac_cpl.F90: should this be changed to ctsm_LilacCpl.F90 or ctsm_cpl.F90 or ctsm_lilac_cpl.F90 or any other variation of this name?

  2. Some files have CLMin their name (not the prefix). For example NutrientCompetitionCLM45default.F90. Do we want to add ctsm_ to this or keep it as is?

  3. Chemical species: do we want to have a consistent naming convention for them? For example, currently, we have ch4.F90 that can be written as ctsm_CH4.F90 or ctsm_Ch4.F90 or ctsm_Methane.F90. Similarly ozone do we want to change it to Ozone, O3, or o3?

@negin513
Copy link
Contributor

There are also some more inconsistencies and suggestions to improve them. I would like to hear your comments on them:

  1. Biogeochemistry in filenames is mostly spelled out as Biogeochem. However, sometimes it is denoted as BGC. I think using Biogeochem in filenames (instead of BGC) makes it more readable and less crowded with uppercase letters.
  • For example: Currently we have EDBGCDyn.F90. Do we want to rename it to ctsm_EDBGCDyn.F90 or ctsm_EDBiogeochemDyn.F90?
  1. What do you think about removingBGC from the file names that already include Biogeochem?
    For example:
    SoilBiogeochemDecompCascadeBGCMod.F90 --> ctsm_SoilBiogeochemDecompCascade.F90

  2. Currently, we have some files that include DV. Should we replace this with DynVeg? Currently, in most file names both dynamics and vegetations are abbreviated as dyn and veg and it makes more readable.

  • For example:
    CNDVType.F90 --> ctsm_CNDynVegType.F90

or

  • CNDVLight.F90 --> ctsm_CNDynVegLight.F90
  1. Urban in filenames is mostly spelled out as urban in all filenames. Except for UrbBuildTempOleson2015Mod.F90 . Should we change it for consistency as following?
  • UrbBuildTempOleson2015Mod.F90 --> ctsm_UrbanBuildTempOleson2015.F90
  1. Flux is spelled out in most of the filenames while in others it is denoted by F.
    For example: Ch4FInundatedStreamType.F90. Should we change this to Flux?
  • Ch4FInundatedStreamType.F90 --> ctsm_MethaneFluxInundatedStreamType.F90

@billsacks
Copy link
Member Author

(1) & (2): The only examples I see are the two you list here. Your idea of renaming EDBGCDyn to EDBiogeochemDyn seems good. However, SoilBiogeochemDecompCascadeBGC should be left with BGC to contrast it with the CN version. The older soil bgc scheme is referred to as CN and the newer one is referred to as BGC, and this difference is reflected in these file names.

(3) I'd say leave CNDV as is, because this (deprecated) sub-component is commonly referenced in this way.

(4) I agree about changing this to Urban

(5) In this example, F stands for fraction, not flux. I'd leave it as is, because it is consistent with variable names throughout the code.

As you can tell from my responses, a lot of this is case by case. So please post any other specific cases you have.

@dlawrenncar
Copy link
Contributor

dlawrenncar commented Jan 15, 2020 via email

@negin513
Copy link
Contributor

@dlawrenncar : Thanks for bringing this up! Based on the discussion I had with @billsacks ED stands for FATES and I am going to change the file names accordingly:

EDBGCDyn.F90 --> ctsm_FatesBiogeochemDyn.F90
dynEDMod.F90 --> ctsm_DynFates.F90

@negin513
Copy link
Contributor

Few more suggestions for better readability:

  1. Changine surfrd in filenames to SurfRead :
  • surfrdMod.F90 --> ctsm_SurfRead.F90
  • surfrdUtilsMod.F90 --> ctsm_SurfReadUtils.F90
  1. I'd suggest the following changes for making these files more distinguishable from each other :
  • CNMResp.F90 --> ctsm_CNMaintResp.F90
  • CNGResp.F90 --> ctsm_CNGrowthResp.F90

@billsacks
Copy link
Member Author

@negin513 as we discussed in person, but for the record: I agree with your proposals in the last comment. Thanks.

@billsacks
Copy link
Member Author

@negin513 here are a couple of other things that should be done (which I noticed in putting together negin513/ctsm_filenames#1):

(1) I noticed that init_hydrology.F90 and getdatetime.F90 don't have a module, just a bare subroutine. They should be turned into modules. I'd suggest doing this as a first step - i.e., first turn these into modules named init_hydrology and getdatetime (adding use statements as needed elsewhere in the code), then do the big rename on top of this.

(2) In addition to the F90 files, there are also directories and .pf files for unit testing, whose names sometimes mirror the names of the F90 files they are testing. It's not essential for these to be kept consistent, but it's probably best if they're mostly consistent to avoid confusion. These can all be found in the test subdirectories of various src subdirectories (e.g. src/biogeophys/test, etc.). Can you take a shot at renaming these test directories and pf files to remain consistent with the modules they are testing? I'd only do this for directories and files that currently mimic existing F90 files (there may be others that are named more abstractly, or named after subroutines rather than entire files/modules). I'd say there's no need to add the ctsm prefix here. For example, a test of module ctsm_Foo.F90 could have a directory named Foo_test and a .pf file named test_Foo.pf. (We don't need to worry about namespace collisions for these unit test files, and adding the ctsm prefix to these test files starts making them particularly long.) In addition to renaming the .pf files and directories, you'll need to update various CMakeLists.txt files throughout the source tree to give the new file and directory names.

@ekluzek ekluzek added the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Jan 9, 2023
@billsacks billsacks removed the next this should get some attention in the next week or two. Normally each Thursday SE meeting. label Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement new capability or improved behavior of existing capability
Projects
None yet
Development

No branches or pull requests

5 participants