# XML Namelist Format in ParamGen
Alper Altuntas, NCAR\
Boulder, CO - 2021

## 1. Introduction

Here, we briefly describe a special use case of ParamGen: XML-based namelist format. This document is complementary to the README.ipynb file, which describes ParamGen in a broader context and with more detail. It is advised to read the README.ipynb file first.

## 2. XML Namelist Template

The YAML and JSON frontends of ParamGen are quite flexible in terms of the data layout, or schema. In the case of XML, however, we work with a predefined schema that resembles CESM's `entry_id_namelist.xsd`. The new ParamGen schema, called `entry_id_pg.xsd`, is located in `cime/scripts/lib/CIME/ParamGen/xml_schema/`

We first write an example xml file, named `my_tamplate.xml` below that conforms to this new schema.

In [1]:
%%writefile my_template.xml
<?xml version="1.0"?>

<entry_id_pg version="0.1">

  <entry id="days_per_year">
    <type>real</type>
    <group>setup_nml</group>
    <desc>Days per year</desc>
    <values>
      <value>365</value>
    </values>
  </entry>


  <entry id="f_anglet">
    <type>logical</type>
    <group>icefields_nml</group>
    <desc>f_anglet</desc>
    <values>
      <value>.true.</value>
      <value cice_mode="prescribed">.false.</value>
    </values>
  </entry>

  <entry id="ice_ic">
    <type>char</type>
    <group>setup_nml</group>
    <desc>Method of ice cover initialization.</desc>
    <values>
      <value>UNSET</value>
      <value ICE_GRID="gx3v7">${DIN_LOC_ROOT}/ice/cice/b40.t31x3.20th.cice.r.2006-01-01-00000.nc</value>
      <value guard='$ICE_GRID .startswith("gx1v")'>${DIN_LOC_ROOT}/ice/cice/b.e15.B1850G.f09_g16.pi_control.25.cice.r.0041-01-01-00000.nc</value>
      <value guard='$ICE_GRID .startswith("tx0.1v")'>${DIN_LOC_ROOT}/ice/cice/g.e11.G.T62_t12.002.cice.r.0016-01-01-00000.nc</value>
      <value guard='$ICE_GRID .startswith("ar9v3")'>${DIN_LOC_ROOT}/ice/cice/cice5ic/r26RBRCICE5g0.cice.r.1990-09-01-00000.nc</value>
    </values>
  </entry>


</entry_id_pg>

Writing my_template.xml


The above xml file includes three namelist variable definitions taken from CICE `namelist_definition_cice.xml`. Notice how the format is very similar to the original namelist definition format. Some of the differences between the original `entry_id_namelist.xsd` vs the new `entry_id_pg.xsd:`

- To easily distinguish these schemas, the root element in the new schema is called `entry_id_pg`, and not `entry_id`.
- Currently, only a subset of descriptive child elements are supported for `entry_id_pg` entries. These are `type`, `group`, and `desc`. More elements may be added as needed.
- In the traditional format, value propositions would be specified with arbitrary `<value>`, attributes, e.g., `hgrid="gx3v7`". The new format also supports this specification type. And within a value entry, multiple key=value attributes may be specified, in which case they are joined with logical AND. In ParamGen implementation of XML specification, however, there is an alternative method of specifying guards, which brings about greater flexibility. 
- The more flexible method is based on specifying guards via the `guard=` attribute. A `guard=` attribute can be any arbitrary Python expression that evalutes to True or False. These expressions can involve any variables (see `expand_func` description in README.ipynb) and any standard Python operators, methods, list comprehensions, etc. Notice how the `.startswith()` method is used to abbreviate the `ice_ic` value list compared to the the traditional proposition specification which would require multiple value entries for each grid starting with "gx1v", "tx0.1v", and "ar9v3".

Before showcasing the ParamGen module, we define an `expand_func`. Recall that ParamGen makes use of custom `expand_func` to infer the values of expandable variables. In the above xml file, we have three such variables: `cice_mode`, `ICE_GRID`, and `DIN_LOC_ROOT`.

In [2]:
def expand_func_demo(varname):
    return {
    'ICE_GRID': 'gx1v6',
    'DIN_LOC_ROOT': '/glade/p/cesmdata/cseg/inputdata',
    'cice_mode': 'thermo_only',
    }[varname]

While the above `expand_func_demo` is a trivial one for demonstration purposes, the below `expand_func` is a real-world example taken from MOM6 within CESM:

In [3]:
def expand_func(varname):
    return case.get_value(varname)

where `case` is a CIME case object whose `get_value()` method returns the values of XML variables like `ICE_GRID`, `DIN_LOC_ROOT`, etc.

## 3. ParamGen XML namelist format in action

We first import the ParamGen class as follows:

In [4]:
from paramgen import ParamGen

We can now instantiate a ParamGen object by passing the `my_template.xml` file path to the `from_xml_nml()` method of ParamGen

In [5]:
pg = ParamGen.from_xml_nml("./my_template.xml")

After having instantiated the ParamGen object, we can call its `reduce` method to evaluate guards and infer final namelist variable values. Notice that we pass in the `expand_func_demo` method so ParamGen can infer the values of expandable variables such as `ICE_GRID`.

In [6]:
pg.reduce(expand_func_demo)

Finally, we can write out a Fortran namelist file as follows:

In [7]:
pg.write_nml("./my_nml_file.nml")

The resulting namelist file is as follows:

In [8]:
!cat ./my_nml_file.nml

&setup_nml
    days_per_year = 365
    ice_ic = /glade/p/cesmdata/cseg/inputdata/ice/cice/b.e15.B1850G.f09_g16.pi_control.25.cice.r.0041-01-01-00000.nc
/

&icefields_nml
    f_anglet = .true.
/



In addition to writing out the namelist file, we can access the data directly, both before and after the `reduce()` method is called. Some examples:

In [9]:
# print out the final value of `f_anglet`:
print(pg.data['icefields_nml']['f_anglet']['values'])

.true.


In [10]:
# print out the description of `days_per_year`:
print(pg.data['setup_nml']['days_per_year']['desc'])

Days per year


In [11]:
# print out the type of `ice_ic`:
print(pg.data['setup_nml']['ice_ic']['type'])

char
