# Tutorial 1c: Cloning a NEON case

**This is an optional tutorial** it's a little bit more advanced, but it will help you think about how to modify the model configuration to run new sites or model experiments.  

The `run_neon` script we used in the [introductory tutorial 1b](1b_NEON_Simulation_Intro.ipynb) created and ran a base case as well as a `.transient` case. 

You'll recall that building the case was pretty time consuming. Luckily, if you aren't changing the model code, you can run from this base case without rebuilding multiple times. 

<br>

---

## In this tutorial

The tutorial has several components. Below you will find steps to: 
1. Create a clone for a new NEON site from the base case you already built.
2. Introduce ways to customize the configuration of your cases.

*Extra credit* 

3. Create another clone for an experiment you're hoping to do.

<div class="alert alert-block alert-warning">

<b>NOTICE:</b>  This tutorial assumes that you've done your homework! 
    
If you haven't downloaded CTSM from the GitHub repository you need to go back to: 
<ul>
    <li><b>1b_NEON_Simulation_Tutorial</b> </li> 
</ul>
    
Do these first!

    
</div>


<div class="alert alert-block alert-warning">

<b>NOTICE:</b> If you're running this notebook through the NCAR JupyterHub login, you need to be on a Cheyenne login node (NOT Casper).  

</div>


<h1> 1. Create a clone </h1>

CTSM cases can be cloned to save time and create model experiments for different sites or different configurations that are otherwise identical to each other.   

### 1.1.1 Set up environment
It is important in order to have all the tools and packages you need to run simulations. 

The following code **is needed** if you're running in CESM-lab in the cloud.


In [None]:
cp -rp /opt/ncar/ctsm/ ~/CTSM

*The code below **is not** needed in the cloud*.

<div class="alert alert-block alert-info">
   
<b>TIP:</b> <i>If you're running on Cheyenne</i>, you may need to uncomment the the following two lines of code. This will set up your conda environment correctly.

<b>This is not required if your running CESM-Lab in the cloud.</b>

</div>


In [None]:
#module purge
#module load conda ncarenv

### 1.1.2 Navigate to your source code

In [None]:
cd ~/CTSM/tools/site_and_regional

## 1.2 Create a clone with `run_neon.py`
We'll just use run_neon again, but point to the **base case** you created in that `1b` tutorial.

<div class="alert alert-block alert-info">
   
<b>TIP:</b> <i>If you're running on Cheyenne</i>, you may need to uncomment the the following two lines of code. This will set up your conda environment correctly.
    
<b>This is not required if your running CESM-Lab in the cloud.</b>
</div>


In [None]:
#module purge
#module load conda ncarenv

The following code will:
- create a clone, 
- download input data from NEON
- submit the simulation

As before, it takes a little bit for all this to happen, but you'll notice it's MUCH faster since we don't have to build the model.

In [None]:
# Change the 4-character NEON site below.
export base_case='KONZ'   # should match the base case you created in 1b
export new_case='CPER'    # the new site you want to run

In [None]:
# then run_neon
./run_neon.py --neon-sites $new_case  \
   --output-root ~/scratch/NEON_cases \
   --base-case ~/scratch/NEON_cases/$base_case \
   --overwrite

Two new flags are being used here:
- `--base-case` points to base case that's already been built, letting you run your new simulation more quickly.  
- `--overwrite` will let you overwrite an existing case.  

<div class="alert alert-block alert-warning">

<b>NOTICE:</b> 
    
- Creating clones while pointing to a `--base-case` saves time by sharing the built model executable.
- Creating clones like this also requires that your base-case and new-case are the same type of simulation, `.transient` simulations in this example.  
- It also assumes that you're <b>not making modifications to the model source code </b> that would require you to rebuild a case.  

</div>


You can check to make sure things are running as expected. On the command line of a terminal window you can enter (HINT, you'll likely have to define `new_case` in the terminal window, as above in the notebook: 
> `ls ~/scratch/NEON_cases/$new_case.transient/run` **or** 

>`qstat -u $USER` **or**

>`tail  ~/scratch/NEON_cases/${new_case}.transient/CaseStatus`

<div class="alert alert-block alert-info" markdown='1'>
How are our simulations different?  let's take a quick look to see.
</div>

----
# 2. Introduction to controling case configuration

There are a few places that summarize differences in simulations.  These files include:
- `env_run.xlm`, which are variables that can be changed to configure your case;  
- `lnd_in`, which cannot be modified directly, but that gets information from user_nl_clm (and provides addition control over the way the land model is configured in your simulation);  
- `datm.streams.xml`, which also cannot be modified directly, but modified with user_datm_streams to point to different data atmosphere streams.

The easiest way to do this is by differencing the same files in each case directory with the `diff` command. 

## 2.1 `env_run.xml` 

In [None]:
cd ~/scratch/NEON_cases
diff $new_case.transient/env_run.xml $base_case.transient/env_run.xml 

**What are the differences between these two cases**, based on their `env_run.xlm` files?

These differences are expected, because NEON cases are set up with `usermod_dirs` that control some of these .xml variables.

You can look at particlar changes for your $new_case below.

In [None]:
cat ~/CTSM/cime_config/usermods_dirs/NEON/$new_case/shell_commands

We won't really go into usermod directories here, but they are a useful way to modify the way cases are configured for different model experiments.

usermod_dirs were used when you created your NEON cases with run_neon.  
- The shell_command and user_nl_* files are copied from `~/CTSM/cime_config/usermods_dirs/NEON` into your case directory.  
- Shell commands are executed after the case is created.
- This makes is easier set up cases in a consistent, repeatable way.
**Pretty slick software engineering!**

Some NEON sites have additional changes, notably for sites with gaps in the NEON input data.  These sites have shorter run times (than the 2018-2021 defaults).  

In [None]:
cat ~/CTSM/cime_config/usermods_dirs/NEON/MOAB/shell_commands

These .xml changes set up high-level control over how your simulation is run.  What are some of the specifics related to the land model?  We can see this by looking at our `lnd_in` file.

## 2.2 `lnd_in` 

**What are the differences between these two cases**, based on their `lnd_in` files?

In [None]:
diff $new_case.transient/CaseDocs/lnd_in $base_case.transient/CaseDocs/lnd_in 

<style> 
table td, table th, table tr {text-align:left !important;}
</style>
<div class="alert alert-block alert-info">
<b>REMEMBER:</b>
    
- The <i>lnd_in</i> file provides a high level summary of all the name list chagnes and files that are being used by CLM. 
- It can be found in the CaseDocs directory, or in your run directory. 
- You cannot directly modifiy the <i>lnd_in file</i>, instead users can modify <i>user_nl_clm</i>.

</div>

**Initial conditions dataset:** `finidat`
  - These are initial conditions files that we created by spinning up the model.  
  - Spin up requires starting the model from bare ground conditions (we call it a *coldstart*).
  - Spin up takes a few hundred years of simulations so that ecosystem carbon and nitrogen pools acheive steady state conditions (e.g. average net ecosystem exchange equals zero).  
  - Since this takes a long time, we provide initial conditions for you to start from.
  - **This also means that if you change model parameterizations, input data, or anything else you ahve to spin up the model again!** 

**Surface dataset:** `fsurdat`
  - The surface datasets describe what vegetation is growing in a grid cell, characteristics of soil physical properties, and much more information about what the land surface 'looks like' to the model.  
  - We modifed these the default surface dataset for each NEON simulation with information about the dominant plant functional type (PFT) and soil properties, based on NEON measurements. 
  - This could likely be further refined, but it's a step towards making the model look more like the real world ecosystems we are trying to simulate.


There are other differences in the build, but this is basically just reflecting the different case directories for the two NEON cases.

---

The `lnd_in` files are controlled by `user_nl_clm`.  Let's see how these are different.

In [None]:
diff $new_case.transient/user_nl_clm $base_case.transient/user_nl_clm

It looks like the only difference here are initial conditions, but that's because we used environmental variables to get the right surface dataset

If you open one of the `user_nl_clm` files you'll see:
```
fsurdat = "$DIN_LOC_ROOT/lnd/clm2/surfdata_map/NEON/surfdata_1x1_NEON_${NEONSITE}_hist_78pfts_CMIP6_simyr2000_c230111.nc"
```

We saw this already by `diff`ing the `env_run.xml files`, above, but now we'll use `.xmlquery` to see how these are different in each case.

In [None]:
echo moving to base_case directory
cd ~/scratch/NEON_cases/$base_case.transient/
./xmlquery NEONSITE

echo moving to new_case directory
cd ~/scratch/NEON_cases/$new_case.transient/
./xmlquery NEONSITE


You can also see what **parameter file** is being used for your case.  Since we haven't changed this, the model just points to the default CTSM5.1 parameter file.

In [None]:
cat ~/scratch/NEON_cases/$base_case.transient/CaseDocs/lnd_in | grep paramfile
cat ~/scratch/NEON_cases/$new_case.transient/CaseDocs/lnd_in | grep paramfile


The land model is using site specific initial conditions and surface data for each NEON site.  How else are our simulations different?  

## 2.3 `datm.streams.xml` 

**What are the differences between these two cases**, based on their `datm.streams.xml` files?

The answer here isn't very interesting... the two cases likely point to different input data reflecting local meterology at each site.  It's still helpful to know about how these files are set up.


In [None]:
cd ~/scratch/NEON_cases/
cat $new_case.transient/CaseDocs/datm.streams.xml | head -20

*Which aspects of this file could be changed for a different site?*

You can see check with this code *(HINT: you'll have to paste it in the command line on into a code cell).*

> ```
> diff $new_case.transient/CaseDocs/datm.streams.xml         $base_case.transient/CaseDocs/datm.streams.xml 
> ```

<div class="alert alert-block alert-info" markdown='1'>
<b>REMEMBER:</b> 

- The <i>datm.stream.xml</i> file points to all of the atmospheric boundary conditions (input data) that are being read in for a case. 
- Like your <i>lnd_in</i> files, it can be found in the CaseDocs directory, or in your run directory. 
- You cannot directly modifiy this file, instead users can modify <i>user_nl_datm_streams</i>.  

</div>

---

<div class="alert alert-success">
<strong>Congratulations!</strong> 
    
You have now cloned a CTSM case to run a simulation at a new NEON tower site, check that yoe can locate the history files from this site and try to plot up some of these data for these new results.
</div>

---

# 3. Create an experimental clone:

<h4> This step is optional, but provides helpful information that you may use in your own workflow </h4>

So far, everything we've done has been *out of the box* looking at different NEON sites, but without changing anything in the underlying model code.  You may want to do model experiments where you alter the vegetation growing at a site, modify some of the model parameters, modify namelist settings, change the input data, or even alter model code.  We'll get into how do make these changes later, but for now we'll get a test case set up.

Since we're already run an out of the box case for KONZ, we can create a paired experimental case at the same site.

<div class="alert alert-block alert-info" markdown='1'>
<b>RECOMMENDATION:</b> use a short, descriptive name for your experiment, it will help you down the road.
</div>


This example below just builds on what you've already been doing:

In [None]:
# Change the 4-character NEON site below.
cd ~/CTSM/tools/site_and_regional
export base_case='KONZ'   # should match the base case you created in 1b
export new_case='KONZ'    # the new site you want to run

# then run_neon
./run_neon.py --neon-sites $new_case  \
   --output-root ~/scratch/NEON_cases \
   --base-case ~/scratch/NEON_cases/$base_case \
   --overwrite \
   --experiment test1 \
   --setup-only  

Two new flags are being used here:
- `--experiment` just appends the case name for the experiment
- `--setup-only` will create the case, but not submit it.

<div class="alert alert-block alert-warning" markdown='1'>

<b>WARNING:</b> Because we're also using the `--base-case` flag, we won't have to rebuild our experimental case.  This may not be advisable if you're modifying model code.

</div>

At this point your experimental case has been created.  
- What is the case name?
- Can you navigate to your case directory?
- Are there any differences between this experimental case and the base case you already ran?
- Can you find the datm input data for your case?
- Is the model going to use an initial conditions file?
- Where is the surface dataset that's being used?
- Can you find the parameter file for your case?

**Extra Credit**
- What changes may you like to make to the model in this new experimental case?