# Tutorial 0e 
## *BONUS OPTION: Creating NEON cases from 'scratch'*

The `run_neon` script we used in the `Day0b` and `Day0d` tutorials is very useful, but they simplifies a number of steps that users may want to take control of in their workflow.  We'll introduce how you can control aspects of your case setup and configuration here. 

If you're new to running CTSM, this is a somewhat advanced tutorial.  You can do quite a bit if your comfortable 

If you're more familiar with running CTSM, these steps may be somewhat redundant.  That said, it illustrates important features, like how to get initial conditions / restart files from NEON.

<br>

---

## In this tutorial

The tutorial has several components. Below you will find steps to: 
1. Create, setup, build, and run out-of-the-box CTSM case at a NEON site
2. Locate history files

*Extra credit* 

3. Create a clone for a new NEON site that reuses the build from the case created in #1. This is **not required**, but helpful.

<div class="alert alert-block alert-warning">

<b>NOTICE:</b>  This tutorial assumes that you've done your homework! 
    
If you haven't downloaded CTSM from the github repository you need to go back to the: 
<ul>
    <li><b>Day0a_GitStarted tutorial</b> and </li> 
    <li><b>Day0b_NEON_Simulation_Tutorial</b> </li> 
</ul>
    
Do these first!

    
</div>


<div class="alert alert-block alert-warning">

<b>NOTICE:</b> If you're running this notebook through the NCAR JupyterHub login you need to be on a Cheyenne login node (NOT Casper).  

</div>



<div class="alert alert-block alert-info" markdown="1">

<b>TIP:</b>  Before we get started, make sure you're in a bash kernel 
<ul>
    <li> Switch kernal (upper right of your current notebook)</li>
    <li> Select either one of the Bash Kernels from the pop-up window</li>    
    <li> Click select</li>    
</ul>
</div>

<div>
<img src="https://github.com/NCAR/CTSM-Tutorial-2022/raw/main/images/kernel.png" width="670" />
</div>


<div class="alert alert-block alert-info" markdown="1">

<b>HINT:</b>  Most of the examples in this tutorial can be run directly from code 
cells of this notebook.  

It's also helpful to have a terminal window open to run from the command line.
To open a tab with a terminal window connection 


First you need to open a terminal window within CESM-Lab:
<ol>
    <li> Click on the + symbol in the upper left for a <i>New Launcher</i></li>
    <li> Click on the <i>Terminal</i> icon</li>
</ol>
</div>


<div>
    <img src="https://github.com/NCAR/CTSM-Tutorial-2022/raw/main/images/LaunchTerminal.png" 
         style="width:655px; height:375px;" />
</div>

***

<h1> 1. Set up and run a simulation</h1>

### CTSM can be run in 4 simple steps.

<h4>1.1 create a new case </h4>

- *This step sets up a new simulation. It is the most complicated of these four steps because it involves making choices to set up the model configuration*

#### 1.2 invoke case.setup
- *This step configures the model so that it can compile*

#### 1.3 build the executable
- *This step compiles the model*

#### 1.4 submit your run to the batch queue
- *This step submits the model simulation*

***
<h1> 1.1 create a new case</h1>

*Set up a new simulation*
***

## 1.1.1 Create a directory for cases
This is a one-time step to create a directory to store your experiment cases, or your **case directory**:

<div class="alert alert-block alert-info">

<b>NOTE:</b>  This code cell below is set up in a bash kernel, which means you can excecutes commands just like in the command line of a terminal window, it may not render correctly if you're looking at this online, but we've copied the command in the NOTE below.

    
<b>TIP:</b>  Execute code by clicking in the cell below, and then either clicking the play button in the menu bar above, or pressing 'shift+enter'

</div>

In [None]:
mkdir ~/NEON_cases

<div class="alert alert-block alert-info">
   
<b>TIP:</b> If you're running on Cheyenne, creating a softlink to scratch in your home directory is helpful.  
    
You can create this with the following: `ln -s /glade/scratch/$USER ~/scratch`
 
</div>

With `run_neon` the case and run directories were combined for convenience.

Now we'll create a new space for model results in a **run directory**.

In [None]:
mkdir ~/scratch/NEON_runs

## 1.1.2 Navigate to the source code directory

Your source code is in your home directory

<div>
<img src="https://github.com/NCAR/CTSM-Tutorial-2022/raw/main/images/SourceCode.png" width="460" height="460" />
</div>

In [None]:
cd ~/CTSM/cime/scripts

## 1.1.3 Create a new case

In [None]:
# Change the 4-character NEON site below.
export neon_site="UNDE"
# then create a new case.
./create_newcase --case ~/NEON_cases/${neon_site}.transient \
    --res CLM_USRDAT \
    --compset IHist1PtClm51Bgc \
    --output-root ~/scratch/NEON_runs \
    --user-mods-dirs NEON/${neon_site} \
    --run-unsupported     

***
The code above doesn't always render properly online.  Here's what the code actually says:

```
./create_newcase --case ~/NEON_cases/${neon_site}.transient \
    --res CLM_USRDAT \
    --compset IHist1PtClm51Bgc \
    --output-root ~/scratch/NEON_runs \
    --user-mods-dirs NEON/${neon_site} \
    --run-unsupported
```
***
### **./create_newcase**

<div class="alert alert-block alert-info">

<b>NOTE:</b> There is a lot of information that goes into creating a case.

You can learn more about the options by typing <i>./create_newcase --help</i> on the the command line or in a new code cell.

<b>We'll briefly go over some of the highlights here.</b>

</div>

---

### Required arguments to create a new case
There are 3 Required arguments Needed to create a new case.  These include 
1. `--case`, which specifies the *location* and *name* of the case being created
  - `~` = your home directory
  - `NEON_cases` = the subdirectory we created to store your cases in 1.1.1
  - `${neon_site}.transient` = the name of the case you're creating
  - *Recommendation:* Use meaningful names, including model version, type of simulation, and any additional details to help you remember the configuration of this simulation
<br><br>
2. `--res` Defines the model *resolution*, or grid,
  - `CLM_USRDAT`, which is an option to get a case setup without having to define the grid resolution, yet. 
  - In global cases, the land model is commonly run at a nominal 1 degree `f09_g17` or 2 degree `f19_g17` resolution 
  - Using `./query_config --grids` provides a list of supported model resolutions
<br><br>  
3. `--compset` Defines the *component set* for your case, 
  - The Component set specifies the default configuration for the case which includes:
    - Component models (e.g. active vs. data vs. stub), 
    - Time period of simulations and forcing scenarios (e.g. 1850 vs 2000 vs. HIST) and 
    - Physics options (e.g. CLM5.1 vs CLM5.0). 
  - `IHist1PtClm51Bgc` is alias that actually describes a much longer set of components that are being used for this single point case. 
  - All CLM-only compsets start with *"I"*.
    - Using `./query_config --compsets clm` provides examples of other CLM compsets
<br>

*We'll come back to compsets, but there are a few other **optional** flags used to create this new case that we'll briefly touch on here*
<br>

4. `--output-root`, which specifies the *location* of your run directory
  - `~/scratch/NEON_runs` = the subdirectory we created for your run directory in 1.1.1
<br> 

5. `--user-mods-dirs` sets up the configuration of the case with a user modification directory that defines the location of the site.
  - `NEON/${neon_site}` has files that format of the history files and a few other custom settings for your case. 
  - This inludes setting some of .xml variables and name list settings correctly for a NEON single point case.  We'll go over this in more depth in future tutorials.
<br>

6. `--run-unsupported` avoids error using compsets are not scientifically supported 

<div class="alert alert-block alert-info" markdown='1'>

<b>NOTE:</b> You may notice an error about project codes when you create your case. The project code isn't important for these simulations. But you may need to change this if you're running on Cheyenne.

</div>

***

### More on component sets

- All CLM-only compsets start with *"I"*.
 - Using `./query_config --compsets clm` provides examples of other CLM compsets

you can try this here

In [None]:
./query_config --compsets clm

The long name for the compset used here is `HIST_DATM%1PT_CLM51%BGC_SICE_SOCN_SROF_SGLC_SWAV_SESP`, which defines
   - time = `HIST_` (vs. 1850, HIST, SSP, etc)
   - data atmosphere `DATM%1PT_`, from a single point, as opposed to model atmosphere (e.g., CAM) 
   - land model `CLM51%BGC_` CLM5.1 physics package with active biogeochemistry (BGC)
   - stub sea ice model `SICE_`
   - stub ocean model `SOCN_` (see there's no active ocean model!)
   - stub river model `SROF_` (other options include MOSART, MIZUROUTE or RTM)
   - stub glacier `SGLC_`
   - stub wave model `SWAV_`   

<div>
<img src="https://github.com/NCAR/CTSM-Tutorial-2022/raw/main/images/Components.png" width="460" height="460" />
</div>


Key Definitions:
- **Active:** Simulation is using the code from the model during the run
- **Data:** Simulation is reading in data from a file for this component
- **Stub:** Component is not being used
<br><br> 

<div class="alert alert-block alert-warning" markdown="1">

<b>HINT:</b> Some compsets are "scientifically supported" and others are not. A scientifically supported compset just means that we've done some additional testing and evaluation with that compset.  You can use an unsupported compset (which is encouraged!), but will need to add the option <i>--run_unsupported</i> at the end of the <i>create_newcase</i> command line.

</div>


<div class="alert alert-block alert-info" markdown="1">

<b>TIP:</b> More information on model <a href="https://www.cesm.ucar.edu/models/cesm2" target="_blank">Configurations and Grids can be found on  the CESM website</a> (see <i>Configurations and Grids</i> subheading at the bottom of the page)

</div>



***
<h2> 1.2 Invoke case.setup </h2>

*This step configures the model so that it can compile* 
***

### 1.2.1 Move to your case directory

In [None]:
cd ~/NEON_cases/${neon_site}.transient

As before, this doesn't always render properly online:

`~/NEON_cases/${neon_site}.transient`

### 1.2.2 Set up your case

The `./case.setup` script:
1. configures the model
2. creates files to modify input data and run options

In [None]:
./case.setup

Using this command, we just configured the model and created the files to modify options & input data.
<div>
<img src="https://github.com/NCAR/CTSM-Tutorial-2022/raw/main/images/CaseSetup.png" width="670" />
</div>

***
<h2> 1.3 Build the executable  </h2>

*This step compiles the model*

It also takes a long time, so be patient.
***

The `./case.build` script:
1. Checks input data
2. Creates a build/run directory with model executable and namelists

TODO: using qcmd -- ./case.build fails with wget errors of met data 

In [None]:
./case.build


---
This takes some time, and will throw a bunch of errors... don't worry, just give it time


**When the build completes successfully you'll see a notice that the `MODEL BUILD HAS FINISHED SUCCESSFULLY`**


<div class="alert alert-block alert-warning">
   
<b>HINT:</b>  Building a NEON case requires that you have the `listing.csv` file in your `~/NEON_cases` directory.  
<p> Don't worry, you have this if you did the <b>Day0b tutorial</b>  <span>&#x1F609</span> </p>

</div>

You can read on, but before executing any code blocks in the notebook **wait for the model to build.**
This can take a while, especially while you're wating for your `qcmd` job to start and as code for the land model compiles.

<style> 
table td, table th, table tr {text-align:left !important;}
</style>
<div class="alert alert-block alert-info">

<b>NOTE:</b> The command <i>qcmd -- ./case.build</i> is specific for NCAR environments, including Cheyenne and cloud configurations, and runs the command on a computing node, reducing the load on the login node. <b>You must include <i>qcmd --</i> when running on Cheyenne</b>, and it's highly advised on shared cloud systems too.  On single-user cloud systems, it isn't needed, though it may speed up builds.

</div>

---

### 1.3.2 Customize your case by modifying user_nl_clm

Namelist are one way you can customize setting of your case. 

Namelist changes can be made after the model is built, which can save you time down the road!

We'll explore these features more in upcoming tutorials, but for now we just need to add an initial conditions file ( or restart file) to your case.

This is done automatically by the `run_neon` script that we used in the *Day0b tutorial*. If you aren't making code or parameter changes, you can use the restart files that we've prestaged on the NEON server.  These can be found in the *listing.csv* file. 

The command below will:
- use `awk` and `grep` to get the url of the restart file on NEON's servers.  
- `tail -1` gives us the last (most recent) file in the list
*Note, this is NOT common knowledge, at least for me.  I spent > an hour figuring this out!* 

In [None]:
awk -F "," '{print $1}' ~/NEON_cases/listing.csv | grep lnd/ctsm/initdata/${neon_site} | tail -1

Now we'll use `wget` to copy this file from the NEON servers into our run directory

*NOTE* run_neon handles this a bit differently. 
- **OPTIONAL EXTRA CREDIT** Can you see where the restart file is in your `Day0b` example?

In [None]:
wget -P ~/scratch/NEON_runs/${neon_site}.transient/run/ \
    eval $(awk -F "," '{print $1}' ~/NEON_cases/listing.csv | grep lnd/ctsm/initdata/${neon_site} | tail -1)
   

Finally we'll add the path to this restart to our user_nl_clm file

First print the file name

In [None]:
ls ~/scratch/NEON_runs/${neon_site}.transient/run/${neon_site}*.clm2.r.*

<div class="alert alert-block alert-warning">
   
<b>WARNING:</b>  You have to modify the code block below by pasting the string that prints from the cell above into the part that says XXX below
</div>

Here's an example of what I mean:

```
echo "finidat='/glade/u/home/wwieder/scratch/NEON_runs/ONAQ.transient/run/ONAQ.2023-01-13.clm2.r.0318-01-01-00000.nc' " >> user_nl_clm
```

In [None]:
echo "finidat = 'XXX' " >> user_nl_clm

Open your user_nl_clm file.  

**Are you pointing to the initial conditions file copied from NEON?**

If so, let's move on.

<h3> 1.4 Submit your case  </h3>

Now you're ready to submit the case!

*This step submits the model simulation*
- You'll also be downloading all the meterological data from NEON for your site, this also takes a little time and prints lots of information to the screen
***

In [None]:
./case.submit

When you submit a job, you will see confirmation that it successfully submitted:

### Congratulations! You've created and submitted a single point NEON run from scratch.

Next, you will probably want to check on the status of your jobs.

<div class="alert alert-block alert-info" markdown="1">

<b>TIP:</b> This is dependent on the scheduler that you're using. 

<ul>
<li>Cheyenne uses <b>PBS</b> where status is checked with <i>qstat -u $USER</i></li>
<li>This is also enabled in the cloud for you, try it in the code block below</li></ul>

If you want to stop the simulation, you can do so with qdel here (or on Cheyenne).
<ul>
<li> Find your Job ID after typing <i>qstat</i> </li>
<li> Type <i>qdel {Job ID}</i> </li>
</ul>
</div>

In [None]:
qstat -u $USER

---
Once your jobs are complete (or show the 'C' state under the 'Use' column, which means complete), we can check the CaseStatus file to ensure there were no errors and it completed successfully.  To do this, we'll 'tail' the end of the CaseStatus file:

In [None]:
tail  ~/NEON_cases/${neon_site}.transient/CaseStatus

You should see several lines, with the middle one saying 'case.run success'.  

Before that you'll see notifications about xml changes, case.setup, and case.submit, and case.run

<div class="alert alert-success">
<strong>Congratulations!</strong> 
    
You've created a CTSM case for the NEON tower you selected.

We'll build on these basics in additional tutorials to customize your simulations.
</div>




****
# 2. Locate model history files
Your simulation will likely take some time to complete. The information
provided next shows where the model output will be located while the
model is running and once the simulation is complete. We also provide
files from a simulation that is already complete so that you can do the
next exercises before your simulation completes. 

<div class="alert alert-block alert-info" markdown="1">

<b>When your simulation is running</b> history files go to your scratch directory: 

<ul>
<li><i>~/scratch/NEON_runs/{CASE}</i> </li>
</ul>

Within this directory you can find <i>/run</i> and <i>/bld</i> subdirectories.

<b>When the simulation is complete</b>, a short-term archive directory is created, and history files are moved here: 

<ul>
<li><i>~/scratch/NEON_runs/<b>archive</b>/{CASE}/lnd/hist/</i> </li>
</ul>

Note that files necessary to continue the run are left in the run directory: <i>~/scratch/NEON_runs/{CASE}/run</i>
</div>

## 2.1 Run directory
*What's in your run directory?*

In [None]:
ls ~/scratch/NEON_runs/${neon_site}.transient/run

*Do you see any log or history files?*
- log files look like `lnd.log*`
- history files look like `${neon_site}.transient.clm2.h0.*.nc`

You can keep running the cell above until you see log and history files, then the model is running.

---

## 2.2 Archive directory
*What's in your archive directory?*

In [None]:
ls ~/scratch/NEON_runs/archive/${neon_site}.transient/lnd/hist | head -5

You can drop the `| head -5` part of the cell above if you want to see ALL the files that have been archived.

If you don't see any history files your simulation is likely still running or in the queue (check using squeue or qstat). Check again before you leave today to see if your simulation completed and if the files were transferred to archive. Even if your run isn't finished, you can move on in this tutorial.

If you'd like to visualize these results you can go back to tutorial Day0c and modify to point to these results from the neon_site you just ran.

***
# 3. Create a Clone
<h4> This step is optional, but provides helpful information </h4>
    
Creating and building a new case is slow.  

Cloning cases can speed this up!

<h2>3.1 Create a clone </h2>

- Clones use the same resolution, compset, and output root.
- Clones can also share the same executable (build), which can save time!

<div class="alert alert-block alert-warning" markdown="1">

<b>WARNING:</b> If you're making code modifications be careful using clones that share the same build. The example below is a good way to run the same model configuration at different NEON sites.

</div>



In [None]:
# Move back to your source code directory
cd ~/CTSM/cime/scripts

# Change the 4-character new NEON site you want to run.
export neon_site2="TEAK"

# then create a cloned case.
./create_clone --case ~/NEON_cases/${neon_site2}.transient --clone ~/NEON_cases/${neon_site}.transient --user-mods-dirs ~/CTSM/cime_config/usermods_dirs/NEON/${neon_site2} --keepexe

As before, the code above doesn't always render properly online.  Here's what we're doing:

`./create_clone --case ~/NEON_cases/${neon_site2}.transient --clone ~/NEON_cases/${neon_site}.transient --user-mods-dirs ~/CTSM/cime_config/usermods_dirs/NEON/${neon_site2} --keepexe`
***
### **./create_clone**

<div class="alert alert-block alert-info">

<b>NOTE:</b> There is a lot of information that goes into creating a clone.

You can learn more about the options by typing <i>./create_clone --help</i> on the the command line or in a new code cell.

<b>We'll briefly go over some of the highlights here.</b>

</div>

---

### Required arguments to create a clone
There are 2 required arguments needed to create a clone.  These include: 
1. `--case`, this is the path and name of the new case your cloning.
<br><br>
2. `--clone`, this is the path and name of the existing case you want to clone.
<br><br>  
### Additional information we provided here were: 
3. `--user-mods-dirs` as before, this sets up the configuration of the case with a user modification directory that defines the location of the new site we're running, but it requires the full path to the `user-mods-dir` you want to use.
<br><br>  
4. `--keepexe` Point to original build from the case being cloned.

***
<h2> 3.2 Modify namelists </h2>

*Because you created a clone and point to an existing build we can skip the `case.setup` and `case.build` steps* 
***

### 3.2.1 Move to your case directory 

In [None]:
cd ~/NEON_cases/${neon_site2}.transient

### 3.2.2 Make namelist changes
We'll use the same series of commands as before to get the restart file into our run directory

In [None]:
wget -P ~/scratch/NEON_runs/${neon_site2}.transient/run/ \
    eval $(awk -F "," '{print $1}' ~/NEON_cases/listing.csv | grep lnd/ctsm/initdata/${neon_site2} | tail -1)
   

Print the file name for the restart file

In [None]:
ls ~/scratch/NEON_runs/${neon_site2}.transient/run/${neon_site2}*.clm2.r.*

And now paste it into `user_nl_clm`  

**REMEMBER** you need to replace XXX with the results of the cell above

In [None]:
echo "finidat = 'XXX' " >> ~/NEON_cases/${neon_site2}.transient/user_nl_clm

***
<h2> 3.3 Submit the case </h2>

*Because you created a clone and point to an existing build we can skip the `case.setup` and `case.build` steps* 
***

- As before, you'll be downloading all the meterological data, which takes a little time and prints lots of information to the screen
***


In [None]:
./case.submit

<div class="alert alert-success">
<strong>Congratulations!</strong> 
    
You've created and run a clone of CLM for the NEON tower you selected.
</div>


You can track progress on your run using steps outlined in part #2 of this tutorial.
You can also work on visualizing your results for different sites using code provided in `Day_0c` materials.




If you're developing this tutorial:
Before saving and pushing this code to github go to `Kernel` and `Restart kernel and clear all outputs...`