# CTSM Simulations at NEON Tower Sites -- Tutorial 

This tutorial is an introduction to running CLM at NEON tower sites.  It will guide you through running a simulation and provides example visualization of the simulation results.  

***
**The tutorial uses a Jupyter Notebook.** A Jupyter Notebook is an interactive computing environment that enables the creation and sharing of documents that contain discrete cells of text or documentation and executable code, including plots. It allows users to access, run, and edit the code in a browser. To interact with this notebook:

- Execute or "run" cells of executable code (cells denoted with '[ ]:') using the play button in the menu at the top (see below image)

- The results of running code, such as plots, will appear below that cell

- Each step must be followed in order, otherwise you may encounter errors

![run cell](https://problemsolvingwithpython.com/02-Jupyter-Notebooks/images/run_cell.png)

For more information on Jupyter notebooks please see their [user guide](https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/what_is_jupyter.html). 

***

## In this tutorial

The tutorial has several components. Below you will find steps to: 
1. Set up and run a simulation
2. Explore CTSM model data
3. Explore NEON observation data
4. Compare CTSM and NEON latent heat flux data

Note that step 1 is the necessary step to run the simulation. If you would like to explore the available data, please continue with steps 2-4 explore the available data.
If you want

****
<h2> 1. Set up and run a simulation</h2>

*This step guides you through running a CTSM model simulation for a NEON flux tower using meteorology observations from the tower. Here, the many steps to run CTSM are condensed into a single function. If you would like a more advanced tutorial to change the model configuration or code, please see [tutorial not yet available] for additional instructions.* <br>

***
<h4>1.1 Select a NEON tower site to simulate.</h4>

NEON towers available for simulation include:  

>ABBY, BART, BLAN, CPER, DCFS, DSNY, GRSM, HARV, JERC, JORN, KONZ, MOAB, NOGP, OAES, ORNL, OSBS, SCBI, SERC, SOAP, SRER, STEI, TALL, TREE, UKFS, UNDE, WOOD.  

The [NEON website](https://www.neonscience.org/field-sites/explore-field-sites) describes tower sites in more detail.

Before running the below code, keep in mind:
- The executable code below selects the **ABBY** site for simulation. If you would like to simulate a different tower, select the NEON tower from the above list and change the 4-character site name inside the quotes below.
- The site must be defined correctly so you do not get an error in the next step.
    - use all capital letters 
    - double-check that the letters match one of the above site names

*Execute the the cell below by clicking on the play button in the menu bar above.*

In [2]:
#Change the 4-character NEON site below.
import os
os.environ['site'] = 'ABBY'

***
<h4>1.2 Create and run a NEON tower site simulation.</h4>  

The executable code below runs a python script, run_neon.py. This script

* creates (`create_newcase`)
* sets up (`case.setup`)
* compiles (`case.build`)
* runs (`case.submit`)
    
a simulation for the tower site you specified above for all years that tower meteorology data are available.

*Run the cell below. Note that this step downloads the input data (~7GB) required for the model simulation. It may take several minutes to run, so please be patient. Many things will print below the cell as data are downloaded and the model compiles.* 

In [2]:
run_neon --neon-sites $site --output-root ~/CLM-NEON

Download finished successfully for listing.csv .
---- building a base case -------
base_case_name : ABBY
user_mods_dir  : /opt/ncar/cesm2/cime_config/usermods_dirs/NEON/ABBY
---- creating a base case -------
Compset longname is 2000_DATM%1PT_CLM51%BGC_SICE_SOCN_SROF_SGLC_SWAV
Compset specification file is /opt/ncar/cesm2/cime_config/config_compsets.xml
Automatically adding SESP to compset
Compset forcing is 1972-2004
ATM component is  Data driven ATM single point tower site data set 
LND component is clm5.1:BGC (vert. resol. CN and methane):
ICE component is Stub ice component
OCN component is Stub ocn component
ROF component is Stub river component
GLC component is Stub glacier (land ice) component
WAV component is Stub wave component
ESP component is Stub external system processing (ESP) component
Pes     specification file is /opt/ncar/cesm2/cime_config/config_pes.xml
Compset specific settings: name is RUN_STARTDATE and value is 2000-01-01
Could not find machine match for '518fb6c9b

<br><br>You might see lines print that say 'ERROR 404: Not Found' and this is ok. You will also likely notice that many files are moved after the simulation completes. The model is done running when you see: 

>st_archive completed <br>
>Submitted job case.run with id None
>Submitted job case.st_archive with id None


If you see this message, great! You've created and run CTSM for the NEON tower you selected, and we'll look at simulation data next.

Note that you can also run the above line of code in your terminal shell but you will need to point to the location of the file:
`/opt/ncar/cesm2/tools/site_and_regional/run_neon.py --neon-sites <site> --output-root <path>`

More detail about setting up and modifying simulations will be available in an upcoming tutorial. Additional information about CLM is available [on the website](https://www.cesm.ucar.edu/models/cesm2/land/), including [technical documentation](https://escomp.github.io/ctsm-docs/versions/release-clm5.0/html/tech_note/index.html), a [user's guide](https://escomp.github.io/ctsm-docs/versions/release-clm5.0/html/users_guide/index.html), and a [quickstart guide](https://escomp.github.io/CESM/release-cesm2/quickstart.html#create-a-case) to running other model configurations.
<br><br>
***

<h4> Optional </h4>

If you would like to see additional and more advanced options for running NEON tower site simulations, many are listed in the python script you just executed using a `--help` option. <p>

*Executing the below cell is optional and will print options available options in the `run_neon` python script*

In [6]:
run_neon --help

usage: run_neon.py [-h] [-d] [-v] [-s]
                   [--neon-sites {ABBY,BARR,BART,BLAN,BONA,CLBJ,CPER,DCFS,DEJU,DELA,DSNY,GRSM,GUAN,HARV,HEAL,JERC,JORN,KONA,KONZ,LAJA,LENO,MLBS,MOAB,NIWO,NOGP,OAES,ONAQ,ORNL,OSBS,PUUM,RMNP,SCBI,SERC,SJER,SOAP,SRER,STEI,STER,TALL,TEAK,TOOL,TREE,UKFS,UNDE,WOOD,WREF,YELL,all} [{ABBY,BARR,BART,BLAN,BONA,CLBJ,CPER,DCFS,DEJU,DELA,DSNY,GRSM,GUAN,HARV,HEAL,JERC,JORN,KONA,KONZ,LAJA,LENO,MLBS,MOAB,NIWO,NOGP,OAES,ONAQ,ORNL,OSBS,PUUM,RMNP,SCBI,SERC,SJER,SOAP,SRER,STEI,STER,TALL,TEAK,TOOL,TREE,UKFS,UNDE,WOOD,WREF,YELL,all} ...]]
                   [--base-case BASE_CASE_ROOT] [--output-root OUTPUT_ROOT]
                   [--overwrite] [--setup-only] [--rerun] [--no-batch]
                   [--run-type {ad,postad,transient,sasu}]
                   [--run-length RUN_LENGTH] [--start-date START_DATE]
                   [--end-date END_DATE] [--run-from-postad]

|------------------------------------------------------------------|
|---------------------  Instruc

________
<h2> 2. Explore CLM model data </h2>

Now that your simulation completed, we can explore the data.
There are countless ways of analyzing and processing model data. This tutorial uses matplotlib. 
***

<h4> 2.1 Locate model data </h4>
When the simulation completed, the data are transferred to the `archive` directory. There are files that include data for every day of the simulation, as well as files that average model variables monthly. <p>

*Run the code below to see a subset of the files listed.*

In [None]:
!ls ~/archive/$site.transient/lnd/hist/*2018*.nc |head -20

The list includes the file path and file names.

The file names are automatically generated and include
* the simulation name, which includes
    * the NEON site
    * the type of simulation (here the simulation is "transient", meaning the model ran for the full length of available data)
* the date of simulated data

The files are saved in netcdf format (denoted with the `.nc` file extension), a file format commonly used for storing large, multi-dimensional scientific variables.
Netcdf files are platform independent and self-describing; each file includes metadata that describes the data, including: **variables**, **dimensions**, and **attributes**.


*[Is this useful or too much information?]* The figure below provides an example of data strucutre in a netcdf file. For example, this dataset has two variables (temperature and pericipation) stored in three-dimensional variables. It also has three dimensions (x, y, and time) which are one dimensional coordinates. It also includes two-dimensional coordinates (latitude and longitude). 

![Netcdf](https://gdfa.ugr.es/python/climate_data/img/xarray2.png)


The NEON tower simulations generate two types of files:
* `*h0*`: Variables that are averaged monthly. One file is available for every month of the simulation.
* `*h1*`: Variables that are recorded every 30 minutes. Values are aggregated into one file for each day of the simulation. Each file includes 48 data points.
****


<h4>2.2 Preprocess and Explore Model Simulation Data </h4>

This step has several components, which are broken into smaller steps


<h5>2.2.1 Load Python Libraries</h5>

*Run the below code to import the required python libraries*

In [None]:
## TODO: clean up imports
#Import Libraries
%matplotlib inline

import os
import sys
import glob
import time
import argparse
import requests

import numpy as np
import pandas as pd
import xarray as xr


from glob import glob
from os.path import join, expanduser

import matplotlib
import matplotlib.pyplot as plt

from scipy import stats


<h5>2.2.2 Select Simulation Year</h5>

For simplicity, we focus on analyzing and evaluating a single year of data. <p>

The code below uses data for **2018**, but data are available through this year. You can select a different year by changing the year in the quotes below. <p> 
*Run the below code to identify the year of interest for making the plots. Note that the same year will be used to download observational data below*

In [None]:
#Specify the year below
year = "2018"

#Reading in the tower site selected above as a python variable
neon_site = os.environ['site']

<h5>2.2.3 Load model data files</h5>

For reading the netCDF files, we are using the `xarray` Python package. You can learn more about [xarray on this website](http://xarray.pydata.org/en/stable/).

>The variable `sim_path` points to the local directory where simulation data are stored. *If you changed the location in the above code, please update this path.*

*Run the code below to load the model data files. Note that the code will print how many files were loaded*

In [None]:
sim_path = "/home/user/archive/"+neon_site+".transient/lnd/hist/"
sim_files = sorted(glob(join(sim_path,neon_site+".transient.clm2.h1."+year+"*.nc")))

print("All Simulation files: [", len(sim_files), "files]")
#print(*sim_files,sep='\n')

<h5> 2.2.4 Open model data files </h5>

Here we use the python function `xarray.open_mfdataset`, which opens multiple netcdf files as a single xarray dataset. For more information on this function, check [this link](
http://xarray.pydata.org/en/stable/generated/xarray.open_mfdataset.html).

*Run the below cell to read in the data files. Note that this step might take a few minutes.*

In [None]:
start = time.time()

ds_ctsm = xr.open_mfdataset(sim_files, decode_times=True, combine='by_coords',parallel=True)

end = time.time()
print("Reading all simulation files took:", end-start, "s.")


<h5> 2.2.5 Optional Step: Explore simulation data </h5>

*Run the below cell to find more information about the data*

In [None]:
ds_ctsm

Here you can click on `Data Variables`, `Attributes`, `Deminsions`, and `coordinates` to expand and see the details. 

If you click on `Data Variables`, you will see a list of all the available variables. You can click on each variable to see a description of the variable (the `long_name`) and its `units`, as well as other information. Here are two questions to consider:

**Questions to consider** 
1. What are the avaibale variables in the dataset?
2. What is the `long_name` and `unit` of the variable `FSH`? 
3. Can you find the dimensions of this variable? 

**Tip**: Xarray has built-in plotting functions. For quick inspection of a variable, we can use `.plot()` to see it. Below, let's make a quick plot of the Gross Primary Production (`GPP`) variable:

In [None]:
ds_ctsm['GPP'].plot()

In [None]:
#Negin: Should we remove the next two blocks of code? 
#what we want from these files Not all the above...
plot_vars = ['TBOT','FSH','GPP','AR','HR','FCEV', 'FCTR', 'FGEV','FSA','FIRA']

In [None]:
df_ctsm = pd.DataFrame({'time':ds_ctsm.time})

for var in plot_vars:
    field = np.ravel ( ds_ctsm[var])     
    df_ctsm[var]=field


________
<h2> 3. Explore NEON Tower Observation Data </h2>

<h4> 3.1 Load NEON data </h4>
    
The following code will download the NEON observational data files for the site and year specified above. 

*Run the two cells of code below. You will see what files are available.*

In [None]:
eval_dir = "/home/user/evaluation_files/"

download_eval_files(neon_site, eval_dir)

eval_path = os.path.join('/home/user/evaluation_files/',neon_site)

eval_files = sorted(glob(join(eval_path,neon_site+"_eval_"+year+"*.nc")))

print("All Observation files:")
print(*eval_files,sep='\n')

In [None]:
start = time.time()

ds_eval = xr.open_mfdataset(eval_files, decode_times=True, combine='by_coords')

end = time.time()
print("Reading all observation files took:", end-start, "s.")

<h4> 3.2 Optional Step: Explore NEON data </h4>

Similar to above, you can explore NEON data by clicking to see details about the variables, coordinate, and attributes.

In [None]:
ds_eval

Let's quickly inspect the Net Radiation (`Rnet`) variable by making a quick plot:

In [None]:
ds_eval['Rnet'].plot()

In [None]:
#what we want from these files not all the above...
eval_vars = ['NEE','FSH','EFLX_LH_TOT','GPP']

__________
## 4. Compare CTSM and NEON latent heat flux data

So far we read files of observational and model data. In this section we will compare observed and simulated **latent heat fluxes**. You can also explore other available variables with this code.


First we need to process the data into a common format for easier analysis. 

*Run the below cell of code*

In [None]:
#Convert to a Pandas Dataframe for easier handling:
df_eval = pd.DataFrame({'time':ds_eval.time})

for var in eval_vars:
    field = np.ravel ( ds_eval[var])     
    df_eval[var]=field

#-- extract year, month, day, hour information from time
df_eval['year'] = df_eval['time'].dt.year
df_eval['month'] = df_eval['time'].dt.month
df_eval['day'] = df_eval['time'].dt.day
df_eval['hour'] = df_eval['time'].dt.hour

#-- make df_all that includes both obs and sim
df_all = df_eval

plot_vars = ['TBOT','FSH','GPP','AR','HR','FCEV', 'FCTR', 'FGEV','FSA','FIRA']

#-- add simulation data to df_all:
for var in plot_vars:
    sim_var_name = "sim_"+var
    #-- shift simulation data by one
    df_all[sim_var_name]=df_ctsm[var].shift(-1).values

## What is latent heat flux?

Latent heat flux is the energy for water evaporation from the ecosystem. It is measured by NEON towers and is also simulated by CLM. Here we are going to explore how well CLM simulates the observed tower data.

Latent heat flux is a combination of plant transpiration, evaporation from leaf surfaces (e.g., from dew, after precipitation events), and evaporation from the soil. 

NEON towers cannot distinguish how much each of these processes contributes to latent heat flux, but CLM simulations can help us to disentangle this. 

Not all the observation variables are in the CLM history output files. But we can calculate them from their components. 

For example, we can calulate the latenet heat flux as the sum of of plant transpiration, canopy evaporation, and ground evaporation. 

$$ Latent Heat Flux = FCEV +FCTR +FGEV $$

$FCEV$ : Canopy evaporation in W/m^2

$FCTR$ : Canopy transpiration in W/m^2

$FGEV$ : Ground evaporation in W/m^2


In [None]:
clm_var = 'sim_EFLX_LH_TOT'

#EFLX_LH_TOT = FCEV + FCTR +FGEV
df_all [clm_var] = df_ctsm['FCEV'].shift(-1).values \
                 + df_ctsm['FCTR'].shift(-1).values \
                 + df_ctsm['FGEV'].shift(-1).values

# Simple Visualization

Let's create a time-series plot that compares daily average latent heat flux from observation (neon) and simulations. 

But first, we should calculate the daily average. 

In [None]:
df_daily = df_all.groupby(['year','month','day']).mean().reset_index()
df_daily['time']=pd.to_datetime(df_daily[["year", "month", "day"]])

In [None]:
plot_var = 'EFLX_LH_TOT'
sim_var = 'sim_'+plot_var

plot_var_desc = "Latent Heat Flux"
plot_var_unit= "Wm-2"

Now, that we calculate the daily average we can easily create the plots using Python matplotlib package:

In [None]:
plt.figure(num=None, figsize=(13, 5),  facecolor='w', edgecolor='k')
        
ax = plt.gca()
df_daily.plot ( x= 'time', y = plot_var , marker = 'o' ,ax =ax , color = 'b',label="NEON")
df_daily.plot ( x= 'time', y = 'sim_'+plot_var , marker = 'o' ,ax =ax , color = 'r',label="CLM")

plt.xlabel('Time', fontsize=17)
plt.ylabel(plot_var_desc+" ["+plot_var_unit+"]", fontsize=17)
plt.title(year+" "+neon_site, fontweight='bold',fontsize=17)
plt.show()
plt.savefig("2018_ABBY_TSeries_Daily_Ave_V2.png")

Now, let's add the daily standard deviation as a shaded area on top:

In [None]:
df_daily_std = df_all.groupby(['year','month','day']).std().reset_index()
df_daily_std['time'] = pd.to_datetime(df_daily_std[["year", "month", "day"]])

plt.figure(num=None, figsize=(13, 5),  facecolor='w', edgecolor='k')

plt.plot ( df_daily.time, df_daily[plot_var], marker = 'o' , color = 'b',label="NEON")
plt.plot ( df_daily.time, df_daily[sim_var], marker = 'o' , color = 'r',label="CLM")

plt.fill_between(df_daily.time, df_daily.EFLX_LH_TOT-df_daily_std.EFLX_LH_TOT, df_daily.EFLX_LH_TOT+df_daily_std.EFLX_LH_TOT ,alpha=0.1, color = 'b')
plt.fill_between(df_daily.time, df_daily.sim_EFLX_LH_TOT-df_daily_std.sim_EFLX_LH_TOT, df_daily.sim_EFLX_LH_TOT+df_daily_std.sim_EFLX_LH_TOT ,alpha=0.1, color = 'r')

plt.legend()
plt.xlabel('Time', fontsize=17)
plt.ylabel(plot_var_desc+" ["+plot_var_unit+"]", fontsize=17)
plt.title(year+" "+neon_site, fontweight='bold',fontsize=17)
plt.show()
plt.savefig("2018_ABBY_TSeriesShaded_Daily_Ave_V2.png")

We can see that there are areas where CLM under-predicts or over-predicts the NEON observation data.

**Question) At your site of interest, when does CLM shows the best and worst performance in predicting Latent Heat Flux ?**

**Question) When is the Latent Heat Flux highest at this site? When is it lowest?**

**Question) Does CLM match NEON tower observations?**


*****

Now we can take a look at the components of Latent Heat Flux. As we mentioned above the componenets of the fluxes are not available in the neon dataset. 

But we can find the components from the CLM simulation data. Therefore, we can find out how different components of Latent Heat Flux is contributing to the total over time. 

For this, let's look at the monthly average data:

In [None]:
df_monthly = df_all.groupby(['year','month']).mean().reset_index()
df_monthly["day"]=15
df_monthly['time']=pd.to_datetime(df_monthly[["year", "month","day"]])


Now let's make a barplot showing components of Latent Heat Flux over different months:


In [None]:
def line_format(label):
    """
    Helper function to convert time label to the format of pandas line plot
    """
    month = label.month_name()[:3]
    if month == 'Jan':
        month += f'\n{label.year}'
    return month


plt.figure(num=None, figsize=(13, 5),  facecolor='w', edgecolor='k')
        
ax = plt.gca()

df_monthly.plot ( x= 'time', y = 'EFLX_LH_TOT' , marker = 'o' ,ax =ax , color = 'black',label="NEON Latent Heat Flux",use_index=False)
df_monthly[['time','sim_FCEV','sim_FCTR','sim_FGEV']].plot.bar ( x= 'time',stacked='True',ax=ax,rot=0)

ax.set_xticklabels(map(line_format, df_monthly.time))

plt.xlabel('Time', fontsize=18)
plt.ylabel('Latent Heat Flux', fontsize=16)
        
plt.title(year+" "+neon_site, fontweight='bold')
plt.show()

Here we are showing the observation (NEON) Latent Heat Flux over the barplot.

**Question) Do trends in Canopy Transpiration (FCTV) mirror the trends in latent heat fluxes? What times of year are trends most similar?**

**Question) For your neon site, when does the Canopy Transpiration (FCTV) is the highest contributor to the total CLM latent heat flux?**

**Question)  Do trends in Canopy Transpiration (FCTV) mirror the trends in latent heat fluxes?  Are these also times when there are biases in CLM simulations of latent heat flux?**

**Question) During which months, Canopy Evaporation (FCEV) is the lowest contributor to the total CLM latent heat flux?**

**Question) For each month, what does the total bar length show in this plot?**


For checking how contrbitutions of each component of Latent Heat Flux changes over different months, let's look at a more advanced plot with two y axis. 

In [None]:
df_monthly ['sim_FCEV_perc']= df_monthly ['sim_FCEV']/(df_monthly ['sim_FCEV']+df_monthly ['sim_FCTR']+df_monthly ['sim_FGEV'])*100
df_monthly ['sim_FCTR_perc']= df_monthly ['sim_FCTR']/(df_monthly ['sim_FCEV']+df_monthly ['sim_FCTR']+df_monthly ['sim_FGEV'])*100
df_monthly ['sim_FGEV_perc']= df_monthly ['sim_FGEV']/(df_monthly ['sim_FCEV']+df_monthly ['sim_FCTR']+df_monthly ['sim_FGEV'])*100

plt.figure(num=None, figsize=(13, 7),  facecolor='w', edgecolor='k')
        
ax = plt.gca()

#df_monthly.plot ( x= 'time', y = 'EFLX_LH_TOT' , marker = 'o' ,ax =ax , color = 'black',label="NEON EFLX_LH_TOT",use_index=False)
df_monthly[['time','sim_FCEV_perc','sim_FCTR_perc','sim_FGEV_perc']].plot.bar ( x= 'time',stacked='True',ax=ax,rot=0)

ax.set_xticklabels(map(line_format, df_monthly.time))
ax.set_ylabel('% Components Latent Heat Flux', fontsize=16)
ax.legend(bbox_to_anchor=(1.04,0), loc="lower left",labels=["% FCEV","% FCTR","% FGEV"])
ax2 = ax.twinx()
df_monthly.plot ( x= 'time', y = 'sim_EFLX_LH_TOT' , marker = 'o' ,ax =ax2 , color = 'b',label="CLM Latent Heat",use_index=False)
df_monthly.plot ( x= 'time', y = 'EFLX_LH_TOT' , marker = 'o' ,ax =ax2 , color = 'r',label="NEON Latent Heat",use_index=False)

plt.ylabel('Latent Heat Flux [W/m2]', fontsize=16)
plt.legend(bbox_to_anchor=(1.04,1), loc="upper left")

plt.title(year+" "+neon_site, fontweight='bold')
plt.show()

______________

For better quantifying and describing the performance of CLM simulation and compare it with NEON observation, we can also utilize scatter plots.

First, let's define a generic function that plot scatter diagrams and add a regression line on top of it. 

In [None]:
def detailed_scatter (x, y, color):
    plt.scatter (x,y, marker="o",color = color)
    slope, intercept, r_value, p_value, std_err = stats.linregress(x,y)
    line = slope*x+intercept
    plt.plot(x,line,'black', label='y={:.2f}x+{:.2f}'.format(slope,intercept)+" (R2="+"{:.2f}".format(r_value)+")")
    plt.legend(fontsize=13)

In [None]:
plt.figure(num=None, figsize=(15, 5),  facecolor='w', edgecolor='k')

plt.subplot(1, 3, 1)
detailed_scatter (df_daily.sim_FCEV, df_daily.sim_EFLX_LH_TOT, '#1f77b4')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Canopy Evaporation [Wm-2]')

plt.subplot(1, 3, 2)
detailed_scatter (df_daily.sim_FCTR, df_daily.sim_EFLX_LH_TOT, '#ff7f0e')

plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Canopy Transpiration [Wm-2]')

plt.subplot(1, 3, 3)
detailed_scatter (df_daily.sim_FGEV, df_daily.sim_EFLX_LH_TOT,'#2ca02c')

plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Ground Evaporation [Wm-2]')

plt.suptitle(year+" "+neon_site+" Scatter Plots", fontweight='bold')

plt.tight_layout()
plt.show()

Similarly, we can create these plots for different seasons:

In [None]:
df_daily['season'] = ((df_daily['month']%12+3)//3).map({1:'DJF', 2: 'MAM', 3:'JJA', 4:'SON'})


In [None]:

fig= plt.figure(num=None, figsize=(18,12),  facecolor='w', edgecolor='k')

season = 'MAM'
df_season =  df_daily[df_daily['season']==season]

plt.subplot(3, 4, 1)
detailed_scatter (df_season.sim_FCEV, df_season.sim_EFLX_LH_TOT, '#1f77b4')
plt.title(season, fontweight='bold')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Canopy Evaporation [Wm-2]')

plt.subplot(3, 4, 5)
detailed_scatter (df_season.sim_FCTR, df_season.sim_EFLX_LH_TOT, '#ff7f0e')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Canopy Transpiration [Wm-2]')


plt.subplot(3, 4, 9)
detailed_scatter (df_season.sim_FGEV, df_season.sim_EFLX_LH_TOT,'#2ca02c')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Ground Evaporation [Wm-2]')


season = 'JJA'
df_season =  df_daily[df_daily['season']==season]

plt.subplot(3, 4, 2)
detailed_scatter (df_season.sim_FCEV, df_season.sim_EFLX_LH_TOT, '#1f77b4')
plt.title(season, fontweight='bold')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Canopy Evaporation [Wm-2]')

plt.subplot(3, 4, 6)
detailed_scatter (df_season.sim_FCTR, df_season.sim_EFLX_LH_TOT, '#ff7f0e')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Canopy Transpiration [Wm-2]')

plt.subplot(3, 4, 10)
detailed_scatter (df_season.sim_FGEV, df_season.sim_EFLX_LH_TOT,'#2ca02c')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Ground Evaporation [Wm-2]')


season = 'SON'
df_season =  df_daily[df_daily['season']==season]

plt.subplot(3, 4, 3)
detailed_scatter (df_season.sim_FCEV, df_season.sim_EFLX_LH_TOT, '#1f77b4')
plt.title(season, fontweight='bold')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Canopy Evaporation [Wm-2]')

plt.subplot(3, 4, 7)
detailed_scatter (df_season.sim_FCTR, df_season.sim_EFLX_LH_TOT, '#ff7f0e')
plt.ylabel('Latent Heat[Wm-2]')
plt.xlabel('Canopy Transpiration [Wm-2]')

plt.subplot(3, 4, 11)
detailed_scatter (df_season.sim_FGEV, df_season.sim_EFLX_LH_TOT,'#2ca02c')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Ground Evaporation [Wm-2]')

season = 'DJF'
df_season =  df_daily[df_daily['season']==season]

plt.subplot(3, 4, 4)
detailed_scatter (df_season.sim_FCEV, df_season.sim_EFLX_LH_TOT, '#1f77b4')
plt.title(season, fontweight='bold')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Canopy Evaporation [Wm-2]')

plt.subplot(3, 4, 8)
detailed_scatter (df_season.sim_FCTR, df_season.sim_EFLX_LH_TOT, '#ff7f0e')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Canopy Transpiration [Wm-2]')

plt.subplot(3, 4, 12)
detailed_scatter (df_season.sim_FGEV, df_season.sim_EFLX_LH_TOT, '#2ca02c')
plt.ylabel('Latent Heat [Wm-2]')
plt.xlabel('Ground Evaporation [Wm-2]')

fig.suptitle(year+" "+neon_site+ " Seasonal Scatter Plots", fontweight='bold')
fig.tight_layout()

plt.show()

--------
Now, let's take a look at how latent heat flux changes during the day. For this we should take a look at the diurnal cycle of the plots. 

For this we need to calculate the average of the same hour from different days. 

In [None]:
diel_df_mean = df_all.groupby('hour').mean().reset_index()
diel_df_std = df_all.groupby('hour').std().reset_index()

diel_df_mean['EFLX_LH_TOT_bias'] = diel_df_mean['sim_EFLX_LH_TOT']- diel_df_mean['EFLX_LH_TOT']
print (diel_df_mean)

In [None]:
plt.figure(num=None, figsize=(13, 9),  facecolor='w', edgecolor='k')
        
plt.subplot(2, 1, 1)  
        
plt.plot ( diel_df_mean.hour, diel_df_mean['EFLX_LH_TOT'], marker = 'o' , color = 'b',label="NEON")
plt.plot ( diel_df_mean.hour, diel_df_mean['sim_EFLX_LH_TOT'], marker = 'o' , color = 'r',label="CLM")

plt.fill_between(diel_df_mean.hour, diel_df_mean.EFLX_LH_TOT-diel_df_std.EFLX_LH_TOT, diel_df_mean.EFLX_LH_TOT+diel_df_std.EFLX_LH_TOT ,alpha=0.1, color = 'b')
plt.fill_between(diel_df_mean.hour, diel_df_mean.sim_EFLX_LH_TOT-diel_df_std.sim_EFLX_LH_TOT, diel_df_mean.sim_EFLX_LH_TOT+diel_df_std.sim_EFLX_LH_TOT ,alpha=0.1, color = 'r')
plt.ylabel('Latent Heat Flux [Wm-2]', fontsize=16)

plt.title("Diurnal Cycle of Latent Heat Flux "+neon_site+" "+year, fontweight='bold')
plt.legend()
plt.subplot(4, 1, 3)  

ax = plt.gca()    
diel_df_mean.plot ( x= 'hour', y = 'EFLX_LH_TOT_bias' , marker = 'o' ,ax =ax , color = 'black',label='Bias')
plt.xlabel('Hour', fontsize=18)
plt.ylabel('Bias', fontsize=16)
plt.show()

In [None]:
diel_df_mean['sim_FCEV_perc']= diel_df_mean ['sim_FCEV']/(diel_df_mean ['sim_FCEV']+diel_df_mean ['sim_FCTR']+diel_df_mean ['sim_FGEV'])*100
diel_df_mean ['sim_FCTR_perc']= diel_df_mean ['sim_FCTR']/(diel_df_mean ['sim_FCEV']+diel_df_mean ['sim_FCTR']+diel_df_mean ['sim_FGEV'])*100
diel_df_mean ['sim_FGEV_perc']= diel_df_mean ['sim_FGEV']/(diel_df_mean ['sim_FCEV']+diel_df_mean ['sim_FCTR']+diel_df_mean ['sim_FGEV'])*100


In [None]:
from matplotlib import gridspec

fig= plt.figure(num=None, figsize=(13, 10),  facecolor='w', edgecolor='k')

gs = gridspec.GridSpec(3, 1, height_ratios=[3, 1, 2]) 

ax0 = plt.subplot(gs[0])
        
ax0.plot ( diel_df_mean.hour, diel_df_mean['EFLX_LH_TOT'], marker = 'o' , color = 'b',label="NEON")
ax0.plot ( diel_df_mean.hour, diel_df_mean['sim_EFLX_LH_TOT'], marker = 'o' , color = 'r',label="CLM")

ax0.fill_between(diel_df_mean.hour, diel_df_mean.EFLX_LH_TOT-diel_df_std.EFLX_LH_TOT, diel_df_mean.EFLX_LH_TOT+diel_df_std.EFLX_LH_TOT ,alpha=0.1, color = 'b')
ax0.fill_between(diel_df_mean.hour, diel_df_mean.sim_EFLX_LH_TOT-diel_df_std.sim_EFLX_LH_TOT, diel_df_mean.sim_EFLX_LH_TOT+diel_df_std.sim_EFLX_LH_TOT ,alpha=0.1, color = 'r')
plt.ylabel('Latent Heat Flux [Wm-2]', fontsize=16)
plt.legend()
plt.title("Diurnal Cycle of Latent Heat Flux "+neon_site+" "+year, fontweight='bold')

ax1 = plt.subplot(gs[1])
diel_df_mean.plot ( x= 'hour', y = 'EFLX_LH_TOT_bias' , marker = 'o' ,ax =ax1 , color = 'black',label='Bias')
plt.xlabel('', fontsize=18)
plt.ylabel('Bias', fontsize=16)

ax2 = plt.subplot(gs[2])
diel_df_mean[['hour','sim_FCEV','sim_FCTR','sim_FGEV']].plot.bar ( x= 'hour',stacked='True',ax=ax2,rot=0)
plt.ylabel('%Latent Heat Components', fontsize=16)
plt.xlabel('Hour', fontsize=18)

plt.tight_layout()

fig.show()

**Question: How does different components of LH Flux contributes to the total latent heat flux during the day?**

___________________
## Soil Profiles:

Now, let's take a look at the soil profile ***** (Ask Danica)



In [None]:
ds_ctsm.TSOI.isel(levgrnd=(slice(0,9))).plot(x="time",yincrease=False, robust=True,cmap='YlOrRd',figsize=(15, 5))


<br><br>Congratulations - you've walked through the basic steps of running a neon case and visualize it.  You're all set for some of the more advanced tutorials now. 