![<CW3E Logo>](https://cw3e.ucsd.edu/images/cw3e_logo_files/wetransfer-b4ff74/CW3E%20Final%20Logo%20Suite/4-Horizontal-Acronym/Digital/PNG/CW3E-Logo-Horizontal-Acronym-FullColor.png "Center for Western Weather and Water Extremes Logo")

# Creating Configuration Files

---

## Overview
---

One way someone can make their code more replicable and user-friendly through the use of an universal configuration file. Environmental variables/global parameters are defined in a configuration file rather than directly in the scripts. This removes the need for a user to manually go, find, and edit a hard-coded environmental variable or parameter within the every script in a workflow. Hence, by creating a configuration file, you are helping with your program's efficiency and user productivity. 

This notebook will go over:
1. Generating a configuration file for Bash scripts
2. Generating a configuration file for Python scripts
3. Implimenting these configuration files into their respective scripts

## Prerequisites
---

| **Concept** | **Importance** | **Notes** |
|:-----------:|:--------------:|:---------:|
| Bash           | Necessary              | Understand on an intermediate level         |
| Python           | Necessary              | Understand on an intermediate level         |

* **Time to Learn:** 1 hour est.

## Generating an Executable Bash Configuration File

---

To create a configuration file, make sure to be in the same working directory as your scripts. For this workflow, I will be using the [MET-tools scripts](https://github.com/CW3E/MET-tools/tree/main/Grid-Stat) which are in the following directory for me.

In [1]:
cd

/home/jlconti


In [2]:
cd MET-tools/Grid-Stat/

/home/jlconti/MET-tools/Grid-Stat


In [3]:
ls *.sh

[0m[38;5;34mbatch_gridstat.sh[0m*         [38;5;34mpre_processing_config.sh[0m*  [38;5;34mrun_wrfout_cf.sh[0m*
[38;5;34mbatch_gridstat_global.sh[0m*  [38;5;34mrun_gridstat.sh[0m*
[38;5;34mbatch_wrfout_cf.sh[0m*        [38;5;34mrun_vxmask.sh[0m*


Once in the desired directory, we can generate a shell configuration file by running the following syntax in the terminal

```
vim <configuration_filename>.sh
```

This will generate a shell script with your desired configuration file name. For the purposes of this notebook, the configuration file I made was named `pre_processing_config.sh`. Before implementing this file into our code, however, we must make it an executable file. First, we must add the following line at the very top of `pre_processing_config.sh`.

```
#!/bin/bash
```

For example, in my Bash configuration file for the MET-tools shell scripts, the top of the file looks like

In [4]:
%%bash
head -4 pre_processing_config.sh

#!/bin/bash
##################################################################################
# Description
##################################################################################


This command at the top of our script tells the shell what interpreter to run. In other words, this instructs the operating system to run `pre_processing_config.sh` as a normal shell script. Now we can save and exit and return to our working directory.

If we run the command `ls -F` in the terminal, we can see a classified list of everything in our working directory. Look at your new `pre_processing_config.sh` file; if there is a trailing `*`, it is an executable file. If there is no trailing `*`, then it is not an executable file, and we need to make it one by running the following line of code in the terminal within our working directory.

```
chmod +x configuration.sh
``` 

Verify that `configuration.sh` is an executable now by running `ls -F` in the terminal again.

In [5]:
ls -F *.sh

[0m[38;5;34mbatch_gridstat.sh[0m*         [38;5;34mpre_processing_config.sh[0m*  [38;5;34mrun_wrfout_cf.sh[0m*
[38;5;34mbatch_gridstat_global.sh[0m*  [38;5;34mrun_gridstat.sh[0m*
[38;5;34mbatch_wrfout_cf.sh[0m*        [38;5;34mrun_vxmask.sh[0m*


### Adding Variables to the Bash Configuration File

---

Now that our configuration file has been made executable, we can define environmental variables. For example, in the [`batch_wrfout_cf.sh` script](https://github.com/CW3E/MET-tools/blob/main/Grid-Stat/batch_wrfout_cf.sh), the majority of the global parameters defined can be removed and added to the configuration file instead. A selection of these global parameters is shown below.

```
export TZ="GMT"
export USR_HME=/home/jlconti/MET-tools
export STRT_DT=2022121400
export END_DT=2023011800
```

Having these environmental variables within the configuration file replaces the need to include them hard-coded within the scripts.

> The only global parameters excluded from being transferred to a configuration file were any ISO directory variables (for example: `export IN_ROOT=...` and `export OUT_ROOT=...`).
>> Variables that are script-dependent or change a lot (such as in/output paths) are not ideal to put in a universal configuration file.

We can now add the variables related to the Singularity container into the configuration file. We can do this by adding the following lines of code (and similar for any other commands now dependent on the Singularity container).

>For more information on the NetCDF Singularity container that is being used in these following commands, refer to the [Singularity Container Jupyter notebook](./Creating-a-Singularity-Container-for-a-Conda-Environment.ipynb).

```
export NCKS_CMD="singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/projects/cwp106/scratch/jlconti/MET_tools_conda_netcdf.sif ncks"
export NCL_CMD="singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/projects/cwp106/scratch/jlconti/MET_tools_conda_netcdf.sif ncl"
export CDO_CMD="singularity exec --bind /cw3e:/cw3e,/scratch:/scratch /cw3e/mead/projects/cwp106/scratch/jlconti/MET_tools_conda_netcdf.sif cdo"
```

Once all of the desired environmental variables are within our configuration file, it should look as such.

In [6]:
cat pre_processing_config.sh

#!/bin/bash
##################################################################################
# Description
##################################################################################
# This configuration file is for the MET-tools workflow. These environmental 
# variables are defined for the use of the pre-processing bash scripts in the 
# workflow. Furthermore, in this file are a few Singularity container-dependent 
# variables designed for the conda environents implimented in the MET-tools 
# scripts.
#
##################################################################################
# GLOBAL PARAMETERS TO BE SET BY USER
##################################################################################

# Root directory for MET-tools git clone
export USR_HME=/home/jlconti/MET-tools

# Refine the case-wise sub-directory for path names, leave as empty string if not needed
export CSE=jlconti

# Root directory for MET singularity image
export SOFT_ROOT=/hom

<a name="config_order">

>When making a configuration file, it is often useful to separate environmental variables into 3 distinct catagories
    </a>
>> 1. Variables the user will definitely have to change (Ex: directories, paths, commonly changed values)
> 2. Variables that the user may need to change (Ex: variables that could change depending on the data being used or the desired result)
> 3. Variables that will probably not need to be changed by the user (Ex: system functions)


### Adding the Configuration File to Bash Scripts

---

Once all the necessary variables for our code have been set in the configuration file, we can add the file to our scripts. This can be done by simply adding the following line of code to the top of our scripts.

```
source pre_processing_config.sh
```

or

```
. pre_processing_config.sh
```

This embedded command will call the configuration file and create definitions for environmental variables and global parameters. For example, we called the configuration file in the `batch_wrfout.sh` script at line 42, as printed below.

In [7]:
%%bash
sed -n '41,42p' batch_wrfout_cf.sh

# Source the configuration file to define majority of required variables
source pre_processing_config.sh


With our defined environmental variables now in the configuration file, we call any needed variable in the script itself using the syntax `${VAR_NAME}`. For example, in our scripts, we used the following.

```
${USR_HME}

${NCKS_CMD}
```

This implimented into a line of code looks like the following printed lines of code from `run_wrfout_cf.sh`

In [8]:
%%bash
sed -n '44p' run_wrfout_cf.sh

if [ ! ${USR_HME} ]; then


In [9]:
%%bash
sed -n '217p' run_wrfout_cf.sh

          cmd="${NCKS_CMD} -A -v forecast_reference_time ${out_name} ${out_name}_tmp"


Now that our configuration file has been placed within our scripts and the global parameters have been referenced, we can run our scripts (EX: `sbatch batch_wrfout_cf.sh`). If everything runs correctly, we know the configuration file has successfully replaced the need for hard-coded variables within the scripts. Now, if we have to change any parameters, we can edit the configuration file rather than the scripts themselves. 

## Generating a Python Configuration File
---
Making a Python configuration file is similar to the steps to make a Bash configuration file. However, some steps and syntax is different. This section will review the specifics for a Python config file.

Before building the configuration file, ensure you are in the same working directory as your Python scripts. I will be using the same MET-tools/Grid-Stat directory as before, which as a reminder, was in the following path

In [10]:
cd

/home/jlconti


In [11]:
cd MET-tools/Grid-Stat/

/home/jlconti/MET-tools/Grid-Stat


In [12]:
ls *.py

plt_gridstat_multidate_heatplot.py        plt_gridstat_multilevel_heatplot.py
plt_gridstat_multidate_heatplot_level.py  post_processing_config.py
plt_gridstat_multilead_lineplot.py        proc_gridstat.py
plt_gridstat_multilead_lineplot_level.py  proc_gridstat_global.py


Similar to before, we can generate a Python configuration file by running

```
vim <configuration_filename>.py
```

This will make a Python script with a desired filename. For the purposes of my work, I made a configuration file named `post_processing_config.py`. Unlike the Bash configuration file, we do not have add anything to this configuration file to make it an executable.

### Adding Variables to a Python Configuration File
---
Now that our `*.py` configuration file has been made, we can now start adding environmental variables to it. Majority of the variables in the various Python scripts in the working directory can be added to the global configuration file. A selection of these variables and their syntax is shown below.

>Adding a variable to a configuration file replaces the need for the variable to be hard-coded in the scripts itself. 
>>In other words, you can remove any variable definitions in your scripts that is in your configuration file.

```
STRT_DT = '2022121400'
END_DT = '2023011800'
LND_MSK = 'San_Francisco_Bay'
FIG_CSE = '/Case_Study/Bay_Area'
```

Defining variables like these within a once in a centralized configuration file will allow the python plotting workflow to be more efficient. Now, the user will only have to change a variable definition once without changing the every hard-coded script.

Once every environmental variable has been defined in the configuration file, the contents should be similar to the file shown below.

In [13]:
cat post_processing_config.py

##################################################################################
# Description
##################################################################################
# This configuration file is for the post-processing section of the MET-tools
# workflow. These environmental variables are defined for the python scripts in
# the MET-tools directory. 
#
##################################################################################
# GLOBAL PARAMETERS TO BE SET BY THE USER
##################################################################################

# define control flow to analyze for heatmaps 
CTR_FLW = 'NRT_ecmwf'

# define control flows to analyze for lineplots 
CTR_FLWS = [
            'NRT_gfs',
            'NRT_ecmwf',
            'GFS',
            'ECMWF',
           ]

# verification domain for the forecast data for heatmaps
GRD = 'd01'

# verification domains for the forecast data for lineplots
GRDS = [
        'd01',
        

> For information on how this configuration file was organized, refer back to this [previous blockquote](#config_order).

### Importing the Configuration File to Python Scripts
---

Once every global parameter has been defined in our configuration file, we can import the file into our Python scripts. Similar to other Python imports, we can source the configuration file by adding the following line. 

```
import post_processing_config as config
```

> Instead of "config" you can use any prefered alias for the configuration file.

For example, in the `proc_gridstat.py` script, the configuration file was imported as shown in the import list below

In [14]:
%%bash
sed -n '32,48p' proc_gridstat.py

##################################################################################
# Imports
##################################################################################
import sys
import os
import numpy as np
import pandas as pd
import pickle
import copy
import glob
from datetime import datetime as dt
from datetime import timedelta
import multiprocessing 
from multiprocessing import Pool
import post_processing_config as config

##################################################################################


With our configuration file imported within our script, we can now reference the environmental variables when necessary using the following syntax

```
<config>.<variable_name>
```

For example, the `STRT_DT` variable within the configuration file can be referenced in the scripts as

```
config.STRT_DT
```

An example of these variables being referenced can be seen in an excerpt code from the `proc_gridstat.py` script.

In [15]:
%%bash
sed -n '83,84p' proc_gridstat.py

        s_iso = config.STRT_DT[:4] + '-' + config.STRT_DT[4:6] + '-' + config.STRT_DT[6:8] +\
                '_' + config.STRT_DT[8:]


## Summary 
---
In this notebook we discussed how to generate usable universal configuration files for the purposes of both Bash and Python scripting. This included file creation, variable addition, and script implementation.