# Create Dynamic Climate Indices

Notebook 1/X

This notebook was written by Logan Qualls. Data for this work is sourced from the National Center for Atmospheric Research's Catchment Attributes and Meterology for Large-Sample Studies (CAMELS) dataset, and this notebook is designed to work specifically with Frederik Kratzert's NeuralHydrology (NH; https://github.com/neuralhydrology/neuralhydrology) and Grey Nearing's SACSMA-SNOW17 (SAC-SMA; https://github.com/Upstream-Tech/SACSMA-SNOW17). NH provides a flexible framework with a variety of tools specifically designed for straightforward application of Long Short-Term Memory networks to hydrological modeling. The SACSMA-SNOW17 model provides a Python interface for the SAC-SMA model.

As climate change continues to impact our world, it becomes increasingly important to understand the robustness of our best (Long Short-Term Memory networks), and most commonly used (conceptual models like SAC-SMA), hydrological models. To begin characterizing model robustness we need to first create dynamic climate indices to serve as the independent variable of our experiments. This notebook will output a pickled dictionary containing dataframes for each basin with various dynamic climate indices.

### Import Libraries

In [1]:
import os
import functions
import pickle as pkl
from pathlib import Path
from functions import config

#Import NH functions
from functions.utils import load_basin_file
from functions.climateindices import calculate_camels_us_dyn_climate_indices

### Define Parameters

##### Most Important Experiment Parameters

First we need to define which forcing source we want to create dynamic climate indices from. Five forcing sources are avaliable through CAMELS, including daymet, maurer, maurer_extended, nldas, and nldas_extended. Next, we need to specify which basin list we want to calculate the dynamic climate indices for. Finally, the "window" variable sets the length of the mean rolling window to be used to calculate the dynamic climate indices.

An explicit overwrite variable is included to prevent accidental overwrites if the dynamic climate indices file already exists.

In [2]:
#########################################################################################

#Specify which forcing source we want to create dynamic climate indices from
forcing = 'maurer_extended'

#Specify the number of the basin list we want to use; must already exist! (531 and 8 provided)
basin_list_num = 531

#Specify length of mean rolling window to calculate dynamic climate indices
window = 365

#########################################################################################

#If the dynamic climate indices file for the forcing, window, and basin list already exists, overwrite it?
overwrite = True

#########################################################################################

##### Paths

Next, we need to specify several paths and files. Example path endings are included above each requested path to help with this. Most should not have to be changed if this repositories native file structure is used.

In [3]:
#########################################################################################

#Working dir (current path; ../NeuralHydrology-Climate-Experiments)
working_dir = Path(os.getcwd())

#Path to config comps (../config_complementaries)
config_comp_dir = working_dir / 'config_complementaries'

#Path to camels dir (../camels/basin_dataset_public_v1p2)
camels_dir = working_dir / 'camels' / 'basin_dataset_public_v1p2'

#Path to dynamic climate indices directory (../configs/dynamic_climate_indices)
#This is where the dynamic climate indices files will be saved
dyn_clim_ind_dir = config_comp_dir / 'dynamic_climate_indices'

#List to basin file; uses basin_list_num; MAKE SURE TO DOUBLE CHECK THIS
basin_list_file = config_comp_dir / 'basin_lists' / f'{basin_list_num}_basin_list.txt'

#Path to static dummy config file (../dummy_configs/climate_experiment_static_dummy.yml)
dummy_config_file = config_comp_dir / 'dummy_configs' / f'climate_experiment_static_dummy.yml' 

#File path and name of output file; named according to forcing
output_file = dyn_clim_ind_dir / f'dyn_clim_indices_{forcing}_{basin_list_num}basins_{window}.p'

#########################################################################################

**You should not have to edit anything below this cell.**

### Explicit Warnings

As a safety measure, explicit warnings have been included to warn you if the defined dynamic climate indices file already exists and if you are going to overwrite it by continuing. It also double checks if the basin list you provided exists or not. 

In [4]:
#If the dynamic climate indices file specified above already exists...
if os.path.exists(output_file) == True:
    
    #Warn us!
    print('\033[91m'+'\033[1m'+'Dynamic climate indices file already exists.')
    
    #If we said we wanted to overwrite, that's fine, but...
    if overwrite == True:
        
        #Warn us!
        print('\033[91m'+'\033[1m'+'Dynamic climate indices file will be overwritten.')

#If the specified basin file does not exist...
if os.path.exists(basin_list_file) == False:
    
    #Warn us!
    print('\033[91m'+'\033[1m'+f'{basin_list_num} basin list does not exist.')

### Calculate Dynamic Climate Indices

Now we can calculate dynamic climate indices for the specified forcing using NH's calculate_camels_us_dyn_climate_indices function.

In [5]:
#Get list of basins from basin_list_file
basins = load_basin_file(basin_list_file)

#If the output file above does not already exist OR we want to overwrite the existing file...
if os.path.exists(output_file) == False or overwrite == True:

    #Calculate dynamic climate indices from CAMELS forcing data
    climate_indices = calculate_camels_us_dyn_climate_indices(data_dir=camels_dir,
                                                              basins=basins,
                                                              window_length=window,
                                                              forcings=forcing,
                                                              output_file=output_file)

    #Save climate_indices to the specified file path and name
    with open(output_file,'wb') as f:
        pkl.dump(climate_indices, f)

100%|██████████| 531/531 [01:21<00:00,  6.48it/s]


In [6]:
#Take a peek at our new climate indices dataframe for an example basin
#.tail() is used because .head() often shows all NaNs, which is normal
climate_indices[basins[0]].tail()

Unnamed: 0_level_0,p_mean_dyn,pet_mean_dyn,aridity_dyn,t_mean_dyn,frac_snow_dyn,high_prec_freq_dyn,high_prec_dur_dyn,low_prec_freq_dyn,low_prec_dur_dyn
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2008-12-27,4.085863,1.349292,0.330234,7.191932,0.258834,0.035616,1.3,0.50137,3.8125
2008-12-28,4.070795,1.348977,0.331379,7.203274,0.254099,0.035616,1.3,0.50137,3.8125
2008-12-29,4.056521,1.348238,0.332363,7.214274,0.251163,0.035616,1.3,0.50411,3.755102
2008-12-30,4.029288,1.347872,0.334519,7.215863,0.246102,0.035616,1.3,0.506849,3.77551
2008-12-31,3.986795,1.347671,0.338034,7.220685,0.238067,0.041096,1.363636,0.509589,3.795918


Now that we have dynamic climate indices calculated, we can create configuration files that use them.