# Precipitation Variability Across Timescales

This notebook demonstrates how to use the precipitation variability metrics driver and calc_ratio script to obtain the precipitation variability metric.

Our metric is based on the simulated-to-observed ratio of spectral power because the spectral power is substantially sensitive to the processing choices for power spectra analysis (e.g., window length, overlap length, and windowing function). By using the ratio, the metric is not greatly affected by the different processing choices, helping the robustness of analysis results.

This notebook should be run in an environment with python, jupyterlab, pcmdi metrics package, and cdat installed. It is expected that you have downloaded the sample data as demonstrated in [the download notebook](Demo_0_download_data.ipynb).  

The following cell reads in the choices you made during the download data step:

In [1]:
from user_choices import demo_data_directory, demo_output_directory

## Basic Use

### Help
Use the `--help` flag for assistance with the precip variability driver:

In [2]:
%%bash
variability_across_timescales_PS_driver.py --help

usage: variability_across_timescales_PS_driver.py [-h]
                                                  [--parameters PARAMETERS]
                                                  [--diags OTHER_PARAMETERS [OTHER_PARAMETERS ...]]
                                                  [--mip MIP] [--exp EXP]
                                                  [--mod MOD] [--var VAR]
                                                  [--frq FRQ]
                                                  [--modpath MODPATH]
                                                  [--results_dir RESULTS_DIR]
                                                  [--case_id CASE_ID]
                                                  [--prd PRD [PRD ...]]
                                                  [--fac FAC]
                                                  [--nperseg NPERSEG]
                                                  [--noverlap NOVERLAP]
                                                  [--ref REF] [--



### Parameter file
Settings can be specified in a parameter file or on the command line. The basic case demonstrated here uses a parameter file, which is printed below.  

Note that this driver should only be used to run **one** model or dataset at a time.  

The `mod` variable can either be set to a particular file name, as shown here, or to a model name (i.e. "GISS-E2-H").

In [3]:
# print parameter file
with open("basic_precip_variability_param.py") as f:
    print(f.read())

mip = "cmip5"
exp = "historical"
mod = "pr_day_GISS-E2-H_historical_r6i1p1_*.nc"
var = "pr"
frq = "day"
modpath = 'demo_data/CMIP5_demo_timeseries/historical/atmos/day/pr/'
results_dir = 'demo_output/precip_variability/GISS-E2-H/'
prd = [2000,2005]  # analysis period
fac = 86400  # factor to make unit of [mm/day]

# length of segment in power spectra (~10 years)
# shortened to 2 years for demo purposes
nperseg = 2 * 365
# length of overlap between segments in power spectra (~5 years)
# shortened to 1 year for demo purposes
noverlap = 1 * 365

# flag for cmec formatted JSON
cmec = False



### Running the driver
The parameter file is passed to the driver using the `-p` flag, similar to other PMP metrics. The basic command is:  
`variability_across_timescales_PS_driver.py -p parameter_file_name.py`

The next cell uses the command line syntax to execute the driver as a subprocess.

In [4]:
%%bash
variability_across_timescales_PS_driver.py -p basic_precip_variability_param.py

demo_data/CMIP5_demo_timeseries/historical/atmos/day/pr/
pr_day_GISS-E2-H_historical_r6i1p1_*.nc
[2000, 2005]
730 365
demo_output/precip_variability/GISS-E2-H/
demo_output/precip_variability/GISS-E2-H/
demo_output/precip_variability/GISS-E2-H/
GISS-E2-H.r6i1p1
['demo_data/CMIP5_demo_timeseries/historical/atmos/day/pr/pr_day_GISS-E2-H_historical_r6i1p1_20000101-20051231.nc']
GISS-E2-H.r6i1p1 365_day
syr, eyr: 2000 2005
2000
Complete regridding from (365, 90, 144) to (365, 90, 180)
2000 (365, 90, 180)
2001
Complete regridding from (365, 90, 144) to (365, 90, 180)
2001 (730, 90, 180)
2002
Complete regridding from (365, 90, 144) to (365, 90, 180)
2002 (1095, 90, 180)
2003
Complete regridding from (365, 90, 144) to (365, 90, 180)
2003 (1460, 90, 180)
2004
Complete regridding from (365, 90, 144) to (365, 90, 180)
2004 (1825, 90, 180)
2005
Complete regridding from (365, 90, 144) to (365, 90, 180)
2005 (2190, 90, 180)
Complete calculating climatology and anomaly for calendar of 365_day
Complet

INFO::2023-02-23 11:51::pcmdi_metrics:: Results saved to a json file: /home/ahn6/PCMDI/pcmdi_metrics/branch/900_msa_precip_variability_demo/pcmdi_metrics/doc/jupyter/Demo/demo_output/precip_variability/GISS-E2-H/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json
2023-02-23 11:51:11,427 [INFO]: base.py(write:237) >> Results saved to a json file: /home/ahn6/PCMDI/pcmdi_metrics/branch/900_msa_precip_variability_demo/pcmdi_metrics/doc/jupyter/Demo/demo_output/precip_variability/GISS-E2-H/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json


### Results
Running the precipitation variability driver produces three output files, found in the demo output directory:  

Spatial pattern of spectral power (forced variability) (netCDF)   
Spatial pattern of spectral power (unforced variability) (netCDF)  
Average of spectral power (forced and unforced) (JSON)  

In [5]:
!ls {demo_output_directory + "/precip_variability/GISS-E2-H"}

PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json
PS_pr.day_regrid.180x90_GISS-E2-H.r6i1p1.nc
PS_pr.day_regrid.180x90_GISS-E2-H.r6i1p1_unforced.nc


The next cell displays the metrics from the JSON file.

In [6]:
import json
import os
output_path = os.path.join(demo_output_directory,"precip_variability/GISS-E2-H/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json")
with open(output_path) as f:
    metric = json.load(f)["RESULTS"]
print(json.dumps(metric, indent=2))

{
  "GISS-E2-H.r6i1p1": {
    "forced": {
      "Land_30N50N": {
        "annual": 1.154266721510137,
        "semi-annual": 0.3692551744241903
      },
      "Land_30S30N": {
        "annual": 6.8655960795131294,
        "semi-annual": 1.1969126049181855
      },
      "Land_50S30S": {
        "annual": 0.7829928891110198,
        "semi-annual": 0.33398811326967975
      },
      "Land_50S50N": {
        "annual": 4.803117924524398,
        "semi-annual": 0.8989181591887316
      },
      "Ocean_30N50N": {
        "annual": 1.4467988289024762,
        "semi-annual": 0.37232162338162866
      },
      "Ocean_30S30N": {
        "annual": 4.568654517465613,
        "semi-annual": 1.5044899979603008
      },
      "Ocean_50S30S": {
        "annual": 0.5918242629787758,
        "semi-annual": 0.1927211439124904
      },
      "Ocean_50S50N": {
        "annual": 3.3099973296409195,
        "semi-annual": 1.0764366904440072
      },
      "Total_30N50N": {
        "annual": 1.311098668230797

## Command line usage with Obs data

To calculate the precipitation variability spectral power ratio, we also need results for a reference dataset. This example shows how to call the `variability_across_timescales_PS_driver` using a combination of the parameter file and command line arguments with daily observational data. The command line arguments will overwrite values that are in the parameter file.  

The `modpath` and `results_dir` values are set first in a separate cell to easily combine the `demo_data_directory` and `demo_output_directory` variables with other strings. The new variables are then passed to the shell command in the second cell.

In [7]:
modpath = demo_data_directory + '/obs4MIPs_PCMDI_daily/NASA-JPL/GPCP-1-3/day/pr/gn/latest/'
results_dir = demo_output_directory + '/precip_variability/GPCP-1-3/'

In [8]:
%%bash -s "$modpath" "$results_dir"
variability_across_timescales_PS_driver.py -p basic_precip_variability_param.py \
--mip 'obs' \
--mod 'pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc' \
--modpath $1 \
--results_dir $2 \
--prd 1997 2016

demo_data/obs4MIPs_PCMDI_daily/NASA-JPL/GPCP-1-3/day/pr/gn/latest/
pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc
[1997, 2016]
730 365
demo_output/precip_variability/GPCP-1-3/
demo_output/precip_variability/GPCP-1-3/
demo_output/precip_variability/GPCP-1-3/
GPCP-1-3
['demo_data/obs4MIPs_PCMDI_daily/NASA-JPL/GPCP-1-3/day/pr/gn/latest/pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc']
GPCP-1-3 gregorian
syr, eyr: 1997 2016
1997
Complete regridding from (365, 180, 360) to (365, 90, 180)
1997 (365, 90, 180)
1998
Complete regridding from (365, 180, 360) to (365, 90, 180)
1998 (730, 90, 180)
1999
Complete regridding from (365, 180, 360) to (365, 90, 180)
1999 (1095, 90, 180)
2000
Complete regridding from (366, 180, 360) to (366, 90, 180)
2000 (1461, 90, 180)
2001
Complete regridding from (365, 180, 360) to (365, 90, 180)
2001 (1826, 90, 180)
2002
Complete regridding from (365, 180, 360) to (365, 90, 180)
2002 (2191, 90, 180)
2003
Complete regridding from (365, 180, 360) to (365, 90, 180)
2003 (2

INFO::2023-02-23 11:58::pcmdi_metrics:: Results saved to a json file: /home/ahn6/PCMDI/pcmdi_metrics/branch/900_msa_precip_variability_demo/pcmdi_metrics/doc/jupyter/Demo/demo_output/precip_variability/GPCP-1-3/PS_pr.day_regrid.180x90_area.freq.mean_GPCP-1-3.json
2023-02-23 11:58:48,155 [INFO]: base.py(write:237) >> Results saved to a json file: /home/ahn6/PCMDI/pcmdi_metrics/branch/900_msa_precip_variability_demo/pcmdi_metrics/doc/jupyter/Demo/demo_output/precip_variability/GPCP-1-3/PS_pr.day_regrid.180x90_area.freq.mean_GPCP-1-3.json


## Precipitation Variability Metric

The precipitation variability metric can be generated after model and observational spectral averages are made. 

A script called "calc_ratio.py" is provided in the precip_variability codebase. This script can be called with three arguments to generate the ratio.  
`ref`: path to obs results JSON  
`modpath`: directory containing model results JSONS (not CMEC formatted JSONs)  
`results_dir`: directory for calc_ratio.py results

This script can be accessed via the PMP repo, which is how it is run here. It does not come with the PMP conda installation.

In [9]:
%%bash -s "$demo_output_directory"
python ../../../pcmdi_metrics/precip_variability/scripts_pcmdi/calc_ratio.py \
--ref $1/precip_variability/GPCP-1-3/PS_pr.day_regrid.180x90_area.freq.mean_GPCP-1-3.json \
--modpath $1/precip_variability/GISS-E2-H/ \
--results_dir $1/precip_variability/ratio/

reference:  demo_output/precip_variability/GPCP-1-3/PS_pr.day_regrid.180x90_area.freq.mean_GPCP-1-3.json
modpath:  demo_output/precip_variability/GISS-E2-H/
outdir:  demo_output/precip_variability/ratio/
['demo_output/precip_variability/GISS-E2-H/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json']
Complete  GISS-E2-H.r6i1p1
Complete all


This outputs one JSON file in the `results_dir` folder. The results in this file are shown below.

In [10]:
output_path = os.path.join(demo_output_directory,"precip_variability/ratio/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json")
with open(output_path) as f:
    metric = json.load(f)["RESULTS"]
print(json.dumps(metric, indent=2))

{
  "GISS-E2-H.r6i1p1": {
    "forced": {
      "Land_30N50N": {
        "annual": 1.6223030227997506,
        "semi-annual": 1.873331686130227
      },
      "Land_30S30N": {
        "annual": 1.3409912459551663,
        "semi-annual": 1.3385919476532728
      },
      "Land_50S30S": {
        "annual": 1.1582631259388922,
        "semi-annual": 1.903328778893799
      },
      "Land_50S50N": {
        "annual": 1.3568447816315299,
        "semi-annual": 1.3967541356262723
      },
      "Ocean_30N50N": {
        "annual": 1.0571429112202069,
        "semi-annual": 0.8535214354376027
      },
      "Ocean_30S30N": {
        "annual": 1.4932022320513534,
        "semi-annual": 1.817141396507603
      },
      "Ocean_50S30S": {
        "annual": 1.4346150163209932,
        "semi-annual": 1.053929465464535
      },
      "Ocean_50S50N": {
        "annual": 1.4578241823817903,
        "semi-annual": 1.6866782169880241
      },
      "Total_30N50N": {
        "annual": 1.2324909366302752,


## Regional metrics

The precipitation variability metrics have a set of default regions, but users can instead define a single spatial region to compute metrics over. There are two ways to do this.

1. Use the `regions_specs` parameter to define a latitude/longitude box.  
Parameter file example:
```
regions_specs={"CONUS": {"domain": {"latitude": (24.7, 49.4), "longitude": (235.22, 293.08)}}}
```

2. Use a shapefile to define a region. Users must provide the path to the shapefile along with the attribute/feature pair that defines the region.  
Parameter file example:
```
region_file="CONUS.shp" # Shapefile path
attr="NAME"             # An attribute in the shapefile
feature="CONUS"         # A unique feature name that can be 
                        # found under the "attr" attribute
```

First, we generate a simple shapefile for use in this example. The shapefile contains one feature, a box that defines the CONUS region.

In [None]:
from shapely import Polygon
import geo pandas as gpd
import pandas as pd

coords = ((233.,22.),(233.,50.),(294.,50.),(294.,22))
df = pd.DataFrame({"Region": ["CONUS"], "Coords": [Polygon(coords)]})
gdf = gpd.GeoDataFrame(df, geometry="Coords", crs="EPSG:4326")
gdf.to_file(demo_output_directory+'/shp/CONUS.shp')

Add the information for this shapefile to the variability_across_timescales_PS_driver.py run command.

In [None]:
%%bash -s "$demo_output_directory"
variability_across_timescales_PS_driver.py -p basic_precip_variability_param.py \
--region_file $1/shp/CONUS.shp \
--attr 'Region' \
--feature 'CONUS' \
--results_dir $1/precip_variability/region_ex

The metrics output will look different than the default example. Metrics will only be produced for a single region that we defined in this shapefile.

In [None]:
output_path = os.path.join(demo_output_directory,"precip_variability/region_ex/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json")
with open(output_path) as f:
    metric = json.load(f)["RESULTS"]
print(json.dumps(metric, indent=2))