# 7. Precipitation Variability Across Timescales

This notebook demonstrates how to use the precipitation variability metrics driver and calc_ratio script to obtain the precipitation variability metric.

Our metric is based on the simulated-to-observed ratio of spectral power because the spectral power is substantially sensitive to the processing choices for power spectra analysis (e.g., window length, overlap length, and windowing function). By using the ratio, the metric is not greatly affected by the different processing choices, helping the robustness of analysis results.

This notebook should be run in an environment with python, jupyterlab, pcmdi metrics package, and cdat installed. It is expected that you have downloaded the sample data as demonstrated in [the download notebook](Demo_0_download_data.ipynb).  

The following cell reads in the choices you made during the download data step:

In [1]:
from user_choices import demo_data_directory, demo_output_directory

## Basic Use

### Help
Use the `--help` flag for assistance with the precip variability driver:

In [2]:
%%bash
variability_across_timescales_PS_driver.py --help

usage: variability_across_timescales_PS_driver.py [-h]
                                                  [--parameters PARAMETERS]
                                                  [--diags OTHER_PARAMETERS [OTHER_PARAMETERS ...]]
                                                  [--mip MIP] [--exp EXP]
                                                  [--mod MOD] [--var VAR]
                                                  [--frq FRQ]
                                                  [--modpath MODPATH]
                                                  [--results_dir RESULTS_DIR]
                                                  [--case_id CASE_ID]
                                                  [--prd PRD [PRD ...]]
                                                  [--fac FAC]
                                                  [--nperseg NPERSEG]
                                                  [--noverlap NOVERLAP]
                                                  [--ref REF] [--



### Parameter file
Settings can be specified in a parameter file or on the command line. The basic case demonstrated here uses a parameter file, which is printed below.  

Note that this driver should only be used to run **one** model or dataset at a time.  

The `mod` variable can either be set to a particular file name, as shown here, or to a model name (i.e. "GISS-E2-H").

In [3]:
# print parameter file
with open("basic_precip_variability_param.py") as f:
    print(f.read())

mip = "cmip5"
exp = "historical"
mod = "pr_day_GISS-E2-H_historical_r6i1p1_*.nc"
var = "pr"
frq = "day"
modpath = 'demo_data_tmp/CMIP5_demo_timeseries/historical/atmos/day/pr/'
results_dir = 'demo_output_tmp/precip_variability/GISS-E2-H/'
prd = [2000,2005]  # analysis period
fac = 86400  # factor to make unit of [mm/day]

# length of segment in power spectra (~10 years)
# shortened to 2 years for demo purposes
nperseg = 2 * 365
# length of overlap between segments in power spectra (~5 years)
# shortened to 1 year for demo purposes
noverlap = 1 * 365

# flag for cmec formatted JSON
cmec = False



### Running the driver
The parameter file is passed to the driver using the `-p` flag, similar to other PMP metrics. The basic command is:  
`variability_across_timescales_PS_driver.py -p parameter_file_name.py`

The next cell uses the command line syntax to execute the driver as a subprocess.

In [4]:
%%bash
variability_across_timescales_PS_driver.py -p basic_precip_variability_param.py

INFO::2024-09-18 15:56::pcmdi_metrics:: Results saved to a json file: /home/ordonez4/git/pcmdi_metrics/doc/jupyter/Demo/demo_output_tmp/precip_variability/GISS-E2-H/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json
2024-09-18 15:56:49,164 [INFO]: base.py(write:422) >> Results saved to a json file: /home/ordonez4/git/pcmdi_metrics/doc/jupyter/Demo/demo_output_tmp/precip_variability/GISS-E2-H/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json
2024-09-18 15:56:49,164 [INFO]: base.py(write:422) >> Results saved to a json file: /home/ordonez4/git/pcmdi_metrics/doc/jupyter/Demo/demo_output_tmp/precip_variability/GISS-E2-H/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json


demo_data_tmp/CMIP5_demo_timeseries/historical/atmos/day/pr/
pr_day_GISS-E2-H_historical_r6i1p1_*.nc
[2000, 2005]
730 365
2
demo_output_tmp/precip_variability/GISS-E2-H/
demo_output_tmp/precip_variability/GISS-E2-H/
demo_output_tmp/precip_variability/GISS-E2-H/
['demo_data_tmp/CMIP5_demo_timeseries/historical/atmos/day/pr/pr_day_GISS-E2-H_historical_r6i1p1_20000101-20051231.nc']
GISS-E2-H.r6i1p1
['demo_data_tmp/CMIP5_demo_timeseries/historical/atmos/day/pr/pr_day_GISS-E2-H_historical_r6i1p1_20000101-20051231.nc']
GISS-E2-H.r6i1p1 365_day
2000 2005
Complete regridding from (2190, 90, 144) to (2190, 90, 180)
Complete calculating climatology and anomaly for calendar of 365_day
Complete power spectra (segment:  730  nps: 5.0 )
Complete domain and frequency average of spectral power
Complete power spectra (segment:  730  nps: 5.0 )
Complete domain and frequency average of spectral power




### Results
Running the precipitation variability driver produces three output files, found in the demo output directory:  

Spatial pattern of spectral power (forced variability) (netCDF)   
Spatial pattern of spectral power (unforced variability) (netCDF)  
Average of spectral power (forced and unforced) (JSON)  

In [5]:
!ls {demo_output_directory + "/precip_variability/GISS-E2-H"}

PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json
PS_pr.day_regrid.180x90_GISS-E2-H.r6i1p1.nc
PS_pr.day_regrid.180x90_GISS-E2-H.r6i1p1_unforced.nc


The next cell displays the metrics from the JSON file.

In [6]:
import json
import os
output_path = os.path.join(demo_output_directory,"precip_variability/GISS-E2-H/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json")
with open(output_path) as f:
    metric = json.load(f)["RESULTS"]
print(json.dumps(metric, indent=2))

{
  "GISS-E2-H.r6i1p1": {
    "forced": {
      "Land_30N50N": {
        "annual": 1.153948602189096,
        "semi-annual": 0.3675381067314767
      },
      "Land_30S30N": {
        "annual": 6.8509958100746555,
        "semi-annual": 1.1945015595812805
      },
      "Land_50S30S": {
        "annual": 0.8090939740005696,
        "semi-annual": 0.34297346148163804
      },
      "Land_50S50N": {
        "annual": 4.793570167683052,
        "semi-annual": 0.8971106124805638
      },
      "Ocean_30N50N": {
        "annual": 1.450126151318265,
        "semi-annual": 0.3738726067518909
      },
      "Ocean_30S30N": {
        "annual": 4.561426422605001,
        "semi-annual": 1.5069884231014545
      },
      "Ocean_50S30S": {
        "annual": 0.5890515819402276,
        "semi-annual": 0.19150748548003316
      },
      "Ocean_50S50N": {
        "annual": 3.3050864193776026,
        "semi-annual": 1.0780758057454556
      },
      "Total_30N50N": {
        "annual": 1.3110986682307972

## Command line usage with Obs data

To calculate the precipitation variability spectral power ratio, we also need results for a reference dataset. This example shows how to call the `variability_across_timescales_PS_driver` using a combination of the parameter file and command line arguments with daily observational data. The command line arguments will overwrite values that are in the parameter file.  

The `modpath` and `results_dir` values are set first in a separate cell to easily combine the `demo_data_directory` and `demo_output_directory` variables with other strings. The new variables are then passed to the shell command in the second cell.

In [7]:
modpath = demo_data_directory + '/obs4MIPs_PCMDI_daily/NASA-JPL/GPCP-1-3/day/pr/gn/latest/'
results_dir = demo_output_directory + '/precip_variability/GPCP-1-3/'

In [8]:
%%bash -s "$modpath" "$results_dir"
variability_across_timescales_PS_driver.py -p basic_precip_variability_param.py \
--mip 'obs' \
--mod 'pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc' \
--modpath $1 \
--results_dir $2 \
--prd 1997 2016

INFO::2024-09-18 16:08::pcmdi_metrics:: Results saved to a json file: /home/ordonez4/git/pcmdi_metrics/doc/jupyter/Demo/demo_output_tmp/precip_variability/GPCP-1-3/PS_pr.day_regrid.180x90_area.freq.mean_GPCP-1-3.json
2024-09-18 16:08:28,962 [INFO]: base.py(write:422) >> Results saved to a json file: /home/ordonez4/git/pcmdi_metrics/doc/jupyter/Demo/demo_output_tmp/precip_variability/GPCP-1-3/PS_pr.day_regrid.180x90_area.freq.mean_GPCP-1-3.json
2024-09-18 16:08:28,962 [INFO]: base.py(write:422) >> Results saved to a json file: /home/ordonez4/git/pcmdi_metrics/doc/jupyter/Demo/demo_output_tmp/precip_variability/GPCP-1-3/PS_pr.day_regrid.180x90_area.freq.mean_GPCP-1-3.json


demo_data_tmp/obs4MIPs_PCMDI_daily/NASA-JPL/GPCP-1-3/day/pr/gn/latest/
pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc
[1997, 2016]
730 365
2
demo_output_tmp/precip_variability/GPCP-1-3/
demo_output_tmp/precip_variability/GPCP-1-3/
demo_output_tmp/precip_variability/GPCP-1-3/
['demo_data_tmp/obs4MIPs_PCMDI_daily/NASA-JPL/GPCP-1-3/day/pr/gn/latest/pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc']
GPCP-1-3
['demo_data_tmp/obs4MIPs_PCMDI_daily/NASA-JPL/GPCP-1-3/day/pr/gn/latest/pr_day_GPCP-1-3_PCMDI_gn_19961002-20170101.nc']
GPCP-1-3 gregorian
1997 2016
Complete regridding from (7305, 180, 360) to (7305, 90, 180)
Complete calculating climatology and anomaly for calendar of gregorian
Complete power spectra (segment:  730  nps: 19.0 )
Complete domain and frequency average of spectral power
Complete power spectra (segment:  730  nps: 19.0 )
Complete domain and frequency average of spectral power




## Precipitation Variability Metric

The precipitation variability metric can be generated after model and observational spectral averages are made. 

A script called "calc_ratio.py" is provided in the precip_variability codebase. This script can be called with three arguments to generate the ratio.  
`ref`: path to obs results JSON  
`modpath`: directory containing model results JSONS (not CMEC formatted JSONs)  
`results_dir`: directory for calc_ratio.py results

This script can be accessed via the PMP repo, which is how it is run here. It does not come with the PMP conda installation.

In [9]:
%%bash -s "$demo_output_directory"
python ../../../pcmdi_metrics/precip_variability/scripts_pcmdi/calc_ratio.py \
--ref $1/precip_variability/GPCP-1-3/PS_pr.day_regrid.180x90_area.freq.mean_GPCP-1-3.json \
--modpath $1/precip_variability/GISS-E2-H/ \
--results_dir $1/precip_variability/ratio/

reference:  demo_output_tmp/precip_variability/GPCP-1-3/PS_pr.day_regrid.180x90_area.freq.mean_GPCP-1-3.json
modpath:  demo_output_tmp/precip_variability/GISS-E2-H/
outdir:  demo_output_tmp/precip_variability/ratio/
['demo_output_tmp/precip_variability/GISS-E2-H/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json']
Complete  GISS-E2-H.r6i1p1
Complete all




This outputs one JSON file in the `results_dir` folder. The results in this file are shown below.

In [10]:
output_path = os.path.join(demo_output_directory,"precip_variability/ratio/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json")
with open(output_path) as f:
    metric = json.load(f)["RESULTS"]
print(json.dumps(metric, indent=2))

{
  "GISS-E2-H.r6i1p1": {
    "forced": {
      "Land_30N50N": {
        "annual": 1.6279984575673894,
        "semi-annual": 1.867057373601494
      },
      "Land_30S30N": {
        "annual": 1.3338114720532706,
        "semi-annual": 1.333350181560781
      },
      "Land_50S30S": {
        "annual": 1.164227264547647,
        "semi-annual": 1.9246852085563568
      },
      "Land_50S50N": {
        "annual": 1.3503132388688357,
        "semi-annual": 1.391749566706111
      },
      "Ocean_30N50N": {
        "annual": 1.0524861972773814,
        "semi-annual": 0.8517712548298377
      },
      "Ocean_30S30N": {
        "annual": 1.499118828822202,
        "semi-annual": 1.8222593026548162
      },
      "Ocean_50S30S": {
        "annual": 1.4363958284724372,
        "semi-annual": 1.0484119422307991
      },
      "Ocean_50S50N": {
        "annual": 1.4625476582104198,
        "semi-annual": 1.6902905191733497
      },
      "Total_30N50N": {
        "annual": 1.2324909366302752,
 

## Regional metrics

The precipitation variability metrics have a set of default regions. However, users can instead define a single spatial region to compute metrics over. There are two ways to do this.

1. Use the `regions_specs` parameter to define a latitude/longitude box.  
Parameter file example:
```
regions_specs={"CONUS": {"domain": {"latitude": (24.7, 49.4), "longitude": (235.22, 293.08)}}}
```

2. Use a shapefile to define a region. Users must provide the path to the shapefile along with the attribute/feature pair that defines the region.  
Parameter file example:
```
region_file="CONUS.shp" # Shapefile path
attr="NAME"             # An attribute in the shapefile
feature="CONUS"         # A unique feature name that can be 
                        # found under the "attr" attribute
```

Both options can be used at the same time. In that case, the area defined by regions_specs is applied first and can be used to trim down very large, high resolution datasets. Then the metrics are computed for the area defined by the shapefile region.

### Region example
First, we generate a simple shapefile for use in this demo. The shapefile contains one feature, a box that defines the CONUS region.

In [13]:
from shapely import Polygon
import geopandas as gpd
import pandas as pd

# Define region box
coords = ((233.,22.),(233.,50.),(294.,50.),(294.,22))

# Add to pandas dataframe, then convert to geopandas dataframe
df = pd.DataFrame({"Region": ["CONUS"], "Coords": [Polygon(coords)]})
gdf = gpd.GeoDataFrame(df, geometry="Coords", crs="EPSG:4326")

# Create the output location
if not os.path.exists(demo_output_directory+"/shp"):
    os.mkdir(demo_output_directory+"/shp")
gdf.to_file(demo_output_directory+'/shp/CONUS.shp')

Add the information for this shapefile to the variability_across_timescales_PS_driver.py run command.

In [14]:
%%bash -s "$demo_output_directory"
variability_across_timescales_PS_driver.py -p basic_precip_variability_param.py \
--region_file $1/shp/CONUS.shp \
--attr 'Region' \
--feature 'CONUS' \
--results_dir $1/precip_variability/region_ex

  clim = np.nanmean(dseg, axis=0)
INFO::2024-09-18 16:11::pcmdi_metrics:: Results saved to a json file: /home/ordonez4/git/pcmdi_metrics/doc/jupyter/Demo/demo_output_tmp/precip_variability/region_ex/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json
2024-09-18 16:11:40,113 [INFO]: base.py(write:422) >> Results saved to a json file: /home/ordonez4/git/pcmdi_metrics/doc/jupyter/Demo/demo_output_tmp/precip_variability/region_ex/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json
2024-09-18 16:11:40,113 [INFO]: base.py(write:422) >> Results saved to a json file: /home/ordonez4/git/pcmdi_metrics/doc/jupyter/Demo/demo_output_tmp/precip_variability/region_ex/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json


demo_data_tmp/CMIP5_demo_timeseries/historical/atmos/day/pr/
pr_day_GISS-E2-H_historical_r6i1p1_*.nc
[2000, 2005]
730 365
2
demo_output_tmp/precip_variability/region_ex
demo_output_tmp/precip_variability/region_ex
demo_output_tmp/precip_variability/region_ex
['demo_data_tmp/CMIP5_demo_timeseries/historical/atmos/day/pr/pr_day_GISS-E2-H_historical_r6i1p1_20000101-20051231.nc']
GISS-E2-H.r6i1p1
['demo_data_tmp/CMIP5_demo_timeseries/historical/atmos/day/pr/pr_day_GISS-E2-H_historical_r6i1p1_20000101-20051231.nc']
GISS-E2-H.r6i1p1 365_day
2000 2005
Complete regridding from (2190, 90, 144) to (2190, 90, 180)
Cropping from shapefile
Reading region from file.
Complete calculating climatology and anomaly for calendar of 365_day
Complete power spectra (segment:  730  nps: 5.0 )
Complete domain and frequency average of spectral power
Complete power spectra (segment:  730  nps: 5.0 )
Complete domain and frequency average of spectral power




The metrics output will look different than the default example. Metrics will only be produced for a single region that we defined in this shapefile.

In [15]:
output_path = os.path.join(demo_output_directory,"precip_variability/region_ex/PS_pr.day_regrid.180x90_area.freq.mean_GISS-E2-H.r6i1p1.json")
with open(output_path) as f:
    metric = json.load(f)["RESULTS"]
print(json.dumps(metric, indent=2))

{
  "GISS-E2-H.r6i1p1": {
    "forced": {
      "CONUS": {
        "annual": 1.2011870574080201,
        "semi-annual": 0.380975826207154
      }
    },
    "unforced": {
      "CONUS": {
        "interannual": 0.1521909521737256,
        "seasonal-annual": 0.20428410514869913,
        "sub-seasonal": 0.20652699240276465,
        "synoptic": 0.10360220715481439
      }
    }
  }
}
