# Compound Events
- https://github.com/e-baumer/standard_precip
- https://github.com/e-baumer/standard_precip/blob/master/standard_precip/base_sp.py
- Handle missing data to calculate spi (last paragraph) https://www.droughtmanagement.info/literature/WMO_standardized_precipitation_index_user_guide_en_2012.pdf#page=9


### SPI:
- Add options
    - freq = ['D', 'M']
    - scale = #
- If freq == 'M': get monthly pr files
- If freq == 'D': get daily pr files
- Concat 3 dfs of pr data:
    - [historical, ssp126]
    - [historical, ssp245]
    - [historical, ssp370]
- Calculate SPI for each model using "from standard_precip.spi import SPI"
- Extract results JJA for 2015-2100 for spi and tasmax
    - Filter by common model/column names
    - Concat spi and tasmax axis=0
    - Concat ssp's axis=1
    - Determine if compound
    - Output 1 df per threshold combo

### Problem:
- KACE-1-0-G currently skipped since division by zero error (~1000 missing)
- SPI calc does not handle nan's well: 1 row of missing pr results in at least 30 rows of missing spi

- Why are there missing values?
    - KACE-1-0-G: no value for all 31st day of month
    - Models missing 37 values: no value for feb 29 (leap years)
        - [INM-CM4-8, INM-CM5-0, NorESM2-MM, NorESM2-LM, GFDL-ESM4, GISS-E2-1-G, FGOALS-g3, BCC-CSM2-MR, CMCC-ESM2, CESM2]
     
- How to handle missing values? 
    - Check original code
    - Options: interpolation by time, multivariate imputation (slow), back/front fill, ...
#### Missing value solution: interpolation by linear (df sorted by time)

In [1]:
%%time

from process import *
import os
import pandas as pd
import warnings
import matplotlib.pyplot as plt
import seaborn as sns
from standard_precip.spi import SPI

warnings.filterwarnings('ignore')

plt.rcParams['figure.figsize'] = (15, 4)
plt.rcParams['figure.dpi'] = 300 # 600
plt.rcParams['font.size'] = 10
plt.rcParams['figure.titlesize'] = 15
plt.rcParams['axes.linewidth'] = 0.1
plt.rcParams['patch.linewidth'] = 0
plt.rcParams['grid.linewidth'] = 0.1

event = ['CWHE','CDHE'][1]
months = [6, 7, 8]
center='LARC'

freq, scale = 'D', 30

initialize(center, event, months, freq, scale)
results, compound, pr_spi, tm, pr_, tm_ = main()

Processing ssp126 spi (1981-01-01 00:00:00 - 2100-12-31 00:00:00)...
Processing ssp245 spi (1981-01-01 00:00:00 - 2100-12-31 00:00:00)...
Processing ssp370 spi (1981-01-01 00:00:00 - 2100-12-31 00:00:00)...
CPU times: user 38.8 s, sys: 1.36 s, total: 40.1 s
Wall time: 40.8 s


In [2]:
pr_spi

Unnamed: 0_level_0,Unnamed: 1_level_0,INM-CM4-8_pr,INM-CM5-0_pr,NorESM2-MM_pr,NorESM2-LM_pr,MIROC6_pr,GFDL-ESM4_pr,MIROC-ES2L_pr,GISS-E2-1-G_pr,FGOALS-g3_pr,MPI-ESM1-2-HR_pr,...,MRI-ESM2-0_spi,CMCC-ESM2_spi,ACCESS-ESM1-5_spi,EC-Earth3_spi,ACCESS-CM2_spi,IPSL-CM6A-LR_spi,CNRM-ESM2-1_spi,CNRM-CM6-1_spi,KACE-1-0-G_spi,CESM2_spi
ssp,date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
ssp126,2021-06-01,0.000000,0.081689,0.000000,0.000000,0.000814,2.004056,2.367464,0.008973,1.138105,2.315744,...,1.944710,-0.067509,0.615972,-1.286553,-0.789865,-0.011746,-0.093620,1.087828,0.415413,0.048708
ssp126,2021-06-02,0.297987,8.573282,0.000000,0.000000,0.000000,0.000000,0.508515,0.347845,0.963457,8.882218,...,1.513595,-0.528473,0.692402,-1.232343,-0.452993,0.150711,0.104578,1.072619,0.517579,0.071000
ssp126,2021-06-03,0.462374,8.971298,0.000000,0.000000,0.000000,3.174705,0.072587,9.452589,0.681418,2.714005,...,1.009377,-0.554914,0.716502,-0.997270,-0.499355,0.276396,0.447028,1.056469,0.535615,0.216405
ssp126,2021-06-04,0.120682,0.630614,0.000000,0.094576,13.903137,0.507504,0.000169,2.450261,2.285066,3.401442,...,0.807251,-0.569984,0.756254,-0.828589,-0.465400,0.036494,0.715795,1.093336,0.509715,0.180110
ssp126,2021-06-05,0.639671,0.000000,3.440921,4.472532,2.895703,0.934759,0.019373,2.092968,0.986643,5.269065,...,0.840292,-0.642846,0.978917,-0.822810,-0.321942,-0.351529,1.019566,1.072641,0.437197,-0.112267
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
ssp370,2100-08-27,13.431886,6.918253,0.064743,0.091623,0.000000,1.345558,15.426550,2.007805,14.151917,6.271328,...,0.412934,-0.092322,-0.944546,1.536825,0.919899,0.353573,-1.029783,1.878583,0.928181,-0.373419
ssp370,2100-08-28,16.964950,0.076420,0.131618,0.115677,0.000920,0.000000,0.431562,0.572648,1.349138,0.000000,...,0.441634,-0.060102,-0.774077,1.321439,0.966694,0.364413,-1.036629,1.787735,0.901117,-0.375359
ssp370,2100-08-29,13.477572,0.029418,8.323636,2.779805,9.479126,0.000000,0.008801,5.638641,0.000000,0.000000,...,0.523189,-0.040056,0.146903,0.786670,0.940155,0.417501,-1.010875,1.791265,0.804114,-0.393232
ssp370,2100-08-30,7.680827,2.135523,16.355370,17.000638,2.276963,1.311583,0.000000,0.009842,0.281959,0.000000,...,0.607270,-0.022949,0.178776,0.679696,0.929014,0.325915,-1.036792,1.696052,0.760785,-0.410191


In [6]:
metrics = ['_day_total$', '_event__total$', '_event_max$'] #>>>
display(results['spi<-1_tasmax>32.2'].filter(regex='ACCESS-CM2'))#.sum())

Unnamed: 0_level_0,Unnamed: 1_level_0,ACCESS-CM2_compound_event_max,ACCESS-CM2_compound_event__total,ACCESS-CM2_compound_day_total
ssp,date,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
ssp126,2021,6,6,6
ssp126,2022,10,43,46
ssp126,2023,0,0,0
ssp126,2024,3,3,3
ssp126,2025,0,0,0
...,...,...,...,...
ssp370,2096,3,3,3
ssp370,2097,6,8,8
ssp370,2098,13,15,16
ssp370,2099,2,2,2


In [4]:
# for column in spi_dfs[ssp_name].columns[1:]:
#     spi_dfs[ssp_name][column].plot.kde()
#     plt.title(f'Precipitation')
# plt.xlabel('Models')
# plt.ylabel('Density')
# plt.show()