This script is used to generate the frequency, total days, and intensity of the UHWs using **CMIP urban predictions**

The data sets are from:   
```bash
/glade/scratch/zhonghua/CMIP5_pred/
```

The results are saved at:
```
/glade/scratch/zhonghua/uhws/UHWs_CMIP/
```

Note:     
**2006**: Using 2006 itself to calculate the percentile, frequency, total days, and intensity  
**2061**: Using the percentile of **2006** to calculate frequency, total days, and intensity of 2061  

In [1]:
import xarray as xr
import datetime
import pandas as pd
import numpy as np
from mpl_toolkits.basemap import Basemap
import matplotlib.pyplot as plt
import time
import gc
import util
# from s3fs.core import S3FileSystem
# s3 = S3FileSystem()

save_dir = "/glade/scratch/zhonghua/uhws/UHWs_CMIP/"

In [2]:
CMIP5_ls = ["ACCESS1-0", "ACCESS1-3", "CanESM2", "CNRM-CM5", "CSIRO-Mk3-6-0",
            "FGOALS-s2","GFDL-CM3", "GFDL-ESM2G", "GFDL-ESM2M", "HadGEM2-CC",
            "HadGEM2-ES", "IPSL-CM5A-MR", "MIROC5", "MIROC-ESM", "MIROC-ESM-CHEM",
            "MRI-CGCM3", "MRI-ESM1"]

## Step 1: Start the pipeline to use 98% percentile (2006) to get frequency (events/year), total days (days/year), and intensity (K) of 2006 and 2061

In [3]:
frequency_2006_ls=[]
duration_2006_ls=[]
intensity_2006_ls=[]
quantile_avail_2006_ls=[]

frequency_2061_ls=[]
duration_2061_ls=[]
intensity_2061_ls=[]

for model in CMIP5_ls:
    print("start model:",model)
    
    # start 2006
    start_time_2006=time.time()
    df_2006=util.load_df("/glade/scratch/zhonghua/CMIP5_pred/2006/"+model+".csv")
    cmip_2006_hw, quantile_avail_2006=util.get_heat_waves_df(df_2006, 0.98, 2, "cmip", None)
    
    frequency_2006_ls.append(util.get_frequency(cmip_2006_hw,model))
    duration_2006_ls.append(util.get_duration(cmip_2006_hw,model))
    intensity_2006_ls.append(util.get_intensity(cmip_2006_hw,model))
    quantile_avail_2006_ls.append(quantile_avail_2006.copy().rename(columns={"quant": model}).set_index(["lat","lon"]))
    print("It took",time.time()-start_time_2006,"to deal with",model,"for year 2006")

    
    # start 2061
    start_time_2061=time.time()
    df_2061=util.load_df("/glade/scratch/zhonghua/CMIP5_pred/2061/"+model+".csv")
    cmip_2061_hw, quantile_avail_2061=util.get_heat_waves_df(df_2061, None, 2, "cmip", quantile_avail_2006)
    
    frequency_2061_ls.append(util.get_frequency(cmip_2061_hw,model))
    duration_2061_ls.append(util.get_duration(cmip_2061_hw,model))
    intensity_2061_ls.append(util.get_intensity(cmip_2061_hw,model))
    
    print("It took",time.time()-start_time_2061,"to deal with",model,"for year 2061")
    print("\n")
    
    del df_2006, df_2061, quantile_avail_2006, quantile_avail_2061
    gc.collect()

start model: ACCESS1-0
Start to load csv /glade/scratch/zhonghua/CMIP5_pred/2006/ACCESS1-0.csv
It takes 10.0939781665802 to load csv
The quantile is: 0.98
The duration threshold is: 2


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_with_quantile["HW"][df_with_quantile["mean"]> df_with_quantile["quant"]] = 0


It took 38.14654755592346 to deal with ACCESS1-0 for year 2006
Start to load csv /glade/scratch/zhonghua/CMIP5_pred/2061/ACCESS1-0.csv
It takes 7.930373191833496 to load csv
The quantile is: None
The duration threshold is: 2
It took 31.62481689453125 to deal with ACCESS1-0 for year 2061


start model: ACCESS1-3
Start to load csv /glade/scratch/zhonghua/CMIP5_pred/2006/ACCESS1-3.csv
It takes 8.231269121170044 to load csv
The quantile is: 0.98
The duration threshold is: 2
It took 35.42331409454346 to deal with ACCESS1-3 for year 2006
Start to load csv /glade/scratch/zhonghua/CMIP5_pred/2061/ACCESS1-3.csv
It takes 10.763758897781372 to load csv
The quantile is: None
The duration threshold is: 2
It took 34.87822866439819 to deal with ACCESS1-3 for year 2061


start model: CanESM2
Start to load csv /glade/scratch/zhonghua/CMIP5_pred/2006/CanESM2.csv
It takes 7.6610589027404785 to load csv
The quantile is: 0.98
The duration threshold is: 2
It took 35.112098693847656 to deal with CanESM2 for 

In [4]:
frequency_2006 = pd.concat(frequency_2006_ls, axis=1)
duration_2006 = pd.concat(duration_2006_ls, axis=1)
intensity_2006 = pd.concat(intensity_2006_ls, axis=1)
quantile_avail_2006 = pd.concat(quantile_avail_2006_ls, axis=1)

frequency_2061 = pd.concat(frequency_2061_ls, axis=1)
duration_2061 = pd.concat(duration_2061_ls, axis=1)
intensity_2061 = pd.concat(intensity_2061_ls, axis=1)

In [5]:
# here the quantile 2006 and quantile 2061 should be same
frequency_2006.to_csv(save_dir+"2006_frequency.csv")
duration_2006.to_csv(save_dir+"2006_totaldays.csv")
intensity_2006.to_csv(save_dir+"2006_intensity.csv")
quantile_avail_2006.to_csv(save_dir+"2006_percentile.csv")

frequency_2061.to_csv(save_dir+"2061_frequency.csv")
duration_2061.to_csv(save_dir+"2061_totaldays.csv")
intensity_2061.to_csv(save_dir+"2061_intensity.csv")