## Calculate Seasonal ENSO Skill

### In this example, we demonstrate: 
1. How to remotely access data from the North American Multi-model Ensemble (NMME) hindcast database and set it up to be used in `climpred`
2. How to calculate the Anomaly Correlation Coefficient (ACC) using seasonal data

### The North American Multi-model Ensemble (NMME)

Further information on NMME is available from [Kirtman et al. 2014](https://journals.ametsoc.org/doi/full/10.1175/BAMS-D-12-00050.1) and the [NMME project website](https://www.cpc.ncep.noaa.gov/products/NMME/)

The NMME public database is hosted on the International Research Institute for Climate and Society (IRI) data server http://iridl.ldeo.columbia.edu/SOURCES/.Models/.NMME/

### Definitions

Anomalies
: Departure from normal, where normal is defined as the climatological value based on the average value for each month over all years.

Nino3.4
: An index used to represent the evolution of the El Nino-Southern Oscillation (ENSO). Calculated as the average sea surface temperature (SST) anomalies in the region 5S-5N; 190-240

In [None]:
import warnings

import matplotlib.pyplot as plt
import xarray as xr
import pandas as pd
import numpy as np

from climpred import HindcastEnsemble
import climpred

import fsspec

In [None]:
warnings.filterwarnings("ignore")

In [None]:
def decode_cf(ds, time_var):
    if ds[time_var].attrs['calendar'] == '360':
        ds[time_var].attrs['calendar'] = '360_day'
    ds = xr.decode_cf(ds, decode_times=True)
    return ds

Load the monthly sea surface temperature (SST) hindcast data for the NCEP-CFSv2 model with the Nino34 region already extracted.

In [None]:
filepath = '/glade/work/rbrady/workshops/climpred/NMME_NCEP-CFSv2_SSTENSOREG_Hindcast.nc'
fcstds = xr.open_dataset(filepath, decode_times=False)

# If not on Cheyenne, comment out above and uncomment below.
# fcstds = xr.open_zarr(fsspec.get_mapper('gcs://climpred_workshop/NMME_NCEP-CFSv2_SSTENSOREG_Hindcast'),
#                      decode_times=False)

fcstds = decode_cf(fcstds, 'S')

The NMME data dimensions correspond to the following `climpred` dimension definitions: `X=lon`,`L=lead`,`Y=lat`,`M=member`, `S=init`.  We will rename the dimensions to their `climpred` names.

In [None]:
fcstds=fcstds.rename({'S': 'init','L': 'lead','M': 'member', 'X': 'lon', 'Y': 'lat'})

Let's make sure that the `lead` dimension is set properly for `climpred`.  NMME data stores `leads` as 0.5, 1.5, 2.5, etc, which correspond to 0, 1, 2, ... months since initialization. We will change the `lead` to be integers starting with zero.

In [None]:
fcstds['lead']=(fcstds['lead']-0.5).astype('int')

Now we need to make sure that the `init` dimension is set properly for `climpred`.  For monthly data, the `init` dimension must be a `xr.cfdateTimeIndex` or a `pd.datetimeIndex`.  We convert the `init` values to `pd.datatimeIndex`.

In [None]:
fcstds['init']=pd.to_datetime(fcstds.init.values.astype(str))
fcstds['init']=pd.to_datetime(fcstds['init'].dt.strftime('%Y%m01 00:00'))

Next, we want to get the verification SST data with the Nino34 region already extracted.

In [None]:
filepath = '/glade/work/rbrady/workshops/climpred/NMME_NOAA-OISSTv2_SSTENSOREG_Verif.nc'
verifds = xr.open_dataset(filepath, decode_times=False)

# If not on Cheyenne, comment out above and uncomment below.
# verifds = xr.open_zarr(fsspec.get_mapper('gcs://climpred_workshop/NMME_NOAA-OISSTv2_SSTENSOREG_Verif'),
#                      decode_times=False)

verifds = decode_cf(verifds, 'T')

Rename the dimensions to correspond to `climpred` dimensions

In [None]:
verifds=verifds.rename({'T': 'time','X': 'lon', 'Y': 'lat'})

Convert the `time` data to be of type `pd.datetimeIndex`

In [None]:
verifds['time']=pd.to_datetime(verifds.time.values.astype(str))
verifds['time']=pd.to_datetime(verifds['time'].dt.strftime('%Y%m01 00:00'))
verifds

Subset the data to 1982-2010

In [None]:
verifds=verifds.sel(time=slice('1982-01-01','2010-12-01'))
fcstds=fcstds.sel(init=slice('1982-01-01','2010-12-01'))

Calculate the Nino3.4 index for forecast and verification

In [None]:
fcstnino34=fcstds.sel(lat=slice(-5,5),lon=slice(190,240)).mean(['lat','lon'])
verifnino34=verifds.sel(lat=slice(-5,5),lon=slice(190,240)).mean(['lat','lon'])

fcstclimo = fcstnino34.groupby('init.month').mean('init')
fcstanoms = (fcstnino34.groupby('init.month') - fcstclimo)

verifclimo = verifnino34.groupby('time.month').mean('time')
verifanoms = (verifnino34.groupby('time.month') - verifclimo)

print(fcstanoms)
print(verifanoms)

Make Seasonal Averages with center=True and drop NaNs.  This means that the first value 

In [None]:
fcstnino34seas=fcstanoms.rolling(lead=3, center=True).mean().dropna(dim='lead')
verifnino34seas=verifanoms.rolling(time=3, center=True).mean().dropna(dim='time')

Create new `xr.DataArray` with seasonal data

In [None]:
nleads=fcstnino34seas['lead'][::3].size
fcst=xr.DataArray(fcstnino34seas['sst'][:,::3,:], 
                           coords={'init' : fcstnino34seas['init'],
                                   'lead': np.arange(0,nleads),
                                   'member': fcstanoms['member'],
                                   },
                           dims=['init','lead','member'])
fcst.name = 'sst'

Assign the `units` attribute of `seasons` to the `lead` dimension

In [None]:
fcst['lead'].attrs={'units': 'seasons'}

Create a `climpred HindcastEnsemble` object

In [None]:
hindcast = HindcastEnsemble(fcst)
hindcast = hindcast.add_observations(verifnino34seas, 'observations')

Compute the Anomaly Correlation Coefficient (ACC) 0, 1, 2, and 3 season lead-times

In [None]:
skillds = hindcast.verify(metric='acc')
print(skillds)

Make bar plot of Nino3.4 skill for 0,1, and 2 season lead times

In [None]:
x=np.arange(0,nleads,1.0).astype(int)
plt.bar(x,skillds['sst'])
plt.xticks(x)
plt.title('NCEP-CFSv2 Nino34 ACC')
plt.xlabel('Lead (Season)')
plt.ylabel('ACC')

### References

1. Kirtman, B.P., D. Min, J.M. Infanti, J.L. Kinter, D.A. Paolino, Q. Zhang, H. van den Dool, S. Saha, M.P. Mendez, E. Becker, P. Peng, P. Tripp, J. Huang, D.G. DeWitt, M.K. Tippett, A.G. Barnston, S. Li, A. Rosati, S.D. Schubert, M. Rienecker, M. Suarez, Z.E. Li, J. Marshak, Y. Lim, J. Tribbia, K. Pegion, W.J. Merryfield, B. Denis, and E.F. Wood, 2014: The North American Multimodel Ensemble: Phase-1 Seasonal-to-Interannual Prediction; Phase-2 toward Developing Intraseasonal Prediction. Bull. Amer. Meteor. Soc., 95, 585–601, https://doi.org/10.1175/BAMS-D-12-00050.1