# Copyright Netherlands eScience Center <br>
** Function     : Testing scoring system with climatology** <br>
** Author       : Yang Liu ** <br>
** First Built  : 2020.08.21 ** <br>
** Last Update  : 2020.08.21 ** <br>
** Library      : Pytorth, Numpy, NetCDF4, os, iris, cartopy, deepclim, matplotlib **<br>
Description     : This notebook serves to predict the Arctic sea ice using deep learning. We also include many climate index (to represent the forcing from atmosphere). The convolutional Long Short Time Memory neural network is used to deal with this spatial-temporal sequence problem. We use Pytorch as the deep learning framework. <br>
<br>
** Here we predict sea ice concentration with one extra relevant field from either ocean or atmosphere to test the predictor.** <br>

Return Values   : Time series and figures <br>

The regionalization adopted here follows that of the MASIE (Multisensor Analyzed Sea Ice Extent) product available from the National Snow and Ice Data Center:<br>
https://nsidc.org/data/masie/browse_regions<br>
It is given by paper J.Walsh et. al., 2019. Benchmark seasonal prediction skill estimates based on regional indices.<br>

The method comes from the study by Shi et. al. (2015) Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting. <br>

In [1]:
%matplotlib inline

import sys
import numbers

# for data loading
import os
from netCDF4 import Dataset
# for pre-processing and machine learning
import numpy as np
#import sklearn
#import scipy
import torch
import torch.nn.functional

#sys.path.append(os.path.join('C:','Users','nosta','ML4Climate','Scripts','DeepClim'))
sys.path.append("C:\\Users\\nosta\\ML4Climate\\Scripts\\DLACs")
import dlacs
import dlacs.preprocess
import dlacs.ConvLSTM

The testing device is Dell Inspirion 5680 with Intel Core i7-8700 x64 CPU and Nvidia GTX 1060 6GB GPU.<br>
Here is a benchmark about cpu v.s. gtx 1060 <br>
https://www.analyticsindiamag.com/deep-learning-tensorflow-benchmark-intel-i5-4210u-vs-geforce-nvidia-1060-6gb/

In [2]:
# constants
constant = {'g' : 9.80616,      # gravititional acceleration [m / s2]
            'R' : 6371009,      # radius of the earth [m]
            'cp': 1004.64,      # heat capacity of air [J/(Kg*K)]
            'Lv': 2264670,      # Latent heat of vaporization [J/Kg]
            'R_dry' : 286.9,    # gas constant of dry air [J/(kg*K)]
            'R_vap' : 461.5,    # gas constant for water vapour [J/(kg*K)]
            'rho' : 1026,       # sea water density [kg/m3]
            }

** Data ** <br>
Time span of each product included: <br>
** Reanalysis ** <br>
- **ERA-Interim** 1979 - 2016 (ECMWF)
- **ORAS4**       1958 - 2014 (ECMWF)

** Index ** <br>
- **NINO3.4**     1950 - 2017 (NOAA)
- **AO**          1950 - 2017 (NOAA)
- **NAO**         1950 - 2017 (NOAA)
- **AMO**         1950 - 2017 (NOAA)
- **PDO**         1950 - 2017 (University of Washington)

!! These index are given by NCEP/NCAR Reanalysis (CDAS) <br>


Alternative (not in use yet) <br>
** Reanalysis ** <br>
- **MERRA2**      1980 - 2016 (NASA)
- **JRA55**       1979 - 2015 (JMA)
- **GLORYS2V3**   1993 - 2014 (Mercartor Ocean)
- **SODA3**       1980 - 2015
- **PIOMASS**     1980 - 2015

** Observations ** <br>
- **NSIDC**       1958 - 2017 

In [3]:
################################################################################# 
#########                           datapath                             ########
#################################################################################
# please specify data path
#datapath_ERAI = '/home/ESLT0068/WorkFlow/Core_Database_DeepLearn/ERA-Interim'
datapath_ERAI = 'H:\\Creator_Zone\\Core_Database_DeepLearn\\ERA-Interim'
output_path = 'C:\\Users\\nosta\\ML4Climate\\PredictArctic\\Maps'

In [4]:
if __name__=="__main__":
    print ('*********************** get the key to the datasets *************************')
    # weekly variables on ERAI grid
    dataset_ERAI_fields_sic = Dataset(os.path.join(datapath_ERAI,
                                      'sic_weekly_erai_1979_2017.nc'))
    print ('*********************** extract variables *************************')
    #################################################################################
    #########                        data gallery                           #########
    #################################################################################
    # we use time series from 1979 to 2016 (468 months in total)
    # training data: 1979 - 2013
    # validation: 2014 - 2016
    # variables list:
    # SIC (ERA-Interim) / SIV (PIOMASS) / SST (ERA-Interim) / ST (ERA-Interim) / OHC (ORAS4) / AO-NAO-AMO-NINO3.4 (NOAA)
    # integrals from spatial fields cover the area from 20N - 90N (4D fields [year, month, lat, lon])
    # *************************************************************************************** #
    # SIC (ERA-Interim) - benckmark
    SIC_ERAI = dataset_ERAI_fields_sic.variables['sic'][:-1,:,:,:] # 4D fields [year, week, lat, lon]
    year_ERAI = dataset_ERAI_fields_sic.variables['year'][:-1]
    week_ERAI = dataset_ERAI_fields_sic.variables['week'][:]
    latitude_ERAI = dataset_ERAI_fields_sic.variables['latitude'][:]
    longitude_ERAI = dataset_ERAI_fields_sic.variables['longitude'][:]

*********************** get the key to the datasets *************************
*********************** extract variables *************************


In [5]:
    #################################################################################
    ###########                 global land-sea mask                      ###########
    #################################################################################
    sea_ice_mask_global = np.ones((len(latitude_ERAI),len(longitude_ERAI)),dtype=float)
    sea_ice_mask_global[SIC_ERAI[0,0,:,:]==-1] = 0
    #################################################################################
    ###########                regionalization sea mask                   ###########
    #################################################################################
    print ('*********************** create mask *************************')
    # W:-156 E:-124 N:80 S:67
    mask_Beaufort = np.zeros((len(latitude_ERAI),len(longitude_ERAI)),dtype=int)
    # W:-180 E:-156 N:80 S:66
    mask_Chukchi = np.zeros((len(latitude_ERAI),len(longitude_ERAI)),dtype=int)
    # W:146 E:180 N:80 S:67
    mask_EastSiberian = np.zeros((len(latitude_ERAI),len(longitude_ERAI)),dtype=int)
    # W:100 E:146 N:80 S:67
    mask_Laptev = np.zeros((len(latitude_ERAI),len(longitude_ERAI)),dtype=int)
    # W:60 E:100 N:80 S:67
    mask_Kara = np.zeros((len(latitude_ERAI),len(longitude_ERAI)),dtype=int)
    # W:18 E:60 N:80 S:64
    mask_Barents = np.zeros((len(latitude_ERAI),len(longitude_ERAI)),dtype=int)
    # W:-44 E:18 N:80 S:55
    mask_Greenland = np.zeros((len(latitude_ERAI),len(longitude_ERAI)),dtype=int)
    # W:-180 E:180 N:90 S:80
    mask_CenArctic = np.zeros((len(latitude_ERAI),len(longitude_ERAI)),dtype=int)
    print ('*********************** calc mask *************************')
    mask_Beaufort[13:31,32:76] = 1

    mask_Chukchi[13:32,0:32] = 1
    mask_Chukchi[13:32,-1] = 1

    mask_EastSiberian[13:31,434:479] = 1

    mask_Laptev[13:31,374:434] = 1

    mask_Kara[13:31,320:374] = 1

    mask_Barents[13:36,264:320] = 1

    mask_Greenland[13:47,179:264] = 1
    mask_Greenland[26:47,240:264] = 0

    mask_CenArctic[:13,:] = 1
    print ('*********************** packing *************************')
    mask_dict = {'Beaufort': mask_Beaufort[:,:],
                 'Chukchi': mask_Chukchi[:,:],
                 'EastSiberian': mask_EastSiberian[:,:],
                 'Laptev': mask_Laptev[:,:],
                 'Kara': mask_Kara[:,:],
                 'Barents': mask_Barents[:,:],
                 'Greenland': mask_Greenland[:,:],
                 'CenArctic': mask_CenArctic[:,:]}
    seas_namelist = ['Beaufort','Chukchi','EastSiberian','Laptev',
                     'Kara', 'Barents', 'Greenland','CenArctic']

*********************** create mask *************************
*********************** calc mask *************************
*********************** packing *************************


In [6]:
    print ('******************  calculate extent from spatial fields  *******************')
    # size of the grid box
    dx = 2 * np.pi * constant['R'] * np.cos(2 * np.pi * latitude_ERAI /
                                            360) / len(longitude_ERAI)
    dy = np.pi * constant['R'] / 480
    # calculate the sea ice area
    SIC_ERAI_area = np.zeros(SIC_ERAI.shape, dtype=float)
    for i in np.arange(len(latitude_ERAI[:])):
        # change the unit to terawatt
        SIC_ERAI_area[:,:,i,:] = SIC_ERAI[:,:,i,:]* dx[i] * dy / 1E+6 # unit km2
    SIC_ERAI_area[SIC_ERAI_area<0] = 0 # switch the mask from -1 to 0
    print ('================  reshape input data into time series  =================')
    SIC_ERAI_area_series = dlacs.preprocess.operator.unfold(SIC_ERAI_area)

******************  calculate extent from spatial fields  *******************


In [7]:
    print ('******************  choose the fields from target region  *******************')
    # select land-sea mask
    sea_ice_mask_barents = sea_ice_mask_global[12:36,264:320]
    print ('******************  choose the fields from target region  *******************')
    # select the area between greenland and ice land for instance 60-70 N / 44-18 W
    sic_exp = SIC_ERAI_area_series[:,12:36,264:320]
    print(sic_exp.shape)
    print(latitude_ERAI[12:36])
    print(longitude_ERAI[264:320])
    #print(latitude_ERAI[26:40])
    #print(longitude_ERAI[180:216])

******************  choose the fields from target region  *******************
******************  choose the fields from target region  *******************
(1824, 24, 56)
[80.5  79.75 79.   78.25 77.5  76.75 76.   75.25 74.5  73.75 73.   72.25
 71.5  70.75 70.   69.25 68.5  67.75 67.   66.25 65.5  64.75 64.   63.25]
[18.   18.75 19.5  20.25 21.   21.75 22.5  23.25 24.   24.75 25.5  26.25
 27.   27.75 28.5  29.25 30.   30.75 31.5  32.25 33.   33.75 34.5  35.25
 36.   36.75 37.5  38.25 39.   39.75 40.5  41.25 42.   42.75 43.5  44.25
 45.   45.75 46.5  47.25 48.   48.75 49.5  50.25 51.   51.75 52.5  53.25
 54.   54.75 55.5  56.25 57.   57.75 58.5  59.25]


In [8]:
    print ('*******************  pre-processing  *********************')
    print ('=========================   normalize data   ===========================')
    sic_exp_norm = dlacs.preprocess.operator.normalize(sic_exp[:-48*4,:,:])
    print('================  save the normalizing factor  =================')
    sic_max = np.amax(sic_exp)
    sic_min = np.amin(sic_exp)
    print(sic_max,"km2")
    print(sic_min,"km2")  
    print ('====================    A series of time (index)    ====================')
    _, yy, xx = sic_exp_norm.shape # get the lat lon dimension
    year = np.arange(1979,2017,1)
    year_cycle = np.repeat(year,48)
    month_cycle = np.repeat(np.arange(1,13,1),4)
    month_cycle = np.tile(month_cycle,len(year)+1) # one extra repeat for lead time dependent prediction
    month_cycle.astype(float)
    month_2D = np.repeat(month_cycle[:,np.newaxis],yy,1)
    month_exp = np.repeat(month_2D[:,:,np.newaxis],xx,2)
    print ('===================  artificial data for evaluation ====================')
    # calculate climatology of SIC
#     seansonal_cycle_SIC = np.zeros(48,dtype=float)
#     for i in np.arange(48):
#         seansonal_cycle_SIC[i] = np.mean(SIC_ERAI_sum_norm[i::48],axis=0)
    # weight for loss
#     weight_month = np.array([0,1,1,
#                              1,0,0,
#                              1,1,1,
#                              0,0,0])
    #weight_loss = np.repeat(weight_month,4)
    #weight_loss = np.tile(weight_loss,len(year))

*******************  pre-processing  *********************
1565.2049481856002 km2
0.0 km2


# Procedure for LSTM <br>
** We use Pytorth to implement LSTM neural network with time series of climate data. ** <br>

In [11]:
    print ('*******************  evaluation matrix  *********************')
    # The prediction will be evaluated through RMSE against climatology
    
    # error score for temporal-spatial fields, without keeping spatial pattern
    def RMSE(x,y):
        """
        Calculate the RMSE. x is input series and y is reference series.
        It calculates RMSE over the domain, not over time. The spatial structure
        will not be kept.
        Parameter
        ----------------------
        x: input time series with the shape [time, lat, lon]
        """
        x_series = x.reshape(x.shape[0],-1)
        y_series = y.reshape(y.shape[0],-1)
        rmse = np.sqrt(np.mean((x_series - y_series)**2,1))
        rmse_std = np.sqrt(np.std((x_series - y_series)**2,1))
    
        return rmse, rmse_std
    
    # error score for temporal-spatial fields, keeping spatial pattern
    def MAE(x,y):
        """
        Calculate the MAE. x is input series and y is reference series.
        It calculate MAE over time and keeps the spatial structure.
        """
        mae = np.mean(np.abs(x-y),0)
        
        return mae

*******************  evaluation matrix  *********************


In [9]:
def week2month(series, m):
    """
    Select certain month from yearly data at weekly resolution.
    Parameters
    ----------
    series : array-like
        Three-dimensional numeric arrays with time as the first dimenison [time, lat, lon]
    m: int
        Month (from 1 to 12).
    """
    time_year, lat, lon = series.shape
    time_month = time_year // 12
    series_month = np.zeros((time_month, lat, lon), dtype=float)
    series_month[::4,:,:] = series[(m-1)*4::48,:,:]
    series_month[1::4,:,:] = series[(m-1)*4+1::48,:,:]
    series_month[2::4,:,:] = series[(m-1)*4+2::48,:,:]
    series_month[3::4,:,:] = series[(m-1)*4+3::48,:,:]
    
    return series_month

In [12]:
    #################################################################################
    ########                performance evaluation with RMSE                 ########
    ########              RMSE over time, and sum over domain                ########
    #################################################################################
    sequence_len, height, width = sic_exp_norm.shape
    test_year = 4
    print('##############################################################')
    print('############   start prediction with climatology  ############')
    print('##############################################################')
    length_year = 10
    signal = sic_exp_norm[-length_year*48:,:,:]
    # compute climatology
    climatology = np.zeros((48, height, width),dtype=float)
    climatology_full_len = np.zeros((48, height, width),dtype=float)
    for i in range(48):
        climatology[i,:,:] = np.mean(signal[i::48,:,:],axis=0)
        climatology_full_len[i,:,:] = np.mean(sic_exp_norm[i::48,:,:],axis=0)
    # repeat this climatology and calculate the RMSE
    climatology_seq = np.tile(climatology,(test_year,1,1))
    climatology_full_len_seq = np.tile(climatology_full_len,(test_year,1,1))
    RMSE_climatology, RMSE_climatology_std  = RMSE(climatology_seq * sic_max,sic_exp_norm[-test_year*12*4:,:,:] * sic_max)
    RMSE_climatology_full_len, RMSE_climatology_full_len_std  = RMSE(climatology_full_len_seq * sic_max,
                                                                     sic_exp_norm[-test_year*12*4:,:,:] * sic_max)
    RMSE_climatology = np.mean(RMSE_climatology)
    RMSE_climatology_std = np.mean(RMSE_climatology_std)
    
    RMSE_climatology_full_len = np.mean(RMSE_climatology_full_len)
    RMSE_climatology_full_len_std = np.mean(RMSE_climatology_full_len_std)

    print("*******************     Lead time 0     *******************")
    print("Mean RMSE with testing data - Climatology 10 years")
    print(RMSE_climatology,"+-",RMSE_climatology_std)
    print("Mean RMSE with testing data - Climatology 38 years")
    print(RMSE_climatology_full_len,"+-",RMSE_climatology_full_len_std)    

    print("*******************     Lead time 0     *******************")
    for i in np.arange(1,13,1):
        climatology_monthly_series = week2month(climatology_seq, i)
        truth_monthly_series = week2month(sic_exp_norm[-test_year*12*4:,:,:], i)
        rmse_climatology_monthly, rmse_climatology_monthly_std = RMSE(climatology_monthly_series * sic_max,truth_monthly_series * sic_max)
        
        print("*******************    {}     *******************".format(i))
        print("RMSE - Climatology    {} + - {}".format(np.mean(rmse_climatology_monthly), np.mean(rmse_climatology_monthly_std)))

##############################################################
############   start prediction with climatology  ############
##############################################################
*******************     Lead time 0     *******************
Mean RMSE with testing data - Climatology 10 years
101.98881351721018 +- 175.8285481305861
Mean RMSE with testing data - Climatology 38 years
127.7026373094444 +- 201.82080805390896
*******************     Lead time 0     *******************
*******************    1     *******************
RMSE - Climatology    146.9372690330108 + - 260.5194161210163
*******************    2     *******************
RMSE - Climatology    139.25492995755374 + - 221.61197178571095
*******************    3     *******************
RMSE - Climatology    134.36612699564316 + - 224.2519433370879
*******************    4     *******************
RMSE - Climatology    145.44036924538642 + - 252.245390223904
*******************    5     *******************
RMSE - Climat

In [12]:
    #################################################################################
    ########          transfer the sea ice fields into binary data           ########
    #################################################################################
    # ice concentration below the threshold is regarded as no ice, the value is from
    # https://nsidc.org/cryosphere/seaice/data/terminology.html
    criterion_0 = 0.15 
    # remove the area weight
    sic_exp_denorm = np.zeros(sic_exp_norm.shape, dtype=float)
    for i in np.arange(height):
        # note: during normalization, the maximum sic is exactly dy * dx[35] and the minimum is 0
        # so, for denormalization, dx[i+12] * dy / (dx[35] * dy) = dx[i+12] /dx[35]
        sic_exp_denorm[:,i,:] = sic_exp_norm[:,i,:] / dx[i+12] * dx[35] # index 12 and 35 correpond to the area slice
    # turn sea ice fields into binary data
    sic_exp_bin = sic_exp_denorm[:]
    sic_exp_bin[sic_exp_bin <= criterion_0] = 0
    sic_exp_bin[sic_exp_bin > criterion_0] = 1
    # turn matrix into int
    sic_exp_bin = sic_exp_bin.astype(int)

In [13]:
    print ('*******************  module for calculating IIEE score *********************')

    # positive is sea ice = 1
    
    def iiee(pred, label, grid_x, grid_y):
        """
        Integrated ice-edge error (IIEE) is defined to evaluate the forecast around
        sea ice edge. It can be decomposed to two components, which are the overestimated (O)
        and underestimated (U) score of local sea ice extent:
        IIEE = O + U
        
        The definition is given by the paper Goessling et. al. 2016.
        Predictability of the Arctic sea ice edge. Geophysical Research Letters.
        
        param pred: forecast fields [seq, lat, lon]
        param label: target observation field [seq, lat, lon]
        param grid_x: vector of size of grid box in longitudinal direction [lat]
        param grid_y: size of grid box in latitudinal direction [lat]
        """
        seq, lat, lon = pred.shape
        # initialize matrix to store U and O scores
        O_array = np.zeros(pred.shape,dtype=float) # overestimated
        U_array = np.zeros(pred.shape,dtype=float) # underestimated
        iiee = np.zeros(pred.shape,dtype=float)
        # calculate the difference
        diff = pred - label
        # compute the scores
        O_array[diff>0.01] = 1
        U_array[diff<-0.01] = 1
        iiee_array[:] = O_array[:] + U_array[:]
        # weight by grid size
        O_weight = np.zeros(pred.shape,dtype=float)
        U_weight = np.zeros(pred.shape,dtype=float)
        iiee_weight = np.zeros(pred.shape,dtype=float)
        for i in range(lat):
            O_weight[:,i,:] = O_array[:,i,:] * grid_x[i] * grid_y
            U_weight[:,i,:] = U_array[:,i,:] * grid_x[i] * grid_y
            iiee_weight[:,i,:] = iiee_array[:,i,:] * grid_x[i] * grid_y    
        # take temporal average and overall average
        O_spatial = np.mean(O_weight,0)
        U_spatial = np.mean(U_weight,0)
        iiee_spatial = np.mean(iiee_weight,0)
        
        O = np.sum(O_spatial) 
        U = np.sum(U_spatial)
        iiee = np.sum(iiee_spatial)

        return O, U, iiee, O_spatial, U_spatial, iiee_spatial

*******************  module for calculating IIEE score *********************


In [None]:
    #################################################################################
    ########                performance evaluation with IIEE                 ########
    ########              IIEE over time, and sum over domain                ########
    #################################################################################


In [14]:
    #################################################################################
    ########                persistence evaluation with RMSE                 ########
    ########              RMSE over time, and sum over domain                ########
    ################################################################################# 
    # two ways of defining persistence
    # (1) actual SIC observed at lead 0 carried through the forecast
    # (2) the SIC anomaly at lead 0 added to the climatology at each lead time
    sequence_len, height, width = sic_exp_norm.shape
    # calculate persistence as anomaly added to the climatology
    anomaly_step_0 = np.zeros(sic_exp_norm[-test_year*12*4:,:,:].shape, dtype=float) # only take the testing period
    climatology_seq_long = np.tile(climatology,(test_year+1,1,1)) # add one more year for the calculation
    anomaly_step_0[:] = sic_exp_norm[-test_year*12*4:,:,:] - climatology_seq_long[:test_year*12*4,:,:]
    persist_anomaly_plus_clim = np.zeros((test_year*12*4, 16, height, width)) # we check upto 16 steps
    for i in range(16): # for the forecast of first lead time
        persist_anomaly_plus_clim[:,i,:,:] = anomaly_step_0[:] + climatology_seq_long[i+2:test_year*12*4+i+2,:,:]
    print('#######################################################################')
    print('############   start prediction with persistence carry on  ############')
    print('#######################################################################')
    RMSE_persist_0, RMSE_persist_0_std = RMSE(sic_exp_norm[-test_year*12*4-1:-1,:,:] * sic_max,
                                              sic_exp_norm[-test_year*12*4:,:,:] * sic_max)
    RMSE_persist_1, RMSE_persist_1_std = RMSE(sic_exp_norm[-test_year*12*4-1:-2,:,:] * sic_max,
                                              sic_exp_norm[-test_year*12*4+1:,:,:] * sic_max)
    RMSE_persist_2, RMSE_persist_2_std = RMSE(sic_exp_norm[-test_year*12*4-1:-3,:,:] * sic_max,
                                              sic_exp_norm[-test_year*12*4+2:,:,:] * sic_max)
    RMSE_persist_3, RMSE_persist_3_std = RMSE(sic_exp_norm[-test_year*12*4-1:-4,:,:] * sic_max,
                                              sic_exp_norm[-test_year*12*4+3:,:,:] * sic_max)
    RMSE_persist_4, RMSE_persist_4_std = RMSE(sic_exp_norm[-test_year*12*4-1:-5,:,:] * sic_max,
                                              sic_exp_norm[-test_year*12*4+4:,:,:] * sic_max)
    RMSE_persist_5, RMSE_persist_5_std = RMSE(sic_exp_norm[-test_year*12*4-1:-6,:,:] * sic_max,
                                              sic_exp_norm[-test_year*12*4+5:,:,:] * sic_max)
    print('###########################################################################################')
    print('############   start prediction with persistence anomaly added to climatology  ############')
    print('###########################################################################################')
    RMSE_persist_anomaly_0, RMSE_persist_anomaly_0_std = RMSE(persist_anomaly_plus_clim[:,0,:,:] * sic_max,
                                                              sic_exp_norm[-test_year*12*4:,:,:] * sic_max)
    RMSE_persist_anomaly_1, RMSE_persist_anomaly_1_std = RMSE(persist_anomaly_plus_clim[:-1,1,:,:] * sic_max,
                                                              sic_exp_norm[-test_year*12*4+1:,:,:] * sic_max)
    RMSE_persist_anomaly_2, RMSE_persist_anomaly_2_std = RMSE(persist_anomaly_plus_clim[:-2,2,:,:] * sic_max,
                                                              sic_exp_norm[-test_year*12*4+2:,:,:] * sic_max)
    RMSE_persist_anomaly_3, RMSE_persist_anomaly_3_std = RMSE(persist_anomaly_plus_clim[:-3,3,:,:] * sic_max,
                                                              sic_exp_norm[-test_year*12*4+3:,:,:] * sic_max)
    RMSE_persist_anomaly_4, RMSE_persist_anomaly_4_std = RMSE(persist_anomaly_plus_clim[:-4,4,:,:] * sic_max,
                                                              sic_exp_norm[-test_year*12*4+4:,:,:] * sic_max)
    RMSE_persist_anomaly_5, RMSE_persist_anomaly_5_std = RMSE(persist_anomaly_plus_clim[:-5,5,:,:] * sic_max,
                                                              sic_exp_norm[-test_year*12*4+5:,:,:] * sic_max)
    RMSE_persist_anomaly_6, RMSE_persist_anomaly_6_std = RMSE(persist_anomaly_plus_clim[:-6,6,:,:] * sic_max,
                                                              sic_exp_norm[-test_year*12*4+6:,:,:] * sic_max)
    RMSE_persist_anomaly_7, RMSE_persist_anomaly_7_std = RMSE(persist_anomaly_plus_clim[:-7,7,:,:] * sic_max,
                                                              sic_exp_norm[-test_year*12*4+7:,:,:] * sic_max)
    RMSE_persist_anomaly_8, RMSE_persist_anomaly_8_std = RMSE(persist_anomaly_plus_clim[:-8,8,:,:] * sic_max,
                                                              sic_exp_norm[-test_year*12*4+8:,:,:] * sic_max)
    RMSE_persist_anomaly_9, RMSE_persist_anomaly_9_std = RMSE(persist_anomaly_plus_clim[:-9,9,:,:] * sic_max,
                                                              sic_exp_norm[-test_year*12*4+9:,:,:] * sic_max)
    RMSE_persist_anomaly_10, RMSE_persist_anomaly_10_std = RMSE(persist_anomaly_plus_clim[:-10,10,:,:] * sic_max,
                                                                sic_exp_norm[-test_year*12*4+10:,:,:] * sic_max)
    RMSE_persist_anomaly_11, RMSE_persist_anomaly_11_std = RMSE(persist_anomaly_plus_clim[:-11,11,:,:] * sic_max,
                                                                sic_exp_norm[-test_year*12*4+11:,:,:] * sic_max)
    RMSE_persist_anomaly_12, RMSE_persist_anomaly_12_std = RMSE(persist_anomaly_plus_clim[:-12,12,:,:] * sic_max,
                                                                sic_exp_norm[-test_year*12*4+12:,:,:] * sic_max)
    RMSE_persist_anomaly_13, RMSE_persist_anomaly_13_std = RMSE(persist_anomaly_plus_clim[:-13,13,:,:] * sic_max,
                                                                sic_exp_norm[-test_year*12*4+13:,:,:] * sic_max)
    RMSE_persist_anomaly_14, RMSE_persist_anomaly_14_std = RMSE(persist_anomaly_plus_clim[:-14,14,:,:] * sic_max,
                                                                sic_exp_norm[-test_year*12*4+14:,:,:] * sic_max)
    RMSE_persist_anomaly_15, RMSE_persist_anomaly_15_std = RMSE(persist_anomaly_plus_clim[:-15,15,:,:] * sic_max,
                                                                sic_exp_norm[-test_year*12*4+15:,:,:] * sic_max)    
    print("*******************     Lead time 0     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_0),"+-",np.mean(RMSE_persist_anomaly_0_std))
    print("Mean RMSE with persistence - carried through the forecast")
    print(np.mean(RMSE_persist_0),"+-",np.mean(RMSE_persist_0_std))
    print("*******************     Lead time 1     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_1),"+-",np.mean(RMSE_persist_anomaly_1_std))
    print("Mean RMSE with persistence - carried through the forecast")
    print(np.mean(RMSE_persist_1),"+-",np.mean(RMSE_persist_1_std))
    print("*******************     Lead time 2     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_2),"+-",np.mean(RMSE_persist_anomaly_2_std))
    print("Mean RMSE with persistence - carried through the forecast")
    print(np.mean(RMSE_persist_2),"+-",np.mean(RMSE_persist_2_std))
    print("*******************     Lead time 3     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_3),"+-",np.mean(RMSE_persist_anomaly_3_std))
    print("Mean RMSE with persistence - carried through the forecast")
    print(np.mean(RMSE_persist_3),"+-",np.mean(RMSE_persist_3_std))
    print("*******************     Lead time 4     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_4),"+-",np.mean(RMSE_persist_anomaly_4_std))
    print("Mean RMSE with persistence - carried through the forecast")
    print(np.mean(RMSE_persist_4),"+-",np.mean(RMSE_persist_4_std))
    print("*******************     Lead time 5     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_5),"+-",np.mean(RMSE_persist_anomaly_5_std))
    print("Mean RMSE with persistence - carried through the forecast")
    print(np.mean(RMSE_persist_5),"+-",np.mean(RMSE_persist_5_std))
    print("*******************     Lead time 6     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_6),"+-",np.mean(RMSE_persist_anomaly_6_std))
    print("*******************     Lead time 7     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_7),"+-",np.mean(RMSE_persist_anomaly_7_std))    
    print("*******************     Lead time 8     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_8),"+-",np.mean(RMSE_persist_anomaly_8_std))    
    print("*******************     Lead time 9     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_9),"+-",np.mean(RMSE_persist_anomaly_9_std))    
    print("*******************     Lead time 10     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_10),"+-",np.mean(RMSE_persist_anomaly_10_std))    
    print("*******************     Lead time 11     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_11),"+-",np.mean(RMSE_persist_anomaly_11_std))
    print("*******************     Lead time 12     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_12),"+-",np.mean(RMSE_persist_anomaly_12_std))
    print("*******************     Lead time 13     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_13),"+-",np.mean(RMSE_persist_anomaly_13_std))
    print("*******************     Lead time 14     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_14),"+-",np.mean(RMSE_persist_anomaly_14_std))
    print("*******************     Lead time 15     *******************")
    print("Mean RMSE with persistence - anomaly plus climatology")
    print(np.mean(RMSE_persist_anomaly_15),"+-",np.mean(RMSE_persist_anomaly_15_std))

#######################################################################
############   start prediction with persistence carry on  ############
#######################################################################
###########################################################################################
############   start prediction with persistence anomaly added to climatology  ############
###########################################################################################
*******************     Lead time 0     *******************
Mean RMSE with persistence - anomaly plus climatology
48.37543218828063 +- 88.19821114166643
Mean RMSE with persistence - carried through the forecast
53.5430826714379 +- 108.17843957592652
*******************     Lead time 1     *******************
Mean RMSE with persistence - anomaly plus climatology
72.64112525180285 +- 132.47112669589706
Mean RMSE with persistence - carried through the forecast
81.07405426004486 +- 159.7012589980829
*******