# Calculations of SIA and SIE from observational datasets

### Author: Chris Wyburn-Powell, [github](https://github.com/chrisrwp/synthetic-ensemble/blob/main/SIA/SIA_calculations_observations.ipynb) <br>

#### Input:
All datasets contain monthly data of which all months 1979-2020 are used:
- **Hadley Centre Sea Ice and Sea Surface Temperature data set, version 1 (HadISST1)** - SIC. [`doi:10.1029/2002JD002670`](https://doi.org/10.1029/2002JD002670)
- **NOAA/NSIDC Climate Data Record of Passive Microwave Sea Ice Concentration, Version 4 (CDR)** - SIC. Contains Climate Data Record (CDR), NASA Team (NT) and NASA Boostrap (BT) datasets. [`doi:10.7265/efmz-2t65`](https://doi.org/10.7265/efmz-2t65)
 
#### Output:
1979-2020 missing/spurious monthly data corrected for SIC, SIA and SIE: 

#### Corrections to datasets:
The obejctive in making corrections is to remove spurious data and fill missing data with values which are similar to what happened and/or represents a realistic spatial scenario - e.g. linear interpolation may be okay for SIA and SIE, but doing so for SIC could cause unrealistic spaital distributions to occur, such as where 100% and 0% were averaged over a large area simulating an unrealistically wide ice edge zone. To ensure SIA/SIE should match SIC variables, SIC data from the same month of a different year is used to fill missing data.
- **NOAA/NSIDC CDR version 4**: Pole hole is filled using the average SIC of the surrounding grid cells (built in). Missing months (1984-07 1987-12, 1988-01) are filled by looking at the closest valid months for SIA (CDR), idenfitying whether the previous or following year's SIA for those valid months are closets to that year with missing data, then selecting the previous or following SIC data to fill the missing data. E.g. For 1984-07: SIA for 1983-06 and 1985-06 are compared with 1984-06 and 1983-08 and 1985-08 are compared with 1984-08. 1985 is found to be closer to 1984 than 1983 was with 1984 so to fill 1984-07, 1985-07 is copied. Similarly SIC values for 1988-12 and 1989-01 are used to fill 1987-12 and 1988-01.
- **HadISST1**: Discontinuities for months 2009-03 and 2009-04 were found with extreme negative anomalies which do not appear in other datasets. SIC from 2007-03 is used for 2009-03 and 2008-04 are used for 2009-04. 

In [1]:
import numpy as np
import xarray as xr
import pandas as pd
import glob 
import datetime
import warnings

In [2]:
#define the root path for the project directory
data_path = '/glade/scratch/cwpowell/Synthetic_ensemble/'

#define the mid-month dates to be used across all datasets
CLIVAR = xr.open_dataset(data_path+'/SIA/SIA_SIE_SIV/CLIVAR_SIA_1850_2100_RCP85.nc')
CLIVAR_time = CLIVAR['time'].sel(time=slice('1979','2020'))

# NSIDC CDR Version 4
## Exclude non-sea ice data and interpolate missing months

In [3]:
#load all monthly files into a single xarray dataset and correct time dimension
#surpress warnings as the non-standard time produces an error, later overwritten
warnings.filterwarnings("ignore", 'variable ') 

all_CDR_data = []

for file in glob.glob(data_path+'Raw_data/observations/NSIDC_CDR_v4/seaice_conc_monthly_nh*'):
    all_CDR_data.append(xr.open_dataset(file))
    
#concatenate all of the monthly files into a single xarray dataaray
all_CDR_xr = xr.concat((all_CDR_data), dim='tdim') 

#rename dimension so they match coordinates
all_CDR_xr = all_CDR_xr.rename({'tdim':'time', 'y':'ygrid', 'x':'xgrid'}) 

#sort by time dimension, files were loaded in a random order
all_CDR_xr = all_CDR_xr.sortby(all_CDR_xr['time']) 

#replace the time dimension with numpy.datetime64 objects for mid-month
all_CDR_xr['time'] = CLIVAR_time 

In [4]:
#set all non-sea ice to np.nan, exclude land/coastal grid points - values <2.5
CDR = all_CDR_xr['cdr_seaice_conc_monthly'].where(
    all_CDR_xr['cdr_seaice_conc_monthly']<1.1)

BT  = all_CDR_xr['nsidc_bt_seaice_conc_monthly'].where(
    all_CDR_xr['nsidc_bt_seaice_conc_monthly']<1.1) 

NT  = all_CDR_xr['nsidc_nt_seaice_conc_monthly'].where(
    all_CDR_xr['nsidc_nt_seaice_conc_monthly']<1.1) 

In [5]:
#for filling values of 1984-07, 1987-12 and 1988-01 the following years (1985-07, 1988-12 and 1989-01)
#were found to be closest to the year with the missing values for other months of the year
filled = []

for data_var in [CDR, BT, NT]: #loop through all 3 datasets
    CDR_xr_1984_07 = data_var.sel(time='1985-07').copy()
    CDR_xr_1984_07['time'] = xr.DataArray(data = CLIVAR_time.sel(time='1984-07').values, 
                                          coords={'time': CLIVAR_time.sel(time='1984-07').values}, 
                                          dims=['time'])
    
    CDR_xr_1987_12 = data_var.sel(time='1988-12').copy()
    CDR_xr_1987_12['time'] = xr.DataArray(data = CLIVAR_time.sel(time='1987-12').values, 
                                          coords={'time': CLIVAR_time.sel(time='1987-12').values}, 
                                          dims=['time'])
    
    CDR_xr_1988_01 = data_var.sel(time='1989-01').copy()
    CDR_xr_1988_01['time'] = xr.DataArray(data = CLIVAR_time.sel(time='1988-01').values, 
                                          coords={'time': CLIVAR_time.sel(time='1988-01').values}, 
                                          dims=['time'])

    filled.append(xr.concat((data_var.sel(time=slice('1979-01','1984-06')), 
                             CDR_xr_1984_07, data_var.sel(time=slice('1984-08','1987-11')), 
                             CDR_xr_1987_12, CDR_xr_1988_01, 
                             data_var.sel(time=slice('1988-02','2020-12'))), 
                            dim='time'))

In [6]:
#save interpolated SIC to NetCDF
CDR_filled = xr.Dataset({'CDR':filled[0], 'BT':filled[1], 'NT':filled[2]})

CDR_filled.attrs = {'Description': 'Arctic sea ice concentration (SIC) from the Climate Data Record'\
                        +' (CDR), NASA Team (NT) and NASA Boostrap (BT). All months 1979-2020, '\
                        +'missing data (1984-07, 1987-12, 1988-01) filled with data from the '\
                        +'following years (1985-07, 1988-12, 1989-01) as the following year SIA is '\
                        +'closer than the preceeding year SIA for the months with data adjacent to '\
                        +'the missing months.', 
                    'Units'      : 'million square km',
                    'Timestamp'  : str(datetime.datetime.utcnow().strftime("%H:%M UTC %a %Y-%m-%d")),
                    'Data source': 'NOAA/NSIDC Climate Data Record of Passive Microwave Sea Ice '\
                        +'Concentration, Version 4, doi:10.7265/efmz-2t65.',
                    'Analysis'   : 'https://github.com/chrisrwp/synthetic-ensemble/SIA/'\
                        +'SIA_calculations_observations.ipynb'}

CDR_filled.to_netcdf(data_path+'Raw_data/observations/NSIDC_CDR_v4/SIC_CDR_BT_NT_79-20_filled.nc')

## Calculate SIA, SIE from interpolated SIC

In [15]:
#calculate SIA
CDR_SIA = CDR_filled.sum('xgrid').sum('ygrid')*625/1e6 #each grid cell is 25x25 km

#calculate SIE
CDR_SIE = {}

for var_ in ['CDR', 'BT', 'NT']:
    ones_zeros = np.where(CDR_filled[var_]>0.15, np.ones(np.shape(CDR_filled[var_])), 
                          np.zeros(np.shape(CDR_filled[var_])))
    
    CDR_SIE[var_] = np.sum(ones_zeros, axis=(1,2))*625/1e6

In [16]:
#combine all the calculations into a single dataset and save to NetCDF

CDR_SIA_SIE = xr.Dataset(data_vars = {'CDR_SIA':(('time'), CDR_SIA['CDR']),
                                      'BT_SIA':(('time'), CDR_SIA['BT']),
                                      'NT_SIA':(('time'), CDR_SIA['NT']),
                                      'CDR_SIE':(('time'), CDR_SIE['CDR']),
                                      'BT_SIE':(('time'), CDR_SIE['BT']),
                                      'NT_SIE':(('time'), CDR_SIE['NT'])},
                         coords    = {'time': CDR_SIA['time']})

CDR_SIA_SIE.attrs = {'Description': 'Arctic sea ice area (SIA) and sea ice extent (SIE) from the '\
                         +'Climate Data Record (CDR), NASA Team (NT) and NASA Boostrap (BT). All '\
                         +'months 1979-2020, missing data (1984-07, 1987-12, 1988-01) are filled '\
                         +'with the following year (1985-07, 1988-12, 1989-01).', 
                     'Units'      : 'million square km',
                     'Timestamp'  : str(datetime.datetime.utcnow().strftime("%H:%M UTC %a %Y-%m-%d")),
                     'Data source': 'NOAA/NSIDC Climate Data Record of Passive Microwave Sea Ice '\
                         +'Concentration, Version 4, doi:10.7265/efmz-2t65.',
                     'Analysis'   : 'https://github.com/chrisrwp/synthetic-ensemble/SIA/'\
                         +'SIA_calculations_observations.ipynb'}

CDR_SIA_SIE.to_netcdf(data_path+'Raw_data/observations/NSIDC_CDR_v4/SIA_SIE_CDR_BT_NT_79-20_filled.nc')

## Calculate the SIA and SIE of the pole hole for use with SII

In [20]:
#make mask of 1 for pole hole, 0 for not pole hole for the 3 sizes of pole holes
warnings.filterwarnings("ignore", 'variable ')

#pole hole valid for 1978-10 to 1987-07
qa_1979_01 = xr.open_dataset(data_path+'Raw_data/observations/NSIDC_CDR_v4/seaice_conc_monthly_nh_197901_n07_v04r00.nc')
qa_1979_01 = qa_1979_01['qa_of_cdr_seaice_conc_monthly'].where(qa_1979_01['qa_of_cdr_seaice_conc_monthly']==47,0)
ph_1987_07 = qa_1979_01.where(qa_1979_01==0,1)

#pole hole valid for 1987-07 to 2007-12
qa_1988_09 = xr.open_dataset(data_path+'Raw_data/observations/NSIDC_CDR_v4/seaice_conc_monthly_nh_198809_f08_v04r00.nc')
qa_1988_09 = qa_1988_09['qa_of_cdr_seaice_conc_monthly'].where(qa_1988_09['qa_of_cdr_seaice_conc_monthly']==47,0)
ph_2007_12 = qa_1988_09.where(qa_1988_09==0,1)

#pole hole valid for 2008-01 to present
qa_2008_09 = xr.open_dataset(data_path+'Raw_data/observations/NSIDC_CDR_v4/seaice_conc_monthly_nh_200809_f17_v04r00.nc')
qa_2008_09 = qa_2008_09['qa_of_cdr_seaice_conc_monthly'].where(qa_2008_09['qa_of_cdr_seaice_conc_monthly']==47,0)
ph_current = qa_2008_09.where(qa_2008_09==0,1)

In [21]:
#calculate the SIA of the pole hole - using the edge interpolated data
#each grid cell is 25x25 km so 125/1000000 million square km

#from the interpolated CDR data only select the grid cells containing the pole hole
ph_SIA_1987_07 = (CDR_filled.sel(time=slice('1979-01','1987-07')) * ph_1987_07.values).sum('xgrid').sum('ygrid')*625*1e-6
ph_SIA_2007_12 = (CDR_filled.sel(time=slice('1987-08','2007-12')) * ph_2007_12.values).sum('xgrid').sum('ygrid')*625*1e-6
ph_SIA_present = (CDR_filled.sel(time=slice('2008-01','2020-12')) * ph_current.values).sum('xgrid').sum('ygrid')*625*1e-6

#SIE of pole hole is just the area of the pole hole
                  #1978-11--1987-07, 1987-08--2007-12, 2008-01--present
#pole_hole_SIE = [1.19,              0.31,             0.029]

ph_SIE_1987_07 = (ph_SIA_1987_07['CDR'] * 0) + 1.19
ph_SIE_2007_12 = (ph_SIA_2007_12['CDR'] * 0) + 0.31
ph_SIE_present = (ph_SIA_present['CDR'] * 0) + 0.029

In [22]:
#combine the data into a single dataset and save to NetCDF
ph_SIA = xr.concat((ph_SIA_1987_07, ph_SIA_2007_12, ph_SIA_present), dim='time')
ph_SIE = xr.concat((ph_SIE_1987_07, ph_SIE_2007_12, ph_SIE_present), dim='time')

ph_SIA_SIE = xr.Dataset({'CDR_SIA':ph_SIA['CDR'], 'BT_SIA':ph_SIA['BT'], 'NT_SIA':ph_SIA['NT'] ,'SIE':ph_SIE})

ph_SIA_SIE.attrs = {'Description': 'Arctic sea ice area (SIA) and sea ice extent '\
                        +'(SIE) of the pole hole from NOAA/NSIDC Climate Data Record '\
                        +'of Passive Microwave Sea Ice Concentration, Version 4. '\
                        +'All months 1979-2020.', 
                    'Units'      : 'million square km',
                    'Timestamp'  : str(datetime.datetime.utcnow().strftime(
                        "%H:%M UTC %a %Y-%m-%d")),
                    'Data source': 'NOAA/NSIDC Climate Data Record of Passive Microwave '\
                        +'Sea Ice Concentration, Version 4, doi:10.7265/efmz-2t65.',
                    'Analysis'   : 'https://github.com/chrisrwp/synthetic-ensemble/'\
                        +'SIA/SIA_calculations_observations.ipynb'}

ph_SIA_SIE.to_netcdf(data_path+'Raw_data/observations/NSIDC_CDR_v4/pole_hole_SIA_edge_CDR_BT_NT_79-20.nc')

# HadISST 1 

## Reduce the dataset to 1979-2020 >30N and replace 2009-03 with 2007-03 and 2009-04 with 2008-04
N.B. March and April 2009 have a large drops in SIA and SIE (notably large negative SIC anomalies in Hudson Bay, Labrador Sea and sea of Okhotsk), this discontinutiy is not shown in other datasets.

In [4]:
HadISST1 = xr.open_dataset(data_path+'Raw_data/observations/HadISST/'\
                           +'HadISST_ice.nc')

#select 1979-2020 and the area above 30N
HadISST1_30N = HadISST1['sic'].sel(time=slice('1979','2020')).where(
    HadISST1['latitude']>30, drop=True) 

HadISST1_30N['time'] = CLIVAR_time #adjust to mid-month to exactly match models

HadISST1_30N.to_netcdf(data_path+'Raw_data/observations/HadISST/'\
                       +'HadISST1_NH_79-20.nc') #save to NetCDF

In [66]:
#fill the spurious data with the most appropriate nearby year's data
HadISST1_2009_03 = HadISST1_30N['sic'].sel(time='2007-03').copy()
HadISST1_2009_03['time'] = xr.DataArray(data = CLIVAR_time.sel(time='2009-03').values, 
                                        coords={'time': CLIVAR_time.sel(time='2009-03').values}, 
                                        dims=['time'])

HadISST1_2009_04 = HadISST1_30N['sic'].sel(time='2008-04').copy()
HadISST1_2009_04['time'] = xr.DataArray(data = CLIVAR_time.sel(time='2009-04').values, 
                                        coords={'time': CLIVAR_time.sel(time='2009-04').values}, 
                                        dims=['time'])

HadISST1_30N_correct = xr.concat((HadISST1_30N['sic'].sel(time=slice('1979','2009-02')), 
                                  HadISST1_2009_03, HadISST1_2009_04, 
                                  HadISST1_30N['sic'].sel(time=slice('2009-05','2020'))), dim='time')

#save this corrected data to NetCDF
HadISST1_30N_correct.to_netcdf(data_path+'Raw_data/observations/HadISST/HadISST1_NH_79-20_filled.nc')

## From the corrected dataset and area file, compute the SIA and SIE
N.B. Concentrations <0.15 are set to 0 this will affect SIA but not SIE

In [72]:
#open area file created from: cdo gridarea -selgrid,2 HadISST_ice.nc HadISST_ice_area.nc
HadISST1_areas = xr.open_dataset(data_path+'Raw_data/observations/HadISST/HadISST_ice_area.nc')
HadISST1_areas_NH = HadISST1_areas['cell_area'].where(HadISST1_areas['latitude']>30,drop=True) #select >30N

In [73]:
#compute SIA and SIE
NH_SIA = (HadISST1_30N_correct * HadISST1_areas_NH / 1e12).sum('latitude').sum('longitude') 
NH_SIE = HadISST1_areas_NH.where(HadISST1_30N_correct>=0.15,0).sum('latitude').sum('longitude') / 1e12

#save calculations to NetCDF
HadISST1_SIA_SIE = xr.Dataset({'SIA' : NH_SIA, 'SIE' : NH_SIE})

HadISST1_SIA_SIE.attrs = {'Description': 'Arctic sea ice area (SIA) and sea ice extent (SIE) from '\
                              +'HadISST1 for all months 1979-2020, calculated using a grid area file'\
                              +'from CDO. Note large negative SIE and SIA anomalies for 2009-03 and'\
                              +'2009-04 are filled with 2007-03 and 2008-04 values.', 
                          'Units'      : 'million square km',
                          'Timestamp'  : str(datetime.datetime.utcnow().strftime(
                              "%H:%M UTC %a %Y-%m-%d")),
                          'Data source': 'Hadley Centre Sea Ice and Sea Surface Temperature data set'\
                              +'(HadISST), doi:10.1029/2002JD002670',
                          'Analysis'   : 'https://github.com/chrisrwp/synthetic-ensemble/SIA/'\
                              +'SIA_calculations_observations.ipynb'}

HadISST1_SIA_SIE.to_netcdf(data_path+'Raw_data/observations/HadISST/HadISST1_SIA_SIE_79-20_filled.nc')

# Compare all SIA and SIE from the different datasets

In [53]:
#open the SIA and SIE data sets
CDR_SIA_SIE = xr.open_dataset(
    data_path+'Raw_data/observations/NSIDC_CDR_v4/SIA_SIE_CDR_BT_NT_79-20_filled.nc')

HadISST1_SIA_SIE = xr.open_dataset(
    data_path+'Raw_data/observations/HadISST/HadISST1_SIA_SIE_79-20_filled.nc')

In [54]:
#compute the average from all data sets
average_SIA = (CDR_SIA_SIE['CDR_SIA'] + CDR_SIA_SIE['NT_SIA'] + CDR_SIA_SIE['BT_SIA']
               + HadISST1_SIA_SIE['SIA'])  / 4
average_SIE = (CDR_SIA_SIE['CDR_SIE'] + CDR_SIA_SIE['NT_SIE'] + CDR_SIA_SIE['BT_SIE']
               + HadISST1_SIA_SIE['SIE']) / 4

In [30]:
import matplotlib.pyplot as plt
month_list = ['January', 'February', 'March', 'April', 'May', 'June', 'July', 
              'August', 'September', 'October', 'November', 'December']

In [59]:
#plot a single month time series of SIA or SIE
month_ = 3 
SIE_SIA = 'A'

plt.figure(figsize=[10,5])
CDR_SIA_SIE['CDR_SI{}'.format(SIE_SIA)].sel(time=CDR_SIA_SIE['time.month']==month_).plot(label='CDR')
CDR_SIA_SIE['BT_SI{}'.format(SIE_SIA)].sel(time=CDR_SIA_SIE['time.month']==month_).plot(label='BT')
CDR_SIA_SIE['NT_SI{}'.format(SIE_SIA)].sel(time=CDR_SIA_SIE['time.month']==month_).plot(label='NT')
HadISST1_SIA_SIE['SI{}'.format(SIE_SIA)].sel(time=HadISST1_SIA_SIE['time.month']==month_).plot(
    label='HadISST1')


average_SIA.sel(time=average_SIA['time.month']==month_).plot(label='Mean', c='k', linewidth=2)

plt.legend(fontsize=12)
plt.xlim(np.datetime64('1979-01'), np.datetime64('2020-12'))
plt.xticks(fontsize=12)
plt.xlabel('Time', fontsize=16)

plt.ylabel(r'$SI{} \ [10^6 \ km^2]$'.format(SIE_SIA), fontsize=16)
plt.yticks(fontsize=12)

plt.title('{} SI{} 1979-2020'.format(month_list[month_-1], SIE_SIA), fontsize=18);

In [60]:
#plot a single month trend plot for all data sets
month_ = 3 
SIE_SIA = 'A'

plt.figure(figsize=[10,5])

coefs = np.polyfit(np.arange(1979,2021), CDR_SIA_SIE['CDR_SI{}'.format(SIE_SIA)].sel(
    time=CDR_SIA_SIE['time.month']==month_), 1)
plt.plot(np.arange(1979,2021), np.arange(1979,2021)*coefs[0] + coefs[1], label='CDR')

coefs = np.polyfit(np.arange(1979,2021), CDR_SIA_SIE['BT_SI{}'.format(SIE_SIA)].sel(
    time=CDR_SIA_SIE['time.month']==month_), 1)
plt.plot(np.arange(1979,2021), np.arange(1979,2021)*coefs[0] + coefs[1], label='BT')

coefs = np.polyfit(np.arange(1979,2021), CDR_SIA_SIE['NT_SI{}'.format(SIE_SIA)].sel(
    time=CDR_SIA_SIE['time.month']==month_), 1)
plt.plot(np.arange(1979,2021), np.arange(1979,2021)*coefs[0] + coefs[1], label='NT')

coefs = np.polyfit(np.arange(1979,2021), HadISST1_SIA_SIE['SI{}'.format(SIE_SIA)].sel(
    time=SIA_SIE_interp_index['time.month']==month_), 1)
plt.plot(np.arange(1979,2021), np.arange(1979,2021)*coefs[0] + coefs[1], label='HadISST1')

coefs = np.polyfit(np.arange(1979,2021), average_SIA.sel(
    time=average_SIA['time.month']==month_), 1)
plt.plot(np.arange(1979,2021), np.arange(1979,2021)*coefs[0] + coefs[1], 
         label='Average', c='k', linewidth=2)


plt.legend(fontsize=12)
plt.xlim(1979,2020)
plt.xticks(fontsize=12)
plt.xlabel('Time', fontsize=16)

plt.ylabel(r'$SI{} \ [10^6 \ km^2]$'.format(SIE_SIA), fontsize=16)
plt.yticks(fontsize=12)

plt.title('{} SI{} 1979-2020'.format(month_list[month_-1], SIE_SIA), fontsize=18);

In [None]:
#plot for all of months SIA or SIE time series

s_y = [0,1,2,0,1,2,0,1,2,0,1,2] #axes counting
s_x = [0,0,0,1,1,1,2,2,2,3,3,3]

SIE_SIA = 'E' #select either E for extent or A for area

if SIE_SIA == 'E':
    str_name = 'Extent'
    ave = average_SIE.copy()
else:
    str_name = 'Area'
    ave = average_SIA.copy()

fig, axes = plt.subplots(4,3,figsize=[19,12])

for month_i, month_ in enumerate(np.arange(1,13,1)):
    #for each month plot each of the datasets
    CDR_SIA_SIE['CDR_SI{}'.format(SIE_SIA)].sel(time=CDR_SIA_SIE['time.month']==month_).plot(
        label='CDR', ax=axes[s_x[month_i]][s_y[month_i]])
    CDR_SIA_SIE['BT_SI{}'.format(SIE_SIA)].sel(time=CDR_SIA_SIE['time.month']==month_).plot(
        label='BT', ax=axes[s_x[month_i]][s_y[month_i]])
    CDR_SIA_SIE['NT_SI{}'.format(SIE_SIA)].sel(time=CDR_SIA_SIE['time.month']==month_).plot(
        label='NT', ax=axes[s_x[month_i]][s_y[month_i]])
    HadISST1_SIA_SIE['SI{}'.format(SIE_SIA)].sel(time=HadISST1_SIA_SIE['time.month']==month_).plot(
        label='HadISST1', ax=axes[s_x[month_i]][s_y[month_i]])
        
    ave.sel(time=ave['time.month']==month_).plot(label='Mean', c='k', linewidth=2, 
                                                 ax=axes[s_x[month_i]][s_y[month_i]], linestyle='--')
    
    #set the x-axis limits
    axes[s_x[month_i]][s_y[month_i]].set_xlim(np.datetime64('1979-{}'.format(str(month_).zfill(2))), 
                                              np.datetime64('2020-{}'.format(str(month_).zfill(2))))
    axes[s_x[month_i]][s_y[month_i]].set_title(month_list[month_-1], fontsize=16)
    axes[s_x[month_i]][s_y[month_i]].set_xlabel('')
    
    if s_x[month_i] == 3:
        axes[s_x[month_i]][s_y[month_i]].set_xlabel('Year', fontsize=14)
    if s_y[month_i] == 0:
        axes[s_x[month_i]][s_y[month_i]].set_ylabel(
            r'$Sea \ Ice \ {} \ [10^6 \ km^2]$'.format(str_name), fontsize=14)
        
    plt.tight_layout()
    
extra_legend = plt.legend(bbox_to_anchor=(1.2, 1), loc='upper center', borderaxespad=0, 
                          ncol=1, fontsize=14)
plt.gca().add_artist(extra_legend);

In [None]:
#plot trends of SIA or SIE for all months
s_y = [0,1,2,0,1,2,0,1,2,0,1,2]
s_x = [0,0,0,1,1,1,2,2,2,3,3,3]

SIE_SIA = 'A'

if SIE_SIA == 'E':
    str_name = 'Extent'
    ave = average_SIE.copy()
else:
    str_name = 'Area'
    ave = average_SIA.copy()

fig, axes = plt.subplots(4,3,figsize=[19,12])

for month_i, month_ in enumerate(np.arange(1,13,1)):
    #for each month calculate the trend for each data set and plot it
    coefs = np.polyfit(np.arange(1979,2021), CDR_SIA_SIE['CDR_SI{}'.format(SIE_SIA)].sel(
        time=CDR_SIA_SIE['time.month']==month_), 1)
    axes[s_x[month_i]][s_y[month_i]].plot(np.arange(1979,2021), 
                                          np.arange(1979,2021)*coefs[0] + coefs[1], label='CDR')

    coefs = np.polyfit(np.arange(1979,2021), CDR_SIA_SIE['BT_SI{}'.format(SIE_SIA)].sel(
        time=CDR_SIA_SIE['time.month']==month_), 1)
    axes[s_x[month_i]][s_y[month_i]].plot(np.arange(1979,2021), 
                                          np.arange(1979,2021)*coefs[0] + coefs[1], label='BT')

    coefs = np.polyfit(np.arange(1979,2021), CDR_SIA_SIE['NT_SI{}'.format(SIE_SIA)].sel(
        time=CDR_SIA_SIE['time.month']==month_), 1)
    axes[s_x[month_i]][s_y[month_i]].plot(np.arange(1979,2021), 
                                          np.arange(1979,2021)*coefs[0] + coefs[1], label='NT')

    coefs = np.polyfit(np.arange(1979,2021), HadISST1_SIA_SIE['SI{}'.format(SIE_SIA)].sel(
        time=SIA_SIE_interp_index['time.month']==month_), 1)
    axes[s_x[month_i]][s_y[month_i]].plot(np.arange(1979,2021), 
                                          np.arange(1979,2021)*coefs[0] + coefs[1], label='HadISST1')


    coefs = np.polyfit(np.arange(1979,2021), ave.sel(time=ave['time.month']==month_), 1)
    axes[s_x[month_i]][s_y[month_i]].plot(np.arange(1979,2021), 
                                          np.arange(1979,2021)*coefs[0] + coefs[1], 
                                          label='Average', c='k', linewidth=2, linestyle='--')
    
    
    axes[s_x[month_i]][s_y[month_i]].set_xlim(1979,2020)
    axes[s_x[month_i]][s_y[month_i]].set_title(month_list[month_-1], fontsize=16)
    axes[s_x[month_i]][s_y[month_i]].set_xlabel('')
    
    if s_x[month_i] == 3:
        axes[s_x[month_i]][s_y[month_i]].set_xlabel('Year', fontsize=14)
    if s_y[month_i] == 0:
        axes[s_x[month_i]][s_y[month_i]].set_ylabel(
            r'$Sea \ Ice \ {} \ [10^6 \ km^2]$'.format(str_name), fontsize=14)
        
    plt.tight_layout()
    
extra_legend = plt.legend(bbox_to_anchor=(1.2, 1), loc='upper center', borderaxespad=0, 
                          ncol=1, fontsize=14)
plt.gca().add_artist(extra_legend);