In [2]:
import numpy as np
import xarray as xr

Read in the two month lead time forecast data for all six locations.

In [6]:
FCDAT = xr.open_dataset('C:/Users/durka/Downloads/Thesis_Data/type_fcmean_LM02.nc')

Set the latitude and longitude coordinates of each location, and use .sel() to select the gridpoint nearest to each location.

In [7]:
LATLON = np.array([[53.428, -6.241],[51.847, -8.486],[54.228, -10.007],[51.938, -10.241],[55.372, -7.339],[53.537, -7.362]])

FC_LM02 = {0: FCDAT.t2m.sel(latitude=xr.DataArray([LATLON[0,0]], dims='lat'), 
                           longitude=xr.DataArray([LATLON[0,1]], dims='lon'),
                           method='nearest'),
          1: FCDAT.t2m.sel(latitude=xr.DataArray([LATLON[1,0]], dims='lat'), 
                           longitude=xr.DataArray([LATLON[1,1]], dims='lon'),
                           method='nearest'),
          2: FCDAT.t2m.sel(latitude=xr.DataArray([LATLON[2,0]], dims='lat'), 
                           longitude=xr.DataArray([LATLON[2,1]], dims='lon'),
                           method='nearest'),
          3: FCDAT.t2m.sel(latitude=xr.DataArray([LATLON[3,0]], dims='lat'), 
                           longitude=xr.DataArray([LATLON[3,1]], dims='lon'),
                           method='nearest'),
          4: FCDAT.t2m.sel(latitude=xr.DataArray([LATLON[4,0]], dims='lat'), 
                           longitude=xr.DataArray([LATLON[4,1]], dims='lon'),
                           method='nearest'),
          5: FCDAT.t2m.sel(latitude=xr.DataArray([LATLON[5,0]], dims='lat'), 
                           longitude=xr.DataArray([LATLON[5,1]], dims='lon'),
                           method='nearest')}

The task at hand now is to find the ensemble member which remains closest to the mean over the five year period.

The first step in this is to obtain an array, ``devs``, which is a measure of how much each ensemble deviates from the mean at each time step. It's shape is 6x60x51, where each dimension represents 6 locations, 60 time steps and 51 ensemble members respectively.

In [8]:
devs = np.zeros([6,60,51])

for key in FC_LM02:
    for t in range(60):
        for num in range(51):
            deviation = np.abs(FC_LM02[key][t,num] - FC_LM02[key].mean(dim='number')[t])
            devs[key,t,num] = deviation.values

Next, we compute the L2 norms of each of the columns of ``devs``. This is stored in an array called ``col_norms``. Each column norm represents the average deviation of a particular ensemble throughout the entire five year period.

The shape of ``col_norms`` is 6x51, where the dimension represents 6 locations and 51 column norms, one norm per ensemble member.

In [9]:
col_norms = np.zeros([6,51])

for loc in range(6):
    for num in range(51):
            norms = np.linalg.norm(devs[loc,:,num])
            col_norms[loc,num] = norms

Finally, we create an array called ``nearest_ensemble``, which contains the ensemble member's number which deviates the least from the ensemble mean for each location.

This is obtained using numpy's ``.argmin()`` function.

In [10]:
nearest_ensemble = np.zeros([6])

for loc in range(6):
    nearest_ensemble[loc] = np.argmin(col_norms[loc,:])

In [11]:
nearest_ensemble

array([50., 50., 50., 50., 50., 50.])

The above output shows that:
* The 51st ensemble member deviates the least from the mean for all six locations

We then create an array ``FC_LM02_data``, which consists of the predictions of the ensemble member that deviates the least from the mean for each of the six locations. Each value is converted from Kelvin to Celsius as the observational and climatological data is in Celsius.

The data is stored using ``%store FC_LM02_data``.

In [9]:
FC_LM02_data = np.zeros([6,60])

for i in range(6):
    FC_LM02_data[i] = np.stack(FC_LM02[i][:,int(nearest_ensemble[i]),0,0]) - 273.15
    
%store FC_LM02_data

Stored 'FC_LM02_data' (ndarray)
