# Spatial resampling of gridded ECMWF data onto a satellite's swath

## Comparison between MetView and pyResample

For the retrieval of physical parameters in the atmosphere via Optimal Estimation, we need apriori knowlegdge of the quantity we want to retrieve; this apriori knowledge can be obtained from a forecast model. 

A walk-through on how this apriori knowledge can be obtained has been presented [here](https://github.com/deweatherman/ExtractApriori/blob/main/ExtractApriori_v1.ipynb). 

The goal in this notebook is to compare plainly the *Nearest Neighbours with Gaussian weights* from *pyResample* with the *Bilinear interpolation* from *MetView*. We remind the reader that MetView leverages [*Meteorological Interpolation and Regridding*](https://confluence.ecmwf.int/display/UDOC/MARS+interpolation+with+MIR)'s (*MIR*) functionality for [regridding](https://www.ecmwf.int/en/newsletter/169/computing/advanced-regridding-metview) (e.g. resampling).


In this notebook we use: 
- [xarray](https://docs.xarray.dev/en/stable/)'s functionality to handle high dimensional datasets and to perform the time interpolation. 
- [Pyresample](https://pyresample.readthedocs.io/en/latest/)'s functionality to efficiently resample data from a reduced Gaussian grid onto a swath.
- [MetView](https://metview.readthedocs.io/en/latest/index.html)'s functionality to efficiently resample data from a reduced Gaussian grid onto a swath.

First we will import the needed libraries and packages:

In [1]:
import sys
import os

import xarray as xr

import numpy as np
import pandas as pd

import matplotlib
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker

import cartopy
import cartopy.crs as ccrs
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER

import pyproj
import pyresample
from pyresample import create_area_def, load_area, data_reduce, utils, AreaDefinition
from pyresample.geometry import SwathDefinition, GridDefinition
from pyresample.kd_tree import resample_nearest, resample_gauss 
from pyresample.bilinear import XArrayBilinearResampler, NumpyBilinearResampler #

#sys.path.append('/home/mario/anaconda3/envs/pyOpEst/lib/python3.9/site-packages/')

import metview as mv

import time

%matplotlib inline


We define some directories for easy access to the datasets:

In [3]:
# Satellite data:
#dataSatDir = '/home/mario/Data/CMSAF/ssims/F16/'
dataSatDir = '/home/mario/Data/CMSAF/ssims/F16/ORD47662/'
#dataSatDir = '/nobackup/users/echeverr/data/cmsaf/ssmis/F16/'
#fileSatID = 'BTRin20140909000000324SSF1601GL.nc'

# ECMWF data:
dataECMWFDir_RG ='/home/mario/Data/Covariance_means/MARS_api_data/datasetsApriori/'
#dataECMWFDir = '/nobackup/users/echeverr/data/ECMWF_era5/MARS_api_data/datasetsApriori/'
#dataECMWFDir_Reg ='/home/mario/Data/Covariance_means/MARS_api_data/datasetsAprioriRegGrid/'

sys.path.append('support') # where supporting_routines_m live

import support_routines 

# Data preparation

We have two sources of data that we care about in this notebook: ECMWF's data and satellite's swath definition (we do not use explicitely the observations per se, rather we use the location of the observations, the points where we want to resample and interpolate our ECMWF data).

First we load ECMWF's datasets using xarray; because the focus in on the resampling of 10m wind speed components over the ocean, we focus on *surface* like datasets (i.e. with variables defined on the surface, e.g. 10m wind speed or 2m temperature).


In [4]:
# xarray surface dataset:

xarray_ds = xr.open_mfdataset(dataECMWFDir_RG+'surface*.grib', 
                                 engine="cfgrib") #, chunks={'time': 50,'latitude': 50, 'longitude': 200})
xarray_ds


Unnamed: 0,Array,Chunk
Bytes,4.14 MiB,4.14 MiB
Shape,"(542080,)","(542080,)"
Count,2 Graph Layers,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 4.14 MiB 4.14 MiB Shape (542080,) (542080,) Count 2 Graph Layers 1 Chunks Type float64 numpy.ndarray",542080  1,

Unnamed: 0,Array,Chunk
Bytes,4.14 MiB,4.14 MiB
Shape,"(542080,)","(542080,)"
Count,2 Graph Layers,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,4.14 MiB,4.14 MiB
Shape,"(542080,)","(542080,)"
Count,2 Graph Layers,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 4.14 MiB 4.14 MiB Shape (542080,) (542080,) Count 2 Graph Layers 1 Chunks Type float64 numpy.ndarray",542080  1,

Unnamed: 0,Array,Chunk
Bytes,4.14 MiB,4.14 MiB
Shape,"(542080,)","(542080,)"
Count,2 Graph Layers,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.75 kiB,1.75 kiB
Shape,"(14, 16)","(14, 16)"
Count,2 Graph Layers,1 Chunks
Type,datetime64[ns],numpy.ndarray
"Array Chunk Bytes 1.75 kiB 1.75 kiB Shape (14, 16) (14, 16) Count 2 Graph Layers 1 Chunks Type datetime64[ns] numpy.ndarray",16  14,

Unnamed: 0,Array,Chunk
Bytes,1.75 kiB,1.75 kiB
Shape,"(14, 16)","(14, 16)"
Count,2 Graph Layers,1 Chunks
Type,datetime64[ns],numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 463.20 MiB 463.20 MiB Shape (14, 16, 542080) (14, 16, 542080) Count 2 Graph Layers 1 Chunks Type float32 numpy.ndarray",542080  16  14,

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 463.20 MiB 463.20 MiB Shape (14, 16, 542080) (14, 16, 542080) Count 2 Graph Layers 1 Chunks Type float32 numpy.ndarray",542080  16  14,

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 463.20 MiB 463.20 MiB Shape (14, 16, 542080) (14, 16, 542080) Count 2 Graph Layers 1 Chunks Type float32 numpy.ndarray",542080  16  14,

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 463.20 MiB 463.20 MiB Shape (14, 16, 542080) (14, 16, 542080) Count 2 Graph Layers 1 Chunks Type float32 numpy.ndarray",542080  16  14,

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 463.20 MiB 463.20 MiB Shape (14, 16, 542080) (14, 16, 542080) Count 2 Graph Layers 1 Chunks Type float32 numpy.ndarray",542080  16  14,

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 463.20 MiB 463.20 MiB Shape (14, 16, 542080) (14, 16, 542080) Count 2 Graph Layers 1 Chunks Type float32 numpy.ndarray",542080  16  14,

Unnamed: 0,Array,Chunk
Bytes,463.20 MiB,463.20 MiB
Shape,"(14, 16, 542080)","(14, 16, 542080)"
Count,2 Graph Layers,1 Chunks
Type,float32,numpy.ndarray


First we read our dataset using *MetView*'s API:

In [6]:
# metview read dataset

# Opening ...*... (BAD) type is not supported.
metview_ds = mv.read(dataECMWFDir_RG+'surface_2014.grib')

Once the dataset is read, we have to access the proper field of interest; the reader is free to explore *MetView*'s documentation on how exactly to do this. 

It took me a bit because the interface is highly dependent on ECMWF's terminology; this is one **disadvantage** respect to *xarray*'s transparent and very informative interface as seen two cells up.

For this test we are going to take the *u* component of the 10m, neutral wind speed; for simplicity we take a single time instant of interest (i.e. one map); we are interested in the spatial resampling comparison and nothing more in this notebook.

In [7]:
u_metview_ds = mv.read(data=metview_ds, param='u10n',
                       date = '20141001', time='0000',
                       step='3',lsm='on')


Our ECMWF's datasets seems to be ready to use; we now focus on our satellite observations.

In this notebook we use CMSAF's [data](https://wui.cmsaf.eu/safira/action/viewDoiDetails?acronym=FCDR_MWI_V003) (Temperature Brightness).

CMSAF's data is structured in logical groups, where each group coincides with a group of channels that share the same antenna of the instrument; this logical separation is very useful because different antennas will (likely) have effectively different footprints and sampling on the ground. 

We first open the datasets (7 days of observations, overlaping as much as possible in time with our ECMWF data); the open method in xarray (*open_mfdataset*) will open only the highest level in the datasets, this is useful to grasp the contents of the dataset (*channels*, *time*, *swath*, etc.). Notice that *open_mfdataset* at a difference with the more basic *open_dataset* will load the datasets in a *lazy* way using **Dask** [under the hood](https://docs.xarray.dev/en/stable/user-guide/dask.html) to avoid actually loading the data on memory:  

In [9]:
# Open satellite dataset at highest level (just to get the channels information):

#ds = xr.open_dataset(dataSatDir+fileID)
ds = xr.open_mfdataset(dataSatDir+'*.nc')
ds

Unnamed: 0,Array,Chunk
Bytes,442.30 MiB,63.19 MiB
Shape,"(318533, 7, 26)","(45505, 7, 26)"
Count,73 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 442.30 MiB 63.19 MiB Shape (318533, 7, 26) (45505, 7, 26) Count 73 Graph Layers 7 Chunks Type float64 numpy.ndarray",26  7  318533,

Unnamed: 0,Array,Chunk
Bytes,442.30 MiB,63.19 MiB
Shape,"(318533, 7, 26)","(45505, 7, 26)"
Count,73 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,442.30 MiB,63.19 MiB
Shape,"(318533, 7, 26)","(45505, 7, 26)"
Count,73 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 442.30 MiB 63.19 MiB Shape (318533, 7, 26) (45505, 7, 26) Count 73 Graph Layers 7 Chunks Type float64 numpy.ndarray",26  7  318533,

Unnamed: 0,Array,Chunk
Bytes,442.30 MiB,63.19 MiB
Shape,"(318533, 7, 26)","(45505, 7, 26)"
Count,73 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,442.30 MiB,63.19 MiB
Shape,"(318533, 7, 26)","(45505, 7, 26)"
Count,73 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 442.30 MiB 63.19 MiB Shape (318533, 7, 26) (45505, 7, 26) Count 73 Graph Layers 7 Chunks Type float64 numpy.ndarray",26  7  318533,

Unnamed: 0,Array,Chunk
Bytes,442.30 MiB,63.19 MiB
Shape,"(318533, 7, 26)","(45505, 7, 26)"
Count,73 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,442.30 MiB,63.19 MiB
Shape,"(318533, 7, 26)","(45505, 7, 26)"
Count,80 Graph Layers,7 Chunks
Type,object,numpy.ndarray
"Array Chunk Bytes 442.30 MiB 63.19 MiB Shape (318533, 7, 26) (45505, 7, 26) Count 80 Graph Layers 7 Chunks Type object numpy.ndarray",26  7  318533,

Unnamed: 0,Array,Chunk
Bytes,442.30 MiB,63.19 MiB
Shape,"(318533, 7, 26)","(45505, 7, 26)"
Count,80 Graph Layers,7 Chunks
Type,object,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,442.30 MiB,63.19 MiB
Shape,"(318533, 7, 26)","(45505, 7, 26)"
Count,80 Graph Layers,7 Chunks
Type,object,numpy.ndarray
"Array Chunk Bytes 442.30 MiB 63.19 MiB Shape (318533, 7, 26) (45505, 7, 26) Count 80 Graph Layers 7 Chunks Type object numpy.ndarray",26  7  318533,

Unnamed: 0,Array,Chunk
Bytes,442.30 MiB,63.19 MiB
Shape,"(318533, 7, 26)","(45505, 7, 26)"
Count,80 Graph Layers,7 Chunks
Type,object,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,17.01 MiB,2.43 MiB
Shape,"(7, 318533)","(7, 45505)"
Count,68 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 17.01 MiB 2.43 MiB Shape (7, 318533) (7, 45505) Count 68 Graph Layers 7 Chunks Type float64 numpy.ndarray",318533  7,

Unnamed: 0,Array,Chunk
Bytes,17.01 MiB,2.43 MiB
Shape,"(7, 318533)","(7, 45505)"
Count,68 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,17.01 MiB,2.43 MiB
Shape,"(7, 318533)","(7, 45505)"
Count,68 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 17.01 MiB 2.43 MiB Shape (7, 318533) (7, 45505) Count 68 Graph Layers 7 Chunks Type float64 numpy.ndarray",318533  7,

Unnamed: 0,Array,Chunk
Bytes,17.01 MiB,2.43 MiB
Shape,"(7, 318533)","(7, 45505)"
Count,68 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,8.51 MiB,1.22 MiB
Shape,"(7, 318533)","(7, 45505)"
Count,68 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 8.51 MiB 1.22 MiB Shape (7, 318533) (7, 45505) Count 68 Graph Layers 7 Chunks Type float32 numpy.ndarray",318533  7,

Unnamed: 0,Array,Chunk
Bytes,8.51 MiB,1.22 MiB
Shape,"(7, 318533)","(7, 45505)"
Count,68 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,8.51 MiB,1.22 MiB
Shape,"(7, 318533)","(7, 45505)"
Count,68 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 8.51 MiB 1.22 MiB Shape (7, 318533) (7, 45505) Count 68 Graph Layers 7 Chunks Type float32 numpy.ndarray",318533  7,

Unnamed: 0,Array,Chunk
Bytes,8.51 MiB,1.22 MiB
Shape,"(7, 318533)","(7, 45505)"
Count,68 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,221.15 MiB,31.59 MiB
Shape,"(7, 318533, 26)","(7, 45505, 26)"
Count,77 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 221.15 MiB 31.59 MiB Shape (7, 318533, 26) (7, 45505, 26) Count 77 Graph Layers 7 Chunks Type float32 numpy.ndarray",26  318533  7,

Unnamed: 0,Array,Chunk
Bytes,221.15 MiB,31.59 MiB
Shape,"(7, 318533, 26)","(7, 45505, 26)"
Count,77 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,17.01 MiB,2.43 MiB
Shape,"(318533, 7)","(45505, 7)"
Count,57 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 17.01 MiB 2.43 MiB Shape (318533, 7) (45505, 7) Count 57 Graph Layers 7 Chunks Type float64 numpy.ndarray",7  318533,

Unnamed: 0,Array,Chunk
Bytes,17.01 MiB,2.43 MiB
Shape,"(318533, 7)","(45505, 7)"
Count,57 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,17.01 MiB,2.43 MiB
Shape,"(7, 318533)","(7, 45505)"
Count,75 Graph Layers,7 Chunks
Type,object,numpy.ndarray
"Array Chunk Bytes 17.01 MiB 2.43 MiB Shape (7, 318533) (7, 45505) Count 75 Graph Layers 7 Chunks Type object numpy.ndarray",318533  7,

Unnamed: 0,Array,Chunk
Bytes,17.01 MiB,2.43 MiB
Shape,"(7, 318533)","(7, 45505)"
Count,75 Graph Layers,7 Chunks
Type,object,numpy.ndarray


Logical groups in NetCDF files (.nc) can be accessed directly with *xarray* if you know the name of the group (this can be easily accessed via the NetCDF4 library if not given by the data provider).

For our particular application (10m wind speed retrieval via Optimal Estimation) we want to access the groups *scene_env1* and *scene_env2* (containing channels 19 and 37 GHz horizontal/vertical polarizations):

In [10]:
# Open specific scenes containing the satellite observations:

scenes_list = ['scene_env1', 'scene_env2']
scene_BT = []

for scene in scenes_list:        
    scene_BT.append(xr.open_mfdataset(
        dataSatDir+'*.nc', combine = 'nested', 
        concat_dim='time', group = scene)) 

#for scene in scenes_list:
    #scene_BT.append(xr.open_dataset(dataSatDir+fileID, group = scene))
    #scene_BT.append(xr.open_mfdataset(dataSatDir+'*.nc', group = scene))

Lets check *scene_env1*:

In [11]:
scene_BT[0]

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 218.72 MiB 31.25 MiB Shape (318533, 90) (45505, 90) Count 15 Graph Layers 7 Chunks Type float64 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 218.72 MiB 31.25 MiB Shape (318533, 90) (45505, 90) Count 15 Graph Layers 7 Chunks Type float64 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 109.36 MiB 15.62 MiB Shape (318533, 90) (45505, 90) Count 15 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 109.36 MiB 15.62 MiB Shape (318533, 90) (45505, 90) Count 15 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 109.36 MiB 15.62 MiB Shape (318533, 90) (45505, 90) Count 15 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 218.72 MiB 31.25 MiB Shape (318533, 90) (45505, 90) Count 15 Graph Layers 7 Chunks Type float64 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,15 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,328.08 MiB,46.87 MiB
Shape,"(318533, 3, 90)","(45505, 3, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 328.08 MiB 46.87 MiB Shape (318533, 3, 90) (45505, 3, 90) Count 15 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  3  318533,

Unnamed: 0,Array,Chunk
Bytes,328.08 MiB,46.87 MiB
Shape,"(318533, 3, 90)","(45505, 3, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,328.08 MiB,46.87 MiB
Shape,"(318533, 3, 90)","(45505, 3, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 328.08 MiB 46.87 MiB Shape (318533, 3, 90) (45505, 3, 90) Count 15 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  3  318533,

Unnamed: 0,Array,Chunk
Bytes,328.08 MiB,46.87 MiB
Shape,"(318533, 3, 90)","(45505, 3, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,328.08 MiB,46.87 MiB
Shape,"(318533, 3, 90)","(45505, 3, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 328.08 MiB 46.87 MiB Shape (318533, 3, 90) (45505, 3, 90) Count 15 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  3  318533,

Unnamed: 0,Array,Chunk
Bytes,328.08 MiB,46.87 MiB
Shape,"(318533, 3, 90)","(45505, 3, 90)"
Count,15 Graph Layers,7 Chunks
Type,float32,numpy.ndarray


We can now  simply concatenate scenes *scene_env1* and *scene_env2*, given because they share the same swath definition (*lon*/*lat* values for each *time*/*scene_across_track* combination, you can check this in CMSAF's product [documentation](https://www.cmsaf.eu/SharedDocs/Literatur/document/2016/saf_cm_dwd_pum_fcdr_ssmis_1_4_pdf.pdf?__blob=publicationFile)).

In [12]:
ds_BT = xr.concat(scene_BT, dim = 'scene_channel') #.drop_vars([])

In [13]:
ds_BT

Unnamed: 0,Array,Chunk
Bytes,1.92 GiB,187.47 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.92 GiB 187.47 MiB Shape (9, 318533, 90) (6, 45505, 90) Count 33 Graph Layers 14 Chunks Type float64 numpy.ndarray",90  318533  9,

Unnamed: 0,Array,Chunk
Bytes,1.92 GiB,187.47 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.92 GiB,187.47 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.92 GiB 187.47 MiB Shape (9, 318533, 90) (6, 45505, 90) Count 33 Graph Layers 14 Chunks Type float64 numpy.ndarray",90  318533  9,

Unnamed: 0,Array,Chunk
Bytes,1.92 GiB,187.47 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 0.96 GiB 93.74 MiB Shape (9, 318533, 90) (6, 45505, 90) Count 33 Graph Layers 14 Chunks Type float32 numpy.ndarray",90  318533  9,

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 0.96 GiB 93.74 MiB Shape (9, 318533, 90) (6, 45505, 90) Count 33 Graph Layers 14 Chunks Type float32 numpy.ndarray",90  318533  9,

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 0.96 GiB 93.74 MiB Shape (9, 318533, 90) (6, 45505, 90) Count 33 Graph Layers 14 Chunks Type float32 numpy.ndarray",90  318533  9,

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,1.92 GiB,187.47 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1.92 GiB 187.47 MiB Shape (9, 318533, 90) (6, 45505, 90) Count 33 Graph Layers 14 Chunks Type float64 numpy.ndarray",90  318533  9,

Unnamed: 0,Array,Chunk
Bytes,1.92 GiB,187.47 MiB
Shape,"(9, 318533, 90)","(6, 45505, 90)"
Count,33 Graph Layers,14 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 0.96 GiB 93.74 MiB Shape (318533, 9, 90) (45505, 6, 90) Count 31 Graph Layers 14 Chunks Type float32 numpy.ndarray",90  9  318533,

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 0.96 GiB 93.74 MiB Shape (318533, 9, 90) (45505, 6, 90) Count 31 Graph Layers 14 Chunks Type float32 numpy.ndarray",90  9  318533,

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 0.96 GiB 93.74 MiB Shape (318533, 9, 90) (45505, 6, 90) Count 31 Graph Layers 14 Chunks Type float32 numpy.ndarray",90  9  318533,

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray


After concatenating through a dimension (in our case through *scene_channel*) *xarray* fill's in any "missing" information; in our case we se that *xarray* added the dimension *scene_channel* to variables that do not really depend on it.

Here we simply correct this by selecting the first element in this dimension for all the variables that actually do not depend on it; we also notice that the attributes (e.b. comment: channels h19, v19, etc.) are not complete after the concatenation, **we will not focus on this** in this notebook.

In [14]:
ds_BT['lat'] = ds_BT.lat[0,:,:]
ds_BT['lon'] = ds_BT.lon[0,:,:]
ds_BT['eia'] = ds_BT.eia[0,:,:]
ds_BT['sft'] = ds_BT.sft[0,:,:]
ds_BT['qc_fov'] = ds_BT.qc_fov[0,:,:]
ds_BT['laz'] = ds_BT.laz[0,:,:]

# And visualize again:
ds_BT

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 218.72 MiB 31.25 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float64 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 218.72 MiB 31.25 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float64 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 109.36 MiB 15.62 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 109.36 MiB 15.62 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 109.36 MiB 15.62 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 218.72 MiB 31.25 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float64 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 0.96 GiB 93.74 MiB Shape (318533, 9, 90) (45505, 6, 90) Count 31 Graph Layers 14 Chunks Type float32 numpy.ndarray",90  9  318533,

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 0.96 GiB 93.74 MiB Shape (318533, 9, 90) (45505, 6, 90) Count 31 Graph Layers 14 Chunks Type float32 numpy.ndarray",90  9  318533,

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 0.96 GiB 93.74 MiB Shape (318533, 9, 90) (45505, 6, 90) Count 31 Graph Layers 14 Chunks Type float32 numpy.ndarray",90  9  318533,

Unnamed: 0,Array,Chunk
Bytes,0.96 GiB,93.74 MiB
Shape,"(318533, 9, 90)","(45505, 6, 90)"
Count,31 Graph Layers,14 Chunks
Type,float32,numpy.ndarray


Finally we want to keep track of the frequency of the channels that we will use, so we select the channels that we will use (and only over the ocean in this case, where sft==0). We also copy the *central_freq* and *polarization* variables and use the values that we already had in our higher level dataset *ds*:

In [15]:
ds_aux = ds_BT.assign_coords(time=ds.time).sel(
    scene_channel=[11,12,14,15])#.where(ds_BT.sft==0)

ds_aux['central_freq'] = ds['central_freq'][0,0,ds_aux['scene_channel']]
ds_aux['polarization'] = ds['polarization'][0,0,ds_aux['scene_channel']]

# Create working satellite dataset (setting 'scene_channel' as last dimension):

SAT_ds = ds_aux.transpose(...,'scene_channel') #.drop_dims(drop_dims = ['date','channel'])

In [16]:
SAT_ds

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 218.72 MiB 31.25 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float64 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 218.72 MiB 31.25 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float64 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 109.36 MiB 15.62 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 109.36 MiB 15.62 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 109.36 MiB 15.62 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float32 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,109.36 MiB,15.62 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 218.72 MiB 31.25 MiB Shape (318533, 90) (45505, 90) Count 34 Graph Layers 7 Chunks Type float64 numpy.ndarray",90  318533,

Unnamed: 0,Array,Chunk
Bytes,218.72 MiB,31.25 MiB
Shape,"(318533, 90)","(45505, 90)"
Count,34 Graph Layers,7 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,437.44 MiB,31.25 MiB
Shape,"(318533, 90, 4)","(45505, 90, 2)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 437.44 MiB 31.25 MiB Shape (318533, 90, 4) (45505, 90, 2) Count 33 Graph Layers 14 Chunks Type float32 numpy.ndarray",4  90  318533,

Unnamed: 0,Array,Chunk
Bytes,437.44 MiB,31.25 MiB
Shape,"(318533, 90, 4)","(45505, 90, 2)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,437.44 MiB,31.25 MiB
Shape,"(318533, 90, 4)","(45505, 90, 2)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 437.44 MiB 31.25 MiB Shape (318533, 90, 4) (45505, 90, 2) Count 33 Graph Layers 14 Chunks Type float32 numpy.ndarray",4  90  318533,

Unnamed: 0,Array,Chunk
Bytes,437.44 MiB,31.25 MiB
Shape,"(318533, 90, 4)","(45505, 90, 2)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,437.44 MiB,31.25 MiB
Shape,"(318533, 90, 4)","(45505, 90, 2)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 437.44 MiB 31.25 MiB Shape (318533, 90, 4) (45505, 90, 2) Count 33 Graph Layers 14 Chunks Type float32 numpy.ndarray",4  90  318533,

Unnamed: 0,Array,Chunk
Bytes,437.44 MiB,31.25 MiB
Shape,"(318533, 90, 4)","(45505, 90, 2)"
Count,33 Graph Layers,14 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,32 B,32 B
Shape,"(4,)","(4,)"
Count,74 Graph Layers,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 32 B 32 B Shape (4,) (4,) Count 74 Graph Layers 1 Chunks Type float64 numpy.ndarray",4  1,

Unnamed: 0,Array,Chunk
Bytes,32 B,32 B
Shape,"(4,)","(4,)"
Count,74 Graph Layers,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,32 B,32 B
Shape,"(4,)","(4,)"
Count,81 Graph Layers,1 Chunks
Type,object,numpy.ndarray
"Array Chunk Bytes 32 B 32 B Shape (4,) (4,) Count 81 Graph Layers 1 Chunks Type object numpy.ndarray",4  1,

Unnamed: 0,Array,Chunk
Bytes,32 B,32 B
Shape,"(4,)","(4,)"
Count,81 Graph Layers,1 Chunks
Type,object,numpy.ndarray


# Spatial resampling of ECMWF's data onto our satellite's swath

We resample now using **two** different techniques: 
- Bilinear interpolation provided by [*MetView*](https://metview.readthedocs.io/en/latest/index.html) (ECMWF + INPE).
- Nearest Neighbours with Gaussian weights (radial weights) provided by [*pyResample*](https://pyresample.readthedocs.io/en/latest/)

The test is simple, resample onto our **satellite swath** using both approaches and check differences.

In [22]:
# User defined desired period of time to analyze:
initSat_time = np.datetime64('2014-10-01T00:00:00.000') 
endSat_time = np.datetime64('2014-10-01T23:59:59.000')

# Find best match (e.g. nearest) for the times present in the dataset:
init_time = SAT_ds.time.sel(time=initSat_time, method = "nearest")
end_time = SAT_ds.time.sel(time=endSat_time, method = "nearest")

In [23]:
work_SAT_ds = SAT_ds.sel(time=slice(init_time,end_time))
                             
work_SAT_ds

Unnamed: 0,Array,Chunk
Bytes,31.25 MiB,31.25 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 31.25 MiB 31.25 MiB Shape (45505, 90) (45505, 90) Count 35 Graph Layers 1 Chunks Type float64 numpy.ndarray",90  45505,

Unnamed: 0,Array,Chunk
Bytes,31.25 MiB,31.25 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,31.25 MiB,31.25 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 31.25 MiB 31.25 MiB Shape (45505, 90) (45505, 90) Count 35 Graph Layers 1 Chunks Type float64 numpy.ndarray",90  45505,

Unnamed: 0,Array,Chunk
Bytes,31.25 MiB,31.25 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.62 MiB,15.62 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 15.62 MiB 15.62 MiB Shape (45505, 90) (45505, 90) Count 35 Graph Layers 1 Chunks Type float32 numpy.ndarray",90  45505,

Unnamed: 0,Array,Chunk
Bytes,15.62 MiB,15.62 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.62 MiB,15.62 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 15.62 MiB 15.62 MiB Shape (45505, 90) (45505, 90) Count 35 Graph Layers 1 Chunks Type float32 numpy.ndarray",90  45505,

Unnamed: 0,Array,Chunk
Bytes,15.62 MiB,15.62 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,15.62 MiB,15.62 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 15.62 MiB 15.62 MiB Shape (45505, 90) (45505, 90) Count 35 Graph Layers 1 Chunks Type float32 numpy.ndarray",90  45505,

Unnamed: 0,Array,Chunk
Bytes,15.62 MiB,15.62 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,31.25 MiB,31.25 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 31.25 MiB 31.25 MiB Shape (45505, 90) (45505, 90) Count 35 Graph Layers 1 Chunks Type float64 numpy.ndarray",90  45505,

Unnamed: 0,Array,Chunk
Bytes,31.25 MiB,31.25 MiB
Shape,"(45505, 90)","(45505, 90)"
Count,35 Graph Layers,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,62.49 MiB,31.25 MiB
Shape,"(45505, 90, 4)","(45505, 90, 2)"
Count,34 Graph Layers,2 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 62.49 MiB 31.25 MiB Shape (45505, 90, 4) (45505, 90, 2) Count 34 Graph Layers 2 Chunks Type float32 numpy.ndarray",4  90  45505,

Unnamed: 0,Array,Chunk
Bytes,62.49 MiB,31.25 MiB
Shape,"(45505, 90, 4)","(45505, 90, 2)"
Count,34 Graph Layers,2 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,62.49 MiB,31.25 MiB
Shape,"(45505, 90, 4)","(45505, 90, 2)"
Count,34 Graph Layers,2 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 62.49 MiB 31.25 MiB Shape (45505, 90, 4) (45505, 90, 2) Count 34 Graph Layers 2 Chunks Type float32 numpy.ndarray",4  90  45505,

Unnamed: 0,Array,Chunk
Bytes,62.49 MiB,31.25 MiB
Shape,"(45505, 90, 4)","(45505, 90, 2)"
Count,34 Graph Layers,2 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,62.49 MiB,31.25 MiB
Shape,"(45505, 90, 4)","(45505, 90, 2)"
Count,34 Graph Layers,2 Chunks
Type,float32,numpy.ndarray
"Array Chunk Bytes 62.49 MiB 31.25 MiB Shape (45505, 90, 4) (45505, 90, 2) Count 34 Graph Layers 2 Chunks Type float32 numpy.ndarray",4  90  45505,

Unnamed: 0,Array,Chunk
Bytes,62.49 MiB,31.25 MiB
Shape,"(45505, 90, 4)","(45505, 90, 2)"
Count,34 Graph Layers,2 Chunks
Type,float32,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,32 B,32 B
Shape,"(4,)","(4,)"
Count,74 Graph Layers,1 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 32 B 32 B Shape (4,) (4,) Count 74 Graph Layers 1 Chunks Type float64 numpy.ndarray",4  1,

Unnamed: 0,Array,Chunk
Bytes,32 B,32 B
Shape,"(4,)","(4,)"
Count,74 Graph Layers,1 Chunks
Type,float64,numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,32 B,32 B
Shape,"(4,)","(4,)"
Count,81 Graph Layers,1 Chunks
Type,object,numpy.ndarray
"Array Chunk Bytes 32 B 32 B Shape (4,) (4,) Count 81 Graph Layers 1 Chunks Type object numpy.ndarray",4  1,

Unnamed: 0,Array,Chunk
Bytes,32 B,32 B
Shape,"(4,)","(4,)"
Count,81 Graph Layers,1 Chunks
Type,object,numpy.ndarray


Now we come to *pyresample* specifics; in order to resample, *pyresample* uses a [*geometry*](https://pyresample.readthedocs.io/en/latest/geo_def.html) definition, there are a couple of them available. Here we use the swath definition for both satellite data and ECMWF's (we are using an ECMWF's reduced Gaussian grid).

The definitions are more or less self-explanatory:

In [24]:

# Define swath using PyResample's SwathDefinition (geometry def.): 
SAT_swath_def = SwathDefinition(lons = work_SAT_ds.lon.values, 
                                lats = work_SAT_ds.lat.values)

# Reduced Gaussian as swath:
# Longitude values for SwathDefinition need to be in [-180,180]
ECMWF_grid_def_RG = SwathDefinition(lons = xarray_ds.longitude.values-180, 
                                lats = xarray_ds.latitude.values)


We now resample using MetView's **Bilinear interpolation**; on exit we save into an xarray dataarray that is automatically integrated into our satellite's dataset:

In [25]:
# Bilinear interpolation using MetView:

startTime = time.time()

work_SAT_ds['u_metview_bilinear'] = xr.DataArray(
    # Longitude values for mv.interpolate need to be in [0,360]
                data   = mv.interpolate(u_metview_ds, 
                                        work_SAT_ds.lat.values.reshape(-1), 
                                        work_SAT_ds.lon.values.reshape(-1)+180.0
                                       ).reshape(
                    work_SAT_ds.lat.values.shape),  # enter data here
                dims   = ['time','scene_across_track'],
                coords = {'time': work_SAT_ds.time, 
                          'scene_across_track': work_SAT_ds.scene_across_track,},
                attrs  = {
                    #'_FillValue': -999.9,
                    'description': 'u10n from ECMWFs forecast resampled with\
                    MetView"s bilinear interpolation to satellite swath',
                    'units'     : 'm/s'
                    }
                ) 

print("%.2f s , Time Bilinear MetView" % (time.time()-startTime)) 

4.86 s , Time Bilinear MetView


And we resample using pyResample's **Nearest Neighbours Gaussian weights**:

In [35]:
# pyResample Nearest Neighbours Gaussian weights:

startTime = time.time()

sigma = 30000
work_SAT_ds['u_pyresample_GNN'] = xr.DataArray(
    data   = resample_gauss(ECMWF_grid_def_RG,
                           xarray_ds.u10n[0,0,:].values, 
                           SAT_swath_def,
                           radius_of_influence=30000,
                           neighbours=10,
                           sigmas=sigma,
                           #*np.ones(#len(work_ECMWF_ds_RG.time2.values)),
                           fill_value=None),  # enter data here
    dims   = ['time','scene_across_track'],
    coords = {'time': work_SAT_ds.time, 
              'scene_across_track': work_SAT_ds.scene_across_track,},
    attrs  = {#'_FillValue': -999.9,
        'description': 'u10n from ECMWFs forecast resampled with\
        PyResample (Nearest Neighbour Gaussian weights) to satellite swath',
        'units'     : 'm/s'
    }
) 

print("%.2f s , Time NN Gaussian weights" % (time.time()-startTime)) 


 #data   = resample_nearest(ECMWF_grid_def_RG, 
                #                          xarray_ds.u10n[0,0,:].values, 
                #                          SAT_swath_def, 
                #                          radius_of_influence=20000, 
                #                          fill_value=None)

1.99 s , Time NN Gaussian weights


The first difference is the speed in computation, *pyResample-NNGW* is twice as fast as *MetView-Bilinear*.


We now compute the difference between the two approaces: 

In [27]:
work_SAT_ds['u_diff_bilinear_GNN'] = work_SAT_ds['u_metview_bilinear']-\
work_SAT_ds['u_pyresample_GNN']

At this point we have our ECMWF data resampled with both approaches; lets visualize a few results.

First, lets create an area of interest to plot, in this case I want to plot some of the computed variables on the entire globe:


In [None]:
area_def_world = load_area('areas.yaml', 'worldeqc30km')# 'worldeqc30km70') # for plots


For simplicity in the plotting I have created a single routine (**basicMapPlotScat1**) for plotting *swath* data (i.e. non regular grid data), we can visualize observations but also the resampled varialbes; the parameter *chan* is set to -1 for this simple plot.

Now lets check the spatially resampled *u* component of the wind speed with both schemes:

**MetView-Bilinear:**


In [None]:
# Plot the resampled (Nearest neighb.) ECMWF data in the new 'grid' 
# (i.e. the satellite swath):

reduced_lon_scene, reduced_lat_scene, reduced_data_scene =\
support_routines.get_Sat_frame(work_SAT_ds.where(work_SAT_ds.sft==0), area_def_world, chan=-1, 
              var = 'u_metview_bilinear', begin_t=None, end_t=None)

support_routines.basicMapPlotScat1(reduced_lon_scene, reduced_lat_scene, reduced_data_scene,
                 'u_metview_bilinear', area_def_world, vmin=-25, vmax=25, 
                                   proj="PlateCarree", var = "u_metview_bilinear")

**pyResample-NNGW**

In [None]:
# Plot the resampled (Nearest neighb.) ECMWF data in the new 'grid' 
# (i.e. the satellite swath):

reduced_lon_scene, reduced_lat_scene, reduced_data_scene =\
support_routines.get_Sat_frame(work_SAT_ds.where(work_SAT_ds.sft==0), area_def_world, chan=-1, 
              var = 'u_pyresample_GNN', begin_t=None, end_t=None)

support_routines.basicMapPlotScat1(reduced_lon_scene, reduced_lat_scene, reduced_data_scene,
                 'u_pyresample_GNN', area_def_world, vmin=-25, vmax=25, 
                                   proj="PlateCarree", var = "u_pyresample_GNN")

# Comparison Nearest Neighbours Gaussian weights and bilinear interpolation

Here we compute the difference and look at different metrics.

In [None]:
# Plot the resampled (Nearest neighb.) ECMWF data in the new 'grid' 
# (i.e. the satellite swath):

reduced_lon_scene, reduced_lat_scene, reduced_data_scene =\
support_routines.get_Sat_frame(work_SAT_ds.where(work_SAT_ds.sft==0), area_def_world, chan=-1, 
              var = 'u_diff_bilinear_GNN', begin_t=None, end_t=None)

support_routines.basicMapPlotScat1(reduced_lon_scene, reduced_lat_scene, reduced_data_scene,
                 'u_diff_bilinear_GNN', area_def_world, vmin=-0.5, vmax=0.5, 
                                   proj="PlateCarree", var = "u_diff_bilinear_GNN")

Lets plot the histogram of the differences:

In [None]:
xr.plot.hist(work_SAT_ds['u_diff_bilinear_GNN'].where(work_SAT_ds.sft==0))

Finally lets define a metric of how similar is the **pyResample-NNGW** to the **MetView-Bilinear**:

The percentage of samples whose differences are below a threshold in *m/s*. 

This is an arbitrary metric and it tries to resemble the way of presenting differences between two different libraries used at ECMWF: [MIR vs EMOSLIB](https://confluence.ecmwf.int/display/UDOC/Differences+for+10m+winds).

Here we focus on the ocean samples.

In [28]:
np.sum(np.abs(work_SAT_ds['u_diff_bilinear_GNN'].where(
    work_SAT_ds.sft==0))<0.2).values / np.sum(~np.isnan(
    work_SAT_ds['u_diff_bilinear_GNN'].where(work_SAT_ds.sft==0))).values


0.8299989994956641

## Sigma and Neighbours:
When using the *pyResample*'s NN Gaussian weights we have two parameters that influence the resulting resampling: the width of the Gaussian (*sigma*) and the number of Neighbours used for the sample.

Here we want to check how sensitive is the resulting resampling scheme; for this we use the Bilinear interpolation from *MetView* as a fixed "truth" and we use initially the following metric: the percentage of results with differences lower than *0.2 m/s*. 



Sometimes we are interested in focusing on a specific region or location; we can do that by *zooming in* into the specific *area of interest*.
We can create such area (afterwards interpretable using *Cartopy*) by defining a bounding box of *min/max* longitudes and latitudes. We select a *eqc* projection ([Plate Carree](https://proj.org/operations/projections/eqc.html) projection) as default. The datum defines the ellipsoid used in the transformations. 

I provide this auxiliary way of plotting specific regions, but it is by no means generic, its intended as a way to help visualize your data and it can definitely be improved!

In [None]:
corners = {"min_lon": 35 , "max_lon": 75, "min_lat": -30 , "max_lat": +30, "lat_0": 0, "lon_0":0}
proj_id = 'eqc'  # eqc
datum = 'WGS84'
area_interest = support_routines.defineArea(corners, proj_id, datum)

Using the *area of interest* that we just defined we proced to get the *frame* (specific lon/lat/data combinations, 3 arrays) and we plot them using the *mapPlotScatZoom* function (in support/support_routines).

In [None]:
zoom_lon_scene, zoom_lat_scene, zoom_data_scene =\
support_routines.get_Sat_frame(work_SAT_ds, area_interest, chan=-1, 
              var = 'u10n_apriori_gauss_interp_RG', begin_t=None, end_t=None)

support_routines.mapPlotScatZoom(zoom_lon_scene, zoom_lat_scene, zoom_data_scene,
                 'zoomPlot_Gauss_RG', -25,25,var='Apriori u10n, from RG grid', area=area_interest)