## DestinE Data Streaming

This service offers compressed climate and era5 data and makes it available via a high quality and memory efficient streaming solution. The [SSIM](https://en.wikipedia.org/wiki/Structural_similarity_index_measure) and the mean relative error serve as quality measures.

<div style='white-space: nowrap', align='center'>

<div style='display:inline-block', align='center'>Era5 2 meter dewpoint temperature (01-01-1940 09:00)<br>
<img src="images/2d9_og_.jpeg" width="450px"><br><img src="images/2d9_cp_.jpeg" width="450px"><br>Mean SSIM: 0.996<br>Compression rate 1:13<br>Mean relative error 0.1 %</div>

<div style='display:inline-block', align='center'>Era 5 10 metre U wind component (01-01-1940 09:00)<br>
<img src="images/10u9_og_.jpeg" width="450px"><br><img src="images/10u9_cp_.jpeg" width="450px"><br>Mean SSIM: 0.995<br>Compression rate 1:27<br>Mean relative error 0.3 %</div>

</div>


## Prerequisites
### DestinE Platform Credentials

You need to have an account on the [Destination Earth Platform](https://auth.destine.eu/realms/desp/account).

#### ⚠️ Warning: Authorized Access Only
The usage of this notebook and data access is reserved only to authorized user groups.

## Access the Data
With a DESP account you can access the stream data proposed in this tutorial.

In [None]:
%%capture cap
%run ./auth.py

In [None]:
output = cap.stdout.split('\n')
refresh_token = output[1]
token = output[2]

# Imports and general definitions
We start by importing necessary packages and definitions regarding the resolution and the endpoint to the streaming api.

Note: The API token must be set here including the user group. This happens in **Authentification**.

In [None]:
from dtelib2 import DTEStreamer, get_stream_overview
from datetime import datetime
import xarray as xa
import rioxarray  # noqa
from pyproj import CRS
from rasterio.transform import from_origin
import matplotlib.pyplot as plt
import cartopy
import cartopy.crs as ccrs

FORMAT = '%Y-%m-%dT%H:%M'

# Stream overview
The code in the cell below calls the DTE API to receive an overview of all available streams.

In [None]:
get_stream_overview(token)

# Parameters for stream access

Here the parameters are set to access the data from the service.

*short_name*: an abbreviated name for the data</br>
*category_name*: a name for the category for the data </br>
*start_date*: the time and date to start the stream</br>
*end_date*: the time and date to end the stream</br>
</br>

To select a stream, chose parameter values from the table above, or if you have a *code snippet*, use it to replace the code in the cell below:

In [None]:
short_name = "2t"
category_name = "Climate DT"
start_date = "2020-01-01T00:00"
end_date = "2020-01-01T00:00"

start_date = datetime.strptime(start_date, FORMAT)
end_date = datetime.strptime(end_date, FORMAT)

# Loading the stream

With the DTEStreamer class we can easily access the data stream through the api and access individual data frames. 
At first, we create a DTEStreamer object with the parameters defined in the step above. The object initializes by calling the api to get meta information about the stream and the location of the stream. (You can take a look at the api yourself in the swagger [here](https://dev.destinestreamer.geoville.com/api/streaming/metadata)). Also, ffmpeg is used to seek to the first image of the stream, according to *start_date*.

The load_next_image() method then loads the next image into a numpy array, which is stored in the list *time_series* for further use.

Note that in this example, the data and time stamps are loaded into a list. A print statements keep us on track with the progress.

Note: This example should be modified to your purpose especially if you plan to do a long time series analysis, as it will load all the data of the loop into memory.


In [None]:
streamer = DTEStreamer(category_name=category_name,
                       short_name=short_name,
                       start_date=start_date,
                       end_date=end_date,
                       token=token)

time_series = list()
time_stamps = list()

for image, time_stamp in streamer.load_next_image():

    time_stamps.append(time_stamp)
    time_series.append(image)

print(f'A total of {len(time_series)} images loaded')

# Creating the xarray.DataArray

With the time_series and time_stamps we can create a geo-referenced DataArray. The method create_lat_lon_grid() creates latitudes and longitudes appropriate to our data. The DataArray is created with dimensions t, latitude and longitude and coordinates lat, lon and time using the time_series and time_stamps from the previous cell. We also set the name with streamer.name() and the unit of the data with streamer.unit(). EPSG:4326 is set as CRS.

In [None]:
lats, lons = streamer.create_lat_lon_grid()

da = xa.DataArray(time_series,
                  dims=['t', 'latitude', 'longitude'],
                  coords={'t': time_stamps, 'longitude': lons, 'latitude': lats},
                  name=streamer.name(),
                  attrs={'units': streamer.unit(),
                         'crs': 'EPSG:4326'}
                 )
da

# Subselecting regions

With rio.clip() from rioxarray it is also possible to subselect polygons in the georeferenced data. With add_feature(cartopy.feature.BORDERS) it is possible to outline countries as well.

In [None]:
europe = [
    { 'type': 'Polygon',
        'coordinates':
          [[
            [
              40.47389429569694,
              71.51798987593267
            ],
            [
              -9.863756639532625,
              71.51798987593267
            ],
            [
              -9.863756639532625,
              36.1174025295482
            ],
            [
              40.47389429569694,
              36.1174025295482
            ],
            [
              40.47389429569694,
              71.51798987593267
            ]
          ]]
    }
        ]

dat = da.isel(t=0).rio.clip(geometries=europe, drop=True)

fig=plt.figure()

dat.plot()
plt.show()

In [None]:
germany = [
    {
        'type': 'Polygon',
        'coordinates':     [[
            [
                2.7283245259702937,
                57.55574030864628
            ],
            [
                2.7283245259702937,
                45.19668374239939
            ],
            [
                17.542851520061646,
                45.19668374239939
            ],
            [
                17.542851520061646,
                57.55574030864628
            ],
            [
                2.7283245259702937,
                57.55574030864628
            ]
        ]]
    }
]

dat = da.isel(t=0).rio.clip(geometries=germany, drop=True)

fig=plt.figure()

dat.plot()
plt.show()