## DestinE Data Streaming

This service offers compressed climate and era5 data and makes it available via a high quality and memory efficient streaming solution. The [SSIM](https://en.wikipedia.org/wiki/Structural_similarity_index_measure) and the mean relative error serve as quality measures.

<div style='white-space: nowrap', align='center'>

<div style='display:inline-block', align='center'>Era5 2 meter dewpoint temperature (01-01-1940 09:00)<br>
<img src="images/2d9_og_.jpeg" width="450px"><br><img src="images/2d9_cp_.jpeg" width="450px"><br>Mean SSIM: 0.996<br>Compression rate 1:13<br>Mean relative error 0.1 %</div>

<div style='display:inline-block', align='center'>Era 5 10 metre U wind component (01-01-1940 09:00)<br>
<img src="images/10u9_og_.jpeg" width="450px"><br><img src="images/10u9_cp_.jpeg" width="450px"><br>Mean SSIM: 0.995<br>Compression rate 1:27<br>Mean relative error 0.3 %</div>

</div>


# Prerequisites
### DestinE Platform Credentials

You need to have an account on the [Destination Earth Platform](https://auth.destine.eu/realms/desp/account).

#### ⚠️ Warning: Authorized Access Only
The usage of this notebook and data access is reserved only to authorized user groups.

In [1]:
%%capture cap
%run ./auth.py

In [2]:
output_1 = cap.stdout.split('}\n')
token_loc = output_1[-1][0:-1]

# Imports and general definitions
We start by importing necessary packages and definitions regarding the resolution and the endpoint to the streaming api.

Note: The API token must be set here including the user group

In [3]:
from dtelib import DTEStreamer
from datetime import datetime
import xarray as xa
import rioxarray # noqa
from pyproj import CRS
from rasterio.transform import from_origin

FORMAT = '%Y-%m-%dT%H:%M'

# Parameters for stream access

Here the parameters are set to access the data from the service, e.g. era5 datasets.

*short_name*: an abbreviated name for the data</br>
*start_date*: the time and date to start the stream</br>
*end_date*: the time and date to end the stream</br>
</br>
Era5 data is available for dates between 01-01-1940 00:00 to 31-12-2023 23:00
and these datasets:

| short_name | name                           | compression ratio | mean relative error | SSIM  |
|------------|--------------------------------|-------------------|---------------------|-------|
| 2d         | 2 meter dewpoint temperature   | 1 : 13            | 0.1 %               | 0.992 |
| 2t         | 2 meter temperature            | 1 : 13            | 0.01 %              | 0.996 |
| 10u        | 10 metre U wind component      | 1 : 27            | 0.3 %               | 0.995 |
| 10v        | 10 metre V wind component      | 1 : 27            | 0.2 %               | 0.996 |


If you have a code snippet, use it to replace the code in the cell below:


In [4]:
category_name = 'Era5'
short_name = '2d'
start_date = datetime.strptime('1954-07-15T12:00', FORMAT)
end_date = datetime.strptime('1954-07-20T12:00', FORMAT)

# Loading the stream

With the DTEStreamer class we can easily access the data stream through the api and access individual data frames. 
At first, we create a DTEStreamer object with the parameters defined in the step above. The object initializes right away by calling the api to get meta information about the stream and the location of the stream. (You can take a look at the api yourself in the swagger [here](https://dev.destinestreamer.geoville.com/api/streaming/metadata)). The images() method opens the stream and creates a generator object which can be used in a for loop to load the individual time steps. Each frame is loaded individually for every step in the loop.
Note that two variables in the for loop correspond to the time stamp and the actual data.

Note that in this example, the data and time stamps are loaded into a list. A print statements keep us on track with the progress.

Note: This example should be modified to your purpose especially if you plan to do a long time series analysis, as it will load all the data of the loop into memory.


In [None]:
streamer = DTEStreamer(category_name=category_name,
                       short_name=short_name,
                       start_date=start_date,
                       end_date=end_date,
                       token_loc=token_loc)

time_series = list()
time_stamps = list()

for image, time_stamp in streamer.images():
    if time_stamp.hour != 12:
        continue
        
    time_stamps.append(time_stamp)
    time_series.append(image)
    print(time_stamp)

# Creating the xarray.DataArray

With the time_series and time_stamps we can create a geo-referenced object. The method create_lon_lat_grid() creates longitudes and latitudes appropriate to our data. The DataArray is created with dimensions t, y and x and coordinates lat, lon and time using the time_series and time_stamps from the previous cell. We also set the name with streamer.name() and the unit of the data with streamer.unit(). To have the object geo-reference properly the transform is written and the CRS EPSG:4326 is applied.

In [None]:
lon, lat = streamer.create_lon_lat_grid()

da = xa.DataArray(time_series,
                  dims=['t', 'y', 'x'],
                  coords={"lon": (("y", "x"), lon),
                          "lat": (("y", "x"), lat),
                          'time': ('t', time_stamps)},
                  name=streamer.name(),
                  attrs=dict(units=streamer.unit())
                  )

da.rio.write_transform(transform=from_origin(0,-90,0.25,-0.25), inplace=True)

# Apply the crs
da.rio.write_crs(input_crs=CRS.from_string('EPSG:4326'), inplace=True)

# Displaying the data

With the isel() method we can select a particular image. 

In [None]:
da.isel(t=0).plot(x='lon', y='lat')

# Subselecting regions

With latitudes and longitudes it is also possible to sub-select AOIs, with the where() method.

In [None]:
italy = [
    {
        'type': 'Polygon',
        'coordinates':     [[
            [-0.23850189831415491, 29.958695145158657],
            [26.009726989451195, 29.958695145158657],
            [26.009726989451195, 53.06909307850006],
            [-0.23850189831415491, 53.06909307850006],
            [-0.23850189831415491, 29.958695145158657]
        ]]
    }
]

da.isel(t=3).rio.clip(geometries=italy, drop=True).plot(x='lon', y='lat')