## DestinE Data Streaming

This service offers compressed climate and era5 data and makes it available via a high quality and memory efficient streaming solution. The [SSIM](https://en.wikipedia.org/wiki/Structural_similarity_index_measure) and the mean relative error serve as quality measures.

<div style='white-space: nowrap', align='center'>

<div style='display:inline-block', align='center'>Era5 2 meter dewpoint temperature (01-01-1940 09:00)<br>
<img src="../images/2d9_og_.jpeg" width="450px"><br><img src="../images/2d9_cp_.jpeg" width="450px"><br>Mean SSIM: 0.996<br>Compression rate 1:13<br>Mean relative error 0.1 %</div>

<div style='display:inline-block', align='center'>Era 5 10 metre U wind component (01-01-1940 09:00)<br>
<img src="../images/10u9_og_.jpeg" width="450px"><br><img src="../images/10u9_cp_.jpeg" width="450px"><br>Mean SSIM: 0.995<br>Compression rate 1:27<br>Mean relative error 0.3 %</div>

</div>


## Prerequisites
### DestinE Platform Credentials

You need to have an account on the [Destination Earth Platform](https://auth.destine.eu/realms/desp/account).

#### ⚠️ Warning: Authorized Access Only
The usage of this notebook and data access is reserved only to authorized user groups.

## Access the Data
With a DESP account you can access the stream data proposed in this tutorial.

# WIND MAGNITUDES ON THE COASTS OF FRANCE AND GREAT BRITAIN (LONG EXAMPLE)
In this example we are loading east and north facing winds to calculate the total magnitude and will be plotting it over the course of two years, 1941 and 2021.

A lot of data will be downloaded and processed in this example. In total, this can take up to 100 minutes until everything is done.

In [None]:
%%capture cap
%run ./auth.py

In [None]:
output = cap.stdout.split('\n')
#refresh_token = output[1]
token = output[2]

# Imports and general definitions
We start by importing necessary packages and definitions regarding the resolution and the endpoint to the streaming api.

Note: The API token must be set here including the user group. This happens in **Authentification**.

In [None]:
from dtelib_climate import DTEStreamer, get_stream_overview
import numpy as np
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
from IPython import display
from scipy.signal import savgol_filter

FORMAT = '%Y-%m-%dT%H:%M'

import time
startt = time.time()



# Parameters for stream access

Here the parameters are set to access the data from the service.

*program_subset*: ERA5 has data from 1940 to 2023 </br>
*variable*: 10u is the horizontal 10m wind towards the East. 10v is the horizontal wind towards the north.</br>
*start_date*: 00:00 on the first day of 1941 or 2021</br>
*end_date*: 23:00 on the last day of 1941 or 2021</br>

In [None]:
program_subset = "ERA5"
u_wind_variable = "10u"
v_wind_variable = "10v"

years = [1941, 2021]
dates = []
u_streamer = []
v_streamer = []

# Datetimes and streamer objects are initialized here.
# Therefore all API calls are done in this cell.
for year in years:
    # The first day of the year in the loop
    start_date = f"{year}-01-01T00:00"
    start_date = datetime.strptime(start_date, FORMAT)

    #  The last day of the year in the loop
    end_date = f"{year}-12-31T23:00"
    end_date = datetime.strptime(end_date, FORMAT)

    dates.append([start_date, end_date])

    # Both streamer objects are initialized here,
    # Eastward and northward wind.
    u_streamer.append(DTEStreamer(program_subset=program_subset,
                           variable=u_wind_variable,
                           start_date=start_date,
                           end_date=end_date,
                           token=token)
                     )
    
    v_streamer.append(DTEStreamer(program_subset=program_subset,
                           variable=v_wind_variable,
                           start_date=start_date,
                           end_date=end_date,
                           token=token)
                     )

# Data specific variables

As an area of interest (AOI), parts of the Atlantic Ocean, the british west coast and french west coast are chosen. Plots will be made for two years. We set some colors to distinguish them and their fits.

In [None]:
# European west coast
aoi = [slice(100,190), slice(1350,1440)]

# Colors for the plot of the data and the fit curves
data_colors = ['#b4e7ff', '#ffc6c6']
fit_colors = ['#6157cc', '#bf4f4f']

# Working with the data

The following is the main cell of this notebook. Northbound and eastbound wind data is downloaded for every hour over the course of 1941 and 2021. That is around 35000 frames of data. It will be cut down to the AOI.

For each frame the magnitude is then calculated for the AOI using Pythagoras theorem. Then the average wind speed of the area is calculated, stored in a list and added to the plot.

When the when one whole year has loaded, a polynomial fit is found through the data, to show a trend through the heavily fluctuating measurement data.

In [None]:
# variable for data storage
magnitudes = [[] for _ in years]
t = []

# variables for the plotted figure
ticks = []
tick_labels = []
y_min = 1.0*10**10
y_max = 0.

# initialize the plot figure
hdisplay = display.display('', display_id=True)
fig, ax = plt.subplots(1, 1, figsize=(15, 6))

ax.set_ylabel(f'Wind Speed in m/s')
plt.title(f'Wind Speeds at Western European Coasts')

# For loop to evaluate the data of all specified years
for i, year in enumerate(years):
    m_avg_list = list()

    # adjust the x-axis of the plot to the data.
    # timestamp() makes dealing with datetime easier
    if i == 0:
        ax.set_xlim(dates[0][0].timestamp(), dates[0][1].timestamp())
    else:
        magnitudes[i] = [None]*len(magnitudes[0])

    # initialize the curve
    line, = ax.plot(magnitudes[i], t, color=data_colors[i])
    line.set_label(f'{year} data')
    line.set_zorder(i)

    # the index used for the current datapoint
    zi=-1

    # a for loop for loading data from both streamer objects
    for (v_image, v_time_stamp), (u_image, u_time_stamp) in zip(v_streamer[0].load_next_image(), u_streamer[0].load_next_image()):
    
        # isolate data to the specified aoi
        v_image = v_image[aoi[0], aoi[1]]
        u_image = u_image[aoi[0], aoi[1]]

        # calculate the wind magnitude with the pythagorean theorem
        wind_magnitude = (v_image**2 + u_image**2)**(1/2)

        # update the index for the current datapoint
        zi+=1

        if i == 0:
            # append wind magnitudes and timestamps on the first run
            magnitudes[i].append(np.average(wind_magnitude))
            t.append(v_time_stamp.timestamp())

            # create time axis labels
            if v_time_stamp.hour == 0 and v_time_stamp.day == 1:
                tick_labels.append(v_time_stamp.strftime('  %B'))
                ticks.append(v_time_stamp.timestamp())
        else:
            # insert wind magnitudes on the second run
            magnitudes[i][zi] = np.average(wind_magnitude)

        # add data to the plot
        line.set_ydata(magnitudes[i])
        line.set_xdata(t)
        
        # scale the data axis to the data so far
        y_min = min([y_min, *magnitudes[i][:zi]])
        y_max = max([y_max, *magnitudes[i][:zi]])
        ax.set_ylim(y_min-1, y_max+1)

        # update the fig once on the 1st and the 15th of each month
        if v_time_stamp.hour != 0 or (v_time_stamp.day != 1 and v_time_stamp.day != 15):
            continue

        # plotting data, showing labels and line labels
        ax.set_xticks(ticks, tick_labels, ha='left')
        ax.grid(True, axis='x')
        #ax.legend()
        hdisplay.update(fig)

        # This refreshes the stream for this resource intensive loop
        v_streamer[0].seek_to_date(v_time_stamp + timedelta(hours=1))
        u_streamer[0].seek_to_date(u_time_stamp + timedelta(hours=1))

    # free streamer resources
    del v_streamer[0]
    del u_streamer[0]

    # plot data again after loading data is complete
    ax.set_xticks(ticks, tick_labels, ha='left')
    ax.grid(True, axis='x')

    # make a polynomial fit for the current data and plot it
    fit = np.polyfit(t, magnitudes[i], 4)
    fit = np.polyval(fit, t)
    line, = ax.plot(fit, t, color=fit_colors[i])
    line.set_zorder(i + len(years))

    # set a label for the plot and update the figure
    line.set_label(f'{year} fit')
    line.set_ydata(fit)
    line.set_xdata(t)
    ax.legend()
    hdisplay.update(fig)
        
plt.close(fig)
print(f'duration: {time.time()-startt}')