## Using dask for EODC items

In this notebook we are going to take a look at stac items we retrieved from EODC and try to adapt some dask worklow to it. The two cores of the dask library we are going to use are parallel processing and lazy loading. These two functions tie smoothly into our workflow because they allow us to keep working with stac items and their quality of just being mata data. That means that we can create a nice worklow without actually downloading any real data into our local RAM or memory. 

## Import required packages

In [2]:
#These are the important packages we first need to import 

import os
import json
from datetime import datetime
import numpy as np
import xarray as xr
import pandas as pd
import matplotlib.pyplot as plt

import pystac
from pystac_client import Client
from odc import stac as odc_stac
from odc.geo.geobox import GeoBox
from affine import Affine
import sys

import dask.dataframe as dd

In [3]:
#The code here adjusts the width of the nootebook display container

from IPython.display import display, HTML
display(HTML("<style>.container { width:90% !important; }</style>"))

In [4]:
#This is how we access the AI4SAR collection via the stac_api

stac_api = "https://stac.eodc.eu/api/v1"
client = Client.open(stac_api)

collection_id="AI4SAR_SIG0"

In [6]:
#Next we search for certain items in the AI4SAR_SIG0 collection, defining the area and time period
collection_id="AI4SAR_SIG0"

bbox = [15.6, 47.7, 16.6, 48.7]  # [lon_min, lat_min, lon_max, lat_max]
start_date = "2023-01-01"
end_date = "2023-10-31"

query = client.search(bbox=bbox,
                        collections=[collection_id],
                        datetime=f"{start_date}/{end_date}",
                        )
q_items = sorted(query.items(), key=lambda x: x.id)

In [7]:
for item in q_items:
    print(item)

print("\n","The length of the list of items is",len(q_items))



<Item id=SIG0_20230418T050210_D124_EU020M_E051N015T3_S1AIWGRDH>
<Item id=SIG0_20230512T050211_D124_EU020M_E051N015T3_S1AIWGRDH>
<Item id=SIG0_20230617T050213_D124_EU020M_E051N015T3_S1AIWGRDH>

 The length of the list of items is 3


In [8]:
#Here we take a look at the metadata of one item.

q_items[0].properties

{'gsd': 20,
 'datetime': '2023-04-18T05:02:10Z',
 'proj:bbox': [5100000, 1500000, 5400000, 1800000],
 'proj:wkt2': 'PROJCS["Azimuthal_Equidistant",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0],UNIT["degree",0.0174532925199433],AUTHORITY["EPSG","4326"]],PROJECTION["Azimuthal_Equidistant"],PARAMETER["latitude_of_center",53],PARAMETER["longitude_of_center",24],PARAMETER["false_easting",5837287.81977],PARAMETER["false_northing",2121415.69617],UNIT["metre",1,AUTHORITY["EPSG","9001"]]]',
 'proj:shape': [15000, 15000],
 'constellation': 'sentinel-1',
 'proj:geometry': {'type': 'Polygon',
  'coordinates': [[[5100000.0, 1500000.0],
    [5100000.0, 1800000.0],
    [5400000.0, 1800000.0],
    [5400000.0, 1500000.0],
    [5100000.0, 1500000.0]]]},
 'proj:transform': [20, 0, 5100000, 0, -20, 1800000],
 'sat:orbit_state': 'descending',
 'sar:product_type': 'GRD',
 'sar:frequency_band': 'C',
 'sat:rel