In [2]:
%load_ext autoreload
%autoreload 2

%load_ext dotenv
%dotenv

# Download Sentinel-2 data

In this notebook we are going to use the EOTDL environment to download the Sentinel-2 imagery that will conform our dataset.

First of all, wee need our AoI bounding box and time interval in which download images. If you missed how we obtained them, go to the [00_exploration](00_exploration.ipynb) notebook.

Let's load the AoI bounding box.

In [3]:
import geopandas as gpd

boadella_bbox_gdf = gpd.read_file('workshop_data/boadella_bbox.geojson', crs='EPSG:4326')

boadella_bbox = list(boadella_bbox_gdf.geometry.total_bounds)
boadella_bbox

[2.792027806635944, 42.33057868499878, 2.838021549182864, 42.36457137143556]

And the range of dates.

In [4]:
import csv

dates = []
with open("workshop_data/dates.csv", "r") as file:
    reader = csv.reader(file)
    for row in reader:
        dates.append(row[0])
dates.sort()

dates[:5]

['2020-01-13', '2020-01-28', '2020-02-02', '2020-06-21', '2020-09-14']

As seen in the previous [notebook](00_exploration.ipynb), we have several available images for the given time interval. Let's download all of them! This is something that we can do in several ways. On the one hand, we can download image by image, as follows.

In [5]:
from eotdl.access import download_sentinel_imagery

first_date = dates[0]

# Uncomment to demonstrate
# download_sentinel_imagery('workshop_data/sentinel_2', first_date, boadella_bbox, 'sentinel-2-l2a')

On the other hand, we can search and download all available images within a time interval, as follows. This is the recommended way for a bulk download, but it has the drawback that we cannot control the quality of the images, as for example know their cloud cover.

In [6]:
from eotdl.access import search_and_download_sentinel_imagery

# Uncomment to demonstrate
# search_and_download_sentinel_imagery(
#     output='workshop_data/sentinel_2',
#     time_interval=dates[:3],
#     bounding_box=boadella_bbox,
#     sensor='sentinel-2-l2a'
# )

Despite what we have seen, in the `workshop_data/dates.csv` file we already have a list with the acquisition dates of valid, cloud-free and good quality images. This is a slower but safer solution. So, let's download them!

In [7]:
for date in dates:
    download_sentinel_imagery('workshop_data/sentinel_2', date, boadella_bbox, 'sentinel-2-l2a')

That's all! We have downloaded the images for our dataset. Let's check them!

In [8]:
from glob import glob

rasters = glob('workshop_data/sentinel_2/*.tif')
rasters[:5]

['workshop_data/sentinel_2/sentinel-2-l2a_2022-03-08.tif',
 'workshop_data/sentinel_2/sentinel-2-l2a_2020-09-14.tif',
 'workshop_data/sentinel_2/sentinel-2-l2a_2021-06-26.tif',
 'workshop_data/sentinel_2/sentinel-2-l2a_2022-06-01.tif',
 'workshop_data/sentinel_2/sentinel-2-l2a_2020-01-13.tif']

We can look for them metadata files, too.

In [9]:
jsons = glob('workshop_data/sentinel_2/*.json')
jsons[:5]

['workshop_data/sentinel_2/sentinel-2-l2a_2020-09-14.json',
 'workshop_data/sentinel_2/sentinel-2-l2a_2022-06-01.json',
 'workshop_data/sentinel_2/sentinel-2-l2a_2020-01-13.json',
 'workshop_data/sentinel_2/sentinel-2-l2a_2022-03-08.json',
 'workshop_data/sentinel_2/sentinel-2-l2a_2020-06-21.json']

It looks amazing! One last step, in order to kind of "label" the downloaded images to be easily ingested by the EOTDL and generate STAC metadata in next steps could be to rename the images, maintaining the acquisiton date but replacing the sensor type in the filename by `Boadella`. This is not mandatory, but it will be useful for our usecase.

In [11]:
files = glob('workshop_data/sentinel_2/*')
for file in files:
    new_file_name = file.replace('sentinel-2-l2a', 'Boadella')
    ! mv $file $new_file_name

To sum up this section, we have downloaded the Sentinel-2 images that will conform our dataset through the Sentinel Hub client and have formated the folder structure to a way that will allow us both to label our dataset using SCANEO and generate STAC metadata. With this, we have our [Q0 dataset](../00_eotdl.ipynb)!

Let's continue to the [02_labeling](./02_labeling.ipynb) notebook and label our dataset!