## Script for automated Download of Sentinel-3 LST Data


### Create Environment and select Kernel

Before you run the script, you need to create the environment defined in the file sentinel_env.yml. Therefore, use Anaconda Promt following these steps:
- Open Anaconda Promt
- Write: "conda env create -n sentinel_env -f 'insert path to environment file ending with sentinel_env.yml'"
- Press Enter
- In the next line, write: "activate sentinel_env"
- Press Enter
- If the next line starts with (sentinel_env), you have been successful
- To create the kernel, write: „python -m ipykernel install --user --name sentinel_env --display-name sentinel_env_kernel“
- If you are having troubles, you can find help here: https://docs.conda.io/projects/conda/en/4.6.0/user-guide/troubleshooting.html
- ...and useful commands here: https://docs.conda.io/projects/conda/en/4.6.0/_downloads/52a95608c49671267e40c689e0bc00ca/conda-cheatsheet.pdf
- In Jupiter Notebook, click kernel in the menu bar, select change kernel, choose sentinel_env_kernel
- If it says sentinle_env_kernel in the upper right corner, you are ready to run the script!

Please note, that the environment works for Windows only.

In [1]:
# Import necessary libraries
import os
import re
import json
import requests
import pandas as pd
from sentinelsat import read_geojson
from datetime import date, datetime, timedelta

### Defining important Variables

The user needs to specify the following variables:
- username: The username for the Copernicus Open Dataspace login.
- password: The password for the Copernicus Open Dataspace login.
- start_date: The start date of the desired aquisition period.
- end_date: The end date of the desired aquisition period.
- start_time: The start time for the LST Daytime aquisitions.
- end_time: The end time for the LST Daytime aquisitions.
- collection_product: The name of the Sentinel collection product (for LST: "SL_2_LST", for SYN: "SY_2_SYN").
- kenya_aoi: The corner coordinates of the bounding box of the aoi.
- output_dir: The path to the directory to store the downloaded Sentinel zip-files.

In [2]:
# Copernicus Data Space credentials
#username = "example@email.de"
#password = "Password"

username = "kiwi@rssgmbh.de"
password = "1WNHO8D8GeSNNtoflQZX!"

In [22]:
# Define search parameters
start_date = date(2022, 11, 1)
end_date = date(2022, 12, 31)

start_time = "06:00:00"
end_time = "15:00:00"

collection_product = "SL_2_LST"

kenya_aoi = "POLYGON ((33.9095878601073650 -4.7204170227049644, 41.8875236511230540 -4.7204170227049644, 41.8875236511230540 4.6338191032410290, 33.9095878601073650 4.6338191032410290, 33.9095878601073650 -4.7204170227049644))'"

maxRecords = 1000 # number of records to search (maximum = 1000)

In [4]:
# Path to directory to store the downloaded data
output_dir = 'D:/Katharina/04_ADM_Kenya/Programming/Sentinel-3/Download/Sentinel_3_Data'

### Creating Copernicus Open Dataspace Access Token

In [5]:
# Get access token
def get_access_token(username: str, password: str) -> str:
    data = {
        "client_id": "cdse-public",
        "username": username,
        "password": password,
        "grant_type": "password",
    }
    try:
        r = requests.post("https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/token",
        data = data,
        )
        r.raise_for_status()
    except Exception as e:
        raise Exception(
            f"Access token creation failed. Response from server was: {r.json()}"
        )
    return r.json()["access_token"]

access_token = get_access_token(username, password)


### Searching the Catalogue with OData

Please note: If the number of products is 1000, the maximum number has been reached. The actual number might be larger, but it stops counting at 1000. Go back to the defining variables section and decrease the search period by adapting start and end date. Run another query, until the number is below 1000. You can download the missing data in a new run afterwards.

In [23]:
# Write the query
json = requests.get(f"https://catalogue.dataspace.copernicus.eu/odata/v1/Products?$filter=contains(Name, '{collection_product}')%20and%20OData.CSC.Intersects(area=geography'SRID=4326;{kenya_aoi})%20and%20ContentDate/Start%20gt%20{start_date}T00:00:00.000Z%20and%20ContentDate/End%20lt%20{end_date}T00:00:00.000Z&$top={maxRecords}").json()
len(json['value'])

{'@odata.context': '$metadata#Products', 'value': [{'@odata.mediaContentType': 'application/octet-stream', 'Id': 'd3355d4c-5528-5350-8039-5ef9f02fa997', 'Name': 'S3A_SL_2_LST____20221101T190341_20221101T190540_20221103T024408_0119_091_312_5940_PS1_O_NT_004.SEN3', 'ContentType': 'application/octet-stream', 'ContentLength': 35536999, 'OriginDate': '2022-11-03T01:55:54.130Z', 'PublicationDate': '2022-11-03T02:56:34.004Z', 'ModificationDate': '2022-11-03T02:56:49.782Z', 'Online': True, 'EvictionDate': '', 'S3Path': '/eodata/Sentinel-3/SLSTR/SL_2_LST/2022/11/01/S3A_SL_2_LST____20221101T190341_20221101T190540_20221103T024408_0119_091_312_5940_PS1_O_NT_004.SEN3', 'Checksum': [{}], 'ContentDate': {'Start': '2022-11-01T19:03:40.687Z', 'End': '2022-11-01T19:05:39.987Z'}, 'Footprint': "geography'SRID=4326;POLYGON ((39.2594 -0.997254, 40.0136 -4.49086, 40.7662 -7.99144, 40.777 -7.99839, 40.9497 -7.95767, 41.4009 -7.86291, 41.8473 -7.76761, 42.3058 -7.67279, 42.7527 -7.57237, 43.2064 -7.47754, 43.6

852

In [7]:
# Convert dictionary to dataframe
products_df = pd.DataFrame.from_dict(json['value'])

# Print the first five products to see if the query worked
products_df.head(5)

Unnamed: 0,@odata.mediaContentType,Id,Name,ContentType,ContentLength,OriginDate,PublicationDate,ModificationDate,Online,EvictionDate,S3Path,Checksum,ContentDate,Footprint,GeoFootprint
0,application/octet-stream,1e73f92d-bc9a-52b6-854c-7496af1fc276,S3B_SL_2_LST____20190901T184732_20190901T20283...,application/octet-stream,0,2019-09-03T02:43:48.814Z,2021-06-17T08:35:46.570Z,2021-06-17T08:35:46.570Z,True,,/eodata/Sentinel-3/SLSTR/SL_2_LST/2019/09/01/S...,[],"{'Start': '2019-09-01T18:47:31.809Z', 'End': '...",geography'SRID=4326;MULTIPOLYGON (((0.00114589...,"{'type': 'MultiPolygon', 'coordinates': [[[[0...."
1,application/octet-stream,be44cbc9-82da-5504-b5b7-a7307e6feeb9,S3B_SL_2_LST____20190901T200117_20190901T20031...,application/octet-stream,0,2020-10-09T08:29:13.528Z,2020-10-11T09:37:12.178Z,2020-10-11T09:37:12.178Z,True,,/eodata/Sentinel-3/SLSTR/SL_2_LST/2019/09/01/S...,[],"{'Start': '2019-09-01T20:01:16.809Z', 'End': '...","geography'SRID=4326;POLYGON ((37.9044 1.95846,...","{'type': 'Polygon', 'coordinates': [[[37.9044,..."
2,application/octet-stream,82c11118-b67c-545b-9eeb-9ccc326fe6b4,S3B_SL_2_LST____20190901T070037_20190901T08413...,application/octet-stream,0,2019-09-02T14:30:27.845Z,2021-06-17T08:37:54.289Z,2021-06-17T08:37:54.289Z,True,,/eodata/Sentinel-3/SLSTR/SL_2_LST/2019/09/01/S...,[],"{'Start': '2019-09-01T07:00:37.246Z', 'End': '...",geography'SRID=4326;MULTIPOLYGON (((0.00114589...,"{'type': 'MultiPolygon', 'coordinates': [[[[0...."
3,application/octet-stream,21d6d5b9-e0bd-52c8-adc7-d4087918110d,S3A_SL_2_LST____20190901T192702_20190901T21080...,application/octet-stream,0,2019-09-03T02:25:21.105Z,2021-06-17T08:35:58.917Z,2021-06-17T08:35:58.917Z,True,,/eodata/Sentinel-3/SLSTR/SL_2_LST/2019/09/01/S...,[],"{'Start': '2019-09-01T19:27:02.289Z', 'End': '...",geography'SRID=4326;MULTIPOLYGON (((0.00114589...,"{'type': 'MultiPolygon', 'coordinates': [[[[0...."
4,application/octet-stream,8b671958-3822-5b88-ac50-38cb47d544d4,S3A_SL_2_LST____20190901T074008_20190901T09210...,application/octet-stream,0,2019-09-02T15:23:29.357Z,2021-06-17T08:38:25.433Z,2021-06-17T08:38:25.433Z,True,,/eodata/Sentinel-3/SLSTR/SL_2_LST/2019/09/01/S...,[],"{'Start': '2019-09-01T07:40:07.616Z', 'End': '...",geography'SRID=4326;MULTIPOLYGON (((0.00114589...,"{'type': 'MultiPolygon', 'coordinates': [[[[0...."


### Filter Dictionary

#### Filter Records by Daytime

The Sentinel-3 SLSTR Level-2 LST product provides a measure of how hot or cold the 'surface' of the Earth would feel to the touch. Usually, it is recorded during daytime and nighttime. We are only interested in the daytime acquisitions, which is why we need to specify a time range to filter the acquisitions. In case of Kenya, that time range is set to 6 am to 3 pm, since Kenya is roughly two hours ahead of UTC.

In [8]:
# Convert the strings for the start and end time to time objects
start_time = datetime.strptime(start_time, "%H:%M:%S").time()
end_time = datetime.strptime(end_time, "%H:%M:%S").time()

products_daytime = {}
products_list = json.get('value', {})

for product in products_list:
    contentDate = product['ContentDate']
    product_id = product['Id']
    
    # Extract the start and end time of the products and convert them to time objects
    start_datetime = datetime.strptime(contentDate.get('Start'), "%Y-%m-%dT%H:%M:%S.%fZ")
    end_datetime = datetime.strptime(contentDate.get('End'), "%Y-%m-%dT%H:%M:%S.%fZ")
    start_datetime = datetime.strptime(start_datetime.strftime('%H:%M:%S'), '%H:%M:%S').time()    
    end_datetime = datetime.strptime(end_datetime.strftime('%H:%M:%S'), '%H:%M:%S').time()
    
    # Filter out products with acquisition times outside the set range
    if start_time <= start_datetime <= end_time or start_time <= end_datetime <= end_time:
        products_daytime[product_id] = product
  
len(products_daytime)

397

#### Filter for Non-Timecritical Frames

The instrument data from Sentinel-3 SLSTR can be disseminated in 'stripes', 'frames', or 'tiles'. Since we are only interested in frames, we need to filter the data for products of a certain naming pattern accoring to the naming conventions.

In [9]:
# Filter for a specific structure of the instance id (part of the name)
products_nonTimeCritical = {}

pattern = r"\d{4}_\d{3}_\d{3}_\d{4}"
product_id = pd.DataFrame.from_dict(products_daytime).loc['Id',].to_list()

for product in product_id:
    product_id = product
    product = products_daytime[product]
    name = product['Name']
    
    match = re.search(pattern, name)
    
    if match:
        products_nonTimeCritical[product_id] = product
        
len(products_nonTimeCritical)

244

In [69]:
# Print the number of total products, filtered products by daytime, and non-time-critical frames
print("Total Number of Products: ", len(products_df))
print("Number of Daytime Products: ", len(products_daytime))
print("Number of Non-Timecritical Products: ", len(products_nonTimeCritical))

Total Number of Products:  704
Number of Daytime Products:  397
Number of Non-Timecritical Products:  244


### Download Data

In [10]:
# Get the list of existing files in the output directory
existing_files = os.listdir(output_dir)
existing_files

['S3A_SL_2_LST____20180503T080254_20180503T080554_20210118T142548_0179_030_363_2880_LR1_R_NT_004.zip',
 'S3A_SL_2_LST____20180503T080554_20180503T080854_20210118T142546_0180_030_363_3060_LR1_R_NT_004.zip',
 'S3A_SL_2_LST____20180504T073643_20180504T073943_20210118T143114_0179_030_377_2880_LR1_R_NT_004.zip',
 'S3A_SL_2_LST____20180504T073943_20180504T074243_20210118T143121_0179_030_377_3060_LR1_R_NT_004.zip',
 'S3A_SL_2_LST____20180507T075909_20180507T080209_20210118T144812_0179_031_035_2880_LR1_R_NT_004.zip',
 'S3A_SL_2_LST____20180507T080209_20180507T080509_20210118T144825_0179_031_035_3060_LR1_R_NT_004.zip',
 'S3A_SL_2_LST____20180508T073558_20180508T073858_20210118T145356_0179_031_049_3060_LR1_R_NT_004.zip',
 'S3A_SL_2_LST____20180511T075525_20180511T075825_20210118T151206_0179_031_092_2880_LR1_R_NT_004.zip',
 'S3A_SL_2_LST____20180511T075825_20180511T080125_20210118T151157_0179_031_092_3060_LR1_R_NT_004.zip',
 'S3A_SL_2_LST____20180512T073214_20180512T073514_20210118T151805_0179_03

In [11]:
# Iterate over the non-timecritical frames
download_frames = {}
product_id = pd.DataFrame.from_dict(products_nonTimeCritical).loc['Id',].to_list()

for product in product_id:
    product_id = product
    product_file = products_nonTimeCritical[product]
    # Extract the titel of the product
    title = products_nonTimeCritical[product]['Name']
    
    # Append the .zip extension to the title
    zip_title = title + ".zip"
    
    # Append the .SEN3 extension to the title
    sen_title = title + ".SEN3"
    
    # Check if the zip or sen file with the same title already exists
    if zip_title not in existing_files and sen_title not in existing_files:
        download_frames[product_id] = product_file

# Print the number of non-time-critical frames that need to be downloaded
print('Non-Timecritical Frames to download:', len(download_frames))

Non-Timecritical Frames to download: 244


In [3]:
# Promt the user for input to confirm the download
user_input = input("Do you want to download the data? (yes/ no):")

Do you want to download the data? (yes/ no):no


In [13]:
if user_input.lower() == "yes":

    # Extract Id of selected products for download
    product_id = pd.DataFrame.from_dict(download_frames).loc['Id',].to_list()
    access_token = get_access_token(username, password)
    headers = {"Authorization": f"Bearer {access_token}"}
    session = requests.Session()

    for product in product_id:
        title = download_frames[product]['Name']
        file_name = title.replace('SEN3', 'zip')
        url = f"https://zipper.dataspace.copernicus.eu/odata/v1/Products({product})/$value"

        response = session.get(url, headers=headers, stream=True)

        # Check if the access tokem is still valid
        while response.status_code == 401:

            # Token expired, generate new one
            access_token = get_access_token(username, password)
            headers = {"Authorization": f"Bearer {access_token}"}
            session.headers.update(headers)
            response = session.get(url, headers=headers, stream=True)

        # If the request was successful, proceed with downloading
        if response.status_code == 200:
            with open(os.path.join(output_dir, file_name), "wb") as file:
                for chunk in response.iter_content(chunk_size=8192):
                    if chunk:
                        file.write(chunk)
        else:
            print(f"Error downloading product {product}, status code: {response.status_code}")