# Data Fetching
Week 3 materials can be found [here](https://drive.google.com/drive/folders/1pAYGeXcQSlTnS_hVQzE37Vg-MVXwq51M?usp=sharing).
In this section, we will delve into the process of acquiring the dataset that has been integral to our analyses â€“ the OLCI data from the Sentinel-3 satellite, part of the Copernicus Dataspace. This segment will guide you through the nuances of accessing this rich dataset, understanding its structure, and efficiently retrieving the data you need for your work.

## Copernicus Data Space

**Overview**  
Copernicus Data Space is a cornerstone of the European Union's Earth observation program, providing a wealth of data from the Sentinel satellites. Aimed at monitoring the Earth's environment, it supports applications in areas like climate change, disaster response, and urban planning.

**Key Features**
- **Diverse Datasets**: Offers imagery, atmospheric measurements, and climate indicators.
- **Accessibility**: Data is freely accessible, fostering open science and research.

**Resources**  
For more information and data access, visit the [Copernicus Dataspace](https://dataspace.copernicus.eu).

---

## Set up Accounts

Before delving into the specifics of data retrieval, it's crucial to ensure you have access to the necessary platforms.

**Copernicus Dataspace:** Accessing data from the Copernicus Dataspace requires a separate registration. If you haven't done so, please take a moment to create an account. Simply visit the [Copernicus Dataspace registration page](https://dataspace.copernicus.eu) and follow the instructions to sign up.

## Data Fetching Logic

The logic underlying the data fetching process involves several key steps:

1. **Area and Time Specification:** Initially, we define the geographical scope and the specific time frame of interest. This precise specification allows us to target our data retrieval effectively.

2. **Retrieving Metadata from Copernicus Dataspace:** Once the area and time parameters are set, we proceed to fetch a list of relevant file names from Copernicus Dataspace.

3. **Optional 1: Fetching Raw Data from Copernicus Dataspace given date and time:** With the metadata saved, we then access the Copernicus Dataspace to retrieve the raw data. You are able to see its preview at [Copernicus Dataspace browser](https://browser.dataspace.copernicus.eu/) with filename you are interested in (to see if it is cloud free, etc), before initiating the download.

3. **Optional 2: Browsing first and download the raw data:**  You can also go to the [Copernicus Dataspace browser](https://browser.dataspace.copernicus.eu/) first and select you images. With filenames you are interested, you can initiate the download.

4. **Optional 1: Fetching Raw Data from Copernicus Dataspace given date and time:** With the metadata saved, we then access the Copernicus Dataspace to retrieve the raw data. You are able to see its preview at [Copernicus Dataspace browser](https://browser.dataspace.copernicus.eu/) with filename you are interested in (to see if it is cloud free, etc), before initiating the download.






### Step 0: Set Up

Before we dive into the data fetching process, it's essential to lay the groundwork by setting up the necessary packages and ensuring proper authentication. Follow these preparatory steps to create a smooth and efficient workflow:
 **Install Required Packages:** Make sure all the necessary packages are installed in your working environment. This includes libraries specific to data handling, geospatial analysis, and any other tools relevant to your project. On Google Colab you don't need to do this, but this is a commpn practice when you exceute the code on your local machine.


By completing these initial setup step, you're ensuring that your environment is ready and equipped with the tools needed for data fetching and analysis.




In [2]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [8]:
from datetime import datetime, timedelta
from shapely.geometry import Polygon, Point
import numpy as np
import requests
import pandas as pd
from shapely.geometry import Polygon
from xml.etree import ElementTree as ET
from shapely.geometry import Polygon
import os


### Step 1: Read in Functions Needed

To streamline our data fetching and processing, we'll first load the essential functions. These functions are designed to handle various tasks such as data retrieval, format conversion, and preliminary data processing. Ensure that you've imported all the required functions before proceeding to the next steps of the workflow. All functions have docstrings so please read them to get some ideas of what they do.


In [9]:
def make_api_request(url, method="GET", data=None, headers=None):
    global access_token
    if not headers:
        headers = {"Authorization": f"Bearer {access_token}"}

    response = requests.request(method, url, json=data, headers=headers)
    if response.status_code in [401, 403]:
        global refresh_token
        access_token = refresh_access_token(refresh_token)
        headers["Authorization"] = f"Bearer {access_token}"
        response = requests.request(method, url, json=data, headers=headers)
    return response


def query_sentinel3_olci_arctic_data(start_date, end_date, token):
    """
    Queries Sentinel-3 OLCI data within a specified time range from the Copernicus Data Space,
    targeting data collected over the Arctic region.

    Parameters:
    start_date (str): Start date in 'YYYY-MM-DD' format.
    end_date (str): End date in 'YYYY-MM-DD' format.
    token (str): Access token for authentication.

    Returns:
    DataFrame: Contains details about the Sentinel-3 OLCI images.
    """

    all_data = []
    # arctic_polygon = "POLYGON((-180 60, 180 60, 180 90, -180 90, -180 60))"
    arctic_polygon = (
        "POLYGON ((-81.7 71.7, -81.7 73.8, -75.1 73.8, -75.1 71.7, -81.7 71.7))"
    )

    filter_string = (
        f"Collection/Name eq 'SENTINEL-3' and "
        f"Attributes/OData.CSC.StringAttribute/any(att:att/Name eq 'productType' and att/Value eq 'OL_1_EFR___') and "
        f"ContentDate/Start gt {start_date}T00:00:00.000Z and ContentDate/Start lt {end_date}T23:59:59.999Z"
    )

    next_url = (
        f"https://catalogue.dataspace.copernicus.eu/odata/v1/Products?"
        f"$filter={filter_string} and "
        f"OData.CSC.Intersects(area=geography'SRID=4326;{arctic_polygon}')&"
        f"$top=1000"
    )

    headers = {"Authorization": f"Bearer {token}"}

    while next_url:
        response = make_api_request(next_url, headers=headers)
        if response.status_code == 200:
            data = response.json()["value"]
            all_data.extend(data)
            next_url = response.json().get("@odata.nextLink")
        else:
            print(f"Error fetching data: {response.status_code} - {response.text}")
            break

    return pd.DataFrame(all_data)


def get_access_and_refresh_token(username, password):
    """Retrieve both access and refresh tokens."""
    url = "https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/token"
    data = {
        "grant_type": "password",
        "username": username,
        "password": password,
        "client_id": "cdse-public",
    }
    response = requests.post(url, data=data)
    response.raise_for_status()
    tokens = response.json()
    return tokens["access_token"], tokens["refresh_token"]


def refresh_access_token(refresh_token):
    """Attempt to refresh the access token using the refresh token."""
    url = "https://identity.dataspace.copernicus.eu/auth/realms/CDSE/protocol/openid-connect/token"
    data = {
        "grant_type": "refresh_token",
        "refresh_token": refresh_token,
        "client_id": "cdse-public",
    }
    headers = {"Content-Type": "application/x-www-form-urlencoded"}
    try:
        response = requests.post(url, headers=headers, data=data)
        response.raise_for_status()  # This will throw an error for non-2xx responses
        return response.json()["access_token"]
    except requests.exceptions.HTTPError as e:
        print(f"Failed to refresh token: {e.response.status_code} - {e.response.text}")
        if e.response.status_code == 400:
            print("Refresh token invalid, attempting re-authentication...")
            # Attempt to re-authenticate
            username = username
            password = password
            # This requires securely managing the credentials, which might not be feasible in all contexts
            access_token, new_refresh_token = get_access_and_refresh_token(
                username, password
            )  # This is a placeholder
            refresh_token = (
                new_refresh_token  # Update the global refresh token with the new one
            )
            return access_token
        else:
            raise

def download_single_product(
    product_id, file_name, access_token, download_dir="downloaded_products"
):
    """
    Download a single product from the Copernicus Data Space.

    :param product_id: The unique identifier for the product.
    :param file_name: The name of the file to be downloaded.
    :param access_token: The access token for authorization.
    :param download_dir: The directory where the product will be saved.
    """
    # Ensure the download directory exists
    os.makedirs(download_dir, exist_ok=True)

    # Construct the download URL
    url = (
        f"https://zipper.dataspace.copernicus.eu/odata/v1/Products({product_id})/$value"
    )

    # Set up the session and headers
    headers = {"Authorization": f"Bearer {access_token}"}
    session = requests.Session()
    session.headers.update(headers)

    # Perform the request
    response = session.get(url, headers=headers, stream=True)

    # Check if the request was successful
    if response.status_code == 200:
        # Define the path for the output file
        output_file_path = os.path.join(download_dir, file_name + ".zip")

        # Stream the content to a file
        with open(output_file_path, "wb") as file:
            for chunk in response.iter_content(chunk_size=8192):
                if chunk:
                    file.write(chunk)
        print(f"Downloaded: {output_file_path}")
    else:
        print(
            f"Failed to download product {product_id}. Status Code: {response.status_code}"
        )

### Step 2: Extract Metadata from Copernicus Dataspace

Once you have set up your environment and are authenticated with Copernicus Dataspace, the next step is to extract the filenames that meet your specific criteria.

In [10]:
username = "ddavidw.et@gmail.com"
password = "2092A12ef90R!"
access_token, refresh_token = get_access_and_refresh_token(username, password)
start_date = "2018-06-01"
end_date = "2018-06-02"

sentinel3_olci_data = query_sentinel3_olci_arctic_data(
    start_date, end_date, access_token
)

# You can also save the metadata
# sentinel3_olci_data.to_csv(
#     "/home/wch/data_colocation/Datasets-Co-location/Metadata/sentinel3_olci_metadata_2018_zara.csv",
#     index=False,
# )

Below you can print the metadata you have just retrieved, it contains several aspects of S3 OLCI including: filename, Id, geo footprint and sensing data, etc.

In [11]:
from IPython.display import display

display(sentinel3_olci_data)


Unnamed: 0,@odata.mediaContentType,Id,Name,ContentType,ContentLength,OriginDate,PublicationDate,ModificationDate,Online,EvictionDate,S3Path,Checksum,ContentDate,Footprint,GeoFootprint
0,application/octet-stream,728d46ab-cb77-3dc7-af04-939f1a26cf09,S3A_OL_1_EFR____20180601T013946_20180601T01403...,application/octet-stream,223173022,2024-08-31T03:03:51.248000Z,2025-05-08T21:55:21.039270Z,2025-05-08T21:55:21.039270Z,True,9999-12-31T23:59:59.999999Z,/eodata/Sentinel-3/OLCI/OL_1_EFR___/2018/06/01...,"[{'Value': 'b9642a5598caa117d51e912b53c349f6',...","{'Start': '2018-06-01T01:39:45.699411Z', 'End'...",geography'SRID=4326;POLYGON ((-53.3576 75.3638...,"{'type': 'Polygon', 'coordinates': [[[-53.3576..."
1,application/octet-stream,2adfd301-98b6-3bfb-9650-0f0d6c3055de,S3A_OL_1_EFR____20180601T032045_20180601T03213...,application/octet-stream,212502842,2024-08-31T03:06:44.991000Z,2025-05-08T21:56:46.074678Z,2025-05-08T21:56:46.074678Z,True,9999-12-31T23:59:59.999999Z,/eodata/Sentinel-3/OLCI/OL_1_EFR___/2018/06/01...,"[{'Value': '3f14c2876284117ee1fee1291cb556fa',...","{'Start': '2018-06-01T03:20:44.670886Z', 'End'...",geography'SRID=4326;POLYGON ((-78.5986 75.3664...,"{'type': 'Polygon', 'coordinates': [[[-78.5986..."
2,application/octet-stream,62385914-ddc1-3a0b-a2c2-81fefceb17a5,S3A_OL_1_EFR____20180601T151428_20180601T15172...,application/octet-stream,824106427,2024-08-31T03:29:19.076000Z,2025-05-08T22:14:18.310310Z,2025-05-08T22:14:18.310310Z,True,9999-12-31T23:59:59.999999Z,/eodata/Sentinel-3/OLCI/OL_1_EFR___/2018/06/01...,"[{'Value': '91e867aa18fcf4053c7907a9c0b474c2',...","{'Start': '2018-06-01T15:14:27.722766Z', 'End'...",geography'SRID=4326;POLYGON ((-81.1794 73.3643...,"{'type': 'Polygon', 'coordinates': [[[-81.1794..."
3,application/octet-stream,a6c0eca4-8ab2-380a-8f35-5145b7cf3a57,S3A_OL_1_EFR____20180602T011332_20180602T01142...,application/octet-stream,241201048,2024-08-31T03:45:53.668000Z,2025-05-08T22:24:13.865747Z,2025-05-08T22:24:13.865747Z,True,9999-12-31T23:59:59.999999Z,/eodata/Sentinel-3/OLCI/OL_1_EFR___/2018/06/02...,"[{'Value': 'b47c86f955a0262d2ba41de213e383bc',...","{'Start': '2018-06-02T01:13:31.651927Z', 'End'...",geography'SRID=4326;POLYGON ((-46.8112 75.3681...,"{'type': 'Polygon', 'coordinates': [[[-46.8112..."
4,application/octet-stream,1797d6c8-04c2-3cd2-807b-eb18f6808537,S3A_OL_1_EFR____20180602T011423_20180602T01172...,application/octet-stream,850643109,2024-08-31T03:46:29.165000Z,2025-05-08T22:25:48.631499Z,2025-05-08T22:25:48.631499Z,True,9999-12-31T23:59:59.999999Z,/eodata/Sentinel-3/OLCI/OL_1_EFR___/2018/06/02...,"[{'Value': 'f2844f6baa07a6031c5ae8ebd3c4903e',...","{'Start': '2018-06-02T01:14:23.117321Z', 'End'...",geography'SRID=4326;POLYGON ((-46.7814 85.7876...,"{'type': 'Polygon', 'coordinates': [[[-46.7814..."
5,application/octet-stream,1582ab36-0faf-344b-afa1-f748c01af97b,S3A_OL_1_EFR____20180602T025431_20180602T02552...,application/octet-stream,232786917,2024-08-31T03:48:48.157000Z,2025-05-08T22:25:18.279674Z,2025-05-08T22:25:18.279674Z,True,9999-12-31T23:59:59.999999Z,/eodata/Sentinel-3/OLCI/OL_1_EFR___/2018/06/02...,"[{'Value': '896d8417e6a886f4b0c64f640345688a',...","{'Start': '2018-06-02T02:54:30.711366Z', 'End'...","geography'SRID=4326;POLYGON ((-72.0572 75.368,...","{'type': 'Polygon', 'coordinates': [[[-72.0572..."
6,application/octet-stream,c7d0b80b-212d-3cba-94a4-986208dd61c6,S3A_OL_1_EFR____20180602T162916_20180602T16321...,application/octet-stream,871803953,2024-08-31T04:13:46.863000Z,2025-05-08T22:44:55.412998Z,2025-05-08T22:44:55.412998Z,True,9999-12-31T23:59:59.999999Z,/eodata/Sentinel-3/OLCI/OL_1_EFR___/2018/06/02...,"[{'Value': '4869f4517795a03aabef41c14cfa6baa',...","{'Start': '2018-06-02T16:29:16.176053Z', 'End'...","geography'SRID=4326;POLYGON ((-99.884 73.3635,...","{'type': 'Polygon', 'coordinates': [[[-99.884,..."
7,application/octet-stream,49b6369c-138f-3aa3-83e1-d6506227fc20,S3B_OL_1_EFR____20180601T013839_20180601T01392...,application/octet-stream,225625139,2024-09-15T08:46:03.194000Z,2025-05-15T18:36:46.407802Z,2025-05-15T18:36:46.407802Z,True,9999-12-31T23:59:59.999999Z,/eodata/Sentinel-3/OLCI/OL_1_EFR___/2018/06/01...,"[{'Value': '3d3d8554092e88fc36adddeaf3829889',...","{'Start': '2018-06-01T01:38:38.643859Z', 'End'...","geography'SRID=4326;POLYGON ((-53.254 75.3679,...","{'type': 'Polygon', 'coordinates': [[[-53.254,..."
8,application/octet-stream,ad355a84-f413-3c02-aa8f-1d272b2724b8,S3B_OL_1_EFR____20180601T031938_20180601T03202...,application/octet-stream,216419148,2024-09-15T08:49:07.684000Z,2025-05-15T18:39:20.921674Z,2025-05-15T18:39:20.921674Z,True,9999-12-31T23:59:59.999999Z,/eodata/Sentinel-3/OLCI/OL_1_EFR___/2018/06/01...,"[{'Value': '189601dda7f7eb3d496d65be83b2067b',...","{'Start': '2018-06-01T03:19:37.792316Z', 'End'...",geography'SRID=4326;POLYGON ((-78.4975 75.3697...,"{'type': 'Polygon', 'coordinates': [[[-78.4975..."
9,application/octet-stream,656cf750-47f6-32bb-ad34-ad05b7522c38,S3B_OL_1_EFR____20180601T013927_20180601T01422...,application/octet-stream,808437455,2024-09-15T08:46:36.170000Z,2025-05-15T18:40:05.046375Z,2025-05-15T18:40:05.046375Z,True,9999-12-31T23:59:59.999999Z,/eodata/Sentinel-3/OLCI/OL_1_EFR___/2018/06/01...,"[{'Value': '3ca5601f922c8a48cb8a9eeffb1ce804',...","{'Start': '2018-06-01T01:39:27.463271Z', 'End'...",geography'SRID=4326;POLYGON ((-53.3308 85.7903...,"{'type': 'Polygon', 'coordinates': [[[-53.3308..."


### Step 4: Download

Once you have the correct filename in the Copernicus format, the final step is to download the data. This process involves authenticating with your Copernicus dataspace credentials and sending a request to download the specified file. Below is an example code snippet demonstrating how to perform the download. Ensure that your username and password are accurate and up-to-date to avoid any authentication issues.


In [12]:
download_dir = "/content/drive/MyDrive/0069/week3"  # Replace with your desired download directory
product_id = sentinel3_olci_data['Id'][0] # Replace with your desired file id
file_name = sentinel3_olci_data['Name'][0]# Replace with your desired filename
# Download the single product
download_single_product(product_id, file_name, access_token, download_dir)

Downloaded: /content/drive/MyDrive/0069/week3/S3A_OL_1_EFR____20180601T013946_20180601T014034_20240530T192600_0048_032_003_1080_MAR_R_NT_004.SEN3.zip


In [None]:
# cd /content/drive/MyDrive/0069/week3/

In [None]:
# unzip /content/drive/MyDrive/0069/week3/S3A_OL_1_EFR____20180601T013946_20180601T014034_20240530T192600_0048_032_003_1080_MAR_R_NT_004.SEN3#

Until here, you should have the dataset downloaded in the directory you specified.

### Another downloading option: Download directly from one file (with know filename) you are interested in

In [None]:
def query_product_by_name(product_name, token):
    """
    Query a specific Sentinel-3 product by its name.

    Parameters:
    product_name (str): The exact name of the product to search for.
    token (str): Access token for authentication.

    Returns:
    dict: Metadata for the matching product.
    """
    url = (
        f"https://catalogue.dataspace.copernicus.eu/odata/v1/Products?"
        f"$filter=Name eq '{product_name}'"
    )
    headers = {"Authorization": f"Bearer {token}"}

    response = make_api_request(url, headers=headers)
    if response.status_code == 200:
        data = response.json().get("value", [])
        if data:
            return data[0]  # Return the first matching product (if any)
        else:
            print(f"No product found with name: {product_name}")
            return None
    else:
        print(f"Error fetching product: {response.status_code} - {response.text}")
        return None




# Step 1: Authenticate and retrieve tokens
access_token, refresh_token = get_access_and_refresh_token(username, password)

# Step 2: Provide the product name
product_name = "S3A_OL_1_EFR____20180601T013946_20180601T014034_20240530T192600_0048_032_003_1080_MAR_R_NT_004.SEN3"  # Replace with the specific product name you have

# Step 3: Query the product by name
product_metadata = query_product_by_name(product_name, access_token)

if product_metadata:
    product_id = product_metadata["Id"]  # Extract product ID from metadata
    file_name = product_metadata["Name"]  # Extract product name from metadata

    # Step 4: Download the product
    download_dir = "/content/drive/MyDrive/0069/week3"  # Replace with your desired directory
    download_single_product(product_id, file_name, access_token, download_dir)


Downloaded: /content/drive/MyDrive/PhD Year 3/GEOL0069_test_2026/Week 3/S3A_OL_1_EFR____20180601T013946_20180601T014034_20240530T192600_0048_032_003_1080_MAR_R_NT_004.SEN3.zip
