# STAC Item Search and Submission to Data Pipeline

This notebook allows operators to:

1. Define an area of interest (AOI) and time range
2. Search for STAC items from the EOPF STAC catalog
3. Submit selected items to the data pipeline for processing


## Setup and Imports


In [25]:
import getpass
import json
import os
from pathlib import Path
import requests


import pandas as pd
import pika
from pystac_client import Client

# Try to load .env file if available
try:
    from dotenv import load_dotenv

    dotenv_path = Path(".env")
    if dotenv_path.exists():
        load_dotenv(dotenv_path)
        print("‚úÖ Loaded credentials from .env file")
    else:
        print("‚ÑπÔ∏è  No .env file found, will prompt for credentials")
except ImportError:
    print("‚ÑπÔ∏è  python-dotenv not installed, will prompt for credentials")
    print("   Install with: pip install python-dotenv")

‚úÖ Loaded credentials from .env file


## Configuration


In [2]:
# STAC API Configuration
STAC_API_URL = "https://stac.core.eopf.eodc.eu/"

# RabbitMQ Configuration
RABBITMQ_HOST = "localhost"
RABBITMQ_PORT = 5672
RABBITMQ_VHOST = "/"
RABBITMQ_USER = "user"
RABBITMQ_EXCHANGE = "eopf_samples"
RABBITMQ_ROUTING_KEY = "eopf_samples.convert.v1"

# Load password from environment or prompt user
RABBITMQ_PASSWORD = os.getenv("RABBITMQ_PASSWORD")

if not RABBITMQ_PASSWORD:
    print("\nüîê RabbitMQ credentials required")
    print("   Tip: Create a .env file with RABBITMQ_PASSWORD=your_password")
    print("   (The .env file is gitignored and won't be committed)\n")
    RABBITMQ_PASSWORD = getpass.getpass("Enter RabbitMQ password: ")
    if RABBITMQ_PASSWORD:
        print("‚úÖ Password entered")
    else:
        print("‚ö†Ô∏è  Warning: No password provided!")
else:
    print("‚úÖ RabbitMQ password loaded")

‚úÖ RabbitMQ password loaded


## Define Area and Time of Interest


In [3]:
# Area of Interest (AOI) - Bounding box: [min_lon, min_lat, max_lon, max_lat]
# Example: Rome area
# aoi_bbox = [12.4, 41.8, 12.6, 42.0]
# Example 2: Majorca area (2.1697998046875004%2C39.21097520599528%2C3.8177490234375004)
# aoi_bbox = [2.16, 39.21, 3.82, 39.78]
# Example 3: France Full
aoi_bbox = [-5.14, 41.33, 9.56, 51.09]

# Time range
start_date = "2025-08-01T00:00:00Z"
end_date = "2025-08-31T23:59:59Z"

print(f"Area of Interest: {aoi_bbox}")
print(f"Time Range: {start_date} to {end_date}")

Area of Interest: [-5.14, 41.33, 9.56, 51.09]
Time Range: 2025-08-01T00:00:00Z to 2025-08-31T23:59:59Z


## Browse Available Collections


In [4]:
# Connect to STAC API
catalog = Client.open(STAC_API_URL)

# List available collections
collections = list(catalog.get_collections())

print(f"\nüìö Available Collections ({len(collections)} total):\n")
for col in collections:
    print(f"  - {col.id}")
    if col.description:
        print(
            f"    {col.description[:100]}..."
            if len(col.description) > 100
            else f"    {col.description}"
        )
    print()


üìö Available Collections (11 total):

  - sentinel-2-l2a
    The Sentinel-2 Level-2A Collection 1 product provides orthorectified Surface Reflectance (Bottom-Of-...

  - sentinel-3-olci-l2-lfr
    The Sentinel-3 OLCI L2 LFR product provides land and atmospheric geophysical parameters computed for...

  - sentinel-3-slstr-l2-lst
    The Sentinel-3 SLSTR Level-2 LST product provides land surface temperature.

  - sentinel-1-l2-ocn
    The Sentinel-1 Level-2 Ocean (OCN) products for wind, wave and currents applications may contain the...

  - sentinel-1-l1-grd
    The Sentinel-1 Level-1 Ground Range Detected (GRD) products consist of focused SAR data that has bee...

  - sentinel-2-l1c
    The Sentinel-2 Level-1C product is composed of 110x110 km2 tiles (ortho-images in UTM/WGS84 projecti...

  - sentinel-1-l1-slc
    The Sentinel-1 Level-1 Single Look Complex (SLC) products consist of focused SAR data, geo-reference...

  - sentinel-3-slstr-l1-rbt
    The Sentinel-3 SLSTR Level-1B RBT

## Select Collection and Search for Items


In [5]:
# Choose the source collection to search
source_collection = "sentinel-2-l2a"  # Change this to your desired collection

# Choose the target collection for processing
target_collection = "sentinel-2-l2a-staging"  # Change this to your target collection

print(f"üîç Searching collection: {source_collection}")
print(f"üéØ Target collection for processing: {target_collection}")

üîç Searching collection: sentinel-2-l2a
üéØ Target collection for processing: sentinel-2-l2a-staging


In [11]:
# Search for items
search = catalog.search(
    collections=[source_collection],
    bbox=aoi_bbox,
    datetime=f"{start_date}/{end_date}" # Adjust as needed
)

# Collect items paginated results
items = []
for page in search.pages():
    items.extend(page.items)

print(f"\n‚úÖ Found {len(items)} items matching criteria.\n")

# Display items in a table
if items:
    items_data = []
    for item in items:
        items_data.append(
            {
                "ID": item.id,
                "Collection": item.collection_id,
                "Datetime": item.datetime.isoformat() if item.datetime else "N/A",
                "Self Link": next((link.href for link in item.links if link.rel == "self"), "N/A"),
            }
        )

    df = pd.DataFrame(items_data)
    display(df)
else:
    print("No items found for the specified criteria.")


‚úÖ Found 2739 items matching criteria.



Unnamed: 0,ID,Collection,Datetime,Self Link
0,S2C_MSIL2A_20250831T112131_N0511_R037_T30UXB_2...,sentinel-2-l2a,2025-08-31T11:21:31.025000+00:00,https://stac.core.eopf.eodc.eu/collections/sen...
1,S2C_MSIL2A_20250831T112131_N0511_R037_T30UWV_2...,sentinel-2-l2a,2025-08-31T11:21:31.025000+00:00,https://stac.core.eopf.eodc.eu/collections/sen...
2,S2C_MSIL2A_20250831T112131_N0511_R037_T30UWU_2...,sentinel-2-l2a,2025-08-31T11:21:31.025000+00:00,https://stac.core.eopf.eodc.eu/collections/sen...
3,S2C_MSIL2A_20250831T112131_N0511_R037_T30UWU_2...,sentinel-2-l2a,2025-08-31T11:21:31.025000+00:00,https://stac.core.eopf.eodc.eu/collections/sen...
4,S2C_MSIL2A_20250831T112131_N0511_R037_T30UWB_2...,sentinel-2-l2a,2025-08-31T11:21:31.025000+00:00,https://stac.core.eopf.eodc.eu/collections/sen...
...,...,...,...,...
2734,S2B_MSIL2A_20250801T102559_N0511_R108_T31TFK_2...,sentinel-2-l2a,2025-08-01T10:25:59.032000+00:00,https://stac.core.eopf.eodc.eu/collections/sen...
2735,S2B_MSIL2A_20250801T102559_N0511_R108_T31TFG_2...,sentinel-2-l2a,2025-08-01T10:25:59.032000+00:00,https://stac.core.eopf.eodc.eu/collections/sen...
2736,S2B_MSIL2A_20250801T102559_N0511_R108_T31TEJ_2...,sentinel-2-l2a,2025-08-01T10:25:59.032000+00:00,https://stac.core.eopf.eodc.eu/collections/sen...
2737,S2B_MSIL2A_20250801T102559_N0511_R108_T31TEH_2...,sentinel-2-l2a,2025-08-01T10:25:59.032000+00:00,https://stac.core.eopf.eodc.eu/collections/sen...


## Submit Items to Pipeline


In [None]:
def submit_item_to_pipeline(item_url: str, target_collection: str) -> bool:
    """
    Submit a single STAC item to the data pipeline via RabbitMQ.

    Args:
        item_url: The self-link URL of the STAC item
        target_collection: The target collection for processing

    Returns:
        True if successful, False otherwise
    """
    try:
        # Create payload
        payload = {
            "source_url": item_url,
            "collection": target_collection,
            "action": "convert-v1-s2",  # specify the action to use the V1 S2 trigger
        }

        # Publish message with simple http post request to localhost:12000/samples
        message = json.dumps(payload)
        response = requests.post(
            "http://localhost:12000/samples",
            data=message,
            headers={"Content-Type": "application/json"},
        )

        return True

    except Exception as e:
        print(f"‚ùå Error submitting item: {e}")
        return False

In [28]:
# Submit all found items to the pipeline
if items:
    print(f"\nüì§ Submitting {len(items)} items to pipeline...\n")

    success_count = 0
    fail_count = 0

    # skip the 290 first items
    for item in items[100:]:
        # Get the self link (canonical URL for the item)
        item_url = next((link.href for link in item.links if link.rel == "self"), None)

        if not item_url:
            print(f"‚ö†Ô∏è  Skipping {item.id}: No self link found")
            fail_count += 1
            continue

        # Submit to pipeline
        if submit_item_to_pipeline(item_url, target_collection):
            print(f"‚úÖ Submitted: {item.id}")
            success_count += 1
        else:
            print(f"‚ùå Failed: {item.id}")
            fail_count += 1

    print("\nüìä Summary:")
    print(f"  - Successfully submitted: {success_count}")
    print(f"  - Failed: {fail_count}")
    print(f"  - Total: {len(items)}")
else:
    print("No items to submit.")


üì§ Submitting 2739 items to pipeline...

‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32UNB_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32UNA_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32UMV_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32UMU_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32UMB_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32UMA_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32ULV_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32ULU_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32ULB_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32ULA_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32TNT_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0511_R108_T32TNS_20250831T143825
‚úÖ Submitted: S2B_MSIL2A_20250831T102559_N0

KeyboardInterrupt: 

## Submit Specific Items (Optional)

If you want to submit only specific items instead of all found items, you can manually select them:


In [None]:
# Example: Submit only specific items by index
# Uncomment and modify as needed

# selected_indices = [0, 1, 2]  # Select first 3 items
#
# for idx in selected_indices:
#     if idx < len(items):
#         item = items[idx]
#         item_url = next((link.href for link in item.links if link.rel == "self"), None)
#
#         if item_url:
#             if submit_item_to_pipeline(item_url, target_collection):
#                 print(f"‚úÖ Submitted: {item.id}")
#             else:
#                 print(f"‚ùå Failed: {item.id}")
#     else:
#         print(f"‚ö†Ô∏è  Index {idx} out of range")