# STAC Item Search and Submission to Data Pipeline

This notebook allows operators to:

1. Define an area of interest (AOI) and time range
2. Search for STAC items from the EOPF STAC catalog
3. Submit selected items to the data pipeline for processing via HTTP webhook

## Setup and Imports


In [4]:
import json

import requests
from pystac import Item
from pystac_client import Client

# Try to load .env file if available
# try:
#     from dotenv import load_dotenv

#     dotenv_path = Path(".env")
#     if dotenv_path.exists():
#         load_dotenv(dotenv_path)
#         print("‚úÖ Loaded credentials from .env file")
#     else:
#         print("‚ÑπÔ∏è  No .env file found, will prompt for credentials")
# except ImportError:
#     print("‚ÑπÔ∏è  python-dotenv not installed, will prompt for credentials")
#     print("   Install with: pip install python-dotenv")

## Configuration


In [5]:
# STAC API Configuration
STAC_API_URL = "https://stac.core.eopf.eodc.eu/"

# Webhook Configuration
WEBHOOK_URL = "http://localhost:12000/samples"

print("‚úÖ Configuration loaded")

‚úÖ Configuration loaded


## Define Area and Time of Interest


In [14]:
# Area of Interest (AOI) - Bounding box: [min_lon, min_lat, max_lon, max_lat]
# Example: Rome area
# aoi_bbox = [12.4, 41.8, 12.6, 42.0]
# Example 2: Majorca area (2.1697998046875004%2C39.21097520599528%2C3.8177490234375004)
# aoi_bbox = [2.16, 39.21, 3.82, 39.78]
# Example 3: France Full
# aoi_bbox = [-5.14, 41.33, 9.56, 51.09]
# Example 4: Lagoon From Venice to Trieste
aoi_bbox = [12.0, 44.4, 14.0, 45.0]

# Time range
start_date = "2025-04-01T00:00:00Z"
end_date = "2025-04-28T23:59:59Z"

print(f"Area of Interest: {aoi_bbox}")
print(f"Time Range: {start_date} to {end_date}")

Area of Interest: [12.0, 44.4, 14.0, 45.0]
Time Range: 2025-04-01T00:00:00Z to 2025-04-28T23:59:59Z


## Browse Available Collections


In [7]:
# Connect to STAC API
catalog = Client.open(STAC_API_URL)

# List available collections
collections = list(catalog.get_collections())

print(f"\nüìö Available Collections ({len(collections)} total):\n")
for col in collections:
    print(f"  - {col.id}")
    if col.description:
        print(
            f"    {col.description[:100]}..."
            if len(col.description) > 100
            else f"    {col.description}"
        )
    print()


üìö Available Collections (12 total):

  - sentinel-2-l2a
    The Sentinel-2 Level-2A Collection 1 product provides orthorectified Surface Reflectance (Bottom-Of-...

  - sentinel-3-olci-l2-lrr
    The Sentinel-3 OLCI L2 LRR product provides land and atmospheric geophysical parameters computed for...

  - sentinel-3-olci-l2-lfr
    The Sentinel-3 OLCI L2 LFR product provides land and atmospheric geophysical parameters computed for...

  - sentinel-2-l1c
    The Sentinel-2 Level-1C product is composed of 110x110 km2 tiles (ortho-images in UTM/WGS84 projecti...

  - sentinel-3-olci-l1-efr
    The Sentinel-3 OLCI L1 EFR product provides TOA radiances at full resolution for each pixel in the i...

  - sentinel-1-l1-grd
    The Sentinel-1 Level-1 Ground Range Detected (GRD) products consist of focused SAR data that has bee...

  - sentinel-3-slstr-l2-frp
    The Sentinel-3 SLSTR Level-2 FRP product provides global (over land and water) fire radiative power.

  - sentinel-1-l1-slc
    The 



## Select Collection and Search for Items


In [10]:
# Choose the source collection to search
source_collection = "sentinel-2-l1c"  # Change this to your desired collection

# Choose the target collection for processing
target_collection = "sentinel-2-l1c"  # Change this to your target collection

print(f"üîç Searching collection: {source_collection}")
print(f"üéØ Target collection for processing: {target_collection}")

üîç Searching collection: sentinel-2-l1c
üéØ Target collection for processing: sentinel-2-l1c


In [15]:
# Search for items
search = catalog.search(
    collections=[source_collection],
    bbox=aoi_bbox,
    datetime=f"{start_date}/{end_date}",  # Adjust as needed
    limit=200,  # Adjust limit as needed
)

# Collect items paginated results and clean them (workaround for issue #26)
# Use pages_as_dicts() to get raw JSON before PySTAC parsing
items = []

for page_dict in search.pages_as_dicts():
    for feature in page_dict.get("features", []):
        # Clean assets with missing href before parsing
        if "assets" in feature:
            original_count = len(feature["assets"])
            feature["assets"] = {
                key: asset for key, asset in feature["assets"].items() if "href" in asset
            }
            removed_count = original_count - len(feature["assets"])
            if removed_count > 0:
                item_id = feature.get("id", "unknown")
                # print(f"‚ö†Ô∏è  Item {item_id}: Removed {removed_count} asset(s) with missing href")

        # Now parse the cleaned item
        try:
            item = Item.from_dict(feature)
            items.append(item)
        except Exception as e:
            item_id = feature.get("id", "unknown")
            print(f"‚ö†Ô∏è  Skipping item {item_id}: {e}")
            continue

print(f"\n‚úÖ Found {len(items)} items (after filtering).\n")


‚úÖ Found 34 items (after filtering).



## Submit Items to Pipeline


In [16]:
def submit_item_to_pipeline(item_url: str, target_collection: str) -> bool:
    """
    Submit a single STAC item to the data pipeline via HTTP webhook.

    Args:
        item_url: The self-link URL of the STAC item
        target_collection: The target collection for processing

    Returns:
        True if successful, False otherwise
    """
    try:
        # Create payload
        payload = {
            "source_url": item_url,
            "collection": target_collection,
            "action": "convert-v1-s2",  # specify the action to use the V1 S2 trigger
        }

        # Submit via HTTP webhook endpoint
        message = json.dumps(payload)
        response = requests.post(
            WEBHOOK_URL,
            data=message,
            headers={"Content-Type": "application/json"},
        )

        response.raise_for_status()
        return True

    except Exception as e:
        print(f"‚ùå Error submitting item: {e}")
        return False

In [17]:
# Submit all found items to the pipeline
if items:
    print(f"\nüì§ Submitting {len(items)} items to pipeline...\n")

    success_count = 0
    fail_count = 0

    for item in items:
        # Get the self link (canonical URL for the item)
        item_url = next((link.href for link in item.links if link.rel == "self"), None)

        if not item_url:
            print(f"‚ö†Ô∏è  Skipping {item.id}: No self link found")
            fail_count += 1
            continue

        # Submit to pipeline
        if submit_item_to_pipeline(item_url, target_collection):
            print(f"‚úÖ Submitted: {item.id}")
            success_count += 1
        else:
            print(f"‚ùå Failed: {item.id}")
            fail_count += 1

    print("\nüìä Summary:")
    print(f"  - Successfully submitted: {success_count}")
    print(f"  - Failed: {fail_count}")
    print(f"  - Total: {len(items)}")
else:
    print("No items to submit.")


üì§ Submitting 34 items to pipeline...

‚úÖ Submitted: S2B_MSIL1C_20250427T100559_N0511_R022_T33TUK_20250427T123603
‚úÖ Submitted: S2B_MSIL1C_20250427T100559_N0511_R022_T32TQR_20250427T123603
‚úÖ Submitted: S2B_MSIL1C_20250427T100559_N0511_R022_T32TQQ_20250427T123603
‚úÖ Submitted: S2A_MSIL1C_20250424T101041_N0511_R022_T33TUK_20250424T153018
‚úÖ Submitted: S2A_MSIL1C_20250424T101041_N0511_R022_T32TQR_20250424T153018
‚úÖ Submitted: S2A_MSIL1C_20250424T101041_N0511_R022_T32TQQ_20250424T170815
‚úÖ Submitted: S2A_MSIL1C_20250424T101041_N0511_R022_T32TQQ_20250424T153018
‚úÖ Submitted: S2B_MSIL1C_20250424T100029_N0511_R122_T33TVK_20250424T122548
‚úÖ Submitted: S2B_MSIL1C_20250424T100029_N0511_R122_T33TUK_20250424T122548
‚úÖ Submitted: S2B_MSIL1C_20250424T100029_N0511_R122_T32TQR_20250424T122548
‚úÖ Submitted: S2B_MSIL1C_20250424T100029_N0511_R122_T32TQQ_20250424T122548
‚úÖ Submitted: S2C_MSIL1C_20250422T101051_N0511_R022_T33TUK_20250422T140109
‚úÖ Submitted: S2C_MSIL1C_20250422T101051_N051

## Submit Specific Items (Optional)

If you want to submit only specific items instead of all found items, you can manually select them:


In [None]:
# Example: Submit only specific items by index
# Uncomment and modify as needed

# selected_indices = [0, 1, 2]  # Select first 3 items
#
# for idx in selected_indices:
#     if idx < len(items):
#         item = items[idx]
#         item_url = next((link.href for link in item.links if link.rel == "self"), None)
#
#         if item_url:
#             if submit_item_to_pipeline(item_url, target_collection):
#                 print(f"‚úÖ Submitted: {item.id}")
#             else:
#                 print(f"‚ùå Failed: {item.id}")
#     else:
#         print(f"‚ö†Ô∏è  Index {idx} out of range")