# IBTrACS Tropical Cyclone Data with pystac-monty

This notebook demonstrates how to use pystac-monty to process IBTrACS (International Best Track Archive for Climate Stewardship) tropical cyclone data and visualize it using interactive maps. We'll:

1. Download the IBTrACS tropical cyclone dataset
2. Convert the data to STAC items using pystac-monty
3. Display events on an interactive map
4. Allow selection of events to view related hazards
5. Explore the Monty STAC model and its metadata

Let's begin by importing the necessary libraries.

In [10]:
# Basic libraries
import tempfile
from datetime import datetime, timedelta

# Visualization libraries
import pandas as pd
import requests

# STAC and pystac-monty
import pytz
from pystac_monty.extension import MontyExtension
from pystac_monty.geocoding import WorldAdministrativeBoundariesGeocoder
from pystac_monty.sources.ibtracs import IBTrACSDataSource, IBTrACSTransformer

# Import STAC helper functions
import sys
sys.path.append('.')
from stac_helpers import (
    check_stac_api_availability,
    check_collection_exists,
    create_collection_from_file,
    create_collection_fallback,
    add_items_to_collection,
    delete_collection
)

## 1. Download and Process IBTrACS Data

First, let's download the IBTrACS dataset and initialize the data source and transformer.

In [11]:
# # Define IBTrACS dataset URL (using the North Atlantic basin as an example)
ibtracs_url = "https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r00/access/csv/ibtracs.NA.list.v04r00.csv"

# # Download the dataset
# print(f"Downloading IBTrACS dataset from {ibtracs_url}...")
# response = requests.get(ibtracs_url)
# print(f"Downloaded {len(response.content) / (1024*1024):.2f} MB")

# read csv file
with open('/home/emathot/Downloads/ibtracs.NA.list.v04r00.csv', 'r') as file:
    response = file.read()

# Initialize the data source and transformer
data_source = IBTrACSDataSource(ibtracs_url, response)

# Initialize the geocoder
geocoder = WorldAdministrativeBoundariesGeocoder("../tests/data-files/world-administrative-boundaries.fgb")
transformer = IBTrACSTransformer(data_source, geocoder)

## 2. Create STAC Items from IBTrACS Data

Now, let's transform the IBTrACS data into STAC items.

In [12]:
# Create STAC items
print("Creating STAC items from IBTrACS data...")
all_stac_items = list(transformer.get_stac_items())
print(f"Created {len(all_stac_items)} STAC items")

ERROR:pystac_monty.validators.ibtracs:Invalid SID
ERROR:pystac_monty.validators.ibtracs:Invalid basin code.
ERROR:pystac_monty.sources.ibtracs:Failed to process ibtracs
Traceback (most recent call last):
  File "/home/emathot/Workspace/IFRCGo/pystac-monty/pystac_monty/sources/ibtracs.py", line 79, in get_stac_items
    storm_data = parse_row_data(storm_data)
  File "/home/emathot/Workspace/IFRCGo/pystac-monty/pystac_monty/sources/ibtracs.py", line 75, in parse_row_data
    obj = IBTracsdataValidator(**row)
  File "/home/emathot/Workspace/IFRCGo/pystac-monty/.venv/lib/python3.10/site-packages/pydantic/main.py", line 214, in __init__
    validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
pydantic_core._pydantic_core.ValidationError: 4 validation errors for IBTracsdataValidator
ISO_TIME
  Input should be a valid datetime or date, input is too short [type=datetime_from_date_parsing, input_value=' ', input_type=str]
    For further information visit https

Creating STAC items from IBTrACS data...
Created 3654 STAC items


In [13]:
# Separate the STAC items by role
event_items = []
hazard_items = []

for item in all_stac_items:
    # Filter for recent events (e.g., last 5 years)
    if item.datetime and item.datetime.year < datetime.now().year - 5:
        continue

    roles = item.properties.get("roles", [])
    if "event" in roles:
        event_items.append(item)
    elif "hazard" in roles:
        hazard_items.append(item)

print(f"Events: {len(event_items)}, Hazards: {len(hazard_items)}")


Events: 59, Hazards: 3595


## 3. Loading STAC Items into a STAC API using the Transaction API

Now that we have created STAC items from the IBTrACS data, let's load them into a STAC API using the transaction API. The transaction API allows us to create, update, and delete STAC items in a STAC API.

We'll use the predefined collections from the monty-stac-extension examples folder:
- ibtracs-events: For event items
- ibtracs-hazards: For hazard items

If these collections don't exist in the STAC API, we'll create them based on the predefined collection definitions.

In [14]:
# Define the STAC API endpoint
# Replace with your actual STAC API endpoint
stac_api_url = "https://montandon-eoapi-stage.ifrc.org/stac"

# Define the collection IDs for each type of item
# These match the predefined collections in monty-stac-extension/examples
event_collection_id = "ibtracs-events"
hazard_collection_id = "ibtracs-hazards"

# Define paths to the predefined collection definitions
event_collection_path = "../monty-stac-extension/examples/ibtracs-events/ibtracs-events.json"
hazard_collection_path = "../monty-stac-extension/examples/ibtracs-hazards/ibtracs-hazards.json"

# Check if the STAC API is available
api_available = check_stac_api_availability(stac_api_url)

STAC API is available at https://montandon-eoapi-stage.ifrc.org/stac


In [15]:
# Check if the collections exist and create them if they don't
if api_available:
    # Check and create event collection if needed
    # delete_collection(stac_api_url, event_collection_id)
    event_collection_exists = check_collection_exists(stac_api_url, event_collection_id)
    if not event_collection_exists:
        print(f"\nAttempting to create collection '{event_collection_id}'...")
        event_collection_created = create_collection_from_file(stac_api_url, event_collection_path)
        if not event_collection_created:
            print("Trying fallback method to create event collection...")
            event_collection_created = create_collection_fallback(
                stac_api_url, 
                event_collection_id, 
                "IBTrACS tropical cyclone events processed with pystac-monty",
                ["event", "source"]
            )
        event_collection_exists = event_collection_created
    
    # Check and create hazard collection if needed
    # delete_collection(stac_api_url, hazard_collection_id)
    hazard_collection_exists = check_collection_exists(stac_api_url, hazard_collection_id)
    if not hazard_collection_exists:
        print(f"\nAttempting to create collection '{hazard_collection_id}'...")
        hazard_collection_created = create_collection_from_file(stac_api_url, hazard_collection_path)
        if not hazard_collection_created:
            print("Trying fallback method to create hazard collection...")
            hazard_collection_created = create_collection_fallback(
                stac_api_url, 
                hazard_collection_id, 
                "IBTrACS tropical cyclone hazards processed with pystac-monty",
                ["hazard", "source"]
            )
        hazard_collection_exists = hazard_collection_created
    
    if not (event_collection_exists and hazard_collection_exists):
        print("\nWarning: One or more collections could not be created in the STAC API.")
        print("Some items may not be added to the STAC API.")
else:
    print("STAC API is not available. Skipping collection checks and creation.")

Collection 'ibtracs-events' exists in the STAC API
Collection 'ibtracs-hazards' exists in the STAC API


In [16]:
%load_ext autoreload
%autoreload 2
# Import STAC helper functions
import sys
sys.path.append('.')
from stac_helpers import (
    check_stac_api_availability,
    check_collection_exists,
    create_collection_from_file,
    create_collection_fallback,
    add_items_to_collection,
    delete_collection
)

# Add the items to their respective collections if the API is available
if api_available:
    if event_collection_exists:
        print("Adding event items to the collection...")
        event_success, event_failed = add_items_to_collection(stac_api_url, event_collection_id, event_items, overwrite=True)
    else:
        print("Skipping adding event items because the collection doesn't exist")
        event_success, event_failed = 0, len(event_items)
    
    if hazard_collection_exists:
        print("\nAdding hazard items to the collection...")
        hazard_success, hazard_failed = add_items_to_collection(stac_api_url, hazard_collection_id, hazard_items, overwrite=True)
    else:
        print("Skipping adding hazard items because the collection doesn't exist")
        hazard_success, hazard_failed = 0, len(hazard_items)
    
    total_success = event_success + hazard_success
    total_failed = event_failed + hazard_failed
    
    print(f"\nSummary: Added {total_success} items successfully, {total_failed} items failed")
else:
    print("Skipping adding items to collections because the API is not available")

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload
Adding event items to the collection...
Processing batch 1 of 1 (59 items)
Item 2021239N17281 already exists in the collection
Added 59 items successfully, 0 items failed

Adding hazard items to the collection...
Processing batch 1 of 4 (1000 items)
Item 2021239N17281-hazard-20210826T120000Z already exists in the collection
Item 2021239N17281-hazard-20210826T150000Z already exists in the collection
Item 2021239N17281-hazard-20210826T180000Z already exists in the collection
Item 2021239N17281-hazard-20210826T210000Z already exists in the collection
Item 2021239N17281-hazard-20210827T000000Z already exists in the collection
Item 2021239N17281-hazard-20210827T030000Z already exists in the collection
Item 2021239N17281-hazard-20210827T060000Z already exists in the collection
Item 2021239N17281-hazard-20210827T090000Z already exists in the collection
Item 2021239N17281-hazard-20210827T120000Z already exi

## 4. Analyze IBTrACS Data

Let's analyze the IBTrACS data to get a better understanding of the tropical cyclones.

In [17]:
# Extract relevant information from event items into a DataFrame
events_df = pd.DataFrame(
    [
        {
            "id": item.id,
            "title": item.properties.get("title", ""),
            "start_datetime": item.properties.get("start_datetime", ""),
            "end_datetime": item.properties.get("end_datetime", ""),
            "description": item.properties.get("description", ""),
            "keywords": ", ".join(item.properties.get("keywords", [])),
            "correlation_id": MontyExtension.ext(item).correlation_id,
            "country_codes": ", ".join(MontyExtension.ext(item).country_codes or []),
            "hazard_codes": ", ".join(MontyExtension.ext(item).hazard_codes or [])
        }
        for item in event_items
    ]
)

# Sort by start_datetime (descending)
events_df = events_df.sort_values(by="start_datetime", ascending=False)

# Display the DataFrame
events_df.head(10)

Unnamed: 0,id,title,start_datetime,end_datetime,description,keywords,correlation_id,country_codes,hazard_codes
58,2023321N15278,Tropical Cyclone NOT_NAMED,2023-11-16T18:00:00+00:00,2023-11-18T00:00:00+00:00,Tropical Cyclone NOT_NAMED (2023) in the North...,"tropical cyclone, tropical depression, NOT_NAM...",20231116T180000-XYZ-NAT-MET-STO-TRO-001-GCDB,XYZ,"MH0057, nat-met-sto-tro, TC"
57,2023297N11277,Tropical Cyclone NOT_NAMED,2023-10-23T12:00:00+00:00,2023-10-24T12:00:00+00:00,Tropical Cyclone NOT_NAMED (2023) in the North...,"tropical cyclone, tropical depression, NOT_NAM...",20231023T120000-NIC-NAT-MET-STO-TRO-001-GCDB,NIC,"MH0057, nat-met-sto-tro, TC"
56,2023292N13309,Tropical Cyclone TAMMY,2023-10-18T18:00:00+00:00,2023-10-31T18:00:00+00:00,Tropical Cyclone TAMMY (2023) in the North Atl...,"tropical cyclone, hurricane, TAMMY, 2023, Nort...",20231018T180000-ATG-NAT-MET-STO-TRO-001-GCDB,ATG,"MH0057, nat-met-sto-tro, TC"
55,2023284N10330,Tropical Cyclone SEAN,2023-10-10T18:00:00+00:00,2023-10-16T18:00:00+00:00,Tropical Cyclone SEAN (2023) in the North Atla...,"tropical cyclone, tropical storm, SEAN, 2023, ...",20231010T180000-XYZ-NAT-MET-STO-TRO-001-GCDB,XYZ,"MH0057, nat-met-sto-tro, TC"
54,2023271N16316,Tropical Cyclone RINA,2023-09-28T06:00:00+00:00,2023-10-02T00:00:00+00:00,Tropical Cyclone RINA (2023) in the North Atla...,"tropical cyclone, tropical storm, RINA, 2023, ...",20230928T060000-XYZ-NAT-MET-STO-TRO-001-GCDB,XYZ,"MH0057, nat-met-sto-tro, TC"
53,2023266N16323,Tropical Cyclone PHILIPPE,2023-09-23T06:00:00+00:00,2023-10-06T12:00:00+00:00,Tropical Cyclone PHILIPPE (2023) in the North ...,"tropical cyclone, tropical storm, PHILIPPE, 20...",20230923T060000-XYZ-NAT-MET-STO-TRO-001-GCDB,XYZ,"MH0057, nat-met-sto-tro, TC"
52,2023265N29284,Tropical Cyclone OPHELIA,2023-09-21T12:00:00+00:00,2023-09-24T18:00:00+00:00,Tropical Cyclone OPHELIA (2023) in the North A...,"tropical cyclone, tropical storm, OPHELIA, 202...",20230921T120000-USA-NAT-MET-STO-TRO-001-GCDB,USA,"MH0057, nat-met-sto-tro, TC"
51,2023258N14318,Tropical Cyclone NIGEL,2023-09-15T06:00:00+00:00,2023-09-26T12:00:00+00:00,Tropical Cyclone NIGEL (2023) in the North Atl...,"tropical cyclone, hurricane, NIGEL, 2023, Nort...",20230915T060000-XYZ-NAT-MET-STO-TRO-001-GCDB,XYZ,"MH0057, nat-met-sto-tro, TC"
50,2023251N15334,Tropical Cyclone MARGOT,2023-09-07T12:00:00+00:00,2023-09-18T18:00:00+00:00,Tropical Cyclone MARGOT (2023) in the North At...,"tropical cyclone, hurricane, MARGOT, 2023, Nor...",20230907T120000-XYZ-NAT-MET-STO-TRO-001-GCDB,XYZ,"MH0057, nat-met-sto-tro, TC"
49,2023249N12320,Tropical Cyclone LEE,2023-09-05T12:00:00+00:00,2023-09-18T18:00:00+00:00,Tropical Cyclone LEE (2023) in the North Atlan...,"tropical cyclone, hurricane, LEE, 2023, North ...",20230905T120000-CAN-NAT-MET-STO-TRO-001-GCDB,CAN,"MH0057, nat-met-sto-tro, TC"


In [18]:
# Extract relevant information from hazard items into a DataFrame
hazards_df = pd.DataFrame(
    [
        {
            "id": item.id,
            "title": item.properties.get("title", ""),
            "datetime": item.datetime.isoformat() if item.datetime else "",
            "description": item.properties.get("description", ""),
            "correlation_id": MontyExtension.ext(item).correlation_id,
            "country_codes": ", ".join(MontyExtension.ext(item).country_codes or []),
            "hazard_codes": ", ".join(MontyExtension.ext(item).hazard_codes or []),
            "severity_value": MontyExtension.ext(item).hazard_detail.severity_value if MontyExtension.ext(item).hazard_detail else None,
            "severity_unit": MontyExtension.ext(item).hazard_detail.severity_unit if MontyExtension.ext(item).hazard_detail else None,
            "pressure": MontyExtension.ext(item).hazard_detail.pressure if MontyExtension.ext(item).hazard_detail else None,
            "pressure_unit": MontyExtension.ext(item).hazard_detail.pressure_unit if MontyExtension.ext(item).hazard_detail else None
        }
        for item in hazard_items
    ]
)

# Sort by datetime (descending)
hazards_df = hazards_df.sort_values(by="datetime", ascending=False)

# Display the DataFrame
hazards_df.head(10)

AttributeError: 'HazardDetail' object has no attribute 'pressure'