# Test Case 3: Earthquake Catalog Data

**Use Case:** Seismologist analyzing earthquake patterns and epicenter distributions

**Dataset:** Preliminary Determination of Epicenters (PDE) - ISF format

**Goal:** 
- Explore the ISF format structure
- Test ISF extractor and data profile extraction
- Create visualizations for earthquake analysis


In [None]:
# Setup
import sys
from pathlib import Path
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

sys.path.insert(0, str(Path.cwd()))

import vibe_widget as vw
from vibe_widget.data_parser.preprocessor import preprocess_data

import os
API_KEY = os.getenv("ANTHROPIC_API_KEY")

print("Setup complete!")


Setup complete!


## Step 1: Exploratory Data Analysis (EDA)


In [7]:
# Load the ISF file
data_path = Path("testdata/202501_cat.isf")

print(f"Loading: {data_path}")
print(f"File exists: {data_path.exists()}")
print(f"File size: {data_path.stat().st_size / (1024*1024):.2f} MB")

# Preview first few lines
print("\n" + "="*60)
print("File Preview (first 30 lines):")
print("="*60)
with open(data_path, 'r', encoding='utf-8', errors='ignore') as f:
    for i, line in enumerate(f):
        if i < 30:
            print(line.rstrip())
        else:
            break


Loading: testdata/202501_cat.isf
File exists: True
File size: 10.10 MB

File Preview (first 30 lines):
BEGIN IMS2.0
MSG_TYPE DATA
MSG_ID 6000PJR6_1 HYDRA
DATA_TYPE BULLETIN IMS1.0:short with ISF2.0 extensions
DATA_TYPE BULLETIN IMS1.0:short with ISF2.0 extensions
Event 6000PJR6 Guam region

   Date       Time        Err   RMS Latitude Longitude  Smaj  Smin  Az Depth   Err Ndef Nst  Gap  mdist  Mdist Qual   Author      OrigID
2025/01/01 00:13:22.65   1.54  0.76  12.7441  143.6244  29.1  12.6 140  35.0f        13   12 160   1.47  76.35 m i se NEIC       2805024

Magnitude  Err Nsta Author      OrigID
mb     4.2 0.2    7 NEIC       2805024

Sta     Dist  EvAz Phase        Time      Tres  Azim AzRes   Slow   SRes Def   SNR       Amp   Per Qual Magnitude    ArrID    Agy   Deploy   Ln Auth  Rep   PCh ACh L
GUMO    1.47  55.1 Pn       00:13:44.500  -2.2                           T__                       m__            BHZIU00     FDSN  IU       00 NEIC  NEIC  BHZ ??? -
GUMO    1.47  55.1 Sn 

In [8]:
# Create data profile with LLM augmentation
print("Creating data profile...")
profile = preprocess_data(
    data_path,
    api_key=API_KEY,
    augment=True,
)

print("\n" + "="*60)
print("Data Profile Summary")
print("="*60)
print(profile.to_markdown())

Creating data profile...

Data Profile Summary
# Dataset Profile: isf
**Source:** `testdata/202501_cat.isf`

## Overview
- **Shape:** 1230 × 8
- **Completeness:** 57.5%
- **Domain:** Seismology / Earthquake Monitoring
- **Purpose:** This dataset records seismic events (earthquakes) detected globally, likely from an international seismological monitoring network (ISF - International Seismological Federation). It is used for earthquake cataloging, hazard assessment, seismic research, and monitoring tectonic activity patterns.
- **Characteristics:** Time series (frequency: irregular/event-based (continuous monitoring with events occurring at irregular intervals, sub-second precision)), Geospatial (CRS: EPSG:4326)

## Fields

### `event_id` (text)
*Unique identifier assigned to each seismic event by the monitoring network*
- Count: 1,230
- **Potential uses:** Primary key for tracking and referencing specific earthquakes, Linking multiple observations or reports of the same event, Cross-ref

## Step 3: Create Visualization Widget


In [9]:
description = """
Create an interactive earthquake visualization that shows:
- World map with earthquake epicenters colored by magnitude
- Time slider to explore temporal patterns
- Depth visualization
"""

widget = vw.create(
    description,
    data_path,
    api_key=API_KEY,
    context=profile
)


Create an interactive earthquake visualization that shows:
- World map with earthquake epicenters colored by magnitude
- Time slider to explore temporal patterns
- Depth visualization


Data Profile: # Dataset Profile: isf
**Source:** `testdata/202501_cat.isf`

## Overview
- **Shape:** 1230 × 8
- **Completeness:** 57.5%
- **Domain:** Seismology / Earthquake Monitoring
- **Purpose:** This dataset records seismic events (earthquakes) detected globally, likely from an international seismological monitoring network (ISF - International Seismological Federation). It is used for earthquake cataloging, hazard assessment, seismic research, and monitoring tectonic activity patterns.
- **Characteristics:** Time series (frequency: irregular/event-based (continuous monitoring with events occurring at irregular intervals, sub-second precision)), Geospatial (CRS: EPSG:4326)

## Fields

### `event_id` (text)
*Unique identifier assigned to each seismic event by the monitoring network*
- Count: 1,230


<vibe_widget.core.VibeWidget object at 0x118fb37d0>