# Tornado Explorer — Jupyter Starter

**Interactive tornado map with historical/sample data + live reports.**  
Built with **Python, Pandas, Folium** (`TimestampedGeoJson`) in Jupyter.

---

## What this notebook does
- Loads a **sample dataset** of tornado events for instant preview.
- Optionally loads your **historical CSV** (`tornado_events.csv`) if present.
- Fetches **live tornado reports** from the National Weather Service (NWS) **Local Storm Reports (LSR)** feed via the **Iowa Environmental Mesonet (IEM)**.
- Generates an interactive map with:
  - **Clustered markers** colored by **EF scale**
  - **Time slider** to play events over time
  - **Live layer** (red points) for the most recent tornado LSRs

---

## Data sources
- **Sample / historical data**
  - `tornado_events_sample.csv` (included)
  - `tornado_events.csv` (optional – drop in your NOAA/SPC CSV)
- **Live data (LSR)**
  - IEM Local Storm Reports endpoint:  
    `https://mesonet.agron.iastate.edu/cgi-bin/request/gis/lsr.py`  
    *(We currently fetch the last **7 days**; you can change to 24h by adjusting the `recent` param.)*

> **Attribution:** Contains information from the National Weather Service (NWS) Local Storm Reports distributed by the Iowa Environmental Mesonet (IEM). IEM/NWS are not responsible for any errors or omissions in this derived work.

---

## How to run
1. Run the **installs** cell (safe to re-run).
2. Run the **imports** and **loader** cells.  
   - If `tornado_events.csv` exists, it will be used; otherwise the sample is used.
3. Run **GeoJSON** → **Map** to build the interactive map and save **`tornado_map.html`**.
4. For **live reports**, run the **LSR (live)** cells:
   - `recent = 7*24*3600` → last 7 days (good for demos)  
   - `recent = 86400` → last 24 hours (near real-time)
5. Open **`tornado_map.html`** to view the map outside Jupyter.

---

## Tech
- Python • Pandas • Folium (+ MarkerCluster, TimestampedGeoJson)
- JupyterLab

In [1]:
# One-time installs (safe to re-run)
%pip install folium pandas ipywidgets geopy

Note: you may need to restart the kernel to use updated packages.


In [19]:
import json
from datetime import datetime
import pandas as pd
import folium
from folium.plugins import MarkerCluster, TimestampedGeoJson
from IPython.display import display, HTML

def to_float(x):
    try:
        return float(x)
    except:
        return None

def color_by_mag(mag):
    """Return a color by EF (0–5)."""
    if mag is None: return '#9e9e9e'
    try:
        m = float(str(mag).upper().replace('EF',''))
    except:
        return '#9e9e9e'
    if m <= 0: return '#bde6ff'   # EF0
    if m <= 1: return '#8fd3ff'   # EF1
    if m <= 2: return '#5db9ff'   # EF2
    if m <= 3: return '#ff8c66'   # EF3
    if m <= 4: return '#ff5a36'   # EF4
    return '#d00000'              # EF5+


In [20]:
CSV_PATH = "tornado_events.csv"          
FALLBACK = "tornado_events_sample.csv"   

import os
path = CSV_PATH if os.path.exists(CSV_PATH) else FALLBACK
print("Using CSV:", path)

df = pd.read_csv(path)

# Flexible column picking
cols = {c.lower(): c for c in df.columns}
def pick(*cands):
    for c in cands:
        if c in cols: return cols[c]
    return None

lat_c  = pick('lat','begin_lat','latitude')
lon_c  = pick('lon','begin_lon','longitude')
time_c = pick('time','begin_date_time','event_date','date')
ef_c   = pick('ef','tor_f_scale','ef_scale','mag','magnitude')
state_c= pick('state','state_name')
place_c= pick('place','cz_name','location','city')
id_c   = pick('event_id','objectid','id')

for need, val in [('lat',lat_c),('lon',lon_c),('time',time_c)]:
    if not val:
        raise ValueError(f"Missing required column for {need}. Your CSV needs at least time/lat/lon.")

df = df.rename(columns={lat_c:'lat', lon_c:'lon', time_c:'time'})
if ef_c:    df = df.rename(columns={ef_c:'ef'})
if state_c: df = df.rename(columns={state_c:'state'})
if place_c: df = df.rename(columns={place_c:'place'})
if id_c:    df = df.rename(columns={id_c:'event_id'})

# Clean types
df['lat'] = df['lat'].apply(to_float)
df['lon'] = df['lon'].apply(to_float)
df = df.dropna(subset=['lat','lon'])

df['time'] = pd.to_datetime(df['time'], errors='coerce')
df = df.dropna(subset=['time'])

def ef_to_num(x):
    if pd.isna(x): return None
    s = str(x).strip().upper().replace('EF','')
    try:
        return float(s)
    except:
        return None

df['ef_num'] = df['ef'].apply(ef_to_num) if 'ef' in df.columns else None

print(f"Loaded {len(df)} tornado events:")
display(df.head())


Using CSV: tornado_events_sample.csv
Loaded 20 tornado events:


Unnamed: 0,time,lat,lon,ef,state,place,event_id,ef_num
0,2024-04-27 15:30:00,33.15,-96.82,EF0,TX,Frisco,TX-0001,0.0
1,2024-04-27 16:10:00,33.019,-96.699,EF1,TX,Plano,TX-0002,1.0
2,2024-04-27 17:05:00,32.776,-96.797,EF2,TX,Dallas,TX-0003,2.0
3,2024-05-01 18:20:00,35.467,-97.516,EF3,OK,Oklahoma City,OK-0004,3.0
4,2024-05-02 19:00:00,36.154,-95.992,EF1,OK,Tulsa,OK-0005,1.0


In [21]:
from collections import OrderedDict

features = []
for _, r in df.iterrows():
    iso_time = pd.to_datetime(r['time']).isoformat()
    popup_lines = []
    if 'event_id' in r and pd.notna(r['event_id']):
        popup_lines.append(f"<b>ID:</b> {r['event_id']}")
    if 'place' in r and pd.notna(r['place']):
        popup_lines.append(f"<b>Place:</b> {r['place']}")
    if 'state' in r and pd.notna(r['state']):
        popup_lines.append(f"<b>State:</b> {r['state']}")
    if 'ef' in r and pd.notna(r['ef']):
        popup_lines.append(f"<b>EF:</b> {r['ef']}")
    popup_lines.append(f"<b>Time:</b> {iso_time}")
    popup_html = '<br>'.join(popup_lines)

    feat = OrderedDict([
        ('type','Feature'),
        ('properties', {
            'time': iso_time,
            'popup': popup_html,
            'ef': r.get('ef'),
            'ef_num': r.get('ef_num')
        }),
        ('geometry', {
            'type': 'Point',
            'coordinates': [float(r['lon']), float(r['lat'])]
        })
    ])
    features.append(feat)

geojson = {'type':'FeatureCollection', 'features':features}
print("GeoJSON features:", len(features))


GeoJSON features: 20


In [22]:
center_lat = df['lat'].median()
center_lon = df['lon'].median()
m = folium.Map(location=[center_lat, center_lon], zoom_start=5, tiles='CartoDB positron')

cluster = MarkerCluster().add_to(m)

# Static clustered markers
for feat in geojson['features']:
    lon, lat = feat['geometry']['coordinates']
    p = feat['properties']
    size = 6 + (p['ef_num'] or 0) * 4
    folium.CircleMarker(
        [lat, lon],
        radius=size,
        color=color_by_mag(p['ef_num']),
        fill=True,
        fill_opacity=0.75,
        popup=folium.Popup(p['popup'], max_width=320)
    ).add_to(cluster)

# Time slider
TimestampedGeoJson(
    data=geojson,
    transition_time=200,
    add_last_point=True,
    period='P1D',     # daily steps
    duration='PT1H'   # point visible duration
).add_to(m)

# Simple legend
legend_html = '''
<div style="position: fixed; bottom: 20px; left: 20px; z-index: 9999; background: white; padding: 10px; border: 1px solid #888; font-size: 12px;">
<b>EF Scale</b><br>
<span style="display:inline-block;width:12px;height:12px;background:#bde6ff;border:1px solid #999;"></span> EF0<br>
<span style="display:inline-block;width:12px;height:12px;background:#8fd3ff;border:1px solid #999;"></span> EF1<br>
<span style="display:inline-block;width:12px;height:12px;background:#5db9ff;border:1px solid #999;"></span> EF2<br>
<span style="display:inline-block;width:12px;height:12px;background:#ff8c66;border:1px solid #999;"></span> EF3<br>
<span style="display:inline-block;width:12px;height:12px;background:#ff5a36;border:1px solid #999;"></span> EF4<br>
<span style="display:inline-block;width:12px;height:12px;background:#d00000;border:1px solid #999;"></span> EF5+
</div>
'''
m.get_root().html.add_child(folium.Element(legend_html))

m.save("tornado_samplemap.html")
display(HTML("<b>Saved:</b> tornado_samplemap.html"))
m


In [23]:
# Last 7 days of Local Storm Reports from IEM
import requests, io, pandas as pd

lsr_url = "https://mesonet.agron.iastate.edu/cgi-bin/request/gis/lsr.py"
params = {
    "wfo": "ALL",
    "recent": 7 * 24 * 3600,   # 7 days in seconds
    "fmt": "csv"
}
resp = requests.get(lsr_url, params=params, timeout=60)
resp.raise_for_status()

lsr = pd.read_csv(io.StringIO(resp.text))
print("Fetched LSR rows:", len(lsr))

# --- normalize cols (case-insensitive) ---
cols = {c.lower(): c for c in lsr.columns}
def COL(name): return cols.get(name)

# tornado mask using PHENOMENA=='TO' OR TYPETEXT contains 'tornado'
has_phen = COL('phenomena') is not None
has_type = COL('typetext') is not None
mask = pd.Series(True, index=lsr.index)
if has_phen:
    mask &= lsr[COL('phenomena')].astype(str).str.upper().eq('TO')
if has_type:
    mask |= lsr[COL('typetext')].astype(str).str.contains('TORNADO', case=False, na=False)

lsr_to = lsr[mask].copy()

# time, lat, lon
time_c = COL('valid') or COL('time') or COL('valid_time') or COL('valid_utc')
lat_c  = COL('lat')
lon_c  = COL('lon')
if not (time_c and lat_c and lon_c):
    raise ValueError(f"Missing required columns. Have: {list(lsr.columns)[:20]}")

lsr_to['time'] = pd.to_datetime(lsr_to[time_c], errors='coerce', utc=True)
lsr_to = lsr_to.rename(columns={lat_c: 'lat', lon_c: 'lon'})
lsr_to['lat'] = pd.to_numeric(lsr_to['lat'], errors='coerce')
lsr_to['lon'] = pd.to_numeric(lsr_to['lon'], errors='coerce')
lsr_to = lsr_to.dropna(subset=['lat','lon','time'])

# optional popup fields
city_c   = COL('city')
county_c = COL('county')
state_c  = COL('state')
id_c     = COL('eventid') or COL('id')
remark_c = COL('remark')

print("Tornado LSR rows (7d):", len(lsr_to))
display(lsr_to.head())


Fetched LSR rows: 1571
Tornado LSR rows (7d): 1571


Unnamed: 0,VALID,VALID2,lat,lon,MAG,WFO,TYPECODE,TYPETEXT,CITY,COUNTY,STATE,SOURCE,REMARK,UGC,UGCNAME,QUALIFIER,time
0,202509210230,2025/09/21 02:30,38.0,-81.16,,RLX,F,FLASH FLOOD,2 NNW Oak Hill,Fayette,WV,Public,18 to 20 inches of swift moving water reported...,WVC019,Fayette,,1970-01-01 00:03:22.509210230+00:00
1,202509210309,2025/09/21 03:09,36.11,-95.91,0.75,TSA,H,HAIL,2 SE Tulsa,Tulsa,OK,Public,Report from mPING: Dime (0.75 in.).,OKC143,Tulsa,E,1970-01-01 00:03:22.509210309+00:00
2,202509210311,2025/09/21 03:11,36.16,-95.8,63.0,TSA,G,TSTM WND GST,4 SW Catoosa,Tulsa,OK,Public,,OKC143,Tulsa,M,1970-01-01 00:03:22.509210311+00:00
3,202509210315,2025/09/21 03:15,36.71,-93.62,,SGF,D,TSTM WND DMG,5 N Shell Knob,Barry,MO,Fire Dept/Rescue,A tree was down due to thunderstorm winds on H...,MOC009,Barry,,1970-01-01 00:03:22.509210315+00:00
4,202509210332,2025/09/21 03:32,36.66,-93.85,,SGF,F,FLASH FLOOD,2 SSE Cassville,Barry,MO,Dept of Highways,Highway 86 is closed due to flooding.,MOC009,Barry,,1970-01-01 00:03:22.509210332+00:00


In [24]:
from collections import OrderedDict
import folium
from folium import FeatureGroup
from folium.plugins import TimestampedGeoJson

# Build GeoJSON features
lsr_features = []
for _, r in lsr_to.iterrows():
    iso = pd.to_datetime(r['time']).isoformat()
    bits = []
    if id_c and id_c in r and pd.notna(r[id_c]):         bits.append(f"<b>ID:</b> {r[id_c]}")
    if city_c and city_c in r and pd.notna(r[city_c]):    bits.append(f"<b>City:</b> {r[city_c]}")
    if county_c and county_c in r and pd.notna(r[county_c]): bits.append(f"<b>County:</b> {r[county_c]}")
    if state_c and state_c in r and pd.notna(r[state_c]): bits.append(f"<b>State:</b> {r[state_c]}")
    if remark_c and remark_c in r and pd.notna(r[remark_c]): bits.append(f"<b>Remark:</b> {r[remark_c]}")
    bits.append(f"<b>Time (UTC):</b> {iso}")
    popup_html = "<br>".join(bits)
    lsr_features.append(OrderedDict({
        "type": "Feature",
        "properties": {"time": iso, "popup": popup_html},
        "geometry": {"type": "Point", "coordinates": [float(r["lon"]), float(r["lat"])]}
    }))

lsr_geojson = {"type": "FeatureCollection", "features": lsr_features}
print("Live LSR features:", len(lsr_features))

# Ensure there is a base map 'm'
if "m" not in globals():
    # try to center on LSR data; else center USA
    if len(lsr_to):
        c_lat, c_lon = lsr_to["lat"].median(), lsr_to["lon"].median()
    elif "df" in globals() and len(df):
        c_lat, c_lon = df["lat"].median(), df["lon"].median()
    else:
        c_lat, c_lon = 39.5, -98.35
    m = folium.Map(location=[c_lat, c_lon], zoom_start=5, tiles="CartoDB positron")

# Add a distinct layer
lsr_group = FeatureGroup(name="Live Tornado Reports (last 7 days)").add_to(m)

# Red dots for live reports
for f in lsr_geojson["features"]:
    lon, lat = f["geometry"]["coordinates"]
    folium.CircleMarker(
        [lat, lon],
        radius=6,
        color="#ff0000",
        fill=True,
        fill_opacity=0.9,
        popup=folium.Popup(f["properties"]["popup"], max_width=320),
    ).add_to(lsr_group)

# Time slider for the live layer
if len(lsr_geojson["features"]):
    TimestampedGeoJson(
        data=lsr_geojson,
        transition_time=200,
        add_last_point=True,
        period="PT1H",     # hourly steps
        duration="PT30M",
    ).add_to(m)

folium.LayerControl(collapsed=False).add_to(m)
m.save("tornado_map.html")
m


Live LSR features: 1571
