# üõ©Ô∏è Building a Historical Flight Tracker with Python and CesiumJS

## Overview and Motivation

This notebook demonstrates how to build an interactive flight history tracking visualization similar to [FlightRadar24](https://www.flightradar24.com/) using real historical [ADS-B (Automatic Dependent Surveillance-Broadcast)](https://en.wikipedia.org/wiki/Automatic_Dependent_Surveillance%E2%80%93Broadcast) flight data, in your notebook.

## What You'll Build

- üìä **Process large aviation datasets** using PyArrow (columnar data format)
- üó∫Ô∏è **Create 3D visualizations** with CesiumJS and the cesiumjs_anywidget library
- üé¨ **Build time-dynamic animations** using CZML (Cesium Markup Language)

An interactive 3D globe showing:
- ‚úàÔ∏è Real flight trajectories with animated paths
- üåç Time-based filtering (select specific dates/times)
- ‚è±Ô∏è Timeline controls for playback
- üíæ Efficient local caching to minimize downloads

## What is already built

The core visualization and interaction components are provided by the [cesiumjs_anywidget](https://github.com/Alex-PLACET/cesiumjs_anywidget) library, which wraps [CesiumJS](https://cesium.com/platform/cesiumjs/) for use in Jupyter notebooks.
Today (February 2026), the widget is still in active development and the code is not very clean (like dirty vibe-coded JavaScript). Features available don't yet cover all the functionalities provided by CesiumJS. The current available features are the ones I needed for this demo and other projects, such as:
- CZML data source loading and time-dynamic visualization
- Timeline and clock controls
- Basic camera controls and synchronization with notebook widgets
- Measurement tools
- Data synchronization between Python <=> JavaScript
- ...

## Data Source

For this example, I used [ADS-B flight data](https://en.wikipedia.org/wiki/Automatic_Dependent_Surveillance%E2%80%93Broadcast) from [ADS-B Global History](https://huggingface.co/datasets/alexisplacet/adsblol_globe_history).
They have their own live map here: https://adsb.lol
Don't hesitate to support them, they are doing great work collecting and sharing aviation data as open data!

The drawback is that they share the data in ~2GB archives composed of compressed JSON files per day, which are not optimal for performance. Therefore, I converted the relevant data into Parquet format and pushed it to HuggingFace Hub for easy access: https://huggingface.co/datasets/alexisplacet/adsblol_globe_history

### The HuggingFace Dataset Details:

**Coverage**: March 3, 2023 to present
**Format**: Parquet files (optimized columnar storage)  

Each folder at the root level corresponds to one day of data, with the name of the ADSBLOL release used (e.g., "v2024.12.28-planes-readsb-prod-0").
In each of these folders, you'll find traces parquet files. Each parquet file groups multiple flights according to the last 8 bits of their [ICAO 24-bit address](https://en.wikipedia.org/wiki/Aviation_transponder_interrogation_modes#ICAO_24-bit_address) (e.g., all ICAO addresses ending with 0x1B are in the same file: traces_1B.parquet). They are used to provide high-precision flight position/information.
We also provide heatmaps: Heatmaps provide flight positions with lower granularity in time (every 10 seconds). They only have the ICAO and the positions of the aircraft at these time intervals.
The heatmaps files are separated by 30 minutes (e.g., 00_positions.parquet has the positions between hh:00 and hh:29:59, 01_positions.parquet has the positions between hh:30 and hh:59:59, etc.).

**Directory Structure**:
```
v2023.03.15-planes-readsb-prod-0/
‚îú‚îÄ‚îÄ heatmaps/
‚îÇ   ‚îú‚îÄ‚îÄ callsigns.parquet      # Mapping of ICAO to callsigns
‚îÇ   ‚îú‚îÄ‚îÄ 00_positions.parquet   # 00:00-00:30 UTC
‚îÇ   ‚îú‚îÄ‚îÄ 01_positions.parquet   # 00:30-01:00 UTC
‚îÇ   ‚îî‚îÄ‚îÄ ...                    # (48 files, one per half-hour)
‚îú‚îÄ‚îÄ aircraft.parquet           # Static aircraft information (e.g., type, model, owner)
‚îú‚îÄ‚îÄ traces_00.parquet          # Full traces for ICAOs ending in 00
‚îú‚îÄ‚îÄ traces_01.parquet          # Full traces for ICAOs ending in 01
‚îî‚îÄ‚îÄ ...                        # (256 files, partitioned by ICAO)
```


Example:
- v2024.12.28-planes-readsb-prod-0
    - aircraft.parquet
    - traces_00.parquet
    - traces_01.parquet
    - traces_02.parquet
    - ...
    - traces_FF.parquet
    - heatmaps
        - callsigns.parquet
        - 00_positions.parquet
        - 01_positions.parquet
        - ...

## Setup and Imports

In [None]:
import pyarrow as pa
import pyarrow.parquet as pq
import pyarrow.compute as pc
import numpy as np
from datetime import datetime, timedelta
from pathlib import Path
from huggingface_hub import hf_hub_download
from typing import List, Dict, Final
import time
import math
import os
import random
from cesiumjs_anywidget import CesiumWidget

# Set HuggingFace cache directory next to this notebook
notebook_dir = Path(__file__).parent if '__file__' in globals() else Path.cwd()
hf_cache_dir = notebook_dir / "hf_cache"
hf_cache_dir.mkdir(exist_ok=True)
os.environ['HF_HOME'] = str(hf_cache_dir)
print(f"HuggingFace cache directory: {hf_cache_dir}")

# Hugging Face dataset repository
REPO_ID = "alexisplacet/adsblol_globe_history"
REPO_TYPE = "dataset"

HuggingFace cache directory: /home/alexisp/Dev/cesiumjs_anywidget/examples/hf_cache


## ‚öôÔ∏è Performance Configuration

Before loading data, let's configure performance parameters:

### Understanding Performance Trade-offs

When working with large aviation datasets, we need to balance:

- **Data Volume** vs **Load Time**: More flights = richer visualization but slower loading
- **Spatial Coverage** vs **Detail**: Wider area = more context but more data to process
- **Temporal Resolution** vs **Smoothness**: More points per flight = smoother animation but higher memory usage

### Configuration Parameters

In [None]:
# Performance Configuration
class PerformanceConfig:
    """Configuration for performance optimization.
    
    Adjust these parameters based on your use case:
    - Presentation/Demo: MAX_FLIGHTS=1000, USE_SPATIAL_FILTER=True
    - Fast Exploration: MAX_FLIGHTS=200, RADIUS_MULTIPLIER=0.8  
    - Global View: MAX_FLIGHTS=500, USE_SPATIAL_FILTER=False
    """
    
    # Maximum number of flights to display at once
    # Lower = faster rendering, Higher = more comprehensive view
    MAX_FLIGHTS = 500  # Recommended: 100-1000
    
    # Whether to use spatial filtering (limit to visible area)
    USE_SPATIAL_FILTER = True  # Set False to see all flights globally
    
    # Radius multiplier for spatial filtering
    # Higher = larger search area
    RADIUS_MULTIPLIER = 1.0  # Default: 1.0, Larger view: 2.0
    
    # Minimum points required for a flight path
    MIN_PATH_POINTS = 2

config = PerformanceConfig()
print("‚öôÔ∏è  Performance config loaded:")
print(f"   Max flights: {config.MAX_FLIGHTS}")
print(f"   Spatial filtering: {'Enabled' if config.USE_SPATIAL_FILTER else 'Disabled'}")
print(f"   Radius multiplier: {config.RADIUS_MULTIPLIER}x")

‚öôÔ∏è  Performance config loaded:
   Max flights: 500
   Spatial filtering: Enabled
   Radius multiplier: 1.0x
   Update cooldown: 2.0s


## Helper functions

In [None]:
def get_day_folder(date: datetime) -> str:
    """Get the folder name for a given date.
    
    Example: 2023-03-15 ‚Üí 'v2023.03.15-planes-readsb-prod-0'
    """
    return f"v{date.strftime('%Y.%m.%d')}-planes-readsb-prod-0"


def icao_bytes_to_hex(icao_bytes: bytes) -> str:
    """Convert ICAO binary (3 bytes) to hex string.
    
    ICAO codes are stored as 3-byte binary for efficiency.
    Example: b'\x40\x0a\x3f' ‚Üí '400a3f'
    """
    return icao_bytes.hex()

def calculate_view_radius(altitude_m: float) -> float:
    """Calculate reasonable view radius based on camera altitude.
    
    Uses a simple heuristic:
    - Low altitude (<10km): ~50km radius
    - Medium altitude (10-100km): ~200km radius
    - High altitude (>100km): ~500km radius
    """
    if altitude_m < 10000:
        return 50000  # 50 km
    elif altitude_m < 100000:
        return 200000  # 200 km
    else:
        return 500000  # 500 km

def find_nearby_icaos(date: datetime, time_of_day: datetime, center_lat: float, center_lon: float, radius_m: float) -> List[bytes]:
    """Find ICAO codes (as bytes) of flights near a location at a specific time.
    
    Pure PyArrow + NumPy implementation with HuggingFace Hub caching.
    Returns list of ICAO codes as bytes (fixed_size_binary[3]).
    """
    day_folder = get_day_folder(date)
    
    # Calculate half-hour index (0-47)
    half_hour_index = time_of_day.hour * 2 + (1 if time_of_day.minute >= 30 else 0)
    half_hour_str = f"{half_hour_index:02d}"
    
    print("üîç Looking for flights:")
    print(f"   Date folder: {day_folder}")
    print(f"   Time: {time_of_day.strftime('%Y-%m-%d %H:%M:%S')}")
    print(f"   Half-hour index: {half_hour_index} (file: {half_hour_str}_positions.parquet)")
    
    
    filename : Final[str]= f"{day_folder}/heatmaps/{half_hour_str}_positions.parquet"
    
    try:
        print(f"   Downloading from HuggingFace: {filename}")
        # Download position file using HuggingFace Hub (with automatic caching)
        local_path = hf_hub_download(
            repo_id=REPO_ID,
            filename=filename,
            repo_type=REPO_TYPE
        )
        print(f"‚úì File ready: {local_path}")
        
        # Read with PyArrow
        table = pq.read_table(local_path, columns=['icao', 'timestamp', 'lat', 'lon', 'alt'])
        print(f"‚úì Loaded position file: {table.num_rows} total positions")
    except Exception as e:
        print(f"‚ùå Error loading position data: {e}")
        import traceback
        traceback.print_exc()
        return []
    
    # Extract columns as NumPy arrays for vectorized operations
    # ICAO is fixed_size_binary[3] - 3 bytes
    icao_bytes = table['icao'].to_pylist()  # List of bytes objects
    lats = table['lat'].to_numpy()
    lons = table['lon'].to_numpy()
    
    print(f"   Position range: lat [{lats.min():.2f}, {lats.max():.2f}], lon [{lons.min():.2f}, {lons.max():.2f}]")
    print(f"   Search center: lat {center_lat:.2f}, lon {center_lon:.2f}")
    print(f"   Search radius: {radius_m/1000:.1f} km")
    
    # Vectorized distance calculation
    lat_rad = np.radians(lats)
    lon_rad = np.radians(lons)
    center_lat_rad = math.radians(center_lat)
    center_lon_rad = math.radians(center_lon)
    
    delta_lat = lat_rad - center_lat_rad
    delta_lon = lon_rad - center_lon_rad
    
    a = np.sin(delta_lat/2)**2 + np.cos(center_lat_rad) * np.cos(lat_rad) * np.sin(delta_lon/2)**2
    c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1-a))
    distances = 6371000 * c
    
    # Filter by distance
    nearby_mask = distances <= radius_m
    nearby_count = nearby_mask.sum()
    
    # Get nearby ICAOs as bytes
    nearby_icaos = [icao_bytes[i] for i in range(len(icao_bytes)) if nearby_mask[i]]
    
    if nearby_count > 0:
        min_dist = distances[nearby_mask].min()
        max_dist = distances[nearby_mask].max()
        print(f"‚úì Found {nearby_count} flights within radius")
        print(f"   Distance range: {min_dist/1000:.1f} km to {max_dist/1000:.1f} km")
        # Show sample ICAOs as hex strings
        sample_hex = [icao_bytes_to_hex(icao) for icao in nearby_icaos[:5]]
        print(f"   Sample ICAOs: {sample_hex}")
    else:
        print(f"‚ùå No flights found within {radius_m/1000:.1f} km")
        print(f"   Closest flight: {distances.min()/1000:.1f} km away")
    
    return nearby_icaos


def load_flight_traces(date: datetime, icao_codes: List[bytes], time_window: timedelta = timedelta(minutes=30)) -> pa.Table:
    """Load flight trace data for specific ICAO codes.
    
    100% PyArrow implementation with HuggingFace Hub caching.
    Note: Traces files contain position data but NOT aircraft metadata.
    Aircraft metadata (registration, type, operator) is in separate aircraft.parquet file.
    
    Args:
        icao_codes: List of ICAO codes as bytes (fixed_size_binary[3])
    """
    if not icao_codes:
        print("‚ùå No ICAO codes provided")
        return pa.table({}, schema=pa.schema([
            ('icao', pa.binary(3)),
            ('timestamp', pa.timestamp('us', tz='UTC')),
            ('lat', pa.float32()),
            ('lon', pa.float32()),
            ('altitude', pa.int32()),
        ]))
    
    day_folder = get_day_folder(date)
    
    print(f"üìä Loading flight traces for {len(icao_codes)} ICAOs...")
    
    # Group ICAOs by last 2 hex chars
    icao_groups = {}
    for icao_bytes in icao_codes:
        icao_hex = icao_bytes_to_hex(icao_bytes)
        suffix = icao_hex[-2:].lower()
        if suffix not in icao_groups:
            icao_groups[suffix] = []
        icao_groups[suffix].append(icao_bytes)
    
    print(f"   Grouped into {len(icao_groups)} trace files: {list(icao_groups.keys())}")
    
    all_traces = []
    
    for suffix, icao_list in icao_groups.items():
        filename = f"{day_folder}/traces_{suffix}.parquet"
        
        print(f"   Loading traces_{suffix}.parquet ({len(icao_list)} ICAOs)...")
        
        try:
            # Download using HuggingFace Hub (with automatic caching)
            local_path = hf_hub_download(
                repo_id=REPO_ID,
                filename=filename,
                repo_type=REPO_TYPE
            )
            print(f"   ‚úì File ready: {local_path}")
            
            # Read with PyArrow - column is 'altitude' not 'alt'
            table = pq.read_table(
                local_path,
                columns=['icao', 'timestamp', 'lat', 'lon', 'altitude']
            )
            print(f"   ‚úì File has {table.num_rows} rows")
            
            # Filter using PyArrow compute - convert bytes list to PyArrow array
            icao_array = pa.array(icao_list, type=pa.binary(3))
            mask = pc.is_in(table['icao'], value_set=icao_array)
            filtered_table = table.filter(mask)
            
            print(f"   ‚úì Filtered to {filtered_table.num_rows} rows for our ICAOs")
            
            if filtered_table.num_rows > 0:
                all_traces.append(filtered_table)
        except Exception as e:
            print(f"   ‚ùå Error loading traces_{suffix}.parquet: {e}")
            import traceback
            traceback.print_exc()
    
    if not all_traces:
        print(f"‚ùå No trace data loaded")
        return pa.table({}, schema=pa.schema([
            ('icao', pa.binary(3)),
            ('timestamp', pa.timestamp('us', tz='UTC')),
            ('lat', pa.float32()),
            ('lon', pa.float32()),
            ('altitude', pa.int32()),
        ]))
    
    # Concatenate PyArrow tables
    combined_table = pa.concat_tables(all_traces)
    print(f"‚úì Combined {len(all_traces)} files into {combined_table.num_rows} total rows")

    

    # Show timestamp range    return combined_table

    timestamps = combined_table['timestamp'].to_pylist()    

    min_dt = min(timestamps)
    max_dt = max(timestamps)
    print(f"   Timestamp range: {min_dt} to {max_dt}")

## CZML Conversion Functions

### Understanding CZML

CZML (Cesium Markup Language) is JSON for describing time-dynamic 3D scenes: https://github.com/AnalyticalGraphicsInc/czml-writer/wiki/CZML-Guid

CZML is a good format to represent time-dynamic entities like aircraft trajectories, as it allows specifying positions, orientations, and properties over time. We will convert our flight data into CZML format for visualization in CesiumJS.

In [None]:
def generate_random_color() -> List[int]:
    """Generate a random bright color in RGBA format."""
    # Generate bright, saturated colors by ensuring at least one channel is high
    colors = [
        [random.randint(150, 255), random.randint(50, 150), random.randint(50, 150)],
        [random.randint(50, 150), random.randint(150, 255), random.randint(50, 150)],
        [random.randint(50, 150), random.randint(50, 150), random.randint(150, 255)],
        [random.randint(150, 255), random.randint(150, 255), random.randint(50, 150)],
        [random.randint(150, 255), random.randint(50, 150), random.randint(150, 255)],
        [random.randint(50, 150), random.randint(150, 255), random.randint(150, 255)]
    ]
    color = random.choice(colors)
    random.shuffle(color)
    return color + [255]  # Add alpha channel


def traces_to_czml(table: pa.Table) -> List[Dict]:
    """Convert flight trace data to CZML format with paths and oriented airplane images.
    
    100% PyArrow implementation - creates polyline paths for each flight.
    Each flight gets a random color.
    
    Note: Traces only have position data. For aircraft metadata (registration, type),
    you'd need to join with aircraft.parquet separately.
    
    Args:
        table: PyArrow Table with flight trace data
    
    Returns:
        List of CZML entities (never None, returns empty document if no data)
    """
    # Default empty CZML document
    czml = [{
        "id": "document",
        "name": "Flight Trajectories",
        "version": "1.0"
    }]
    
    # Handle None or invalid input
    if table is None:
        print("‚ö†Ô∏è  Warning: Received None table in traces_to_czml")
        return czml
    
    if table.num_rows == 0:
        print("‚ÑπÔ∏è  No trace data to convert to CZML")
        return czml
    
    # Group by ICAO to create one entity per flight
    icao_col = table['icao'].to_pylist()  # List of bytes
    unique_icaos = list(set(icao_col))
    
    print(f"Creating CZML for {len(unique_icaos)} unique flights...")
    
    for icao_bytes in unique_icaos:
        icao_hex = icao_bytes_to_hex(icao_bytes)
        
        # Filter table for this ICAO
        mask = pc.equal(table['icao'], icao_bytes)
        flight_data = table.filter(mask)
        
        if flight_data.num_rows < 2:
            continue  # Need at least 2 points for a path
        
        # Sort by timestamp
        sorted_indices = pc.sort_indices(flight_data, sort_keys=[("timestamp", "ascending")])
        flight_data = pc.take(flight_data, sorted_indices)
        
        # Extract data
        lats = flight_data['lat'].to_pylist()
        lons = flight_data['lon'].to_pylist()
        alts_feet = flight_data['altitude'].to_pylist()  # Altitude in feet
        timestamps = flight_data['timestamp'].to_pylist()
        
        # Convert altitude from feet to meters
        alts_meters = [alt * 0.3048 if alt is not None else 10000 for alt in alts_feet]
        
        # Build time-position array for CZML
        # Format: [time1, lon1, lat1, alt1, time2, lon2, lat2, alt2, ...]
        position_array = []
        for lat, lon, alt_m, ts in zip(lats, lons, alts_meters, timestamps):
            # Convert timestamp to ISO string
            if hasattr(ts, 'isoformat'):
                time_str = ts.isoformat().replace('+00:00', 'Z')
            else:
                time_str = datetime.fromtimestamp(ts).isoformat() + 'Z'
            
            position_array.extend([time_str, lon, lat, alt_m])
        
        # Determine availability (time range)
        start_time = position_array[0]
        end_time = position_array[-4]  # Last timestamp in the array
        
        # Generate random color for this flight
        color = generate_random_color()
        
        # Calculate average altitude for display
        valid_alts = [a for a in alts_meters if a is not None]
        avg_alt = sum(valid_alts) / len(valid_alts) if valid_alts else 10000
        
        entity = {
            "id": f"flight_{icao_hex}",
            "name": icao_hex.upper(),
            "description": f"""<table>
                <tr><td>ICAO:</td><td>{icao_hex.upper()}</td></tr>
                <tr><td>Points:</td><td>{len(lats)}</td></tr>
                <tr><td>Avg Altitude:</td><td>{avg_alt:.0f} m ({avg_alt/0.3048:.0f} ft)</td></tr>
            </table>""",
            "availability": f"{start_time}/{end_time}",
            
            # Time-dynamic position for the moving point
            "position": {
                "epoch": start_time,
                "cartographicDegrees": position_array
            },
            
            # Static polyline showing the full path (polylines don't support time-dynamic positions)
            # Build static position array: [lon1, lat1, alt1, lon2, lat2, alt2, ...]
            "polyline": {
                "positions": {
                    "cartographicDegrees": [coord for lat, lon, alt_m in zip(lats, lons, alts_meters) for coord in (lon, lat, alt_m)]
                },
                "material": {
                    "polylineOutline": {
                        "color": {"rgba": color},
                        "outlineColor": {"rgba": [0, 0, 0, 100]},
                        "outlineWidth": 1
                    }
                },
                "width": 3,
                "clampToGround": False
            },
            
            # Point at current position
            "point": {
                "color": {"rgba": color},
                "pixelSize": 8,
                "outlineColor": {"rgba": [0, 0, 0, 255]},
                "outlineWidth": 2
            }
        }
        
        czml.append(entity)
    
    print(f"‚úì Generated CZML with {len(czml)-1} flight trajectories")
    return czml


def positions_to_czml(table: pa.Table) -> List[Dict]:
    """Convert position data from heatmap to CZML format with time-dynamic polyline paths.
    
    Heatmap contains snapshot positions at a specific half-hour.
    Groups positions by ICAO and creates a time-animated path for each aircraft.
    ICAO is fixed_size_binary[3], altitude is in feet.
    100% PyArrow implementation - no pandas!
    """
    from datetime import timezone
    
    if table.num_rows == 0:
        return [{
            "id": "document",
            "name": "Flight Positions",
            "version": "1.0"
        }]
    
    # First, find the global time range from all timestamps
    if 'timestamp' in table.column_names:
        all_timestamps = table['timestamp'].to_pylist()
        valid_timestamps = [ts for ts in all_timestamps if ts is not None]
        if valid_timestamps:
            min_ts = min(valid_timestamps)
            max_ts = max(valid_timestamps)
            # Convert to ISO strings (use timezone-aware UTC)
            global_start = datetime.fromtimestamp(min_ts, tz=timezone.utc).isoformat().replace('+00:00', 'Z')
            global_end = datetime.fromtimestamp(max_ts, tz=timezone.utc).isoformat().replace('+00:00', 'Z')
        else:
            global_start = None
            global_end = None
    else:
        global_start = None
        global_end = None
    
    # Create document with clock settings for animation
    czml = [{
        "id": "document",
        "name": "Flight Positions",
        "version": "1.0"
    }]
    
    # Add clock settings if we have time data
    if global_start and global_end:
        czml[0]["clock"] = {
            "interval": f"{global_start}/{global_end}",
            "currentTime": global_start,
            "multiplier": 60,  # 60x speed (1 second = 1 minute)
            "range": "LOOP_STOP",
            "step": "SYSTEM_CLOCK_MULTIPLIER"
        }
        print(f"Time range: {global_start} to {global_end}")
    
    # Group by ICAO to create one entity per aircraft
    icao_col = table['icao'].to_pylist()  # List of bytes
    unique_icaos = list(set(icao_col))
    
    print(f"Creating time-dynamic CZML paths for {len(unique_icaos)} unique flights from heatmap...")
    
    for icao_bytes in unique_icaos:
        icao_hex = icao_bytes_to_hex(icao_bytes)
        
        # Filter table for this ICAO
        mask = pc.equal(table['icao'], icao_bytes)
        flight_positions = table.filter(mask)
        
        # Skip if no valid data
        if flight_positions.num_rows == 0:
            continue
        
        # Sort by timestamp if available
        if 'timestamp' in flight_positions.column_names:
            sorted_indices = pc.sort_indices(flight_positions, sort_keys=[("timestamp", "ascending")])
            flight_positions = pc.take(flight_positions, sorted_indices)
        
        # Extract data for this flight
        lats = flight_positions['lat'].to_pylist()
        lons = flight_positions['lon'].to_pylist()
        alts_feet = flight_positions['alt'].to_pylist() if 'alt' in flight_positions.column_names else [None] * flight_positions.num_rows
        timestamps = flight_positions['timestamp'].to_pylist() if 'timestamp' in flight_positions.column_names else [None] * flight_positions.num_rows
        
        # Convert altitudes from feet to meters
        alts_meters = [(alt * 0.3048) if alt is not None and alt > 0 else 10000 for alt in alts_feet]
        
        # Filter out invalid positions and build position array
        valid_data = []
        for i, (lon, lat, alt_m, ts) in enumerate(zip(lons, lats, alts_meters, timestamps)):
            # Filter out invalid positions (0, 0)
            if lat != 0 or lon != 0:
                valid_data.append((lon, lat, alt_m, ts))
        
        # Skip if not enough valid positions
        if len(valid_data) < 2:
            continue
        
        # Check if we have timestamps
        has_timestamps = all(ts is not None for _, _, _, ts in valid_data)
        
        if has_timestamps:
            # Build time-dynamic position array
            # Format: [time1, lon1, lat1, alt1, time2, lon2, lat2, alt2, ...]
            position_array = []
            for lon, lat, alt_m, ts in valid_data:
                # Convert timestamp to ISO string (use UTC timezone-aware)
                if isinstance(ts, int):
                    # Unix timestamp in seconds
                    time_str = datetime.fromtimestamp(ts, tz=timezone.utc).isoformat().replace('+00:00', 'Z')
                elif hasattr(ts, 'isoformat'):
                    # Already a datetime object
                    time_str = ts.isoformat().replace('+00:00', 'Z')
                else:
                    time_str = str(ts)
                
                position_array.extend([time_str, lon, lat, alt_m])
            
            # Get time range for availability
            start_time = position_array[0]
            end_time = position_array[-4]  # Last timestamp in the array
        else:
            # No timestamps - create static positions
            position_array = []
            for lon, lat, alt_m, _ in valid_data:
                position_array.extend([lon, lat, alt_m])
        
        # Calculate average altitude for display
        valid_alts = [a for a in alts_feet if a is not None and a > 0]
        avg_alt_feet = sum(valid_alts) / len(valid_alts) if valid_alts else 0
        avg_alt_meters = avg_alt_feet * 0.3048
        
        # Generate random color for this flight
        color = generate_random_color()
        
        # Build entity
        entity = {
            "id": f"flight_{icao_hex}",
            "name": icao_hex.upper(),
            "description": f"""<table>
                <tr><td>ICAO:</td><td>{icao_hex.upper()}</td></tr>
                <tr><td>Positions:</td><td>{len(valid_data)} points</td></tr>
                <tr><td>Avg Altitude:</td><td>{avg_alt_meters:.0f} m ({avg_alt_feet:.0f} ft)</td></tr>
            </table>"""
        }
        
        if has_timestamps:
            # Time-dynamic entity
            entity["availability"] = f"{start_time}/{end_time}"
            
            # Time-dynamic position for the moving point
            entity["position"] = {
                "epoch": start_time,
                "cartographicDegrees": position_array
            }
            
            # Static polyline showing the full path (polylines don't support time-dynamic positions)
            # Build static position array: [lon1, lat1, alt1, lon2, lat2, alt2, ...]
            static_positions = []
            for lon, lat, alt_m, _ in valid_data:
                static_positions.extend([lon, lat, alt_m])
            
            entity["polyline"] = {
                "positions": {
                    "cartographicDegrees": static_positions
                },
                "material": {
                    "polylineOutline": {
                        "color": {"rgba": color},
                        "outlineColor": {"rgba": [0, 0, 0, 100]},
                        "outlineWidth": 1
                    }
                },
                "width": 3,
                "clampToGround": False
            }
            
            # Moving point at current position
            entity["point"] = {
                "color": {"rgba": color},
                "pixelSize": 8,
                "outlineColor": {"rgba": [0, 0, 0, 255]},
                "outlineWidth": 2
            }
        else:
            # Static entity (no timestamps)
            last_lon, last_lat, last_alt_m, _ = valid_data[-1]
            
            entity["position"] = {
                "cartographicDegrees": [last_lon, last_lat, last_alt_m]
            }
            
            entity["polyline"] = {
                "positions": {
                    "cartographicDegrees": position_array
                },
                "material": {
                    "polylineOutline": {
                        "color": {"rgba": color},
                        "outlineColor": {"rgba": [0, 0, 0, 100]},
                        "outlineWidth": 1
                    }
                },
                "width": 3,
                "clampToGround": False
            }
            
            entity["point"] = {
                "color": {"rgba": color},
                "pixelSize": 8,
                "outlineColor": {"rgba": [0, 0, 0, 255]},
                "outlineWidth": 2
            }
        
        czml.append(entity)
    
    print(f"‚úì Generated CZML with {len(czml)-1} flight paths")
    return czml


## Flight Data Manager

In [None]:
class FlightDataManager:
    """Manages flight data loading and updates based on camera position and time.
    
    100% PyArrow implementation - stores position data as PyArrow Table.
    """
    
    def __init__(self, widget: CesiumWidget, initial_date: datetime):
        self.widget = widget
        self.current_date = initial_date
        self.last_update_time = 0
        self.current_positions = None  # Will be a PyArrow Table
        
    def update_data(self, camera_lat: float, camera_lon: float, camera_alt: float, current_time: datetime | None = None):
        """Update flight data based on camera position and time."""

        # Use provided time or default to current date at noon
        if current_time is None:
            current_time = self.current_date.replace(hour=12, minute=0)
        
        # Calculate view radius
        radius = calculate_view_radius(camera_alt)
        
        print("\nUpdating data...")
        print(f"  Location: ({camera_lat:.2f}, {camera_lon:.2f})")
        print(f"  Altitude: {camera_alt/1000:.1f} km")
        print(f"  Radius: {radius/1000:.1f} km")
        print(f"  Time: {current_time}")
        
        # Get position data from heatmap
        try:
            # Load heatmap data for the current time
            day_folder = get_day_folder(self.current_date)
            half_hour_index = current_time.hour * 2 + (1 if current_time.minute >= 30 else 0)
            half_hour_str = f"{half_hour_index:02d}"
            filename = f"{day_folder}/heatmaps/{half_hour_str}_positions.parquet"
            
            print(f"  Loading heatmap: {filename}")
            local_path = hf_hub_download(
                repo_id=REPO_ID,
                filename=filename,
                repo_type=REPO_TYPE
            )
            
            # Read position data INCLUDING timestamp for animation
            table = pq.read_table(local_path, columns=['icao', 'timestamp', 'lat', 'lon', 'alt'])
            print(f"  ‚úì Loaded {table.num_rows} positions")
            
            # Filter by radius - ICAO is binary(3)
            icao_list = table['icao'].to_pylist()
            lats = table['lat'].to_numpy()
            lons = table['lon'].to_numpy()
            
            # Vectorized distance calculation
            lat_rad = np.radians(lats)
            lon_rad = np.radians(lons)
            center_lat_rad = math.radians(camera_lat)
            center_lon_rad = math.radians(camera_lon)
            
            delta_lat = lat_rad - center_lat_rad
            delta_lon = lon_rad - center_lon_rad
            
            a = np.sin(delta_lat/2)**2 + np.cos(center_lat_rad) * np.cos(lat_rad) * np.sin(delta_lon/2)**2
            c = 2 * np.arctan2(np.sqrt(a), np.sqrt(1-a))
            distances = 6371000 * c
            
            # Filter by distance
            nearby_mask = distances <= radius
            filtered_table = table.filter(pa.array(nearby_mask))
            
            print(f"  ‚úì Found {filtered_table.num_rows} flights within {radius/1000:.1f} km")
            
            if filtered_table.num_rows > 0:
                self.current_positions = filtered_table
                
                # Convert to CZML and update widget
                czml = positions_to_czml(filtered_table)
                
                print(f"  Loading {len(czml)-1} flight positions into viewer...")
                self.widget.load_czml(czml)
                print("  ‚úì Update complete")
            else:
                print("  No flights found in this area/time")
                
        except Exception as e:
            print(f"  Error updating data: {e}")
            import traceback
            traceback.print_exc()
    
    def clear_data(self):
        """Clear all loaded flight data."""
        self.current_positions = None
        self.widget.clear_czml()
        print("Cleared all flight data")
    
    def change_date(self, new_date: datetime):
        """Change the active date and reload data."""
        self.current_date = new_date
        self.clear_data()
        print(f"Changed date to {new_date.date()}")


## Initialize Widget

In [None]:
# Initial date: March 15, 2023 (known to have data)
initial_date = datetime(2023, 3, 15)

# Create widget centered on Paris
widget = CesiumWidget(
    latitude=48.8566,
    longitude=2.3522,
    altitude=50000,  # 50km altitude for good overview
    heading=0,
    pitch=-45,
    roll=0,
    height="800px",
    enable_terrain=False,
    enable_lighting=True,
    show_timeline=True,  # Enable timeline for playback
    animation=True,
    current_time=initial_date.isoformat() + 'Z'
)

print("Widget created. View centered on Paris.")
print(f"Date: {initial_date.date()}")
print("\nControls:")
print("  - Pan/zoom to explore different regions")
print("  - Use timeline to scrub through time")

Widget created. View centered on Paris.
Date: 2023-03-15

Controls:
  - Pan/zoom to explore different regions
  - Use timeline to scrub through time
  - Data will load automatically based on your view


## Setup Data Manager

In [None]:
# Create data manager
data_manager = FlightDataManager(widget, initial_date)

Callbacks registered. Interact with the map to load flight data!


## Display widget on a side panel

In [35]:
from sidecar import Sidecar
sc = Sidecar(title='Flight radar')
with sc:
    display(widget)

## Data Loading

In [None]:

data_manager.update_data(
    widget.latitude,
    widget.longitude,
    widget.altitude,
    initial_date.replace(hour=15, minute=0)  # 3:00 PM
)



Updating data...
  Location: (48.86, 2.35)
  Altitude: 50.0 km
  Radius: 200.0 km
  Time: 2023-03-15 15:00:00
  Loading heatmap: v2023.03.15-planes-readsb-prod-0/heatmaps/30_positions.parquet
  ‚úì Loaded 281813 positions
  ‚úì Found 3488 flights within 200.0 km
Time range: 2023-03-15T15:00:00Z to 2023-03-15T15:29:30Z
Creating time-dynamic CZML paths for 161 unique flights from heatmap...
‚úì Generated CZML with 153 flight paths
  Loading 153 flight positions into viewer...
  ‚úì Update complete


## Interactive Date Selector

Let's create a date picker to select which day's data to load:

In [11]:
import ipywidgets as widgets
from IPython.display import display

# Date picker
date_picker = widgets.DatePicker(
    description='Flight Date:',
    value=initial_date.date(),
    disabled=False
)

# Time slider (hour of day)
time_slider = widgets.IntSlider(
    value=12,
    min=0,
    max=23,
    step=1,
    description='Hour (UTC):',
    continuous_update=False
)

# Update button
update_button = widgets.Button(
    description='Load Data',
    button_style='primary',
    icon='download'
)

# Clear button
clear_button = widgets.Button(
    description='Clear All',
    button_style='warning',
    icon='trash'
)

status_label = widgets.Label(value=f'Ready. Current date: {initial_date.date()}')

def on_update_clicked(b):
    new_date = datetime.combine(date_picker.value, datetime.min.time())
    new_time = new_date.replace(hour=time_slider.value)
    
    if new_date.date() != data_manager.current_date.date():
        data_manager.change_date(new_date)
    
    status_label.value = f'Loading data for {new_time}...'
    data_manager.update_data(
        widget.latitude,
        widget.longitude,
        widget.altitude,
        new_time
    )
    flight_count = data_manager.current_positions.num_rows if data_manager.current_positions else 0
    status_label.value = f'Loaded {flight_count} flights'

def on_clear_clicked(b):
    data_manager.clear_data()
    status_label.value = 'Cleared all data'

update_button.on_click(on_update_clicked)
clear_button.on_click(on_clear_clicked)

controls = widgets.VBox([
    widgets.HBox([date_picker, time_slider]),
    widgets.HBox([update_button, clear_button]),
    status_label
])

display(controls)

VBox(children=(HBox(children=(DatePicker(value=datetime.date(2023, 3, 15), description='Flight Date:', step=1)‚Ä¶

# About the author

My name is [Alexis Placet](https://www.linkedin.com/in/alexisplacet/) and I'm a software engineer and open-source enthusiast. I'm currently working at [Quantstack](https://www.quantstack.com/) where we build tools for data science and scientific computing.
