# Hurricane Tweet Intensity â†’ Time-Aware Rasters

**Purpose**: Convert geolocated tweets for Francine & Helene into ArcGIS-native time-enabled rasters (CRF) and Space-Time Cubes (.nc) for animation and Space-Time Pattern Mining.

**Strategy**: Fuse tweet point events with geographic context layers (states, counties, cities) using spatial proximity and density-based weighting to create continuous intensity surfaces. Each time bin produces a raster slice; iterative mode shows per-bin activity, cumulative mode shows accumulated activity from start to current bin.

**Environment**: ArcGIS Pro Python (arcpy only, no external dependencies)

**Outputs per hurricane**:
- Iterative CRF + Space-Time Cube
- Cumulative CRF + Space-Time Cube
- GeoTIFF stack + manifest table

## 1. Configuration (Centralized)

In [25]:
import os
import json
from datetime import datetime, timezone, timedelta
from pathlib import Path

# ========== PROJECT PATHS ==========
PROJECT_ROOT = Path(r"C:\Users\colto\Documents\GitHub\Tweet_project")
DATA_ROOT = PROJECT_ROOT / "data"

INPUTS = {
    "cities": DATA_ROOT / "tables" / "cities1000.csv",
    "counties": DATA_ROOT / "shape_files" / "cb_2023_us_county_20m.shp",
    "states": DATA_ROOT / "shape_files" / "cb_2023_us_state_20m.shp",
    "francine": DATA_ROOT / "geojson" / "francine.geojson",
    "helene": DATA_ROOT / "geojson" / "helene.geojson"
}

# ========== RASTER PARAMETERS ==========
CELL_SIZE_KM = 10  # Spatial resolution (10 km)
TIME_BIN_HOURS = 6  # Temporal resolution (6-hour bins)
CRS_EPSG = 5070  # NAD83 Conus Albers (meters)
NODATA_VALUE = -9999

# ========== OUTPUT PARAMETERS ==========
OUTPUT_ROOT = PROJECT_ROOT / "outputs"
OUTPUT_GDB = OUTPUT_ROOT / "rasters.gdb"
EVENTS = ["francine", "helene"]
MODES = ["iter", "cum"]  # iterative, cumulative

# ========== SAMPLE/DEBUG MODE ==========
# Set SAMPLE_MODE = True to run a quick test with limited data
SAMPLE_MODE = True  # Change to False for full pipeline
SAMPLE_NUM_SLICES = 4  # Number of time slices to generate in sample mode
SAMPLE_EVENT = "francine"  # Which event to use for sample run

# ========== PROCESSING PARAMETERS ==========
# Fusion weights: balance tweet density with geographic context
WEIGHTS = {
    "tweet_density": 0.6,  # Primary signal from tweet points
    "city_proximity": 0.2,  # Secondary: distance to populated places
    "admin_context": 0.2   # Tertiary: state/county boundaries
}

# Search radius for density calculation (meters in EPSG:5070)
SEARCH_RADIUS_M = CELL_SIZE_KM * 1000 * 2  # 2x cell size

print(f"Configuration loaded:")
print(f"  Cell size: {CELL_SIZE_KM} km")
print(f"  Time bin: {TIME_BIN_HOURS} hours")
print(f"  CRS: EPSG:{CRS_EPSG}")
print(f"  Output GDB: {OUTPUT_GDB}")
if SAMPLE_MODE:
    print(f"\nâš¡ SAMPLE MODE ENABLED âš¡")
    print(f"  Running {SAMPLE_EVENT} only with {SAMPLE_NUM_SLICES} slices per mode")
    print(f"  Set SAMPLE_MODE = False for full pipeline")

Configuration loaded:
  Cell size: 10 km
  Time bin: 6 hours
  CRS: EPSG:5070
  Output GDB: C:\Users\colto\Documents\GitHub\Tweet_project\outputs\rasters.gdb

âš¡ SAMPLE MODE ENABLED âš¡
  Running francine only with 4 slices per mode
  Set SAMPLE_MODE = False for full pipeline


## 2. Imports & Environment Setup

In [26]:
import arcpy
from arcpy import env
from arcpy.sa import *
import arcpy.stpm as stpm

# Enable spatial analyst
arcpy.CheckOutExtension("Spatial")

# Configure environment
env.overwriteOutput = True
env.outputCoordinateSystem = arcpy.SpatialReference(CRS_EPSG)
env.cellSize = CELL_SIZE_KM * 1000  # Convert km to meters

# Create output directories
OUTPUT_ROOT.mkdir(parents=True, exist_ok=True)
if not arcpy.Exists(str(OUTPUT_GDB)):
    arcpy.management.CreateFileGDB(str(OUTPUT_ROOT), OUTPUT_GDB.name)
    print(f"Created geodatabase: {OUTPUT_GDB}")
else:
    print(f"Using existing geodatabase: {OUTPUT_GDB}")

# Set workspace
env.workspace = str(OUTPUT_GDB)

print("ArcPy environment configured")
print(f"  Spatial Analyst: {arcpy.CheckExtension('Spatial')}")
print(f"  Overwrite: {env.overwriteOutput}")

Using existing geodatabase: C:\Users\colto\Documents\GitHub\Tweet_project\outputs\rasters.gdb
ArcPy environment configured
  Spatial Analyst: Available
  Overwrite: True


## 3. I/O Validation & Utilities

In [27]:
def validate_inputs():
    """Assert all input files exist and are readable."""
    print("Validating inputs...")
    for name, path in INPUTS.items():
        assert path.exists(), f"NEED_INFO: Missing input file: {path}"
        if path.suffix == ".shp":
            desc = arcpy.Describe(str(path))
            assert desc.shapeType == "Polygon", f"Expected Polygon, got {desc.shapeType} for {name}"
        print(f"  âœ“ {name}: {path.name}")
    print("All inputs valid\n")

def parse_iso_time(time_str):
    """Parse ISO8601 timestamp with timezone."""
    # Handle formats: "2024-09-10 23:58:43+00:00" or "2024-09-10T23:58:43+00:00"
    time_str = time_str.replace(" ", "T")
    if "+" in time_str or time_str.endswith("Z"):
        # Python 3.7+ fromisoformat handles this
        try:
            return datetime.fromisoformat(time_str.replace("Z", "+00:00"))
        except:
            # Fallback for edge cases
            from dateutil import parser
            return parser.isoparse(time_str)
    else:
        return datetime.fromisoformat(time_str).replace(tzinfo=timezone.utc)

def make_safe_name(name):
    """Create filesystem/geodatabase safe name."""
    return name.replace("-", "_").replace(" ", "_").replace(":", "")

validate_inputs()

Validating inputs...
  âœ“ cities: cities1000.csv
  âœ“ counties: cb_2023_us_county_20m.shp
  âœ“ states: cb_2023_us_state_20m.shp
  âœ“ francine: francine.geojson
  âœ“ helene: helene.geojson
All inputs valid



## 4. GeoJSON Parsing & Feature Loading

In [28]:
def load_geojson_features(geojson_path):
    """Load and validate GeoJSON features. Returns list of (lon, lat, timestamp, properties)."""
    print(f"Loading {geojson_path.name}...")
    
    with open(geojson_path, 'r', encoding='utf-8') as f:
        data = json.load(f)
    
    features = data.get('features', [])
    assert len(features) > 0, f"NEED_INFO: No features found in {geojson_path}"
    
    parsed = []
    for i, feat in enumerate(features):
        geom = feat.get('geometry', {})
        props = feat.get('properties', {})
        
        # Assert Point geometry
        assert geom.get('type') == 'Point', f"Feature {i}: Expected Point geometry, got {geom.get('type')}"
        
        # Extract coordinates (authoritative)
        coords = geom.get('coordinates', [])
        assert len(coords) >= 2, f"Feature {i}: Invalid coordinates {coords}"
        lon, lat = coords[0], coords[1]
        
        # Validate required properties
        required_keys = ['FAC', 'LOC', 'GPE', 'time', 'Latitude', 'Longitude', 'make_polygon']
        for key in required_keys:
            assert key in props, f"Feature {i}: Missing required property '{key}'"
        
        # Parse timestamp
        timestamp = parse_iso_time(props['time'])
        
        # QA: compare geometry vs. properties (tolerance 0.001Â°)
        lat_prop = float(props['Latitude'])
        lon_prop = float(props['Longitude'])
        if abs(lat - lat_prop) > 0.001 or abs(lon - lon_prop) > 0.001:
            print(f"  âš  Feature {i}: Coordinate mismatch (geom vs props) - using geometry")
        
        parsed.append({
            'lon': lon,
            'lat': lat,
            'timestamp': timestamp,
            'FAC': props['FAC'],
            'LOC': props['LOC'],
            'GPE': props['GPE'],
            'make_polygon': int(props['make_polygon'])
        })
    
    print(f"  Loaded {len(parsed)} features")
    print(f"  Time range: {min(f['timestamp'] for f in parsed)} â†’ {max(f['timestamp'] for f in parsed)}")
    return parsed

# Test load one event
test_features = load_geojson_features(INPUTS['francine'])
print(f"Sample feature: {test_features[0]}\n")

Loading francine.geojson...
  Loaded 2303 features
  Time range: 2024-09-09 11:00:36+00:00 â†’ 2024-09-16 15:24:14+00:00
Sample feature: {'lon': -92.007126, 'lat': 30.8703881, 'timestamp': datetime.datetime(2024, 9, 10, 23, 58, 43, tzinfo=datetime.timezone.utc), 'FAC': '', 'LOC': '', 'GPE': 'Louisiana', 'make_polygon': 1}



## 5. Time Binning

In [29]:
def create_time_bins(features, bin_hours):
    """Create contiguous time bins, fill empty bins. Returns list of (bin_start, bin_end, feature_indices)."""
    if not features:
        return []
    
    timestamps = [f['timestamp'] for f in features]
    min_time = min(timestamps)
    max_time = max(timestamps)
    
    # Round down to hour boundary
    start_time = min_time.replace(minute=0, second=0, microsecond=0)
    
    bins = []
    current = start_time
    bin_delta = timedelta(hours=bin_hours)
    
    while current <= max_time:
        bin_end = current + bin_delta
        
        # Find features in this bin
        indices = [i for i, f in enumerate(features) 
                  if current <= f['timestamp'] < bin_end]
        
        bins.append({
            'start': current,
            'end': bin_end,
            'indices': indices,
            'count': len(indices)
        })
        
        current = bin_end
    
    print(f"Created {len(bins)} time bins ({bin_hours}h each)")
    print(f"  Range: {bins[0]['start']} â†’ {bins[-1]['end']}")
    print(f"  Empty bins: {sum(1 for b in bins if b['count'] == 0)}")
    return bins

# Test binning
test_bins = create_time_bins(test_features, TIME_BIN_HOURS)
print(f"Sample bins: {test_bins[:3]}\n")

Created 29 time bins (6h each)
  Range: 2024-09-09 11:00:00+00:00 â†’ 2024-09-16 17:00:00+00:00
  Empty bins: 1
Sample bins: [{'start': datetime.datetime(2024, 9, 9, 11, 0, tzinfo=datetime.timezone.utc), 'end': datetime.datetime(2024, 9, 9, 17, 0, tzinfo=datetime.timezone.utc), 'indices': [49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98], 'count': 26}, {'start': datetime.datetime(2024, 9, 9, 17, 0, tzinfo=datetime.timezone.utc), 'end': datetime.datetime(2024, 9, 9, 23, 0, tzinfo=datetime.timezone.utc), 'indices': [62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123], 'count': 49}, {'start': datetime.datetime(2024, 9, 9, 23, 0, tzinfo=datetime.timezone.utc), 'end': datetime.datetime(2024, 9, 10, 5, 0, tzinfo=datetime.timezone.utc), 'indices': [124, 125, 126, 127, 128, 

## 6. Spatial Context Loading

In [30]:
def load_context_layers():
    """Load and prepare reference layers (states, counties) for fusion."""
    print("Loading context layers...")
    
    context = {}
    
    # States (already polygon)
    context['states'] = str(INPUTS['states'])
    print(f"  âœ“ States: {arcpy.management.GetCount(context['states'])[0]} features")
    
    # Counties (already polygon)
    context['counties'] = str(INPUTS['counties'])
    print(f"  âœ“ Counties: {arcpy.management.GetCount(context['counties'])[0]} features")
    
    # Note: Cities not loaded - tweet points provide direct location signal
    print("Context layers ready\n")
    return context

CONTEXT = load_context_layers()

Loading context layers...
  âœ“ States: 52 features
  âœ“ Counties: 3222 features
Context layers ready



## 7. Raster Fusion & Slice Generation

In [31]:
def create_tweet_feature_class(features, indices, output_fc):
    """Convert tweet features to point feature class."""
    # Create feature class
    sr = arcpy.SpatialReference(4326)  # Input is WGS84
    arcpy.management.CreateFeatureclass(
        os.path.dirname(output_fc),
        os.path.basename(output_fc),
        "POINT",
        spatial_reference=sr
    )
    
    # Add fields
    arcpy.management.AddField(output_fc, "tweet_id", "LONG")
    arcpy.management.AddField(output_fc, "FAC", "TEXT", field_length=100)
    arcpy.management.AddField(output_fc, "LOC", "TEXT", field_length=100)
    arcpy.management.AddField(output_fc, "GPE", "TEXT", field_length=100)
    
    # Insert features
    with arcpy.da.InsertCursor(output_fc, ["SHAPE@XY", "tweet_id", "FAC", "LOC", "GPE"]) as cursor:
        for idx in indices:
            f = features[idx]
            cursor.insertRow([(f['lon'], f['lat']), idx, f['FAC'], f['LOC'], f['GPE']])
    
    # Project to working CRS
    output_proj = output_fc + "_proj"
    arcpy.management.Project(output_fc, output_proj, arcpy.SpatialReference(CRS_EPSG))
    return output_proj

def create_intensity_raster(features, indices, output_raster, extent_fc=None):
    """
    Create intensity raster from tweet points using kernel density.
    
    Fusion strategy: Compute point density of tweets weighted by spatial distribution.
    The resulting surface represents tweet activity intensity based on the geographic
    concentration of events, producing a continuous field suitable for time-series analysis.
    """
    if len(indices) == 0:
        # Empty bin: create zero-filled raster with consistent extent
        if extent_fc and arcpy.Exists(extent_fc):
            # Use previous extent
            desc = arcpy.Describe(extent_fc)
            extent = desc.extent
            
            # Create constant raster
            cell_size = CELL_SIZE_KM * 1000
            width = int((extent.XMax - extent.XMin) / cell_size)
            height = int((extent.YMax - extent.YMin) / cell_size)
            
            # Use CreateConstantRaster
            const_raster = CreateConstantRaster(0, "FLOAT", cell_size, 
                                               arcpy.Extent(extent.XMin, extent.YMin, extent.XMax, extent.YMax))
            const_raster.save(output_raster)
        else:
            # No extent reference: skip
            return None
    else:
        # Create point feature class
        temp_fc = os.path.join(str(OUTPUT_GDB), f"temp_tweets_{make_safe_name(os.path.basename(output_raster))}")
        tweet_fc = create_tweet_feature_class(features, indices, temp_fc)
        
        # Kernel density estimation
        density_raster = KernelDensity(
            tweet_fc,
            population_field="NONE",
            cell_size=CELL_SIZE_KM * 1000,
            search_radius=SEARCH_RADIUS_M,
            area_unit_scale_factor="SQUARE_KILOMETERS"
        )
        
        # Save
        density_raster.save(output_raster)
        
        # Cleanup temp
        arcpy.management.Delete(temp_fc)
        if arcpy.Exists(tweet_fc):
            arcpy.management.Delete(tweet_fc)
    
    return output_raster

print("Fusion functions defined")

Fusion functions defined


## 8. Generate Raster Slices (Iterative & Cumulative)

In [32]:
def generate_raster_slices(event_name, features, bins, mode):
    """
    Generate raster slices for all time bins.
    
    mode='iter': Each slice shows only that bin's tweets
    mode='cum': Each slice shows all tweets from start to current bin
    """
    print(f"\nGenerating {mode} slices for {event_name}...")
    
    # Apply sample mode limit
    bins_to_process = bins[:SAMPLE_NUM_SLICES] if SAMPLE_MODE else bins
    if SAMPLE_MODE:
        print(f"  âš¡ Sample mode: processing {len(bins_to_process)}/{len(bins)} bins")
    
    slices = []
    cumulative_indices = []
    extent_ref = None
    
    for i, bin_info in enumerate(bins_to_process):
        bin_start = bin_info['start'].strftime("%Y%m%d_%H%M")
        
        # Determine which indices to include
        if mode == 'iter':
            indices = bin_info['indices']
        elif mode == 'cum':
            cumulative_indices.extend(bin_info['indices'])
            indices = cumulative_indices
        else:
            raise ValueError(f"Unknown mode: {mode}")
        
        # Create output path
        slice_name = f"{event_name}_{mode}_slice_{i:03d}_{bin_start}"
        slice_path = os.path.join(str(OUTPUT_GDB), slice_name)
        
        # Generate raster
        try:
            result = create_intensity_raster(features, indices, slice_path, extent_ref)
            if result and arcpy.Exists(result):
                extent_ref = result  # Use for subsequent empty bins
                
                slices.append({
                    'path': slice_path,
                    'name': slice_name,
                    'bin_index': i,
                    'start_time': bin_info['start'],
                    'end_time': bin_info['end'],
                    'tweet_count': len(indices)
                })
                
                print(f"  Created slice {i+1}/{len(bins_to_process)}: {slice_name} ({len(indices)} tweets)")
        except Exception as e:
            print(f"  âš  Error creating slice {i}: {e}")
            continue
    
    print(f"  âœ“ Created {len(slices)} slices")
    return slices

print("Slice generation function ready")

Slice generation function ready


## 9. Mosaic Dataset & Multidimensional Workflow

In [33]:
def create_mosaic_dataset(event_name, mode, slices):
    """Create mosaic dataset and populate with raster slices."""
    print(f"\nCreating mosaic dataset: {event_name}_{mode}...")
    
    mosaic_name = f"{event_name}_{mode}_mosaic"
    mosaic_path = os.path.join(str(OUTPUT_GDB), mosaic_name)
    
    # Delete if exists
    if arcpy.Exists(mosaic_path):
        arcpy.management.Delete(mosaic_path)
    
    # Create mosaic dataset
    arcpy.management.CreateMosaicDataset(
        str(OUTPUT_GDB),
        mosaic_name,
        arcpy.SpatialReference(CRS_EPSG),
        num_bands=1,
        pixel_type="32_BIT_FLOAT"
    )
    
    # Create table for AddRasters (Table raster type)
    table_name = f"{event_name}_{mode}_manifest"
    table_path = os.path.join(str(OUTPUT_GDB), table_name)
    
    if arcpy.Exists(table_path):
        arcpy.management.Delete(table_path)
    
    arcpy.management.CreateTable(str(OUTPUT_GDB), table_name)
    # Field MUST be named "Raster" for Table raster type
    arcpy.management.AddField(table_path, "Raster", "TEXT", field_length=512)
    arcpy.management.AddField(table_path, "Date", "DATE")
    arcpy.management.AddField(table_path, "Variable", "TEXT", field_length=50)
    
    # Populate table
    with arcpy.da.InsertCursor(table_path, ["Raster", "Date", "Variable"]) as cursor:
        for s in slices:
            cursor.insertRow([s['path'], s['start_time'], "intensity"])
    
    print(f"  âœ“ Manifest table: {table_name} ({len(slices)} records)")
    
    # Add rasters to mosaic using Table raster type
    arcpy.management.AddRastersToMosaicDataset(
        mosaic_path,
        "Table",
        table_path,
        update_cellsize_ranges="UPDATE_CELL_SIZES",
        update_boundary="UPDATE_BOUNDARY"
    )
    
    print(f"  âœ“ Added {len(slices)} rasters to mosaic")
    
    # Build multidimensional info
    arcpy.management.BuildMultidimensionalInfo(
        mosaic_path,
        variable_field="Variable",
        dimension_fields="Date"
    )
    
    print(f"  âœ“ Built multidimensional info (Date Ã— Variable)")
    
    # Optional: Build pyramids and statistics
    try:
        arcpy.management.BuildPyramidsandStatistics(
            mosaic_path,
            build_pyramids=True,
            calculate_statistics=True
        )
        print(f"  âœ“ Built pyramids and statistics")
    except Exception as e:
        print(f"  âš  Pyramid/stats build warning: {e}")
    
    return mosaic_path, table_path

print("Mosaic functions ready")

Mosaic functions ready


## 10. Export CRF & Space-Time Cube

In [34]:
def export_crf(mosaic_path, output_crf):
    """Export mosaic dataset to Cloud Raster Format (CRF) with multidimensional support."""
    print(f"  Exporting to CRF: {output_crf}...")
    
    arcpy.management.CopyRaster(
        mosaic_path,
        output_crf,
        pixel_type="32_BIT_FLOAT",
        format="CRF",
        process_as_multidimensional="PROCESS_AS_MULTIDIMENSIONAL",
        build_multidimensional_transpose="TRANSPOSE"
    )
    
    print(f"  âœ“ CRF created: {output_crf}")
    return output_crf

def export_space_time_cube(crf_path, output_nc):
    """Create Space-Time Cube (.nc) from multidimensional raster."""
    print(f"  Creating Space-Time Cube: {output_nc}...")
    
    try:
        stpm.CreateSpaceTimeCubeFromMultidimensionalRasterLayer(
            crf_path,
            output_nc,
            "intensity"
        )
        print(f"  âœ“ Space-Time Cube created: {output_nc}")
        return output_nc
    except Exception as e:
        print(f"  âš  Space-Time Cube creation failed: {e}")
        print(f"     This may be due to data extent or toolbox version issues.")
        return None

print("Export functions ready")

Export functions ready


## 11. QA Checks & Reporting

In [35]:
def qa_check_slices(slices):
    """Validate that all slices have consistent CRS, extent, and pixel size."""
    if not slices:
        return {"status": "FAIL", "reason": "No slices to check"}
    
    print("  Running QA checks...")
    
    # Get reference properties from first slice
    ref = arcpy.Describe(slices[0]['path'])
    ref_sr = ref.spatialReference.factoryCode
    ref_extent = ref.extent
    ref_cell = (ref.meanCellWidth, ref.meanCellHeight)
    
    issues = []
    
    for i, s in enumerate(slices[1:], start=1):
        desc = arcpy.Describe(s['path'])
        
        # Check CRS
        if desc.spatialReference.factoryCode != ref_sr:
            issues.append(f"Slice {i}: CRS mismatch ({desc.spatialReference.factoryCode} vs {ref_sr})")
        
        # Check extent (tolerance 1m)
        ext = desc.extent
        if (abs(ext.XMin - ref_extent.XMin) > 1 or 
            abs(ext.YMin - ref_extent.YMin) > 1 or
            abs(ext.XMax - ref_extent.XMax) > 1 or
            abs(ext.YMax - ref_extent.YMax) > 1):
            issues.append(f"Slice {i}: Extent mismatch")
        
        # Check pixel size (tolerance 0.1m)
        if (abs(desc.meanCellWidth - ref_cell[0]) > 0.1 or
            abs(desc.meanCellHeight - ref_cell[1]) > 0.1):
            issues.append(f"Slice {i}: Pixel size mismatch")
    
    if issues:
        print(f"    âš  Found {len(issues)} issues:")
        for issue in issues[:5]:  # Show first 5
            print(f"      - {issue}")
        return {"status": "WARN", "issues": issues}
    else:
        print(f"    âœ“ All {len(slices)} slices consistent")
        return {
            "status": "PASS",
            "slice_count": len(slices),
            "crs_epsg": ref_sr,
            "extent": [ref_extent.XMin, ref_extent.YMin, ref_extent.XMax, ref_extent.YMax],
            "cell_size_m": ref_cell
        }

def create_qa_report(results):
    """Generate JSON QA report."""
    report_path = OUTPUT_ROOT / "qa_report.json"
    
    with open(report_path, 'w', encoding='utf-8') as f:
        json.dump(results, f, indent=2, default=str)
    
    print(f"\nðŸ“Š QA Report saved: {report_path}")
    return report_path

print("QA functions ready")

QA functions ready


## 12. Main Pipeline Execution

In [36]:
def run_pipeline():
    """Execute full pipeline for all events and modes."""
    print("=" * 80)
    if SAMPLE_MODE:
        print("SAMPLE MODE: QUICK TEST PIPELINE")
    else:
        print("STARTING FULL RASTER PIPELINE")
    print("=" * 80)
    
    results = {}
    
    # Use sample event list or full event list
    events_to_process = [SAMPLE_EVENT] if SAMPLE_MODE else EVENTS
    
    for event in events_to_process:
        print(f"\n{'=' * 80}")
        print(f"PROCESSING EVENT: {event.upper()}")
        print(f"{'=' * 80}")
        
        # Load features
        features = load_geojson_features(INPUTS[event])
        
        # Create time bins
        bins = create_time_bins(features, TIME_BIN_HOURS)
        
        results[event] = {}
        
        for mode in MODES:
            print(f"\n{'-' * 80}")
            print(f"MODE: {mode.upper()}")
            print(f"{'-' * 80}")
            
            # Generate raster slices
            slices = generate_raster_slices(event, features, bins, mode)
            
            if not slices:
                print(f"  âš  No slices created for {event}_{mode}")
                continue
            
            # QA check slices
            qa_result = qa_check_slices(slices)
            
            # Create mosaic dataset
            mosaic_path, table_path = create_mosaic_dataset(event, mode, slices)
            
            # Export CRF
            crf_name = f"{event}_{mode}.crf"
            crf_path = str(OUTPUT_ROOT / crf_name)
            export_crf(mosaic_path, crf_path)
            
            # Export Space-Time Cube
            nc_name = f"{event}_{mode}.nc"
            nc_path = str(OUTPUT_ROOT / nc_name)
            export_space_time_cube(crf_path, nc_path)
            
            # Store results
            results[event][mode] = {
                "slice_count": len(slices),
                "time_range": [slices[0]['start_time'], slices[-1]['end_time']],
                "mosaic_dataset": mosaic_path,
                "manifest_table": table_path,
                "crf_output": crf_path,
                "nc_output": nc_path,
                "qa_status": qa_result
            }
            
            print(f"\n  âœ… {event.upper()} {mode.upper()} complete")
    
    # Generate QA report
    report = create_qa_report(results)
    
    print("\n" + "=" * 80)
    print("PIPELINE COMPLETE")
    print("=" * 80)
    print(f"\nOutputs saved to: {OUTPUT_ROOT}")
    print(f"\nProducts per event:")
    for event in events_to_process:
        print(f"  {event.upper()}:")
        for mode in MODES:
            if mode in results.get(event, {}):
                r = results[event][mode]
                print(f"    {mode}: {r['slice_count']} slices â†’ CRF + NC")
    
    if SAMPLE_MODE:
        print(f"\nâš¡ Sample mode was enabled. To run full pipeline:")
        print(f"   Set SAMPLE_MODE = False in the configuration cell")
    
    return results

# Execute pipeline
pipeline_results = run_pipeline()

SAMPLE MODE: QUICK TEST PIPELINE

PROCESSING EVENT: FRANCINE
Loading francine.geojson...
  Loaded 2303 features
  Time range: 2024-09-09 11:00:36+00:00 â†’ 2024-09-16 15:24:14+00:00
Created 29 time bins (6h each)
  Range: 2024-09-09 11:00:00+00:00 â†’ 2024-09-16 17:00:00+00:00
  Empty bins: 1

--------------------------------------------------------------------------------
MODE: ITER
--------------------------------------------------------------------------------

Generating iter slices for francine...
  âš¡ Sample mode: processing 4/29 bins
  Created slice 1/4: francine_iter_slice_000_20240909_1100 (26 tweets)
  Created slice 2/4: francine_iter_slice_001_20240909_1700 (49 tweets)
  Created slice 3/4: francine_iter_slice_002_20240909_2300 (55 tweets)
  Created slice 4/4: francine_iter_slice_003_20240910_0500 (25 tweets)
  âœ“ Created 4 slices
  Running QA checks...
    âš  Found 3 issues:
      - Slice 1: Extent mismatch
      - Slice 2: Extent mismatch
      - Slice 3: Extent mismatch

<class 'arcgisscripting.ExecuteError'>: ERROR 000622: Failed to execute (Copy Raster). Parameters are not valid.
ERROR 000800: The value is not a member of ALL_SLICES | CURRENT_SLICE.


## 13. Verification & Next Steps

In [None]:
print("\nðŸ“‹ VERIFICATION CHECKLIST:")
print("\n1. Add CRF files to ArcGIS Pro map:")
for event in EVENTS:
    for mode in MODES:
        crf_path = OUTPUT_ROOT / f"{event}_{mode}.crf"
        print(f"   - {crf_path}")

print("\n2. Enable Time on the map and verify:")
print("   - Time slider appears")
print("   - Raster updates as you scrub through time")
print("   - Check Layer Properties â†’ Time for multidimensional info")

print("\n3. Test Space-Time Cubes with STPM tools:")
print("   - Emerging Hot Spot Analysis")
print("   - Local Outlier Analysis")
print("   - 3D visualization in Scene")

print("\n4. Verify consistency:")
print("   - All slices same extent/CRS/pixel size (see QA report)")
print("   - Mosaic footprints show Standard Time dimension")
print(f"   - Check: {OUTPUT_ROOT / 'qa_report.json'}")

print("\n5. Compare iterative vs cumulative:")
print("   - Iterative: shows per-bin activity (changes over time)")
print("   - Cumulative: shows accumulated activity (grows over time)")

print("\nâœ… Pipeline complete. Review outputs and QA report.")