<a href="https://colab.research.google.com/github/derek881107/Real-Time-Disaster-Detection-System/blob/main/historical_dataset_processing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

===============================================================================
Historical Disaster–Client Proximity Analyzer (GDACS + GeoSpatial)
===============================================================================


Overview
--------
This script builds a precise, geometry-aware pipeline to detect which client
locations may be affected by historical disaster events reported by GDACS.
Instead of relying on coarse bounding boxes, it fetches and unifies per-episode
polygons from the GDACS API, buffers them if requested, and computes the exact
shortest distance from each client point to each disaster geometry. Results are
exported to both JSON (summary + matches) and a multi-sheet Excel workbook for
analysis and reporting.

Typical Use Cases
-----------------
- Quantify which client sites fell within X km of historical disasters.
- Produce decision-ready spreadsheets and JSON summaries for risk reviews.
- Merge multiple GDACS GeoJSON files and de-duplicate events by ID.
- Convert merged GeoJSON to a flat CSV for quick inspection.

Key Inputs
----------
1) Client CSV (required)
   - A table of client/vendor/facility locations.
   - Expected columns (case-insensitive; extra columns are OK):
     - id, name, address, country, type, latitude, longitude
     - latitude and longitude must be numeric (WGS84).

2) GDACS GeoJSON (required)
   - A merged GeoJSON of disaster events (features list) with rich `properties`,
     including `eventid`, `eventtype`, `name`, `description`, `fromdate`,
     `todate`, `alertlevel`, and `url.details`.
   - The script can help you:
       • merge many *.geojson/*.json files: `merge_geojson_files(...)`
       • auto-discover & merge by pattern: `auto_merge_json_files(...)`
       • remove duplicate events by a key (default `eventid`)

Core Workflow (What the Script Does)
------------------------------------
1) Data Prep
   - Optionally merge multiple GDACS GeoJSON files and drop duplicates.
   - (Optional) Convert merged GeoJSON to CSV for a flat preview.

2) Geometry Enrichment (per disaster event)
   - Use `properties.url.details` to fetch the event’s episode list.
   - For each episode, call the GDACS polygons API to retrieve geometries.
   - Validate and fix geometries if needed; unify (union) all episode shapes
     into one “event geometry.” Optionally apply a buffer (km → degrees).

3) Proximity Analysis (per client point)
   - Compute the shortest distance from the client’s (lon, lat) to the unified
     event geometry. If the point lies inside the polygon, distance = 0.
   - Keep matches within a user-defined threshold (e.g., 50, 100, 200 km).

4) Reporting & Export
   - JSON: overall stats + geometry summary + full match records.
   - Excel: a comprehensive workbook with multiple analysis sheets, including:
       • Matches (row-level results with client + disaster context)
       • Summary_Statistics (totals, uniques, distance stats)
       • Disaster_Types (counts + percentages)
       • Alert_Levels (counts + percentages)
       • Countries_Affected (client-country perspective)
       • Distance_Analysis (0–10, 10–50, 50–100, 100–200, >200 km bins)
       • All_Disaster_Events (complete GDACS properties per event)
       • Geometry_Summary (event-level geometry metadata)

Primary Functions (What to Call)
--------------------------------
- merge_geojson_files(file_paths, output_filename)
    Merge many GeoJSON/JSON files into one FeatureCollection.

- geojson_to_csv(geojson_data, output_filename)
    Flatten a FeatureCollection to CSV (geometry basics + properties).

- remove_duplicates(geojson_data, key_field='eventid')
    Drop duplicate features by a chosen properties key.

- fetch_event_episodes(event_data)
    From the event details URL, retrieve all associated episode IDs.

- fetch_episode_geometry(event_type, event_id, episode_id)
    Download the polygon(s)/point(s) for a specific episode from GDACS.

- create_unified_event_geometry(event_data, buffer_km=0)
    Union all episode geometries; optional km-buffer; return shape + bbox + GeoJSON.

- point_to_geometry_distance(lon, lat, geometry)
    Compute the shortest distance (km) from a point to a Shapely geometry.

- analyze_disaster_proximity_enhanced(csv_file_path, json_file_path,
                                      max_distance_km, buffer_km=0,
                                      cache_geometries=True)
    End-to-end engine: read inputs, build/cached geometries, compute matches.

- save_enhanced_results(results, geometry_cache, output_file='.json')
    Persist JSON with analysis summary, geometry summary, and all matches.

- save_results_to_excel(results, geometry_cache, output_file='.xlsx')
    Produce the multi-sheet Excel workbook described above.

- export_geometries_geojson(geometry_cache, output_file='.geojson')
    Write unified event geometries to a standalone GeoJSON.

- interactive_enhanced_setup() / main_enhanced()
    Guided prompts (useful in Colab) to choose files and distances.

- quick_enhanced_analysis(csv_path=None, json_path=None)
    Fast path for running the enhanced analysis with minimal prompts.

Outputs & File Naming
---------------------
- JSON:  enhanced_disaster_analysis_{DIST}km_buffer{BUFFER}km.json
- Excel: enhanced_disaster_analysis_{DIST}km_buffer{BUFFER}km.xlsx
- Merged GeoJSON: user-specified name (e.g., merged_geojson.json)
- Optional CSV from GeoJSON: user-specified name (e.g., merged_data.csv)

Distance & Buffer Notes
-----------------------
- Distances are computed using Shapely geometry distances converted from
  degrees to kilometers with the factor ≈ 111.32 km/degree. This is a practical,
  widely used approximation. For very large distances or near the poles, consider
  swapping in a true geodesic approach if you need higher fidelity.
- Buffering uses the same conversion (km → degrees) before applying
  `geometry.buffer(...)`.

Dependencies
------------
- Python 3.9+
- Packages: pandas, numpy, shapely, requests, openpyxl
  Install example: `pip install pandas numpy shapely requests openpyxl`

Runtime & Environment
---------------------
- Internet access is required to call GDACS APIs when constructing geometries.
- Works in local Python and Google Colab; Colab file-upload helpers are included.
- A geometry cache prevents refetching for the same event during a run.

Typical Usage
-------------
A) Quick run with saved files:
   1) Prepare: a client CSV + a merged GDACS GeoJSON.
   2) Call:    `results, cache = quick_enhanced_analysis('/path/clients.csv',
                                                         '/path/merged_geojson.json')`
   3) Choose a maximum distance (e.g., 100 km) and an optional buffer (e.g., 10 km).
   4) Save JSON/Excel when prompted.

B) Interactive (helpful in Colab):
   1) Run `main_enhanced()`
   2) Follow prompts to pick files, set distances, and export artifacts.

Error Handling & Validation
---------------------------
- Invalid polygons are repaired with `buffer(0)` when possible.
- Episode fetch failures fall back to the current episode ID.
- Client coordinate parsing is guarded; malformed rows are skipped with a warning.
- If required columns are missing, the script reports a clear message and aborts.

Assumptions & Limitations
-------------------------
- Client CSV must include valid latitude/longitude.
- GDACS GeoJSON must include the properties needed to discover details/episodes.
- Distance conversion uses a fixed km/degree factor; use geodesic libraries if
  you require sub-kilometer accuracy at high latitudes or over large spans.

License & Attribution
---------------------
This script consumes publicly available GDACS data/APIs. Please respect their
usage guidelines and attribute GDACS in downstream analyses or reports.

Last updated: 2025-09-06


In [1]:
import json
import pandas as pd
import os
from collections import defaultdict
import glob

def merge_geojson_files(file_paths, output_filename='merged_geojson.json'):
    """
    Merge multiple GeoJSON files into one large GeoJSON file
    """
    merged_features = []
    all_bbox = []

    for file_path in file_paths:
        try:
            with open(file_path, 'r', encoding='utf-8') as f:
                data = json.load(f)

            if 'features' in data:
                merged_features.extend(data['features'])
                print(f"Processed {file_path}: {len(data['features'])} features")

                if data.get('bbox'):
                    all_bbox.append(data['bbox'])

        except Exception as e:
            print(f"Error processing {file_path}: {e}")

    merged_geojson = {
        "type": "FeatureCollection",
        "features": merged_features,
        "bbox": None
    }


    with open(output_filename, 'w', encoding='utf-8') as f:
        json.dump(merged_geojson, f, ensure_ascii=False, indent=2)

    print(f"Merge completed! Total {len(merged_features)} features")
    print(f"Saved as: {output_filename}")

    return merged_geojson


def geojson_to_csv(geojson_data, output_filename='merged_data.csv'):
    """
    Convert GeoJSON data to CSV format
    """
    rows = []

    for feature in geojson_data['features']:
        row = {}

        geometry = feature.get('geometry', {})
        if geometry.get('type') == 'Point':
            coordinates = geometry.get('coordinates', [])
            if len(coordinates) >= 2:
                row['longitude'] = coordinates[0]
                row['latitude'] = coordinates[1]

        properties = feature.get('properties', {})
        for key, value in properties.items():
            if isinstance(value, (str, int, float, bool)):
                row[key] = value
            elif isinstance(value, list) and key == 'affectedcountries':
                country_names = [country.get('countryname', '') for country in value]
                row[f'{key}_names'] = '; '.join(country_names)
                country_codes = [country.get('iso3', '') for country in value]
                row[f'{key}_codes'] = '; '.join(country_codes)
            elif isinstance(value, dict):
                for sub_key, sub_value in value.items():
                    if isinstance(sub_value, (str, int, float, bool)):
                        row[f'{key}_{sub_key}'] = sub_value
            else:
                row[key] = str(value)

        rows.append(row)

    df = pd.DataFrame(rows)
    df.to_csv(output_filename, index=False, encoding='utf-8')

    print(f"CSV file saved as: {output_filename}")
    print(f"Contains {len(df)} rows of data, {len(df.columns)} columns")

    return df

def auto_merge_json_files(folder_path='.', pattern='*.json'):



    json_files = glob.glob(os.path.join(folder_path, pattern))

    if not json_files:
        print("No JSON files found")
        return None

    print(f"Found {len(json_files)} JSON files:")
    for file in json_files:
        print(f"  - {file}")

    return merge_geojson_files(json_files)

# Main execution program
def main():
    print("=== JSON File Merger Program ===\n")

    # Option 1: Manually specify file paths
    # Please replace these paths with your actual file paths
    file_paths = [
        'China.geojson',
        'India.geojson',
        'Malaysia.geojson',
        'Mexico.geojson',
        'Netherlands.geojson',
        'Romania.geojson',
        'United States.geojson'
    ]

    # Check if files exist
    existing_files = [f for f in file_paths if os.path.exists(f)]

    if not existing_files:
        print("Manually specified files do not exist, trying automatic search...")
        # Option 2: Automatically search for all JSON files in current directory
        merged_data = auto_merge_json_files()
    else:
        print(f"Found {len(existing_files)} files, starting merge...")
        merged_data = merge_geojson_files(existing_files)

    if merged_data:
        # Also generate CSV file
        print("\nConverting to CSV format...")
        df = geojson_to_csv(merged_data)

        # Display data summary
        print(f"\n=== Data Summary ===")
        print(f"Total number of features: {len(merged_data['features'])}")
        if not df.empty:
            print(f"Number of CSV columns: {len(df.columns)}")
            print(f"Main columns: {list(df.columns[:10])}")

            # Display event type statistics
            if 'eventtype' in df.columns:
                event_counts = df['eventtype'].value_counts()
                print(f"\nEvent type statistics:")
                for event_type, count in event_counts.items():
                    print(f"  {event_type}: {count}")

            # Display country statistics
            if 'country' in df.columns:
                country_counts = df['country'].value_counts().head(10)
                print(f"\nTop 10 countries/regions:")
                for country, count in country_counts.items():
                    print(f"  {country}: {count}")

# Execute program
if __name__ == "__main__":
    main()

# Additional feature: Remove duplicate data
def remove_duplicates(geojson_data, key_field='eventid'):
    """
    Remove duplicate features based on specified field
    """
    seen_keys = set()
    unique_features = []

    for feature in geojson_data['features']:
        key_value = feature.get('properties', {}).get(key_field)
        if key_value not in seen_keys:
            seen_keys.add(key_value)
            unique_features.append(feature)
        else:
            print(f"Found duplicate data: {key_field} = {key_value}")

    geojson_data['features'] = unique_features
    print(f"After removing duplicates, {len(unique_features)} features remaining")

    return geojson_data

# Usage example:
# If you want to remove duplicate data, you can use it like this:
# merged_data = remove_duplicates(merged_data, 'eventid')

=== JSON File Merger Program ===

Manually specified files do not exist, trying automatic search...
No JSON files found


In [None]:
#!/usr/bin/env python3
# -*- coding: utf-8 -*-
"""
Enhanced Disaster Location Analysis Tool
Compare CSV client locations with precise GeoJSON disaster geometries
Uses actual polygon shapes instead of simple bounding boxes
"""

import pandas as pd
import json
import numpy as np
from math import radians, cos, sin, asin, sqrt
import sys
import os
import requests
import time
from shapely.geometry import Point, Polygon, MultiPolygon, shape
from shapely.ops import unary_union
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

def haversine_distance(lon1, lat1, lon2, lat2):
    """
    Calculate the straight-line distance between two points (in kilometers)
    Uses Haversine formula to calculate the shortest distance between two points on Earth's surface
    """
    # Convert decimal degrees to radians
    lon1, lat1, lon2, lat2 = map(radians, [lon1, lat1, lon2, lat2])

    # Haversine formula
    dlon = lon2 - lon1
    dlat = lat2 - lat1
    a = sin(dlat/2)**2 + cos(lat1) * cos(lat2) * sin(dlon/2)**2
    c = 2 * asin(sqrt(a))

    # Earth's radius in kilometers
    r = 6371
    return c * r

def fetch_event_episodes(event_data):
    """
    Fetch all episodes for a given event using the details URL

    Parameters:
    event_data (dict): Event data containing URL information

    Returns:
    list: List of episode IDs for the event
    """
    try:
        details_url = event_data['properties']['url']['details']
        print(f"🔍 Fetching episodes for event {event_data['properties']['eventid']}...")

        response = requests.get(details_url, timeout=30)
        response.raise_for_status()

        event_details = response.json()
        episodes = []

        # Extract episode IDs from the response
        if 'episodelist' in event_details:
            for episode in event_details['episodelist']:
                episodes.append(episode.get('episodeid'))
        else:
            # Fallback: use the current episode ID
            episodes.append(event_data['properties']['episodeid'])

        print(f"✅ Found {len(episodes)} episodes: {episodes}")
        return episodes

    except Exception as e:
        print(f"⚠️  Error fetching episodes: {e}")
        # Fallback: use the current episode ID
        return [event_data['properties']['episodeid']]

def fetch_episode_geometry(event_type, event_id, episode_id):
    """
    Fetch geometry for a specific episode

    Parameters:
    event_type (str): Type of event (TC, FL, etc.)
    event_id (int): Event ID
    episode_id (int): Episode ID

    Returns:
    list: List of Shapely geometry objects
    """
    try:
        geometry_url = f"https://www.gdacs.org/gdacsapi/api/polygons/getgeometry?eventtype={event_type}&eventid={event_id}&episodeid={episode_id}"
        print(f"📡 Fetching geometry for episode {episode_id}...")

        response = requests.get(geometry_url, timeout=30)
        response.raise_for_status()

        geojson_data = response.json()
        geometries = []

        # Handle different GeoJSON structures
        if geojson_data.get('type') == 'FeatureCollection':
            for feature in geojson_data.get('features', []):
                if 'geometry' in feature:
                    geom = shape(feature['geometry'])
                    if geom.is_valid:
                        geometries.append(geom)
                    else:
                        # Try to fix invalid geometry
                        fixed_geom = geom.buffer(0)
                        if fixed_geom.is_valid:
                            geometries.append(fixed_geom)

        elif geojson_data.get('type') == 'Feature':
            if 'geometry' in geojson_data:
                geom = shape(geojson_data['geometry'])
                if geom.is_valid:
                    geometries.append(geom)
                else:
                    fixed_geom = geom.buffer(0)
                    if fixed_geom.is_valid:
                        geometries.append(fixed_geom)

        elif 'type' in geojson_data and geojson_data['type'] in ['Polygon', 'MultiPolygon', 'Point']:
            geom = shape(geojson_data)
            if geom.is_valid:
                geometries.append(geom)
            else:
                fixed_geom = geom.buffer(0)
                if fixed_geom.is_valid:
                    geometries.append(fixed_geom)

        print(f"✅ Retrieved {len(geometries)} geometry objects")
        return geometries

    except Exception as e:
        print(f"⚠️  Error fetching geometry for episode {episode_id}: {e}")
        return []

def create_unified_event_geometry(event_data, buffer_km=0):
    """
    Create a unified geometry for an event by combining all its episodes

    Parameters:
    event_data (dict): Event data from JSON
    buffer_km (float): Buffer distance in kilometers (optional)

    Returns:
    tuple: (unified_geometry, bbox, geojson_dict)
    """
    try:
        event_id = event_data['properties']['eventid']
        event_type = event_data['properties']['eventtype']

        print(f"\n🔄 Processing event {event_id} ({event_type})...")

        # Get all episodes for this event
        episodes = fetch_event_episodes(event_data)

        all_geometries = []

        # Fetch geometry for each episode
        for episode_id in episodes:
            episode_geometries = fetch_episode_geometry(event_type, event_id, episode_id)
            all_geometries.extend(episode_geometries)

            # Add small delay to avoid overwhelming the API
            time.sleep(0.5)

        if not all_geometries:
            print(f"❌ No valid geometries found for event {event_id}")
            return None, None, None

        print(f"🔧 Unifying {len(all_geometries)} geometries...")

        # Union all geometries for this event
        if len(all_geometries) == 1:
            unified_geometry = all_geometries[0]
        else:
            unified_geometry = unary_union(all_geometries)

        # Ensure the geometry is valid
        if not unified_geometry.is_valid:
            print("🔧 Fixing invalid geometry...")
            unified_geometry = unified_geometry.buffer(0)

        # Apply buffer if requested
        if buffer_km > 0:
            print(f"🔧 Applying {buffer_km}km buffer...")
            # Convert km to degrees (rough approximation)
            buffer_degrees = buffer_km / 111.32  # 1 degree ≈ 111.32 km
            unified_geometry = unified_geometry.buffer(buffer_degrees)

        # Get bounding box
        bounds = unified_geometry.bounds
        bbox = [bounds[0], bounds[1], bounds[2], bounds[3]]  # [minx, miny, maxx, maxy]

        # Create GeoJSON representation
        if hasattr(unified_geometry, '__geo_interface__'):
            geojson_dict = unified_geometry.__geo_interface__
        else:
            geojson_dict = None

        print(f"✅ Event {event_id} geometry unified successfully")
        return unified_geometry, bbox, geojson_dict

    except Exception as e:
        print(f"❌ Error creating unified geometry for event {event_data['properties']['eventid']}: {e}")
        return None, None, None

def point_to_geometry_distance(point_lon, point_lat, geometry):
    """
    Calculate the shortest distance from a point to a Shapely geometry

    Parameters:
    point_lon (float): Point longitude
    point_lat (float): Point latitude
    geometry: Shapely geometry object

    Returns:
    float: Distance in kilometers
    """
    try:
        point = Point(point_lon, point_lat)

        # If point is inside the geometry, distance is 0
        if geometry.contains(point):
            return 0.0

        # Calculate distance to geometry boundary
        distance_degrees = point.distance(geometry)

        # Convert degrees to kilometers (rough approximation)
        distance_km = distance_degrees * 111.32  # 1 degree ≈ 111.32 km

        return distance_km

    except Exception as e:
        print(f"⚠️  Error calculating distance: {e}")
        return float('inf')

def analyze_disaster_proximity_enhanced(csv_file_path, json_file_path, max_distance_km, buffer_km=0, cache_geometries=True):
    """
    Enhanced analysis function using precise geometries instead of bounding boxes

    Parameters:
    csv_file_path (str): Path to CSV file
    json_file_path (str): Path to JSON file
    max_distance_km (float): Maximum search distance (kilometers)
    buffer_km (float): Buffer distance to add to disaster geometries (kilometers)
    cache_geometries (bool): Whether to cache geometries to avoid re-downloading

    Returns:
    tuple: (matches_list, geometry_cache)
    """

    print(f"🔍 Starting enhanced disaster location analysis...")
    print(f"📊 Maximum search distance: {max_distance_km} km")
    print(f"🛡️  Geometry buffer: {buffer_km} km")
    print("🎯 Using precise polygon geometries instead of bounding boxes\n")

    # Read CSV file
    try:
        df_csv = pd.read_csv(csv_file_path)
        print(f"✅ Successfully loaded CSV file with {len(df_csv)} client records")

        # Check required columns
        required_columns = ['id', 'name', 'address', 'country', 'type', 'latitude', 'longitude']
        missing_columns = [col for col in required_columns if col not in df_csv.columns]
        if missing_columns:
            print(f"⚠️  Warning: CSV file missing columns: {missing_columns}")
            print("Available columns:", list(df_csv.columns))

    except Exception as e:
        print(f"❌ Failed to read CSV file: {e}")
        return [], {}

    # Read JSON file
    try:
        with open(json_file_path, 'r', encoding='utf-8') as f:
            json_data = json.load(f)

        features = json_data.get('features', [])
        print(f"✅ Successfully loaded JSON file with {len(features)} disaster events\n")

    except Exception as e:
        print(f"❌ Failed to read JSON file: {e}")
        return [], {}

    # Process each disaster event to create unified geometries
    geometry_cache = {}
    print("🔄 Creating unified geometries for disaster events...")
    print("=" * 60)

    for i, feature in enumerate(features):
        event_id = feature['properties']['eventid']

        if event_id not in geometry_cache:
            print(f"\n📍 Processing event {i+1}/{len(features)}: {event_id}")

            unified_geometry, bbox, geojson_dict = create_unified_event_geometry(feature, buffer_km)

            if unified_geometry is not None:
                geometry_cache[event_id] = {
                    'geometry': unified_geometry,
                    'bbox': bbox,
                    'geojson': geojson_dict,
                    'properties': feature['properties']
                }
                print(f"✅ Event {event_id} processed successfully")
            else:
                print(f"❌ Failed to process event {event_id}")
        else:
            print(f"📋 Event {event_id} already cached")

    print(f"\n🎯 Successfully processed {len(geometry_cache)} unique events")
    print("=" * 60)

    # Analyze matches using precise geometries
    matches = []

    for idx, client in df_csv.iterrows():
        try:
            client_lat = float(client.get('latitude', 0))
            client_lon = float(client.get('longitude', 0))

            # Skip invalid coordinates
            if client_lat == 0 and client_lon == 0:
                continue

            for event_id, event_cache in geometry_cache.items():
                geometry = event_cache['geometry']
                properties = event_cache['properties']

                # Calculate precise distance to geometry
                distance = point_to_geometry_distance(client_lon, client_lat, geometry)

                # If within specified distance
                if distance <= max_distance_km:
                    match_info = {
                        # Client information
                        "client_info": {
                            "id": str(client.get('id', '')),
                            "name": str(client.get('name', '')),
                            "address": str(client.get('address', '')),
                            "country": str(client.get('country', '')),
                            "type": str(client.get('type', '')),
                            "latitude": float(client_lat),
                            "longitude": float(client_lon),
                            "distance": f"Distance: {round(distance, 1)}km from disaster"
                        },

                        # Disaster information (enhanced)
                        "disaster_info": {
                            "distance_km": round(distance, 2),
                            "distance_calculation": "precise_geometry",
                            "event_type": properties.get('eventtype', ''),
                            "event_name": properties.get('name', ''),
                            "description": properties.get('description', ''),
                            "country": properties.get('country', ''),
                            "alert_level": properties.get('alertlevel', ''),
                            "from_date": properties.get('fromdate', ''),
                            "to_date": properties.get('todate', ''),
                            "severity": properties.get('severitydata', {}).get('severitytext', ''),
                            "unified_bbox": event_cache['bbox'],
                            "full_properties": properties
                        },

                        # Enhanced analysis metadata
                        "analysis_metadata": {
                            "method": "enhanced_geometry_analysis",
                            "search_distance_km": max_distance_km,
                            "buffer_applied_km": buffer_km,
                            "coordinates": {
                                "client_lat": float(client_lat),
                                "client_lon": float(client_lon),
                                "unified_bbox": event_cache['bbox']
                            },
                            "geometry_available": True
                        },

                        # Geometry data (optional, for advanced users)
                        "geometry_data": {
                            "geojson": event_cache.get('geojson'),
                            "bbox": event_cache['bbox']
                        }
                    }
                    matches.append(match_info)

        except (ValueError, TypeError) as e:
            print(f"⚠️  Error processing client {client.get('id', idx)}: {e}")
            continue

    print(f"\n🎯 Found {len(matches)} matching items using precise geometries")
    return matches, geometry_cache

def save_enhanced_results(results, geometry_cache, output_file='enhanced_disaster_analysis_results.json'):
    """
    Save enhanced results including geometry information
    """
    if results:
        # Create enhanced summary statistics
        summary = {
            "analysis_summary": {
                "analysis_method": "enhanced_geometry_analysis",
                "total_matches": len(results),
                "unique_clients": len(set([r['client_info']['id'] for r in results])),
                "unique_events": len(geometry_cache),
                "disaster_types": {},
                "alert_levels": {},
                "countries_affected": {},
                "distance_stats": {
                    "avg_distance": round(sum([r['disaster_info']['distance_km'] for r in results]) / len(results), 2),
                    "max_distance": max([r['disaster_info']['distance_km'] for r in results]),
                    "min_distance": min([r['disaster_info']['distance_km'] for r in results])
                }
            },
            "geometry_summary": {
                "total_events_processed": len(geometry_cache),
                "geometry_types": {},
                "bbox_coverage": []
            },
            "matches": results
        }

        # Calculate statistics
        for result in results:
            disaster_type = result['disaster_info']['event_type']
            alert_level = result['disaster_info']['alert_level']
            country = result['client_info']['country']

            summary["analysis_summary"]["disaster_types"][disaster_type] = summary["analysis_summary"]["disaster_types"].get(disaster_type, 0) + 1
            summary["analysis_summary"]["alert_levels"][alert_level] = summary["analysis_summary"]["alert_levels"].get(alert_level, 0) + 1
            summary["analysis_summary"]["countries_affected"][country] = summary["analysis_summary"]["countries_affected"].get(country, 0) + 1

        # Add geometry statistics
        for event_id, cache_data in geometry_cache.items():
            if cache_data['geometry']:
                geom_type = cache_data['geometry'].geom_type
                summary["geometry_summary"]["geometry_types"][geom_type] = summary["geometry_summary"]["geometry_types"].get(geom_type, 0) + 1
                summary["geometry_summary"]["bbox_coverage"].append(cache_data['bbox'])

        # Save to JSON file
        with open(output_file, 'w', encoding='utf-8') as f:
            json.dump(summary, f, indent=2, ensure_ascii=False)

        print(f"💾 Enhanced results saved to: {output_file}")
        return summary
    else:
        print("❌ No results to save")
        return None

def save_results_to_excel(results, geometry_cache, output_file='enhanced_disaster_analysis_results.xlsx'):
    """
    Save enhanced results to Excel file with multiple sheets

    Parameters:
    results (list): List of match results
    geometry_cache (dict): Geometry cache data
    output_file (str): Output Excel file path

    Returns:
    str: Output file path if successful, None otherwise
    """
    try:
        if not results:
            print("❌ No results to save to Excel")
            return None

        print(f"📊 Creating Excel file: {output_file}")

        # Create Excel writer object
        with pd.ExcelWriter(output_file, engine='openpyxl') as writer:

            # Sheet 1: Main Results - Detailed matches
            print("📝 Creating 'Detailed_Matches' sheet...")
            detailed_data = []

            for result in results:
                client_info = result['client_info']
                disaster_info = result['disaster_info']
                analysis_meta = result['analysis_metadata']

                # Get full disaster properties from the original JSON
                full_props = disaster_info.get('full_properties', {})

                row = {
                    # === DISASTER/EVENT INFORMATION (place at the front) ===
                    # Basic event information
                    'Event_ID': full_props.get('eventid', ''),
                    'Episode_ID': full_props.get('episodeid', ''),
                    'Event_Type': full_props.get('eventtype', ''),
                    'Event_Name': full_props.get('eventname', ''),
                    'Event_Full_Name': full_props.get('name', ''),
                    'Event_Description': full_props.get('description', ''),
                    'HTML_Description': full_props.get('htmldescription', ''),

                    # Time information
                    'From_Date': full_props.get('fromdate', ''),
                    'To_Date': full_props.get('todate', ''),
                    'Date_Modified': full_props.get('datemodified', ''),

                    # Geographic information
                    'Disaster_Country': full_props.get('country', ''),
                    'ISO3_Code': full_props.get('iso3', ''),
                    'Country_On_Land': full_props.get('countryonland', ''),

                    # Alert information
                    'Alert_Level': full_props.get('alertlevel', ''),
                    'Alert_Score': full_props.get('alertscore', ''),
                    'Episode_Alert_Level': full_props.get('episodealertlevel', ''),
                    'Episode_Alert_Score': full_props.get('episodealertscore', ''),

                    # Severity information
                    'Severity_Value': full_props.get('severitydata', {}).get('severity', ''),
                    'Severity_Text': full_props.get('severitydata', {}).get('severitytext', ''),
                    'Severity_Unit': full_props.get('severitydata', {}).get('severityunit', ''),

                    # Status information
                    'Is_Temporary': full_props.get('istemporary', ''),
                    'Is_Current': full_props.get('iscurrent', ''),

                    # Data source
                    'Data_Source': full_props.get('source', ''),
                    'Source_ID': full_props.get('sourceid', ''),
                    'GLIDE_Number': full_props.get('glide', ''),

                    # URL information
                    'Report_URL': full_props.get('url', {}).get('report', ''),
                    'Details_URL': full_props.get('url', {}).get('details', ''),
                    'Geometry_URL': full_props.get('url', {}).get('geometry', ''),

                    # Icon information
                    'Icon_URL': full_props.get('icon', ''),
                    'Overall_Icon_URL': full_props.get('iconoverall', ''),

                    # Geometry classification
                    'Polygon_Label': full_props.get('polygonlabel', ''),
                    'Geometry_Class': full_props.get('Class', ''),

                    # Affected countries (if multiple)
                    'Affected_Countries_Count': len(full_props.get('affectedcountries', [])),
                    'Affected_Countries_List': ', '.join([country.get('countryname', '') for country in full_props.get('affectedcountries', [])]),
                    'Affected_ISO2_Codes': ', '.join([country.get('iso2', '') for country in full_props.get('affectedcountries', [])]),
                    'Affected_ISO3_Codes': ', '.join([country.get('iso3', '') for country in full_props.get('affectedcountries', [])]),

                    # Unified bounding box
                    'Unified_Bbox_MinLon': disaster_info.get('unified_bbox', [None, None, None, None])[0],
                    'Unified_Bbox_MinLat': disaster_info.get('unified_bbox', [None, None, None, None])[1],
                    'Unified_Bbox_MaxLon': disaster_info.get('unified_bbox', [None, None, None, None])[2],
                    'Unified_Bbox_MaxLat': disaster_info.get('unified_bbox', [None, None, None, None])[3],

                    # === Distance calculation results ===
                    'Distance_KM': disaster_info.get('distance_km', ''),
                    'Distance_Calculation_Method': disaster_info.get('distance_calculation', ''),

                    # === Analysis parameters ===
                    'Search_Distance_KM': analysis_meta.get('search_distance_km', ''),
                    'Buffer_Applied_KM': analysis_meta.get('buffer_applied_km', ''),
                    'Analysis_Method': analysis_meta.get('method', ''),

                    # === CLIENT INFORMATION (place at the end) ===
                    'Client_ID': client_info.get('id', ''),
                    'Client_Name': client_info.get('name', ''),
                    'Client_Address': client_info.get('address', ''),
                    'Client_Country': client_info.get('country', ''),
                    'Client_Type': client_info.get('type', ''),
                    'Client_Latitude': client_info.get('latitude', ''),
                    'Client_Longitude': client_info.get('longitude', ''),
                    'Client_Distance_Summary': client_info.get('distance', ''),
                }
                detailed_data.append(row)

            df_detailed = pd.DataFrame(detailed_data)
            df_detailed.to_excel(writer, sheet_name='Detailed_Matches', index=False)

            # Sheet 2: Summary Statistics
            print("📊 Creating 'Summary_Statistics' sheet...")
            summary_data = []

            # Basic statistics
            total_matches = len(results)
            unique_clients = len(set([r['client_info']['id'] for r in results]))
            unique_events = len(geometry_cache)

            if results:
                distances = [r['disaster_info']['distance_km'] for r in results]
                avg_distance = round(sum(distances) / len(distances), 2)
                max_distance = max(distances)
                min_distance = min(distances)
            else:
                avg_distance = max_distance = min_distance = 0

            summary_stats = [
                ['Metric', 'Value'],
                ['Analysis Method', 'Enhanced Geometry Analysis'],
                ['Analysis Date', datetime.now().strftime('%Y-%m-%d %H:%M:%S')],
                ['Total Matches Found', total_matches],
                ['Unique Clients Affected', unique_clients],
                ['Unique Disaster Events', unique_events],
                ['Average Distance (km)', avg_distance],
                ['Maximum Distance (km)', max_distance],
                ['Minimum Distance (km)', min_distance],
            ]

            df_summary = pd.DataFrame(summary_stats[1:], columns=summary_stats[0])
            df_summary.to_excel(writer, sheet_name='Summary_Statistics', index=False)

            # Sheet 3: Disaster Types Analysis
            print("📈 Creating 'Disaster_Types' sheet...")
            disaster_types = {}
            alert_levels = {}
            countries_affected = {}

            for result in results:
                dtype = result['disaster_info']['event_type']
                alert = result['disaster_info']['alert_level']
                country = result['client_info']['country']

                disaster_types[dtype] = disaster_types.get(dtype, 0) + 1
                alert_levels[alert] = alert_levels.get(alert, 0) + 1
                countries_affected[country] = countries_affected.get(country, 0) + 1

            # Create disaster types dataframe
            disaster_types_data = [['Disaster_Type', 'Count', 'Percentage']]
            for dtype, count in disaster_types.items():
                percentage = round((count / total_matches) * 100, 1) if total_matches > 0 else 0
                disaster_types_data.append([dtype, count, f"{percentage}%"])

            df_disaster_types = pd.DataFrame(disaster_types_data[1:], columns=disaster_types_data[0])
            df_disaster_types.to_excel(writer, sheet_name='Disaster_Types', index=False)

            # Sheet 4: Alert Levels Analysis
            print("🚨 Creating 'Alert_Levels' sheet...")
            alert_levels_data = [['Alert_Level', 'Count', 'Percentage']]
            for alert, count in alert_levels.items():
                percentage = round((count / total_matches) * 100, 1) if total_matches > 0 else 0
                alert_levels_data.append([alert, count, f"{percentage}%"])

            df_alert_levels = pd.DataFrame(alert_levels_data[1:], columns=alert_levels_data[0])
            df_alert_levels.to_excel(writer, sheet_name='Alert_Levels', index=False)

            # Sheet 5: Countries Analysis
            print("🌍 Creating 'Countries_Affected' sheet...")
            countries_data = [['Country', 'Affected_Clients', 'Percentage']]
            for country, count in countries_affected.items():
                percentage = round((count / total_matches) * 100, 1) if total_matches > 0 else 0
                countries_data.append([country, count, f"{percentage}%"])

            df_countries = pd.DataFrame(countries_data[1:], columns=countries_data[0])
            df_countries.to_excel(writer, sheet_name='Countries_Affected', index=False)

            # Sheet 6: Distance Analysis
            print("📏 Creating 'Distance_Analysis' sheet...")
            if results:
                # Create distance bins
                distances = [r['disaster_info']['distance_km'] for r in results]
                distance_ranges = ['0-10km', '10-25km', '25-50km', '50-100km', '100-200km', '200km+']
                distance_counts = [0, 0, 0, 0, 0, 0]

                for dist in distances:
                    if dist <= 10:
                        distance_counts[0] += 1
                    elif dist <= 25:
                        distance_counts[1] += 1
                    elif dist <= 50:
                        distance_counts[2] += 1
                    elif dist <= 100:
                        distance_counts[3] += 1
                    elif dist <= 200:
                        distance_counts[4] += 1
                    else:
                        distance_counts[5] += 1

                distance_analysis_data = []
                for i, (range_name, count) in enumerate(zip(distance_ranges, distance_counts)):
                    percentage = round((count / total_matches) * 100, 1) if total_matches > 0 else 0
                    distance_analysis_data.append([range_name, count, f"{percentage}%"])

                df_distance = pd.DataFrame(distance_analysis_data, columns=['Distance_Range', 'Count', 'Percentage'])
                df_distance.to_excel(writer, sheet_name='Distance_Analysis', index=False)

            # Sheet 7: All Disaster Events (complete disaster event data)
            print("🌪️  Creating 'All_Disaster_Events' sheet...")
            all_events_data = []

            for event_id, cache_data in geometry_cache.items():
                if cache_data.get('properties'):
                    props = cache_data['properties']
                    bbox = cache_data.get('bbox', [None, None, None, None])
                    geom_type = cache_data['geometry'].geom_type if cache_data.get('geometry') else 'Unknown'

                    row = {
                        # Basic event information
                        'Event_ID': props.get('eventid', ''),
                        'Episode_ID': props.get('episodeid', ''),
                        'Event_Type': props.get('eventtype', ''),
                        'Event_Name': props.get('eventname', ''),
                        'Full_Name': props.get('name', ''),
                        'Description': props.get('description', ''),
                        'HTML_Description': props.get('htmldescription', ''),

                        # Time information
                        'From_Date': props.get('fromdate', ''),
                        'To_Date': props.get('todate', ''),
                        'Date_Modified': props.get('datemodified', ''),

                        # Geographic information
                        'Country': props.get('country', ''),
                        'ISO3_Code': props.get('iso3', ''),
                        'Country_On_Land': props.get('countryonland', ''),

                        # Alert information
                        'Alert_Level': props.get('alertlevel', ''),
                        'Alert_Score': props.get('alertscore', ''),
                        'Episode_Alert_Level': props.get('episodealertlevel', ''),
                        'Episode_Alert_Score': props.get('episodealertscore', ''),

                        # Severity information
                        'Severity_Value': props.get('severitydata', {}).get('severity', ''),
                        'Severity_Text': props.get('severitydata', {}).get('severitytext', ''),
                        'Severity_Unit': props.get('severitydata', {}).get('severityunit', ''),

                        # Status information
                        'Is_Temporary': props.get('istemporary', ''),
                        'Is_Current': props.get('iscurrent', ''),

                        # Data source
                        'Data_Source': props.get('source', ''),
                        'Source_ID': props.get('sourceid', ''),
                        'GLIDE_Number': props.get('glide', ''),

                        # URL information
                        'Report_URL': props.get('url', {}).get('report', ''),
                        'Details_URL': props.get('url', {}).get('details', ''),
                        'Geometry_URL': props.get('url', {}).get('geometry', ''),

                        # Icon information
                        'Icon_URL': props.get('icon', ''),
                        'Overall_Icon_URL': props.get('iconoverall', ''),

                        # Geometry classification
                        'Polygon_Label': props.get('polygonlabel', ''),
                        'Geometry_Class': props.get('Class', ''),
                        'Processed_Geometry_Type': geom_type,

                        # Affected countries
                        'Affected_Countries_Count': len(props.get('affectedcountries', [])),
                        'Affected_Countries_List': ', '.join([country.get('countryname', '') for country in props.get('affectedcountries', [])]),
                        'Affected_ISO2_Codes': ', '.join([country.get('iso2', '') for country in props.get('affectedcountries', [])]),
                        'Affected_ISO3_Codes': ', '.join([country.get('iso3', '') for country in props.get('affectedcountries', [])]),

                        # Unified bounding box
                        'Unified_Bbox_MinLon': bbox[0],
                        'Unified_Bbox_MinLat': bbox[1],
                        'Unified_Bbox_MaxLon': bbox[2],
                        'Unified_Bbox_MaxLat': bbox[3],

                        # Whether there are matching clients
                        'Has_Affected_Clients': any(result['disaster_info']['event_type'] == props.get('eventtype', '') and
                                                  str(result['disaster_info'].get('full_properties', {}).get('eventid', '')) == str(props.get('eventid', ''))
                                                  for result in results),
                        'Affected_Clients_Count': sum(1 for result in results
                                                    if str(result['disaster_info'].get('full_properties', {}).get('eventid', '')) == str(props.get('eventid', '')))
                    }
                    all_events_data.append(row)

            if all_events_data:
                df_all_events = pd.DataFrame(all_events_data)
                df_all_events.to_excel(writer, sheet_name='All_Disaster_Events', index=False)

            # Sheet 8: Geometry Information (simplified version, detailed data already above)
            print("🗺️  Creating 'Geometry_Summary' sheet...")
            geometry_data = []

            for event_id, cache_data in geometry_cache.items():
                if cache_data.get('properties'):
                    props = cache_data['properties']
                    bbox = cache_data.get('bbox', [None, None, None, None])
                    geom_type = cache_data['geometry'].geom_type if cache_data.get('geometry') else 'Unknown'

                    # Calculate geometry statistics
                    geometry_area = 0
                    geometry_perimeter = 0
                    if cache_data.get('geometry'):
                        try:
                            # Approximate area calculation (square degrees)
                            geometry_area = cache_data['geometry'].area
                            # Approximate perimeter calculation (degrees)
                            geometry_perimeter = cache_data['geometry'].length
                        except:
                            pass

                    row = {
                        'Event_ID': event_id,
                        'Event_Type': props.get('eventtype', ''),
                        'Event_Name': props.get('name', ''),
                        'Alert_Level': props.get('alertlevel', ''),
                        'Country': props.get('country', ''),
                        'From_Date': props.get('fromdate', ''),
                        'To_Date': props.get('todate', ''),
                        'Geometry_Type': geom_type,
                        'Geometry_Area_Deg2': round(geometry_area, 6) if geometry_area > 0 else '',
                        'Geometry_Perimeter_Deg': round(geometry_perimeter, 6) if geometry_perimeter > 0 else '',
                        'Bbox_MinLon': bbox[0],
                        'Bbox_MinLat': bbox[1],
                        'Bbox_MaxLon': bbox[2],
                        'Bbox_MaxLat': bbox[3],
                        'Bbox_Width_Deg': round(bbox[2] - bbox[0], 6) if bbox[0] is not None and bbox[2] is not None else '',
                        'Bbox_Height_Deg': round(bbox[3] - bbox[1], 6) if bbox[1] is not None and bbox[3] is not None else '',
                        'Has_Affected_Clients': any(str(result['disaster_info'].get('full_properties', {}).get('eventid', '')) == str(event_id) for result in results),
                        'Affected_Clients_Count': sum(1 for result in results if str(result['disaster_info'].get('full_properties', {}).get('eventid', '')) == str(event_id))
                    }
                    geometry_data.append(row)

            if geometry_data:
                df_geometry = pd.DataFrame(geometry_data)
                df_geometry.to_excel(writer, sheet_name='Geometry_Summary', index=False)

        print(f"✅ Excel file successfully created: {output_file}")
        print(f"📊 Created {8 if geometry_data else 7} sheets with comprehensive analysis")
        return output_file

    except Exception as e:
        print(f"❌ Error creating Excel file: {e}")
        return None

def export_geometries_geojson(geometry_cache, output_file='disaster_geometries.geojson'):
    """
    Export unified geometries as a GeoJSON file
    """
    try:
        features = []

        for event_id, cache_data in geometry_cache.items():
            if cache_data.get('geojson'):
                feature = {
                    "type": "Feature",
                    "properties": {
                        "event_id": event_id,
                        "event_type": cache_data['properties'].get('eventtype', ''),
                        "event_name": cache_data['properties'].get('name', ''),
                        "alert_level": cache_data['properties'].get('alertlevel', ''),
                        "country": cache_data['properties'].get('country', ''),
                        "from_date": cache_data['properties'].get('fromdate', ''),
                        "to_date": cache_data['properties'].get('todate', '')
                    },
                    "geometry": cache_data['geojson']
                }
                features.append(feature)

        geojson_output = {
            "type": "FeatureCollection",
            "features": features
        }

        with open(output_file, 'w', encoding='utf-8') as f:
            json.dump(geojson_output, f, indent=2, ensure_ascii=False)

        print(f"🗺️  Geometries exported to: {output_file}")
        return True

    except Exception as e:
        print(f"❌ Error exporting geometries: {e}")
        return False

def interactive_enhanced_setup():
    """
    Interactive setup for enhanced analysis
    """
    print("🌍 Enhanced Disaster Location Analysis Tool")
    print("=" * 60)
    print("🎯 This version uses precise polygon geometries instead of bounding boxes")
    print("📁 File Setup")
    print("=" * 60)

    # Check existing files
    content_files = [f for f in os.listdir('/content') if f.endswith(('.csv', '.json', '.txt'))]

    if content_files:
        print("📂 Found the following files in /content directory:")
        for i, file in enumerate(content_files, 1):
            print(f"   {i}. {file}")
        print()

    # CSV file setup
    print("🗂️ CSV File Setup:")
    csv_option = input("Choose option [1]Upload new file [2]Use existing file [3]Manual path: ").strip()

    if csv_option == "1":
        print("Please upload your CSV file...")
        try:
            from google.colab import files
            uploaded_csv = files.upload()
            csv_file_path = f"/content/{list(uploaded_csv.keys())[0]}"
        except ImportError:
            csv_file_path = input("Google Colab not detected. Enter CSV file path: ").strip()
    elif csv_option == "2" and content_files:
        print("Select CSV file:")
        csv_files = [f for f in content_files if f.endswith('.csv')]
        for i, file in enumerate(csv_files, 1):
            print(f"   {i}. {file}")
        csv_choice = int(input("Enter file number: ")) - 1
        csv_file_path = f"/content/{csv_files[csv_choice]}"
    else:
        csv_file_path = input("Enter CSV file full path: ").strip()

    # JSON file setup
    print("\n🗂️ JSON Disaster Data File Setup:")
    json_option = input("Choose option [1]Upload new file [2]Use existing file [3]Manual path: ").strip()

    if json_option == "1":
        print("Please upload your JSON file...")
        try:
            from google.colab import files
            uploaded_json = files.upload()
            json_file_path = f"/content/{list(uploaded_json.keys())[0]}"
        except ImportError:
            json_file_path = input("Google Colab not detected. Enter JSON file path: ").strip()
    elif json_option == "2" and content_files:
        print("Select JSON file:")
        json_files = [f for f in content_files if f.endswith(('.json', '.txt'))]
        for i, file in enumerate(json_files, 1):
            print(f"   {i}. {file}")
        json_choice = int(input("Enter file number: ")) - 1
        json_file_path = f"/content/{json_files[json_choice]}"
    else:
        json_file_path = input("Enter JSON file full path: ").strip()

    print(f"\n✅ CSV file path: {csv_file_path}")
    print(f"✅ JSON file path: {json_file_path}")

    return csv_file_path, json_file_path

def main_enhanced():
    """
    Main enhanced function with precise geometry analysis
    """
    try:
        # Interactive file setup
        csv_file_path, json_file_path = interactive_enhanced_setup()

        # Get search distance
        print("\n🎯 Search Distance Configuration")
        print("=" * 40)
        print("💡 Recommended distance ranges:")
        print("   • 10-50km: Close proximity analysis")
        print("   • 50-200km: Medium range analysis")
        print("   • 200km+: Wide area analysis")

        while True:
            try:
                distance_input = input("\nEnter maximum search distance (km): ").strip()
                max_distance_km = float(distance_input)
                if max_distance_km > 0:
                    break
                else:
                    print("❌ Distance must be positive")
            except ValueError:
                print("❌ Please enter a valid number")

        # Get buffer distance
        print("\n🛡️  Buffer Configuration (Optional)")
        print("=" * 40)
        print("💡 Buffer adds safety margin around disaster areas:")
        print("   • 0km: No buffer (exact geometry)")
        print("   • 5-20km: Small safety margin")
        print("   • 50km+: Large safety zone")

        while True:
            try:
                buffer_input = input("Enter buffer distance (km) [0 for no buffer]: ").strip()
                if not buffer_input:
                    buffer_km = 0
                    break
                buffer_km = float(buffer_input)
                if buffer_km >= 0:
                    break
                else:
                    print("❌ Buffer must be non-negative")
            except ValueError:
                print("❌ Please enter a valid number")

        print(f"\n🚀 Starting enhanced analysis...")
        print("=" * 60)

        # Execute enhanced analysis
        results, geometry_cache = analyze_disaster_proximity_enhanced(
            csv_file_path, json_file_path, max_distance_km, buffer_km
        )

        if results:
            # Display summary
            print("\n📋 Enhanced Analysis Summary")
            print("=" * 40)
            print(f"🎯 Total matches found: {len(results)}")
            print(f"👥 Unique clients affected: {len(set([r['client_info']['id'] for r in results]))}")
            print(f"🌪️  Unique events processed: {len(geometry_cache)}")

            # Show disaster types
            disaster_types = {}
            for result in results:
                dtype = result['disaster_info']['event_type']
                disaster_types[dtype] = disaster_types.get(dtype, 0) + 1

            print(f"\n🌪️ Disaster types found:")
            for dtype, count in disaster_types.items():
                print(f"   • {dtype}: {count}")

            # Save options
            print("\n💾 Save Results")
            print("=" * 30)
            save_option = input("Save results to files? [Y/n]: ").lower().strip()

            if save_option in ['', 'y', 'yes']:
                # JSON output
                default_json_name = f"enhanced_disaster_analysis_{max_distance_km}km_buffer{buffer_km}km.json"
                json_filename = input(f"JSON filename (default: {default_json_name}): ").strip()
                if not json_filename:
                    json_filename = default_json_name

                summary = save_enhanced_results(results, geometry_cache, json_filename)

                # Excel output
                excel_option = input("Also save as Excel file? [Y/n]: ").lower().strip()
                if excel_option in ['', 'y', 'yes']:
                    default_excel_name = f"enhanced_disaster_analysis_{max_distance_km}km_buffer{buffer_km}km.xlsx"
                    excel_filename = input(f"Excel filename (default: {default_excel_name}): ").strip()
                    if not excel_filename:
                        excel_filename = default_excel_name

                    excel_path = save_results_to_excel(results, geometry_cache, excel_filename)

                # Option to export geometries
                export_option = input("Export unified geometries as GeoJSON? [y/N]: ").lower().strip()
                if export_option in ['y', 'yes']:
                    geojson_filename = f"unified_geometries_{max_distance_km}km.geojson"
                    export_geometries_geojson(geometry_cache, geojson_filename)

                # Download option for Google Colab
                try:
                    from google.colab import files
                    download_option = input("Download files to local machine? [y/N]: ").lower().strip()
                    if download_option in ['y', 'yes']:
                        files.download(json_filename)
                        if excel_option in ['', 'y', 'yes'] and excel_path:
                            files.download(excel_filename)
                        if export_option in ['y', 'yes']:
                            files.download(geojson_filename)
                        print("📥 File download completed!")
                except ImportError:
                    print("💡 Files saved locally. Google Colab download not available.")
                except Exception as e:
                    print(f"⚠️ Download failed: {e}")

            print("\n🎉 Enhanced analysis completed successfully!")
            print("💡 This analysis used precise polygon geometries for accurate distance calculations")
            return results, geometry_cache

        else:
            print("\n❌ No matches found.")
            print("💡 Suggestions:")
            print("   • Increase search distance")
            print("   • Add buffer distance")
            print("   • Check coordinate data accuracy")
            print("   • Verify disaster data format")
            return [], {}

    except KeyboardInterrupt:
        print("\n\n⏹️ Analysis interrupted by user")
        return [], {}
    except Exception as e:
        print(f"\n❌ Error during execution: {e}")
        print("💡 Please check file paths and internet connection")
        return [], {}

def quick_enhanced_analysis(csv_path=None, json_path=None):
    """
    Quick enhanced analysis function
    """
    # Use default paths or provided paths
    if csv_path is None:
        csv_path = "/content/merged_full_dataset_with_coordinates.csv"
    if json_path is None:
        json_path = "/content/merged_geojson.json"

    print(f"🚀 Quick Enhanced Analysis Mode")
    print(f"📁 CSV: {csv_path}")
    print(f"📁 JSON: {json_path}")
    print("=" * 50)

    # Get parameters
    while True:
        try:
            distance_input = input("Enter search distance (km): ").strip()
            distance = float(distance_input)
            if distance > 0:
                break
            else:
                print("❌ Distance must be positive")
        except ValueError:
            print("❌ Please enter a valid number")

    while True:
        try:
            buffer_input = input("Enter buffer distance (km) [0]: ").strip()
            if not buffer_input:
                buffer = 0
                break
            buffer = float(buffer_input)
            if buffer >= 0:
                break
            else:
                print("❌ Buffer must be non-negative")
        except ValueError:
            print("❌ Please enter a valid number")

    results, geometry_cache = analyze_disaster_proximity_enhanced(csv_path, json_path, distance, buffer)

    if results:
        print(f"\n📊 Found {len(results)} matches")

        # Save results
        save_option = input("Save to files? [Y/n]: ").lower().strip()
        if save_option in ['', 'y', 'yes']:
            # JSON output
            json_output_file = f"enhanced_disaster_analysis_{distance}km_buffer{buffer}km.json"
            save_enhanced_results(results, geometry_cache, json_output_file)

            # Excel output
            excel_option = input("Also save as Excel? [Y/n]: ").lower().strip()
            if excel_option in ['', 'y', 'yes']:
                excel_output_file = f"enhanced_disaster_analysis_{distance}km_buffer{buffer}km.xlsx"
                save_results_to_excel(results, geometry_cache, excel_output_file)

        return results, geometry_cache
    else:
        print("❌ No matches found")
        return [], {}

# Execute in Google Colab
if __name__ == "__main__":
    print("🌍 Enhanced Disaster Location Analysis Tool - Google Colab Edition")
    print("=" * 70)
    print("🎯 This version uses precise polygon geometries instead of simple bounding boxes")
    print("📋 Execution Options:")
    print("   1. 🎮 Interactive Enhanced Analysis (Recommended)")
    print("   2. ⚡ Quick Enhanced Analysis")
    print("   3. 📖 View Instructions")
    print("=" * 70)

    try:
        mode = input("Select execution mode [1/2/3]: ").strip()

        if mode == "1":
            print("\n🎮 Starting Interactive Enhanced Analysis...")
            results, geometry_cache = main_enhanced()
        elif mode == "2":
            print("\n⚡ Quick Enhanced Analysis Mode")
            print("💡 Using default file paths, use interactive mode to modify")
            results, geometry_cache = quick_enhanced_analysis()
        else:
            print("\n📖 Instructions:")
            print("=" * 40)
            print("🔧 Manual Function Calls:")
            print("   • main_enhanced() - Full interactive enhanced analysis")
            print("   • quick_enhanced_analysis() - Quick enhanced analysis")
            print("   • analyze_disaster_proximity_enhanced(csv_path, json_path, distance, buffer) - Core enhanced analysis")
            print("\n📁 File Requirements:")
            print("   • CSV: id, name, address, country, type, latitude, longitude")
            print("   • JSON: GeoJSON format with disaster features and URL information for geometry fetching")
            print("\n🆕 Enhanced Features:")
            print("   • Fetches actual polygon geometries from GDACS API")
            print("   • Unifies multiple episodes into single event geometry")
            print("   • Calculates precise distance to polygon boundaries")
            print("   • Optional buffer zones around disaster areas")
            print("   • Exports unified geometries as GeoJSON")
            print("   • Comprehensive Excel reports with multiple analysis sheets")
            print("\n💡 Example:")
            print("   results, cache = quick_enhanced_analysis('/content/data.csv', '/content/disasters.json')")
            print("\n⚠️  Requirements:")
            print("   • Internet connection (for API calls)")
            print("   • Python packages: shapely, requests, openpyxl")
            print("   • Install with: !pip install shapely requests openpyxl")

    except KeyboardInterrupt:
        print("\n👋 Program terminated")
    except Exception as e:
        print(f"\n❌ Error occurred: {e}")
        print("💡 Please restart or use main_enhanced() function")
        print("💡 Make sure to install required packages: !pip install shapely requests openpyxl")

🌍 Enhanced Disaster Location Analysis Tool - Google Colab Edition
🎯 This version uses precise polygon geometries instead of simple bounding boxes
📋 Execution Options:
   1. 🎮 Interactive Enhanced Analysis (Recommended)
   2. ⚡ Quick Enhanced Analysis
   3. 📖 View Instructions
Select execution mode [1/2/3]: 1

🎮 Starting Interactive Enhanced Analysis...
🌍 Enhanced Disaster Location Analysis Tool
🎯 This version uses precise polygon geometries instead of bounding boxes
📁 File Setup
📂 Found the following files in /content directory:
   1. merged_full_dataset_with_coordinates.csv
   2. merged_data.csv
   3. merged_geojson.json

🗂️ CSV File Setup:
Choose option [1]Upload new file [2]Use existing file [3]Manual path: 2
Select CSV file:
   1. merged_full_dataset_with_coordinates.csv
   2. merged_data.csv
Enter file number: 1

🗂️ JSON Disaster Data File Setup:
Choose option [1]Upload new file [2]Use existing file [3]Manual path: 2
Select JSON file:
   1. merged_geojson.json
Enter file number: 1

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

📥 File download completed!

🎉 Enhanced analysis completed successfully!
💡 This analysis used precise polygon geometries for accurate distance calculations
