# üåç Geospatial Analysis

Where do the rides actually go? In this notebook, we calculate the physical distance of each trip using the Haversine formula. We'll also identify 'Leisure Loops'‚Äîtrips that start and end at the same location‚Äîwhich are often a key indicator of recreational rather than utility-based riding.

### 1. Tools and Data
We use NumPy for the heavy-duty trigonometric calculations needed for distance mapping on a sphere. We'll be working with the trip data enriched with coordinates from our pipeline.

In [1]:
import pandas as pd
import numpy as np
from pathlib import Path

In [2]:
DATA_DIR = Path("../data/processed")
input_path = DATA_DIR / "fact_trips.csv"
output_path = DATA_DIR / "fact_trips_geo.csv"

df = pd.read_csv(input_path)
df = df.dropna(subset=['start_lat', 'start_lng', 'end_lat', 'end_lng'])

### 2. Calculating Trip Distance
We implement the Haversine formula to find the 'as-the-crow-flies' distance between start and end stations. We also flag any trips that return to the starting station as potential leisure rides.

In [3]:
def haversine(lat1, lon1, lat2, lon2):
    lat1, lon1, lat2, lon2 = map(np.radians, [lat1, lon1, lat2, lon2])
    dlat = lat2 - lat1
    dlon = lon2 - lon1
    a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2
    c = 2 * np.arcsin(np.sqrt(a))
    return 6371 * c

df['dist_km'] = haversine(df['start_lat'], df['start_lng'], df['end_lat'], df['end_lng'])
df['is_leisure_loop'] = (df['dist_km'] < 0.05) & (df['start_station_name'] == df['start_station_name'])

### 3. Saving Geospatial Insights
Finally, we save these new metrics. This data will be crucial for understanding the behavioral differences between commuters (who usually have a clear destination) and casual riders (who might be exploring).

In [4]:
df[['ride_id', 'dist_km', 'is_leisure_loop']].to_csv(output_path, index=False)

print("-" * 40)
print(f"\u2705 SUCCESS: Geospatial metrics saved to {output_path}")
print(f"Leisure Loops identified: {df['is_leisure_loop'].sum():,}")

----------------------------------------
‚úÖ SUCCESS: Geospatial metrics saved to ..\data\processed\fact_trips_geo.csv
Leisure Loops identified: 442,193
