# A3: Geo-Python Notebook: Project Outline

### This notebook aims to process a GPX file to extract some basic activity statistics and create a heat map of the route based on speed.
### The notebook will use gpxpy, pandas, folium and a heat map plugin to execute the task. More details on each of these libraries are included below.

#### Links to the libraries used in this notebook
- **gpxpy**: [https://github.com/tkrajina/gpxpy](https://github.com/tkrajina/gpxpy) - Python library for parsing and manipulating GPX files
- **pandas**: [https://pandas.pydata.org/docs/](https://pandas.pydata.org/docs/) - Data manipulation and analysis
- **folium**: [https://python-visualization.github.io/folium/](https://python-visualization.github.io/folium/) - Produces maps in python
- **HeatMap plugin**: [https://python-visualization.github.io/folium/plugins.html#folium.plugins.HeatMap](https://python-visualization.github.io/folium/plugins.html#folium.plugins.HeatMap) - Visualises the heat map data

In [16]:
# Install and Import the required libraries
!pip install gpxpy folium
import gpxpy
import pandas as pd
import folium
from folium.plugins import HeatMap
import math



### The next step is to load in the GPX File
#### **gpxpy.parse()** ([documentation here](https://gpxpy.readthedocs.io/en/latest/gpx.html#gpxpy.gpx.GPX.parse)) is used to read the GPX file and extract GPS track data.

### Here I am testing with a hike I recently completed, but "gpx_file" could be changed to any GPX file

In [17]:
gpx_file = "gaisberg.gpx"

with open(gpx_file, 'r') as f:
    gpx = gpxpy.parse(f)

print(f"Loaded {gpx_file}")

Loaded gaisberg.gpx


### Next, the GPS data will be extracted and some stats can then be calculated from it

In [7]:
def haversine_distance(lat1, lon1, lat2, lon2):
    R = 6371000  # Earth radius in meters
    lat1, lon1, lat2, lon2 = map(math.radians, [lat1, lon1, lat2, lon2])
    dlat = lat2 - lat1
    dlon = lon2 - lon1
    a = math.sin(dlat/2)**2 + math.cos(lat1) * math.cos(lat2) * math.sin(dlon/2)**2
    return R * 2 * math.asin(math.sqrt(a))

# Extract all points
points = []
for track in gpx.tracks:
    for segment in track.segments:
        for point in segment.points:
            points.append({
                'lat': point.latitude,
                'lon': point.longitude,
                'elevation': point.elevation or 0,
                'time': point.time
            })

# Create DataFrame and calculate distances/speeds
df = pd.DataFrame(points)
df['distance'] = 0
df['speed_kmh'] = 0
total_distance = 0

for i in range(1, len(df)):
    dist = haversine_distance(df.iloc[i-1]['lat'], df.iloc[i-1]['lon'],
                             df.iloc[i]['lat'], df.iloc[i]['lon'])
    df.iloc[i, df.columns.get_loc('distance')] = dist
    total_distance += dist
    
    # Calculate speed if we have time data
    if df.iloc[i]['time'] and df.iloc[i-1]['time']:
        time_diff = (df.iloc[i]['time'] - df.iloc[i-1]['time']).total_seconds()
        if time_diff > 0:
            speed = (dist / time_diff) * 3.6  # Convert to km/h
            df.iloc[i, df.columns.get_loc('speed_kmh')] = speed

# Calculate statistics
distance_km = total_distance / 1000

# Duration
if df['time'].notna().any():
    start_time = df['time'].dropna().iloc[0]
    end_time = df['time'].dropna().iloc[-1]
    duration = end_time - start_time
    duration_hours = duration.total_seconds() / 3600
else:
    duration_hours = 0

# Speed stats
avg_speed = df[df['speed_kmh'] > 0]['speed_kmh'].mean()
max_speed = df['speed_kmh'].max()

# Elevation stats
min_elevation = df['elevation'].min()
max_elevation = df['elevation'].max()
elevation_gain = df['elevation'].diff()[df['elevation'].diff() > 0].sum()
elevation_loss = abs(df['elevation'].diff()[df['elevation'].diff() < 0].sum())

# Print results

print("\n=== HIKE STATISTICS ===")
print(f"Total points: {len(df)}")
print(f"Distance: {distance_km:.2f} km")
print(f"Duration: {duration_hours:.2f} hours")
print(f"Average Speed: {avg_speed:.2f} km/h")
print(f"Max Speed: {max_speed:.2f} km/h")
print(f"Elevation: {min_elevation:.0f}m - {max_elevation:.0f}m")
print(f"Elevation Gain: {elevation_gain:.0f}m")
print(f"Elevation Loss: {elevation_loss:.0f}m")


=== HIKE STATISTICS ===
Total points: 2660
Distance: 10.78 km
Duration: 2.39 hours
Average Speed: 4.98 km/h
Max Speed: 14.20 km/h
Elevation: 632m - 1285m
Elevation Gain: 820m
Elevation Loss: 816m


### Finally, a heat map can be created using folium and the heat map extension

In [18]:
# Set the map to centre on the location of the GPS file
center_lat = df['lat'].mean()
center_lon = df['lon'].mean()

m = folium.Map(location=[center_lat, center_lon], zoom_start=15)

# Add track line
track_points = [[row['lat'], row['lon']] for _, row in df.iterrows()]
folium.PolyLine(track_points, color='black', weight=2).add_to(m)

# Add speed heatmap
speed_data = []
speeds = df[df['speed_kmh'] > 0]['speed_kmh']
if not speeds.empty:
    min_speed, max_speed = speeds.min(), speeds.max()
    for _, row in df.iterrows():
        if row['speed_kmh'] > 0:
            intensity = (row['speed_kmh'] - min_speed) / (max_speed - min_speed) if max_speed > min_speed else 0.5
            speed_data.append([row['lat'], row['lon'], intensity])

if speed_data:
    HeatMap(speed_data, radius=12, blur=10).add_to(m)

# Add start/end markers
folium.Marker([df.iloc[0]['lat'], df.iloc[0]['lon']], 
              popup='Start', icon=folium.Icon(color='green')).add_to(m)
folium.Marker([df.iloc[-1]['lat'], df.iloc[-1]['lon']], 
              popup='End', icon=folium.Icon(color='red')).add_to(m)

# Display map
m