<a href="https://colab.research.google.com/github/karlbuscheck/ode-to-gravel/blob/main/ode_to_gravel.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# An Ode to Gravel

There's nothing better than the sound of gravel crunching under your tires.

My gravel bike, which I bought over the summer is, undoubtedly, the best purchase I've ever made. This weekend, while riding in the Marin Headlands, I had an idea: I wonder what I'd find if I downloaded all my performance data from Strave, dumped it into a Colab notebook and started digging around?

And so, later that night when I got home, I did.

What follows is a look back at eight months of gravel riding, 90 rides to be exact. I asked a few simple questions: how far did I ride, how much time did I spend on my bike, and how high did I climb? And then I ended with a fun one: what day of the week was my most prolific?

## Load and explore the data

In [5]:
# Import the file -- the "activities" CSV from Strava
from google.colab import files
uploaded = files.upload()

Saving activities.csv to activities.csv


In [6]:
# Load the CSV as a DataFrame
import pandas as pd
activities = pd.read_csv('activities.csv')
activities.head()

Unnamed: 0,Activity ID,Activity Date,Activity Name,Activity Type,Activity Description,Elapsed Time,Distance,Max Heart Rate,Relative Effort,Commute,...,Timer Time,Total Cycles,Recovery,With Pet,Competition,Long Run,For a Cause,With Kid,Downhill Distance,Media
0,502826513,"Feb 26, 2016, 2:59:42 PM",The run that came before all the other ones,Run,,9057,18.04,,,False,...,,,,,,,,,,media/59D3D043-97E2-446C-B0E2-966496F2B7A8.jpg
1,503758481,"Feb 27, 2016, 6:35:59 PM",Saturday Morning Gym Cool Down,Run,,1528,4.0,,,False,...,,,,,,,,,,
2,505587866,"Feb 29, 2016, 5:01:00 PM",Out and back to King's Landing with a lake lap...,Run,,8900,15.9,,,False,...,,,,,,,,,,
3,507324371,"Mar 3, 2016, 12:31:16 AM",Down to the harbor,Run,,3038,8.54,,,False,...,,,,,,,,,,
4,508549940,"Mar 4, 2016, 5:18:00 PM",Through the Port,Run,,6080,13.52,,,False,...,,,,,,,,,,


In [8]:
# Check the shape of the DataFrame
activities.shape

(784, 101)

So, we have 784 activites and 101 columns. Let's take a look more closely at those columns.

In [9]:
# List the column names
activities.columns

Index(['Activity ID', 'Activity Date', 'Activity Name', 'Activity Type',
       'Activity Description', 'Elapsed Time', 'Distance', 'Max Heart Rate',
       'Relative Effort', 'Commute',
       ...
       'Timer Time', 'Total Cycles', 'Recovery', 'With Pet', 'Competition',
       'Long Run', 'For a Cause', 'With Kid', 'Downhill Distance', 'Media'],
      dtype='object', length=101)

To pare things down to gravel rides, lets take a look at the `Activity Type` column.

In [10]:
# Check the 'Activity Type' column
activities['Activity Type'].value_counts()

Unnamed: 0_level_0,count
Activity Type,Unnamed: 1_level_1
Run,292
Walk,240
Hike,148
Ride,90
Snowboard,8
Sail,4
Kayaking,1
Snowshoe,1


Not, to isolate those rides.

In [12]:
gravel = activities[activities['Activity Type'] == 'Ride'].copy()
gravel.shape

(90, 101)

As expected, there are 90 rides. This dataset goes way back in time, about 10 years. So, let's make sure we just grab the rides that occured after June 25, 2026 the day I logged my first ride on my neew gravel bike.

In [15]:
# Set the date column as datetime
gravel['Activity Date'] = pd.to_datetime(gravel['Activity Date'])

  gravel['Activity Date'] = pd.to_datetime(gravel['Activity Date'])


In [16]:
# Now, filter for the date
gravel = gravel[gravel['Activity Date'] >= '2025-06-25']
gravel.shape

(90, 101)

Looks like there were no rides to filter out, after all. Let's do a double check.

In [17]:
# Double check that there are no rides to fitler out
gravel["Activity Date"].min(), gravel["Activity Date"].max()

(Timestamp('2025-06-25 23:57:00'), Timestamp('2026-02-21 22:53:26'))

Now, to dig into the `gravel` DataFrame.

In [19]:
# Display the first few rows
gravel.head()

Unnamed: 0,Activity ID,Activity Date,Activity Name,Activity Type,Activity Description,Elapsed Time,Distance,Max Heart Rate,Relative Effort,Commute,...,Timer Time,Total Cycles,Recovery,With Pet,Competition,Long Run,For a Cause,With Kid,Downhill Distance,Media
538,14917564412,2025-06-25 23:57:00,First official ride on the new gravel bike,Ride,,5031,16.43,,,False,...,,,,,,,,,,media/FC371489-4245-47F1-BDAC-C468700FA19E.jpg
539,14923903448,2025-06-26 16:00:49,Morning ride to Crissy Field by way of the Wav...,Ride,,3966,10.02,,,False,...,,,,,,,,,,media/5B29C802-96C1-4037-BA1A-E471A4925E4C.jpg...
540,14933777181,2025-06-27 15:32:46,Riding past Crissy Field to the start of the b...,Ride,,4681,11.61,,,False,...,,,,,,,,,,media/D6E87780-0214-4F48-A326-88F7E3864465.jpg...
541,14936000026,2025-06-27 20:38:33,Quick spin to Sports Basement,Ride,,5856,10.22,,,False,...,,,,,,,,,,media/16F18F41-1D57-4321-8535-94054103A303.jpg
543,14957039453,2025-06-29 19:38:45,Over the bridge and through the fog: Riding to...,Ride,,4937,19.69,,,False,...,,,,,,,,,,media/17199BF1-B6A8-42C6-9CC3-6CA3156D84FB.jpg...


Let's begin seeing how many miles I've ridden. The `Distance` column immediately presents a challenge as it's being logged in kilometers, not miles.

## Dig into the stats: How many miles have I rode?

We'll turn kilometers into miles with a quick line of code.

In [20]:
# Change kilometers to miles
gravel['Distance_miles'] = gravel['Distance'] * 0.621371

In [21]:
# Check to make sure we have the new column
gravel.head()

Unnamed: 0,Activity ID,Activity Date,Activity Name,Activity Type,Activity Description,Elapsed Time,Distance,Max Heart Rate,Relative Effort,Commute,...,Total Cycles,Recovery,With Pet,Competition,Long Run,For a Cause,With Kid,Downhill Distance,Media,Distance_miles
538,14917564412,2025-06-25 23:57:00,First official ride on the new gravel bike,Ride,,5031,16.43,,,False,...,,,,,,,,,media/FC371489-4245-47F1-BDAC-C468700FA19E.jpg,10.209126
539,14923903448,2025-06-26 16:00:49,Morning ride to Crissy Field by way of the Wav...,Ride,,3966,10.02,,,False,...,,,,,,,,,media/5B29C802-96C1-4037-BA1A-E471A4925E4C.jpg...,6.226137
540,14933777181,2025-06-27 15:32:46,Riding past Crissy Field to the start of the b...,Ride,,4681,11.61,,,False,...,,,,,,,,,media/D6E87780-0214-4F48-A326-88F7E3864465.jpg...,7.214117
541,14936000026,2025-06-27 20:38:33,Quick spin to Sports Basement,Ride,,5856,10.22,,,False,...,,,,,,,,,media/16F18F41-1D57-4321-8535-94054103A303.jpg,6.350412
543,14957039453,2025-06-29 19:38:45,Over the bridge and through the fog: Riding to...,Ride,,4937,19.69,,,False,...,,,,,,,,,media/17199BF1-B6A8-42C6-9CC3-6CA3156D84FB.jpg...,12.234795


There it is! And now to quickly sum it up.

In [25]:
# Sum up the total number of miles
gravel['Distance_miles'].sum()

np.float64(997.8410477699999)

Look at that -- right on the doorstep of **1K miles in eight months**. Not bad considering I was off my bike for *a while* when I was recovering from thumb surgery.

In [26]:
# Display the total number of gravel miles
total_miles = gravel["Distance_miles"].sum()
print(f"{total_miles:,.1f} total gravel miles")

997.8 total gravel miles


Let's figure out just how much time I missed.

In [32]:
# Sort chronologically
gravel = gravel.sort_values("Activity Date")

In [33]:
# Compute the days between rides
gravel["Days_since_last_ride"] = gravel["Activity Date"].diff().dt.days

In [34]:
# Find the biggest gap
gravel["Days_since_last_ride"].max()

76.0

Whoa. 76 days. That's two-and-a-half months.

In [35]:
# Confirm the exact timing
gravel.loc[
    gravel["Days_since_last_ride"].idxmax(),
    ["Activity Date", "Days_since_last_ride"]
]

Unnamed: 0,696
Activity Date,2025-12-09 21:25:22
Days_since_last_ride,76.0


So, as it turns out, it was **997.8 miles in 5.5 months**.

In [36]:
# Determine how many miles I averaged per (active) month
total_days = (gravel["Activity Date"].max() - gravel["Activity Date"].min()).days
gap = gravel["Days_since_last_ride"].max()

active_days = total_days - gap
active_months = active_days / 30

total_miles = gravel["Distance_miles"].sum()

miles_per_active_month = total_miles / active_months

print(f"Active riding months: {active_months:.1f}")
print(f"Miles per active month: {miles_per_active_month:.1f}")

Active riding months: 5.5
Miles per active month: 182.5


182.5 miles per *active* month. Next, let's check how much time I spent on the bike.

## How many hours?

First, we need to consider the tradeoff between elapsed time and moving time.

In [27]:
# Check for all the time columns
[col for col in gravel.columns if "Time" in col]

['Elapsed Time',
 'Elapsed Time.1',
 'Moving Time',
 'Uphill Time',
 'Downhill Time',
 'Other Time',
 'Start Time',
 'Weather Observation Time',
 'Sunrise Time',
 'Sunset Time',
 'Timer Time']

We'll focus on elapsed time and moving time and change them into hours.

In [28]:
# Up first, moving time
gravel['Hours_moving'] = gravel['Moving Time'] / 3600
gravel['Hours_moving'].sum()

np.float64(109.465)

In [30]:
# Now, elapsed time
gravel['Hours_elapsed'] = gravel['Elapsed Time'] / 3600
gravel['Hours_elapsed'].sum()

np.float64(136.86499999999998)

Finally, let's print it all out neatly and find the difference, or how much time I spent sitting at Equator drinking iced coffee or catching my breath in the Marin Headlands.

In [31]:
# Display the clean outputs
moving = gravel['Moving Time'].sum() / 3600
elapsed = gravel['Elapsed Time'].sum() / 3600

print(f'Moving hours: {moving:,.1f}')
print(f'Elapsed hours: {elapsed:,.1f}')
print(f'Stopped time difference: {elapsed - moving:,.1f} hours')

Moving hours: 109.5
Elapsed hours: 136.9
Stopped time difference: 27.4 hours


To wrap this section, we'll break it down on a weekly and monthly bais -- accounting for that 76-day layoff.

In [41]:
# Display the weekly and monthly counts
active_weeks = active_days / 7

moving_hours  = gravel['Moving Time'].sum() / 3600
elapsed_hours = gravel['Elapsed Time'].sum() / 3600

print(f'Avg MOVING hours / active week:  {moving_hours/active_weeks:.2f}')
print(f'Avg ELAPSED hours / active week: {elapsed_hours/active_weeks:.2f}')

print(f'Avg MOVING hours / active month:  {moving_hours/active_months:.1f}')
print(f'Avg ELAPSED hours / active month: {elapsed_hours/active_months:.1f}')

Avg MOVING hours / active week:  4.67
Avg ELAPSED hours / active week: 5.84
Avg MOVING hours / active month:  20.0
Avg ELAPSED hours / active month: 25.0


20 hours a month on the bike (plus an extra five hours for iced coffee and the Headlands views) not too bad!

## How many feet did I climb?

First, let's find the proper elevation column.

In [45]:
# Search for all the elevation columns
[col for col in gravel.columns if 'Elevation' in col]

['Elevation Gain', 'Elevation Loss', 'Elevation Low', 'Elevation High']

In [51]:
# Take a closer look at 'Elevation Gain'
gravel['Elevation Gain'].sort_values()

Unnamed: 0,Elevation Gain
584,9.2
607,11.9
744,14.4
554,22.5
595,26.0
...,...
570,507.4
580,533.3
573,590.4
583,692.0


The table above confirms that Strava is measuring `Elevation Gain` in meters -- not feet. 1 meter = 3.28084 feet. So, we can quickly change the units and sum this up.

In [53]:
# Total elevation gain in meters
total_gain_m = gravel['Elevation Gain'].sum()

# Chain the gain from meters to feet
total_gain_ft = total_gain_m * 3.28084

# Display the final output
print(f'Total elevation gain: {total_gain_ft:,.0f} feet')

Total elevation gain: 59,789 feet


To put that into context, that's like climbing from sea level to the top of Mount Everest... twice.

Finally, let's figure out the averges per ride, week and month.

In [59]:
# Let's compute the elevation averages

# Total elevation gain (already computed earlier)
# total_gain_ft
# active_weeks
# active_months

# Define /the ride count
num_rides = len(gravel)

# Define theaverages
avg_gain_per_ride = total_gain_ft / num_rides
avg_gain_per_active_week = total_gain_ft / active_weeks
avg_gain_per_active_month = total_gain_ft / active_months

# Display the final results
print(f'Avg elevation gain per ride: {avg_gain_per_ride:,.0f} ft')
print(f'Avg elevation gain per active week: {avg_gain_per_active_week:,.0f} ft')
print(f'Avg elevation gain per active month: {avg_gain_per_active_month:,.0f} ft')

Avg elevation gain per ride: 664 ft
Avg elevation gain per active week: 2,552 ft
Avg elevation gain per active month: 10,937 ft


## What day of the week is my most prolific?

Let's conclude this on a fun note and calculate which day of the week is my best in terms of mileage, elevation gain and moving time.

In [62]:
# Extract weekday name from 'Activity Data' timestamp
gravel['Weekday'] = pd.to_datetime(gravel['Activity Date']).dt.day_name()

In [72]:
# Define weekday order
weekday_order = [
    "Monday", "Tuesday", "Wednesday",
    "Thursday", "Friday", "Saturday", "Sunday"
]

# Change to ordered categorical
gravel["Weekday"] = pd.Categorical(
    gravel["Weekday"],
    categories=weekday_order,
    ordered=True
)

In [73]:
# Group totals by weekday
weekday_summary = (
    gravel
    .groupby('Weekday')
    .agg({
        'Distance_miles': 'sum',
        'Elevation Gain': 'sum',
        'Moving Time': 'sum'
    })
)

# Change moving time to hours
weekday_summary['Moving_Hours'] = weekday_summary['Moving Time'] / 3600

# Chnage elevation gain to feet in summary table
weekday_summary["Elevation_Gain_ft"] = weekday_summary["Elevation Gain"] * 3.28084

# Reorder columns for clarity
weekday_summary = weekday_summary[[
    "Distance_miles",
    "Elevation_Gain_ft",
    "Moving_Hours"
]]

# Display the day-by-day table
weekday_summary

  .groupby('Weekday')


Unnamed: 0_level_0,Distance_miles,Elevation_Gain_ft,Moving_Hours
Weekday,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Monday,215.236701,14058.727484,24.226111
Tuesday,104.384114,7266.404432,11.243333
Wednesday,124.690519,8728.346736,15.045278
Thursday,138.851564,5226.37812,13.399167
Friday,158.69194,9123.687956,17.350556
Saturday,167.98765,11528.87176,18.973056
Sunday,87.998561,3856.955504,9.2275


In [71]:
# Find the winners
most_miles_day = weekday_summary["Distance_miles"].idxmax()
most_gain_day  = weekday_summary["Elevation_Gain_ft"].idxmax()
most_hours_day = weekday_summary["Moving_Hours"].idxmax()

# Display the final results
print(f"Most miles ridden on: {most_miles_day} ({weekday_summary.loc[most_miles_day,'Distance_miles']:.1f} mi)")
print(f"Most elevation gained on: {most_gain_day} ({weekday_summary.loc[most_gain_day,'Elevation_Gain_ft']:,.0f} ft)")
print(f"Most hours ridden on: {most_hours_day} ({weekday_summary.loc[most_hours_day,'Moving_Hours']:.1f} hrs)")

Most miles ridden on: Monday (215.2 mi)
Most elevation gained on: Monday (14,059 ft)
Most hours ridden on: Monday (24.2 hrs)


My most prolific day? It's Monday across the board. And that's no accident. I always set out to start my week on the right note: a long ride and a big climb.