# Workout Data Analysis Notebook

This notebook provides an interactive way to query your workout database using natural language questions.

The notebook uses the `workout_assistant.py` module we created to convert your questions into SQL queries and display the results.

In [30]:
# Import required libraries
import sys
import os
import duckdb
import pandas as pd
import importlib

# Import our workout assistant functions
import workout_assistant
importlib.reload(workout_assistant)  # Force reload to pick up changes
from workout_assistant import generate_sql_fallback, execute_query, format_results

print("Workout Assistant loaded successfully!")
print("You can now ask natural language questions about your workout data.")

Using enhanced fallback SQL generation (no AI API key found)
Tip: Set OPENAI_API_KEY or ANTHROPIC_API_KEY environment variable for smarter queries
Workout Assistant loaded successfully!
You can now ask natural language questions about your workout data.


## Quick Database Overview

Let's first take a look at the structure and recent data in your workout database.

In [2]:
# Connect to database and show basic info
con = duckdb.connect('c2_data.db')

# Show total number of workouts
total_workouts = con.execute("SELECT COUNT(*) as total FROM v_summary_workouts").fetchone()[0]
print(f"Total workouts in database: {total_workouts}")

# Show date range
date_range = con.execute("SELECT MIN(Date) as earliest, MAX(Date) as latest FROM v_summary_workouts").fetchone()
print(f"Date range: {date_range[0]} to {date_range[1]}")

# Show equipment types
equipment_counts = con.execute("SELECT Type, COUNT(*) as count FROM v_summary_workouts GROUP BY Type").df()
print("\nWorkouts by equipment type:")
print(equipment_counts)

con.close()

Total workouts in database: 221
Date range: 2024-04-03 05:48:00 to 2025-09-04 06:04:00

Workouts by equipment type:
     Type  count
0  SkiErg     44
1  RowErg    177


## Interactive Question Function

This function allows you to ask natural language questions about your workout data.

In [31]:
def ask_workout_question(question):
    """
    Ask a natural language question about your workout data.
    
    Args:
        question (str): Your question in natural language
    
    Returns:
        Displays the SQL query generated and the results
    """
    print(f"Question: {question}")
    print("=" * (len(question) + 10))
    
    # Generate SQL using the fallback method
    sql = generate_sql_fallback(question)
    print(f"\nGenerated SQL:")
    print(sql)
    print()
    
    # Execute the query
    result = execute_query(sql)
    
    if result is not None:
        if len(result) == 0:
            print("No results found.")
        elif len(result) == 1:
            # Single result - show as key-value pairs
            for col in result.columns:
                value = result[col].iloc[0]
                if pd.isna(value):
                    continue
                if isinstance(value, float):
                    if 'distance' in col.lower():
                        print(f"{col}: {value:,.0f} meters")
                    elif 'avg' in col.lower():
                        print(f"{col}: {value:,.1f}")
                    else:
                        print(f"{col}: {value:.2f}")
                else:
                    print(f"{col}: {value}")
        else:
            # Multiple results - show as dataframe
            print(result.to_string(index=False))
    else:
        print("Error executing query.")
    
    print("\n" + "-"*50 + "\n")

# Test the function
print("Function ready! Use ask_workout_question('your question here')")

Function ready! Use ask_workout_question('your question here')


## Example Questions

Here are some example questions you can ask. Run the cells below to see the results!

In [4]:
# Example 1: Find your longest workout
ask_workout_question("What was my longest workout?")

Question: What was my longest workout?

Generated SQL:
SELECT "Date", "Description", "Work Distance", "Work Time (Formatted)", "Type" FROM v_summary_workouts ORDER BY "Work Distance" DESC LIMIT 1;

Date: 2025-02-04 05:59:00
Description: 40:44 row
Work Distance: 10212
Work Time (Formatted): 40:44.7
Type: RowErg

--------------------------------------------------



In [5]:
# Example 2: Find your fastest pace for 5k+ workouts
ask_workout_question("What's my fastest pace for 5k workouts?")

Question: What's my fastest pace for 5k workouts?

Generated SQL:
SELECT "Date", "Description", "Pace", "Work Distance", "Type" FROM v_summary_workouts WHERE "Work Distance" > 5000 ORDER BY "Pace" ASC LIMIT 1;

Date: 2025-03-13 05:49:00
Description: 6000m row
Pace: 1:52.3
Work Distance: 6000
Type: RowErg

--------------------------------------------------



In [None]:
# Example 3: Get statistics for 2025
ask_workout_question("How many workouts did I do in 2025?")

In [None]:
# Example 4: Average distance by equipment type
ask_workout_question("What's my average distance by equipment type?")

In [None]:
# Example 5: Recent workouts
ask_workout_question("Show me my recent workouts")

## Ask Your Own Questions

Use the cell below to ask your own questions about your workout data!

In [28]:
# Test complex date range question
your_question = "What was my total distance in years 2023 - 2025?"
ask_workout_question(your_question)

Question: What was my total distance in years 2023 - 2025?

Generated SQL:
SELECT SUM("Work Distance") AS total_distance, COUNT(*) AS workouts FROM v_summary_workouts WHERE strftime('%Y', "Date") BETWEEN '2023' AND '2025';

total_distance: 1,257,978 meters
workouts: 221

--------------------------------------------------



In [32]:
# Test different date range format
ask_workout_question("Show me workout stats from 2024 to 2025")

Question: Show me workout stats from 2024 to 2025

Generated SQL:
SELECT strftime('%Y', "Date") AS year, COUNT(*) AS workouts, AVG("Work Distance") AS avg_distance, SUM("Work Distance") AS total_distance FROM v_summary_workouts WHERE strftime('%Y', "Date") BETWEEN '2024' AND '2025' GROUP BY year ORDER BY year;

year  workouts  avg_distance  total_distance
2024        90   6611.044444        594994.0
2025       131   5060.946565        662984.0

--------------------------------------------------

