# Training with Zwift

> "Every time I see an adult on a bicycle, I no longer despair for the future of the human race." 
<br>**H. G. Wells**<br><br>
“Learn to ride a bicycle. You will not regret it if you live.” 
<br>**Mark Twain**

Biking is the only childhood hobby I've continued to enjoy throughout my life. I remember riding in the local Fourth of July parade in elementary school. As a college freshman, I salvaged a '70s Univega ("Maria"), building it up from the frame and customizing it for the next few years. I rode that bike everywhere. It was stolen while I was in grad school; plucked off the porch at a BBQ. Since then, I've had a [2017 Fuji Touring Bike](https://www.cyclingabout.com/2017-fuji-touring-bike/) ("björn"). björn has never failed me. 

Most of my riding over the past few years has been indoors due to COVID and a shoulder injury. I joined [Zwift](https://us.zwift.com/) last year on the recommendation of a friend. Zwift gamifies bike training by offering an array of virtual routes that vary in difficulty. I'm currently using the app to train for a bikepack trip [~1400 miles along the Pacific Coast](https://www.adventurecycling.org/routes-and-maps/adventure-cycling-route-network/interactive-network-map/) next summer. 

One of Zwift's best features is their ability to create insightful and intuitive visualizations from user data, acquired from a bluetooth-enabled bike trainer. I recently learned that users can download their ride data, providing an opportunity to further characterize my training experience.<br><br>

**GOAL:** *(1)* Characterize bike training patterns over time, starting from March 1, 2022; and *(2)* describe anticipated level of functioning on May 1, 2023.<br>
**DATA:** Ride data downloaded from personal Zwift (v5.62) account.<br>
**ANALYSIS:** Exploratory data analysis; Bayesian modeling.<br>
**ETHICAL CONSIDERATIONS:** There are no apparent issues with transparency, accountability, or equity in terms of avaiable data. To avoid any unforseen privacy issues, I will not be posting the raw ride data on Github. I will do my best to characterize the data in this Notebook to justify any insights drawn from the analyses.<br>
**ADDITIONAL CONSIDERATIONS:** None.<br>

## Load libraries

In [1]:
import os

# data wrangling/analysis
import fitdecode
import pandas as pd
import numpy as np

# data visualization
import matplotlib.pyplot as plt
import seaborn as sns

I am going to follow [this tutorial](https://towardsdatascience.com/parsing-fitness-tracker-data-with-python-a59e7dc17418) to parse .fit files (downloaded from Zwift) using the [`fitdecode`](https://pypi.org/project/fitdecode/) library.

In [2]:
with fitdecode.FitReader('2022-07-12-07-31-42.fit') as fit:
    for frame in fit:
        if isinstance(frame, fitdecode.records.FitDataMessage):
            
            #if frame.name == 'file_id':
            #    for field in frame.fields:
            #        print(field.name) 
            
            if frame.name == 'session':
                print(frame.get_value('total_timer_time'))
                print(frame.get_value('total_distance'))
                print(frame.get_value('avg_speed'))
                print(frame.get_value('max_speed'))
                print(frame.get_value('avg_power'))
                print(frame.get_value('max_power'))
                print(frame.get_value('total_ascent'))
                print(frame.get_value('total_descent'))
                print(frame.get_value('avg_cadence'))
                print(frame.get_value('max_cadence'))
                        
            #if frame.name == 'record':
            #    print(frame.get_value('distance'))
            #    print(frame.get_value('speed'))
            #    print(frame.get_value('altitude'))
            #    print(frame.get_value('power'))
            #    print(frame.get_value('grade'))
            #    print(frame.get_value('cadence'))

1244.0
10984.27
8.829
13.783
156
191
61
0
88
109


> ***
`file_id` serial_number, time_created, manufacturer, product, number, type
> ***
`device_info` timestamp, serial_number, cum_operating_time, manufacturer, product, software_version, battery_voltage, device_index, device_type, hardware_version, battery_status
> ***
`event` timestamp, timer_trigger, timer_trigger, data16, event, event_type, event_group

**Note:** There is an `event` frame at the beginning and end of the second-by-second `record` frames.
> ***
`record`  timestamp, position_lat, position_long, ***distance***, time_from_course, ***speed***, distance, compressed_speed_distance, heart_rate, enhanced_altitude, ***altitude***, enhanced_speed, speed, ***power, grade, cadence***, resistance, cycle_length, temperature

**Note:** There is a `record` frame for each second of data recorded.
> ***
`lap` timestamp, ***start_time***, start_position_lat, start_position_long, end_position_lat, end_position_long, total_elapsed_time, ***total_timer_time, total_distance***, total_strokes, message_index, total_calories, total_fat_calories, enhanced_avg_speed, ***avg_speed***, enhanced_max_speed, ***max_speed, avg_power, max_power, total_ascent, total_descent***, event, event_type, avg_heart_rate, max_heart_rate, ***avg_cadence, max_cadence***, intensity, lap_trigger, sport, event_group
> ***
`session` timestamp, ***start_time***, start_position_lat, start_position_long, total_elapsed_time, ***total_timer_time, total_distance***, total_strokes, nec_lat, nec_long, swc_lat, swc_long, message_index, total_calories, total_fat_calories, enhanced_avg_speed, ***avg_speed***, enhanced_max_speed, ***max_speed, avg_power, max_power, total_ascent, total_descent***, first_lap_index, num_laps, event, event_type, sport, sub_sport, avg_heart_rate, max_heart_rate, ***avg_cadence, max_cadence***, total_training_effect, event_group, trigger
> ***
`activity` timestamp, total_timer_time, local_timestamp, num_sessions, type, event, event_type, event_group
***

Most of the data in these frames are either redundant or irrelevant to this project. Summary data from the training session is located in the `lap` and `session` frames, which contain many of the same variables. They can largely be used interchangably depending on preference. I'll use the `session` frame for now. Second-by-second data on a number of variables (e.g., speed, cadence) is available in the `record` frames. 

Data in these frames can be accessed using the `fitdecode` argument `get_values`.