# Data Science for Cycling #1 - How To Read GPX Strava Routes With Python
- Notebook 1/6
- Make sure to have `gpxpy` installed:
<br>

```
pip install gpxpy
```

- Let's import the libraries and tweak Matplotlib's default stylings:

In [None]:
%pip install gpxpy

In [None]:
import gpxpy
import gpxpy.gpx

import pandas as pd
import matplotlib.pyplot as plt
plt.rcParams['axes.spines.top'] = False
plt.rcParams['axes.spines.right'] = False

- You can read GPX files with Python's context manager syntax:

In [None]:
!pwd

In [None]:
with open('../4090825616.gpx', 'r') as gpx_file:
    gpx = gpxpy.parse(gpx_file)

In [None]:
with open('../2875303142.gpx', 'r') as gpx_file:
    gpx = gpxpy.parse(gpx_file)

In [None]:
with open('../3752723542.gpx', 'r') as gpx_file:
    gpx = gpxpy.parse(gpx_file)

- It's a specific GPX object:

In [None]:
gpx

- Get the number of data points (number of times geolocation was taken):

In [None]:
gpx.get_track_points_no()

- Get the minimum and maximum altitudes:

In [None]:
gpx.get_elevation_extremes()

- Get the number of meters of uphil and downhil ride
- It's a roundtrip, so the numbers should be almost identical

In [None]:
gpx.get_uphill_downhill()

- You can dump the entire GPX file to XML
- Here are the first 1000 characters:

In [None]:
gpx.to_xml()[:10000]

<br>

## Basic analysis
- There's only one track available in the file
- Access it with Python's list indexing syntax:

In [None]:
len(gpx.tracks)#[0]

In [None]:
gpx.tracks[0]

- The track has only one segment - access it the same way:

In [None]:
len(gpx.tracks[0].segments)

In [None]:
gpx.tracks[0].segments[0]

In [None]:
len(gpx.tracks[0].segments[0].points)

- The segment has 31202 data points
- Here are the first 10:

In [None]:
gpx.tracks[0].segments[0].points[:10]

- Let's now extract all dat apoints
- Store latitude, longitude, and elevation as a list of dicts

In [None]:
route_info = []

for track in gpx.tracks:
    for segment in track.segments:
        for point in segment.points:
            route_info.append({
                'latitude': point.latitude,
                'longitude': point.longitude,
                'elevation': point.elevation,
                'time': point.time
            })

In [None]:
route_info[:3]

- Convert it to Pandas DataFrame for faster and easier analysis

In [None]:
route_df = pd.DataFrame(route_info)
route_df.head()

In [None]:
route_df.dtypes

- Save it to CSV for later use:

In [None]:
route_df.to_csv('../200km_route_df.csv', index=False)

<br>

## Basic visualization
- You can use matplotlib to visualize all data points
- It won't show the map, but you should still see how the route looks like:

In [None]:
plt.figure(figsize=(14, 8))
plt.scatter(route_df['longitude'], route_df['latitude'], color='#101010')
plt.title('Route latitude and longitude points', size=20);