# Week 3 - Line Charts

## Goal

Create a Line Chart to visualize the trend of Trailing 365-Day Distance over time.

## Imports

In [1]:
import pandas as pd
file_id = "1ymbNqfv9s6YGZzN93HFKAhjg0Z5xZXV1"
url = f"https://drive.google.com/uc?id={file_id}"
df = pd.read_csv(url)

## Preprocessing

The trailing 365 chart will be a line chart showing the total distance (in km) run over the previous 365 days for each date.

### 1. Create a suitable date field

Although we can start from the code used in Week 1, there may be multiple runs on the same day, each recorded at a different time.

To group runs by day correctly, we need a date field where all runs on the same calendar day share the same value.

One simple approach is to normalise the time component to midnight. This keeps the date while removing the time.

You can explore the pandas method:

```
.dt.normalize()
```

In [2]:
#Your code here

Unnamed: 0,Date
0,2025-07-21
1,2025-07-19
2,2025-07-16
3,2025-07-15
4,2025-07-12


### 2. Add a distance column in km

In [4]:
#Your code here

Unnamed: 0,Date,Distance
0,2025-07-21,8.702
1,2025-07-19,8.3593
2,2025-07-16,8.3725
3,2025-07-15,6.82
4,2025-07-12,2.7581


### 3. Group

Next, we need to group runs by day and sum the total distance run on each day.

At this stage, it is also helpful to fill in any missing days with a distance of 0 km. This ensures we have one data point for every day, which is important for later rolling calculations.

For now, it is recommended to leave `Date` as the index after grouping.

> **Hint**
>
> - Investigate the pandas method `.asfreq()`
> - `"D"` can be used as a frequency code for daily time series

In [6]:
#Your code here

Unnamed: 0_level_0,Distance
Date,Unnamed: 1_level_1
2011-01-05,2.1306
2011-01-06,0.0
2011-01-07,0.0
2011-01-08,0.0
2011-01-09,0.0


### 4. Calculate running 365-day total

Next, calculate the running total distance over the previous 365 days.

At this stage, make sure the index is reset after completion, so that the `Date` column can be accessed correctly when plotting.

> **Hint**
>
> - Investigate the pandas method `.rolling()`
> - As before, `"D"` can be used for daily windows  
> - `"365D"` represents a rolling window of 365 days


In [16]:
#Your code here

Unnamed: 0,Date,Distance,Trailing365
0,2011-01-05,2.1306,2.1306
1,2011-01-06,0.0,2.1306
2,2011-01-07,0.0,2.1306
3,2011-01-08,0.0,2.1306
4,2011-01-09,0.0,2.1306


###4. Save to csv

In [9]:
#Your code here

###Solution

In [15]:
# 1. Date field
trailing365 = pd.DataFrame()  # Create a new DataFrame for clarity
trailing365["Date"] = (
    pd.to_datetime(df["start_date"])
      .dt.tz_localize(None)
      .dt.normalize()  # Set time to midnight for correct daily grouping
)

# 2. Distance in km
trailing365["Distance"] = df["distance"] / 1000.0

# 3. Group by day, sum distance, and fill missing days
trailing365 = (
    trailing365
    .groupby("Date")
    .agg(Distance=("Distance", "sum"))  # Total distance per day
    .asfreq("D", fill_value=0)          # Ensure one row per day; missing days set to 0 km
)

# 4. Calculate trailing 365-day total
trailing365["Trailing365"] = (
    trailing365["Distance"]
    .rolling("365D")  # Rolling window covering the previous 365 days
    .sum()
)
#Return date as a column
trailing365 = trailing365.reset_index()
trailing365.to_csv("trailing365.csv", index=False)