# Libraries & Helpers

In [6]:
# Libraries
import pandas as pd
import numpy as np
import gspread
import datetime
import matplotlib.pyplot as plt
import seaborn as sns

import os
os.chdir("..")  
from src import config
from src import help_functions as hf

# Configs
pd.set_option("display.max_columns", None)
pd.set_option("display.max_colwidth", None)

# Data

In [7]:
# Import and quick check Training data 
googleDrive_client = gspread.authorize(config.DRIVE_CREDENTIALS)
training_data, _ = hf.import_google_sheet(googleDrive_client=googleDrive_client, filename=config.DRIVE_TP_LOG_FILENAMES[0], sheet_index=0)

# "Clean" data
for col in training_data.columns:
    try:
        training_data[col] = training_data[col].apply(hf.safe_convert_to_numeric)
    except ValueError:
        pass 

# Date & Datetime
training_data["Date"] = pd.to_datetime(training_data[["Year", "Month", "Day"]]).dt.date
training_data["Datetime"] = pd.to_datetime(training_data[["Year", "Month", "Day"]])
training_data = training_data.sort_values(by="Date").reset_index(drop=True)

# About
print("Training data about:")
print("-----------------------------------------------------")
print("Todays date: {}".format(datetime.datetime.today().date()))
print("Date range: {} to {}".format(training_data["Date"].min(), training_data["Date"].max()))
print("Duplicated rows = {}".format(training_data[training_data.duplicated(keep=False)].shape[0]))
print("Missing dates = {}".format([d for d in pd.date_range(start=training_data["Date"].min(), end=training_data["Date"].max()).date if d not in training_data["Date"].values]))

print("\nDifferent activities and their counts:")
print("-------------------------------------")
activities_count_time = (
    training_data
    .groupby("Activity type")[["Duration [h]"]]
    .agg(
        count=("Duration [h]", "count"),
        total_duration=("Duration [h]", "sum")
        )
    .reset_index()
    .sort_values(by="total_duration", ascending=False)
    )

for _, row in activities_count_time.iterrows():
    print("{} ~> {:.2f} hours ({} act.)".format(row["Activity type"], row["total_duration"], row["count"]))

Training data about:
-----------------------------------------------------
Todays date: 2025-08-28
Date range: 2024-09-13 to 2025-08-27
Duplicated rows = 0
Missing dates = []

Different activities and their counts:
-------------------------------------
Trail Running ~> 283.77 hours (153 act.)
Road Biking ~> 80.82 hours (33 act.)
Running ~> 78.25 hours (76 act.)
Indoor Biking ~> 71.04 hours (51 act.)
Mountain Biking ~> 16.99 hours (9 act.)
Hiking ~> 15.35 hours (6 act.)
Road biking ~> 3.74 hours (2 act.)
Lap Swimming ~> 0.21 hours (1 act.)


In [8]:
# Import and quick check Daily data
googleDrive_client = gspread.authorize(config.DRIVE_CREDENTIALS)
daily_data, _ = hf.import_google_sheet(googleDrive_client=googleDrive_client, filename=config.DRIVE_TP_LOG_FILENAMES[1], sheet_index=0)

# "Clean" data
for col in daily_data.columns:
    try:
        daily_data[col] = daily_data[col].apply(hf.safe_convert_to_numeric)
    except ValueError:
        pass 

# Date & Datetime
daily_data["Date"] = pd.to_datetime(daily_data[["Year", "Month", "Day"]]).dt.date
daily_data["Datetime"] = pd.to_datetime(daily_data[["Year", "Month", "Day"]])
daily_data = daily_data.sort_values(by="Date").reset_index(drop=True)

# About
print("Daily data about:")
print("-----------------------------------------------------")
print("Todays date: {}".format(datetime.datetime.today().date()))
print("Date range: {} to {}".format(daily_data["Date"].min(), daily_data["Date"].max()))
print("Duplicated rows = {}".format(daily_data[daily_data.duplicated(keep=False)].shape[0]))
print("Missing dates = {}".format([d for d in pd.date_range(start=daily_data["Date"].min(), end=daily_data["Date"].max()).date if d not in daily_data["Date"].values]))

Daily data about:
-----------------------------------------------------
Todays date: 2025-08-28
Date range: 2024-04-15 to 2025-08-27
Duplicated rows = 0
Missing dates = []


# Development

We have two goals:

1. **Recent Adjusted Relative Training Load (ARTL)**

Define a simple metric that shows where we are with our current training compared to what we've been doing in the recent past. The purpose is to see if we should reduce out trainings load — so that we don't overreach or risk injury, or increase increase it to match what what our body has been adapted to in recent past. Or just simple define where in the training cycle we are if we take larger picture into account. *How are we positined relative to recent training load overall?*

2. **Recent Load Relative Percentile (RLRP)**

Quantify how hard today's training was compared to what we are used to. In other words, show where this session fall within the distribution of our recent sessions — is it an average day, a light day, or a clear spike? This also helps guide tomorrow’s (or few next days) training choice, since we know whether today was relatively light, normal, or heavy.

Dataset:

- All activities, regardless if it was real training or not (including hiking, swimming, easy cycling etc.).
- Lets assume we only have one "real" workout per day and take total (sum) daily training load (one sample is one day). 

Let $TL_i$ be the training load of the day i.

### Recent Adjusted Relative Training Load (ARTL)

A history aware measure of current relative load. We have to balance:
- Baseline (long term) adaptation 
    - What your body has been used to, and succesfully handed it, over a longer period and did, e.g., 3 months.
    - Captures what your body is adapted to before the most recent training period.
    - Reflects true adaptation, not recent acute changes.

- Recent training pattern 
    - What our last weeks/days looked like, weighted by recency, so that very recent sessions count more (what we did e.g. 3 weeks before, our body has probably handled and adapted to).
    - Captures what your body has been exposed to in the most recent period.
    - Reflect acute training load.

Adjusted Relative Training Load (ARTL) is defined as the ratio between recent weighted training load and baseline training load 
-  $ARTL = \frac{Recent\ Weighted\ Load}{Baseline\ Load}$
- ARTL > 1 -> recent 3-week load is above what you were adapted to in the prior 3 months - "going out of bodys comfort zone, risky"
- ARTL < 1 -> recent 3-week load is below what you were adapted to in the prior 3 months - "safe to push"

Where:

Define historical windows:
- Baseline window:  Lets take last 90 days (3 months) excluding last 21 days (3 weeks).
- Recent weighted window: Lets take last 21 days (3 weeks).