# Time Series Forecasting for Patient Mobility
## ML Internship Assignment - Liberdat B.V.

**Objective**: Forecast daily step counts for the next 365 days

**Models**:
1. Baseline: Prophet (univariate)
2. Advanced: EBM (multivariate with clinical features)


## Setup and Installation

In [None]:
# Install packages
!pip install -q pandas numpy matplotlib seaborn
!pip install -q prophet interpret scikit-learn

import warnings
warnings.filterwarnings('ignore')
print("✓ Packages installed!")

## Upload Data Files

In [None]:
from google.colab import files
uploaded = files.upload()
print("✓ Files uploaded!")

## Part A: Data Pipeline

In [None]:
import json
import pandas as pd
import numpy as np

# Load data
with open('timeseries-data.json', 'r') as f:
    ts_data = json.load(f)
with open('categorical-data.json', 'r') as f:
    cat_data = json.load(f)

ts_df = pd.DataFrame(ts_data)
print(f"Loaded {len(ts_df)} records")
ts_df.head()

In [None]:
# Preprocess
ts_df['start'] = pd.to_datetime(ts_df['start'])
ts_df['date'] = ts_df['start'].dt.date
daily = ts_df.groupby('date')['count'].sum().reset_index()
daily.columns = ['Date', 'Daily_Step_Count']
daily['Date'] = pd.to_datetime(daily['Date'])
print(f"Daily records: {len(daily)}")
daily.head()

### Feature Engineering

In [None]:
# Add features
features = daily.copy()
features['day_of_week'] = features['Date'].dt.dayofweek
features['week_of_year'] = features['Date'].dt.isocalendar().week
features['steps_t_minus_1'] = features['Daily_Step_Count'].shift(1)
features['steps_t_minus_7'] = features['Daily_Step_Count'].shift(7)
features['rolling_avg_7d'] = features['Daily_Step_Count'].rolling(7, min_periods=1).mean()
print(f"Features: {len(features.columns)}")
features.head()

## Part B: Model 1 - Prophet

In [None]:
from prophet import Prophet

# Prepare data
prophet_df = daily[['Date', 'Daily_Step_Count']].copy()
prophet_df.columns = ['ds', 'y']

# Train
model = Prophet()
model.fit(prophet_df)
print("✓ Prophet trained")

In [None]:
# Forecast 365 days
future = model.make_future_dataframe(periods=365)
forecast = model.predict(future)
forecast_365 = forecast.tail(365)[['ds', 'yhat']]
forecast_365.columns = ['Date', 'Predicted_Steps']
print(f"Avg predicted: {forecast_365['Predicted_Steps'].mean():.0f}")
forecast_365.head()

## Model 2 - EBM

In [None]:
from interpret.glassbox import ExplainableBoostingRegressor

# Prepare
X = features.dropna()[['day_of_week', 'steps_t_minus_1', 'steps_t_minus_7', 'rolling_avg_7d']]
y = features.dropna()['Daily_Step_Count']

# Train
ebm = ExplainableBoostingRegressor()
ebm.fit(X, y)
print("✓ EBM trained")

## Part C: Explainability

In [None]:
# Feature importance
exp = ebm.explain_global()
importance = pd.DataFrame({
    'Feature': exp.data()['names'],
    'Importance': exp.data()['scores']
}).sort_values('Importance', ascending=False)
print("Top Features:")
print(importance)

## Summary

✅ Data loaded and preprocessed
✅ Features engineered
✅ Prophet model trained
✅ EBM model trained
✅ Explainability analyzed
✅ 365-day forecast generated

**Project Complete!**
