# Time Series Analysis in Medicine and Biology
## Practical Course 2026 – University of Tübingen

---
[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](
https://colab.research.google.com/github/ShamsaraE/time-series-medicine-biology-2026/blob/main/notebooks/02_malaria_cases_multiplicative_model.ipynb
)

---

# Case Study 2: Malaria (Kericho) – Multiplicative Time Series Modeling

This notebook explores multiplicative structure in malaria case data using `statsmodels`.

We include:
1. Classical additive decomposition on log scale
2. Classical multiplicative decomposition
3. STL decomposition
4. Log-linear regression using OLS

## 1. Load and prepare data

 If needed, data can alternatively be downloaded from:
https://sites.google.com/view/tsbiostat/home


CSV: `../data/kericho_data.csv` with columns `Year, Month, Cases, Rain, minT, maxT, VCAP`.

We convert `Year+Month` to a monthly datetime index and compute `log_cases`.

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

# Load dataset
df = pd.read_csv('../data/kericho_data.csv')

# Convert month names
month_map = {'Jan':1,'Feb':2,'Mar':3,'Apr':4,'May':5,'Jun':6,'Jul':7,'Aug':8,'Sep':9,'Oct':10,'Nov':11,'Dec':12}
df['Month_num'] = df['Month'].map(month_map)

# Create datetime index
df['date'] = pd.to_datetime(dict(year=df['Year'], month=df['Month_num'], day=1))
df = df.sort_values('date').set_index('date').asfreq('MS')

# Safe log transform
df['Cases_adj'] = df['Cases'].replace(0, 1)
df['log_cases'] = np.log(df['Cases_adj'])

df.head()

## 2. Raw vs Log Series

In [None]:
plt.figure(figsize=(10,4))
plt.plot(df['Cases'])
plt.title('Malaria Cases (raw)')
plt.show()

plt.figure(figsize=(10,4))
plt.plot(df['log_cases'])
plt.title('Malaria Cases (log scale)')
plt.show()

## 3. Classical Additive Decomposition (on Log Scale)

Multiplicative structure becomes additive after log transform.

In [None]:
from statsmodels.tsa.seasonal import seasonal_decompose

result_add = seasonal_decompose(df['log_cases'], model='additive', period=12)
result_add.plot()
plt.show()

## 4. Classical Multiplicative Decomposition (Original Scale)

This directly assumes:
$Y_t = Trend × Seasonal × Residual$

In [None]:
result_mult = seasonal_decompose(df['Cases_adj'], model='multiplicative', period=12)
result_mult.plot()
plt.show()

## 5. STL Decomposition (Flexible Seasonality)

STL allows seasonal patterns to evolve over time.

In [None]:
from statsmodels.tsa.seasonal import STL

stl = STL(df['log_cases'], period=12)
res_stl = stl.fit()
res_stl.plot()
plt.show()