# Time series components

## Data

In [15]:
import pandas as pd

df = pd.read_parquet("../../../data/EIA/fuel_type_data_california.parquet")
df

Unnamed: 0,period,respondent,respondent-name,fueltype,type-name,value,value-units
0,2025-04-22 20:00:00-07:00,CAL,California,BAT,Battery storage,179,megawatthours
1,2025-04-22 20:00:00-07:00,CAL,California,COL,Coal,216,megawatthours
...,...,...,...,...,...,...,...
452546,2018-12-31 17:00:00-07:00,CAL,California,WAT,Hydro,2880,megawatthours
452547,2018-12-31 17:00:00-07:00,CAL,California,WND,Wind,2628,megawatthours


In [16]:
df = (df
 .query('fueltype == "SUN"')
 .set_index('period')['value'].to_frame()
 .sort_index()
 .loc['2019':'2024']
 .resample('4W')
 .mean()
)

df

Unnamed: 0_level_0,value
period,Unnamed: 1_level_1
2019-01-06 00:00:00-07:00,2201.500000
2019-02-03 00:00:00-07:00,2000.586310
...,...
2024-12-29 00:00:00-07:00,3531.580357
2025-01-26 00:00:00-07:00,4222.479167


In [17]:
df.columns = ['values']
df

Unnamed: 0_level_0,values
period,Unnamed: 1_level_1
2019-01-06 00:00:00-07:00,2201.500000
2019-02-03 00:00:00-07:00,2000.586310
...,...
2024-12-29 00:00:00-07:00,3531.580357
2025-01-26 00:00:00-07:00,4222.479167


## Individual component behaviour based on model

Components:

- Trend (T)
- Seasonality (S)
- Residual or Irregular Component (I)

Models:

- Additive model: $y_t = T_t + S_t + e_t$
- Multiplicative model: $y_t = T_t \times S_t \times e_t$

### Additive model

In [18]:
import statsmodels.api as sm

In [19]:
data = df['values'].values
result = sm.tsa.seasonal_decompose(data, model='additive', period=12)

r = (df
 .assign(
    trend = result.trend,
    seasonal = result.seasonal,
    residual = result.resid)
 .dropna())

r

Unnamed: 0_level_0,values,trend,seasonal,residual
period,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2019-06-23 00:00:00-07:00,4854.479167,3746.314298,8.142827,1100.022041
2019-07-21 00:00:00-07:00,5131.361607,3729.616815,222.545419,1179.199373
...,...,...,...,...
2024-07-14 00:00:00-07:00,7498.229167,5862.769655,163.530373,1471.929139
2024-08-11 00:00:00-07:00,7324.133929,5902.322359,-91.339450,1513.151020


In [20]:
r['model_result'] = r.trend + r.seasonal + r.residual
r

Unnamed: 0_level_0,values,trend,seasonal,residual,model_result
period,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-06-23 00:00:00-07:00,4854.479167,3746.314298,8.142827,1100.022041,4854.479167
2019-07-21 00:00:00-07:00,5131.361607,3729.616815,222.545419,1179.199373,5131.361607
...,...,...,...,...,...
2024-07-14 00:00:00-07:00,7498.229167,5862.769655,163.530373,1471.929139,7498.229167
2024-08-11 00:00:00-07:00,7324.133929,5902.322359,-91.339450,1513.151020,7324.133929


In [21]:
dfs = {}
dfs['additive'] = r

### Multiplicative model

In [8]:
r = df['values'].values
result = sm.tsa.seasonal_decompose(data, model='multiplicative', period=12)

r = (df
 .assign(
    trend = result.trend,
    seasonal = result.seasonal,
    residual = result.resid)
 .dropna())

r

Unnamed: 0_level_0,values,trend,seasonal,residual
period,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2019-06-23 00:00:00-07:00,4854.479167,3746.314298,1.036189,1.250546
2019-07-21 00:00:00-07:00,5131.361607,3729.616815,1.083876,1.269372
...,...,...,...,...
2024-07-14 00:00:00-07:00,7498.229167,5862.769655,1.002435,1.275850
2024-08-11 00:00:00-07:00,7324.133929,5902.322359,0.948840,1.307796


In [9]:
r['model_result'] = r.trend * r.seasonal * r.residual
r

Unnamed: 0_level_0,values,trend,seasonal,residual,model_result
period,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2019-06-23 00:00:00-07:00,4854.479167,3746.314298,1.036189,1.250546,4854.479167
2019-07-21 00:00:00-07:00,5131.361607,3729.616815,1.083876,1.269372,5131.361607
...,...,...,...,...,...
2024-07-14 00:00:00-07:00,7498.229167,5862.769655,1.002435,1.275850,7498.229167
2024-08-11 00:00:00-07:00,7324.133929,5902.322359,0.948840,1.307796,7324.133929


In [10]:
dfs['multiplicative'] = r

## Model comparison

### Numerical

In [11]:
df = pd.concat(dfs, axis=1).melt(ignore_index=False).reset_index()
df.columns = ["month", "model", "component", "value"]

In [12]:
df

Unnamed: 0,month,model,component,value
0,2019-06-23 00:00:00-07:00,additive,values,4854.479167
1,2019-07-21 00:00:00-07:00,additive,values,5131.361607
...,...,...,...,...
678,2024-07-14 00:00:00-07:00,multiplicative,model_result,7498.229167
679,2024-08-11 00:00:00-07:00,multiplicative,model_result,7324.133929


## Visual comparison

In [13]:

from modules import utils
utils.configure_plotly_template(showlegend=True)

In [14]:
import plotly.express as px

fig = px.line(
    data_frame=df,
    x='month',
    y='value',
    color='component',
    facet_col='model',
    facet_row='component',
    width=1500,
    height=1000,
    facet_col_spacing=0.1,
)

fig.update_yaxes(matches=None)

for attr in dir(fig.layout):
    if attr.startswith("yaxis"):
        axis = getattr(fig.layout, attr)
        if axis:
            axis.showticklabels = True
        
fig

## Interpretation 

https://chatgpt.com/c/680a1854-fe68-800c-9642-8544ba7e471b

Use **additive** or **multiplicative** decomposition based on how the seasonal fluctuations behave relative to the trend:

---

### ✅ Use **additive** when:
- The **magnitude** of seasonal changes stays **constant** over time.
- The seasonal pattern does **not scale** with the trend.
- Example: sales increase over time, but holiday peaks remain around +20 units consistently.

### ✅ Use **multiplicative** when:
- The **magnitude** of seasonality **grows or shrinks** with the trend.
- The seasonal pattern **scales proportionally** to the level of the series.
- Example: if overall sales double, holiday peaks also double.

---

### Quick visual test:
- If your seasonal component has **roughly equal amplitude** throughout → **additive**.
- If your seasonal component’s **amplitude increases or decreases with the trend** → **multiplicative**.

---

Looking at your plot:
- The **seasonal amplitude** for the `values` series seems to **increase** over time (especially right side).
- So, the **multiplicative model** is likely the better fit.

Let me know if you want to formally test this or visualize diagnostics.

Based on the decomposition plots and assuming this is airline passenger data (which it looks like), here are the **key insights** and **meaningful conclusions** you can derive:

---

### 📈 **1. Strong Upward Trend**
- Both additive and multiplicative models show a **clear long-term increase** in passengers over time.
- This suggests **growing demand** for air travel during the period (likely 1949–1960).

---

### 🔁 **2. Clear Seasonality**
- There is a **repeating seasonal pattern** every 12 months:
  - Peaks in mid-year (e.g., summer travel).
  - Troughs in early year (e.g., January).
- This regular pattern can be used for **forecasting** future seasonality.

---

### 📊 **3. Additive vs. Multiplicative**
- In the **additive model**, seasonal effects are **constant in magnitude**.
- In the **multiplicative model**, seasonal effects **grow with the trend**.
- Visually, the **multiplicative model better fits** the growing amplitude of the original data, especially toward the end of the series.

---

### 📉 **4. Residuals Show Model Fit**
- Residuals in the **multiplicative model are tighter** and more stable, especially in later years where the additive model leaves large unexplained spikes.
- This suggests the **multiplicative model better captures the structure** of the data.

---

### 🎯 **5. Model Result Approximates the Original**
- The final row (`model_result`) in the multiplicative model more closely follows the original `values` series — especially where the amplitude is higher.

---

### 🧠 Strategic conclusions:
- **Forecasting models** (like Holt-Winters or SARIMA) should likely use a **multiplicative seasonal component** for this dataset.
- This time series is driven by **long-term growth** and **scaling seasonal effects**, not just fixed monthly deviations.
- **Residual diagnostics** (e.g., randomness, ACF) would help confirm this statistically.

---

Let me know if you want to formalize these conclusions into bullet points for a report or class presentation.