# Cell 1: Introduction to Change Point Modeling

**Purpose**: This notebook implements a Bayesian change point model using PyMC3 to detect structural breaks in Brent oil prices and associate them with major events for the 10 Academy Week 10 Challenge (Task 2).

**Objectives**:
- Load and prepare Brent oil price data.
- Build and run a Bayesian change point model.
- Interpret results to identify change points and quantify impacts.
- Associate change points with events from `major_events.csv`.

**Input**: `data/processed/cleaned_oil_data.csv`, `data/events/major_events.csv`
**Output**: Plots in `results/figures/`, model results in `results/models/`, and interpretation dictionary.

In [None]:
# Cell 2: Import Required Libraries
# Description: Import Python libraries for modeling, data handling, and visualization.

import pandas as pd
import numpy as np
import pymc3 as pm
import matplotlib.pyplot as plt
from src.models.changepoint_model import build_changepoint_model, run_mcmc, plot_changepoint_results, interpret_changepoint

# Ensure plots are displayed inline
%matplotlib inline

In [None]:
# Cell 3: Load Data
# Description: Load cleaned Brent oil price data and major events dataset.
# Input: Cleaned oil data and events CSV files
# Output: DataFrames for prices and events

# Define file paths
PRICE_PATH = '../data/processed/cleaned_oil_data.csv'
EVENTS_PATH = '../data/events/major_events.csv'

# Load price data
price_df = pd.read_csv(PRICE_PATH, parse_dates=['Date'])

# Load events data
events_df = pd.read_csv(EVENTS_PATH)

# Display first few rows to verify
print("Price Data Head:")
print(price_df.head())
print("\nEvents Data Head:")
print(events_df.head())

In [None]:
# Cell 4: Prepare Data for Modeling
# Description: Extract price array and dates for change point modeling.
# Input: Price DataFrame
# Output: Numpy array of prices and Series of dates

# Extract price data as numpy array
prices = price_df['Price'].values

# Extract dates
dates = price_df['Date']

# Get number of days
n_days = len(prices)

# Print data summary
print(f"Number of days: {n_days}")
print(f"Price range: {prices.min():.2f} to {prices.max():.2f} USD/barrel")

In [None]:
# Cell 5: Build and Run Change Point Model
# Description: Build the Bayesian model and run MCMC sampling.
# Input: Price array and number of days
# Output: PyMC3 model and MCMC trace

# Build model
model = build_changepoint_model(prices, n_days)

# Run MCMC sampling
trace = run_mcmc(model, draws=1000, tune=1000)

# Check convergence
print("MCMC Summary:")
print(pm.summary(trace))

In [None]:
# Cell 6: Visualize Model Results
# Description: Plot data with estimated change point and posterior distribution of tau.
# Input: MCMC trace, price data, and dates
# Output: Plot saved to 'results/figures/changepoint_results.png'

# Define output path
OUTPUT_PATH = 'results/figures/changepoint_results.png'

# Plot results
plot_changepoint_results(trace, prices, dates, OUTPUT_PATH)

# Display plot (already saved)
plt.show()

In [None]:
# Cell 7: Interpret Change Point Results
# Description: Associate change point with events and quantify price impact.
# Input: MCMC trace, dates, and events DataFrame
# Output: Dictionary with interpretation results

# Interpret results
results = interpret_changepoint(trace, dates, events_df)

# Print interpretation
print("Change Point Interpretation:")
for key, value in results.items():
    print(f"{key}: {value}")