## 1. Data Analysis Workflow

The analysis follows a structured workflow from raw data to actionable insights.

---

### Step 1: Data Collection

- Load the Brent oil price dataset (1987–2022).
- Collect external event data from trusted sources such as:
  - OPEC reports  
  - News archives  
  - World Bank  
  - IMF  

---

### Step 2: Data Cleaning

- Convert the `Date` column to datetime format.
- Handle missing or duplicate values.
- Sort data chronologically.
- Check for outliers.

---

### Step 3: Exploratory Data Analysis (EDA)

- Plot historical price trends.
- Analyze long-term and short-term movements.
- Examine price volatility.
- Identify visually obvious breakpoints.

---

### Step 4: Time Series Diagnostics

- Test stationarity using the Augmented Dickey-Fuller (ADF) test.
- Analyze autocorrelation and partial autocorrelation.
- Examine variance stability.
- Detect seasonality or cycles.

---

### Step 5: Change Point Modeling

- Apply a Bayesian Change Point model using PyMC.
- Estimate break dates.
- Estimate regime-level parameters (mean and variance).

---

### Step 6: Event Mapping

- Match detected change points with known events.
- Analyze alignment between statistical breaks and real-world shocks.

---

### Step 7: Interpretation

- Quantify impact size.
- Compare regimes.
- Assess uncertainty using posterior distributions.

---

### Step 8: Reporting and Visualization

- Prepare dashboards and analytical reports.
- Communicate findings clearly to stakeholders.


## 2 Major Events Affecting Brent Oil Prices (1987–2022)

| Event_ID | Start_Date | Event_Name                              | Category        | Description |
|----------|------------|------------------------------------------|-----------------|-------------|
| E01 | 1990-08-02 | Gulf War Begins | Geopolitical | Iraq invades Kuwait, disrupting oil supply |
| E02 | 1997-07-02 | Asian Financial Crisis | Economic | Financial crisis reduces Asian oil demand |
| E03 | 2001-09-11 | 9/11 Attacks | Geopolitical | Terrorist attacks impact global markets |
| E04 | 2003-03-20 | Iraq War | Geopolitical | US-led invasion disrupts Middle East supply |
| E05 | 2008-09-15 | Global Financial Crisis | Economic | Banking collapse leads to demand crash |
| E06 | 2010-04-20 | Deepwater Horizon Spill | Environmental | Major oil spill affects production |
| E07 | 2011-02-15 | Arab Spring | Geopolitical | Political unrest in oil-producing nations |
| E08 | 2014-11-27 | OPEC Production Decision | OPEC Policy | OPEC refuses to cut output |
| E09 | 2016-01-16 | Iran Sanctions Lifted | Political | Iran re-enters oil market |
| E10 | 2016-11-30 | OPEC Production Cut | OPEC Policy | Coordinated output reduction |
| E11 | 2018-05-08 | US Sanctions on Iran | Political | US withdraws from nuclear deal |
| E12 | 2020-03-11 | COVID-19 Pandemic | Economic | Global lockdown reduces oil demand |
| E13 | 2020-04-20 | Oil Price Crash | Market Shock | WTI turns negative |
| E14 | 2022-02-24 | Russia–Ukraine War | Geopolitical | War disrupts energy supply |
| E15 | 2022-06-01 | EU Ban on Russian Oil | Political | Sanctions reduce Russian exports |


## 3 assumptions and limitations

### Assumptions

  - Price changes reflect market reactions to major events.
  - Structural breaks indicate regime changes.
  - External shocks affect prices within short time windows.
  - Data is accurate and representative.

### Limitations

#### 1 Correlation ≠ Causation
  - A statistical change near an event does not prove the event caused the change.
#### 2 Overlapping Events
  - Multiple events may occur close together.
#### 3 Data Granularity
  - Daily prices may miss intraday dynamics.
#### 4 Market Expectations
  - Prices may react before official events.
#### 5 Model Simplification
  - Bayesian models assume simplified regime structures.

## 4. Correlation vs Causation

##### This study identifies statistical associations between events and price changes.
However:
- Correlation means two variables move together.
- Causation means one directly causes the other.
##### Change point detection shows timing coincidence, not proof of causality.

Causal claims require:

- Control variables
- Structural models
- Natural experiments
  
Therefore, conclusions are probabilistic, not deterministic.

### 5. Communication Channels
Results will be communicated through:

| Audience     | Channel          | Format           |
| ------------ | ---------------- | ---------------- |
| Investors    | Dashboard        | Web App          |
| Policymakers | Policy Brief     | PDF              |
| Analysts     | Technical Report | Jupyter/Markdown |
| Management   | Presentation     | Slides           |


### 6. Understanding Time Series Properties
#### 6.1 Trend Analysis

  Long-term upward/downward movements in prices caused by demand, supply, and technology.

  - Observed via line plots and moving averages.

#### 6.2 Stationarity

  A stationary series has constant mean and variance.

Tested using:

- Augmented Dickey-Fuller (ADF) Test

Oil prices are usually non-stationary.

#### 6.3 Volatility Patterns

Periods of high uncertainty show large price swings.

Measured using:

- Rolling standard deviation

- GARCH-style analysis

#### 6.4 Implications for Modeling

Because oil prices are:

- Non-stationary

- Volatile

- Regime-dependent

Change point models are suitable.

### 7. Change Point Models
#### `Purpose`

Change point models detect moments where the statistical behavior of a time series changes.

They identify:

  - Sudden jumps

  - Trend shifts

  - Volatility regime changes

#### `In This Project`

They help:

  - Detect market regime changes

  - Link regimes to real events

  - Estimate impact size

#### `Bayesian Approach (PyMC)`

Bayesian change point models:

  - Treat change points as random variables

  - Provide probability distributions

  - Quantify uncertainty

This is superior to deterministic methods.

### 8. Expected Outputs

The change point analysis produces:
1. `Change Point Dates`

   - Probable dates where regimes changed.

2. `Regime Parameters`

   For each regime:

   - Mean price

   - Variance

   - Trend

3. `Posterior Distributions`

    Uncertainty around estimates.

4. `Event Alignment Report`

    Comparison of breakpoints with events.

5. `Credible Intervals`

    Confidence ranges for parameters.

### 9. Limitations of Outputs

   - Break dates are probabilistic.

   - Some changes may be missed.

   - Results depend on priors.

   - Long-term gradual changes may not be detected.