# Data Analysis for Hybrid GARCH-ML Model

This notebook demonstrates the data analysis process for our hybrid volatility prediction model.

In [None]:
import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline
plt.style.use('seaborn')

## 1. Data Collection

We'll fetch historical stock data using yfinance.

In [None]:
# Download historical data for AAPL
ticker = "AAPL"
data = yf.download(ticker, start="2020-01-01")
data.head()

## 2. Feature Engineering

Calculate various technical indicators and features.

In [None]:
# Calculate returns
data['Returns'] = data['Adj Close'].pct_change()

# Calculate realized volatility
data['RealizedVol'] = data['Returns'].rolling(window=21).std() * np.sqrt(252)

# Plot volatility
plt.figure(figsize=(12, 6))
data['RealizedVol'].plot()
plt.title('Realized Volatility')
plt.show()

## 3. Data Analysis

Analyze the characteristics of our features.

In [None]:
# Basic statistics
data.describe()