# Exploration Data Analysis Notebook
The purpose of this notebook is to load, preprocess and exploratory raw data from OpenBB Terimnal

## Key Activites:
    1. Load Raw data from openbb using src/hedging_engine/data_loader.py
    2. Explore the data for volatility and monutnem
    3. Save processed data to data/interim for more feature engineering

In [1]:
import sys, os

# Get the project root directory (one level up from notebooks folder)
project_root = os.path.abspath(os.path.join(os.getcwd(), ".."))
src_path = os.path.join(project_root, "src")

# Add both project root and src to Python path
if project_root not in sys.path:
    sys.path.insert(0, project_root)
if src_path not in sys.path:
    sys.path.insert(0, src_path)

print(f"Project root: {project_root}")
print(f"Source path: {src_path}")
print(f"Python path updated successfully!")

# Verify the paths exist
print(f"Project root exists: {os.path.exists(project_root)}")
print(f"Source path exists: {os.path.exists(src_path)}")
print(f"Data loader exists: {os.path.exists(os.path.join(src_path, 'hedging_engine', 'data_loader.py'))}")

Project root: /workspaces/Systematic-Options-Auto-Hedging-Engine
Source path: /workspaces/Systematic-Options-Auto-Hedging-Engine/src
Python path updated successfully!
Project root exists: True
Source path exists: True
Data loader exists: True


In [2]:
# Import the data_loader module
try:
    from hedging_engine import data_loader
    print("✓ Successfully imported data_loader from hedging_engine")
except ImportError as e:
    print(f"✗ Failed to import from hedging_engine: {e}")
    try:
        # Alternative import method
        import hedging_engine.data_loader as data_loader
        import hedging_engine.save_raw_data as save_raw_data
        print("✓ Successfully imported data_loader using alternative method")
    except ImportError as e2:
        print(f"✗ Alternative import also failed: {e2}")
        # Direct import as last resort
        sys.path.append(os.path.join(src_path, 'hedging_engine'))
        import data_loader
        print("✓ Successfully imported data_loader using direct method")

✓ Successfully imported data_loader from hedging_engine


In [3]:
# Import additional libraries for EDA
import pandas as pd
import numpy as np
import matplotlib
matplotlib.use('Agg')  # Use non-interactive backend to prevent issues
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Set plotting style (using more compatible style)
try:
    plt.style.use('seaborn-v0_8')
except:
    plt.style.use('default')  # Fallback to default style
    
sns.set_palette("husl")

# Configure matplotlib for notebook
plt.rcParams['figure.figsize'] = (10, 6)
plt.rcParams['font.size'] = 10

print("Libraries imported successfully!")
print(f"Matplotlib backend: {matplotlib.get_backend()}")
print(f"Pandas version: {pd.__version__}")
print(f"NumPy version: {np.__version__}")

Libraries imported successfully!
Matplotlib backend: Agg
Pandas version: 2.3.1
NumPy version: 2.3.1


## 1. Load Raw Data
Load historical stock data using the updated data_loader module

In [4]:
# Load historical data for analysis
tickers = ["AAPL", "MSFT", "GOOGL", "AMZN", "TSLA"]  # Multiple stocks for analysis
start_date = "2020-01-01"
end_date = "2023-12-31"

print("Loading historical data...")
print(f"Tickers: {tickers}")
print(f"Date range: {start_date} to {end_date}")

# Create directories for saving data
raw_dir = os.path.join(project_root, "data", "raw")
os.makedirs(raw_dir, exist_ok=True)

# Load data for each ticker
stock_data = {}
for ticker in tickers:
    try:
        print(f"\nLoading data for {ticker}...")
        
        # Load raw data
        df_raw = data_loader.load_data(ticker, start_date, end_date)
        
        # Save raw data before processing
        raw_filename = os.path.join(raw_dir, f"{ticker}_raw_{datetime.now().strftime('%Y%m%d')}.csv")
        data_loader.save_raw_data(df_raw, raw_filename)
        
        # Preprocess the data
        df_processed = data_loader.preprocess_data(df_raw)
        stock_data[ticker] = df_processed
        
        print(f"✓ Successfully loaded and saved {len(df_processed)} records for {ticker}")
    except Exception as e:
        print(f"✗ Failed to load {ticker}: {e}")

print(f"\nSuccessfully loaded data for {len(stock_data)} stocks")

Loading historical data...
Tickers: ['AAPL', 'MSFT', 'GOOGL', 'AMZN', 'TSLA']
Date range: 2020-01-01 to 2023-12-31

Loading data for AAPL...
Attempting to load data with provider: yfinance
Successfully loaded data with provider: yfinance
Raw data saved to /workspaces/Systematic-Options-Auto-Hedging-Engine/data/raw/AAPL_raw_20251001.csv
Input data shape: (1006, 7)
Final preprocessed data shape: (1006, 7)
✓ Successfully loaded and saved 1006 records for AAPL

Loading data for MSFT...
Attempting to load data with provider: yfinance
Successfully loaded data with provider: yfinance
Raw data saved to /workspaces/Systematic-Options-Auto-Hedging-Engine/data/raw/AAPL_raw_20251001.csv
Input data shape: (1006, 7)
Final preprocessed data shape: (1006, 7)
✓ Successfully loaded and saved 1006 records for AAPL

Loading data for MSFT...
Attempting to load data with provider: yfinance
Successfully loaded data with provider: yfinance
Raw data saved to /workspaces/Systematic-Options-Auto-Hedging-Engine/d

## 2. Data Overview and Basic Statistics
Examine the structure and basic properties of the loaded data

In [5]:
# Examine data structure and basic statistics
for ticker, df in stock_data.items():
    print(f"\n{'='*50}")
    print(f"DATA OVERVIEW FOR {ticker}")
    print(f"{'='*50}")
    
    print(f"Shape: {df.shape}")
    print(f"Date range: {df.index.min()} to {df.index.max()}")
    print(f"Columns: {list(df.columns)}")
    
    print(f"\nBasic Statistics:")
    print(df[['open', 'high', 'low', 'close', 'volume']].describe())
    
    print(f"\nMissing Values:")
    missing = df.isnull().sum()
    print(missing[missing > 0] if missing.sum() > 0 else "No missing values")


DATA OVERVIEW FOR AAPL
Shape: (1006, 7)
Date range: 2020-01-02 to 2023-12-29
Columns: ['open', 'high', 'low', 'close', 'volume', 'split_ratio', 'dividend']

Basic Statistics:
              open         high          low        close        volume
count  1006.000000  1006.000000  1006.000000  1006.000000  1.006000e+03
mean    140.675507   142.321389   139.143536   140.808131  9.895373e+07
std      33.310018    33.430571    33.179199    33.313857  5.439610e+07
min      57.020000    57.125000    53.152500    56.092499  2.404830e+07
25%     123.682503   125.030003   122.157499   123.592501  6.407675e+07
50%     145.540001   147.264999   144.120003   145.860001  8.467540e+07
75%     166.302498   168.147503   164.815002   166.214996  1.155069e+08
max     198.020004   199.619995   197.000000   198.110001  4.265100e+08

Missing Values:
No missing values

DATA OVERVIEW FOR MSFT
Shape: (1006, 6)
Date range: 2020-01-02 to 2023-12-29
Columns: ['open', 'high', 'low', 'close', 'volume', 'dividend']

## 3. Price Movement Analysis
Analyze price movements, returns, and basic momentum indicators

In [6]:
# Calculate returns and basic momentum indicators
enhanced_data = {}

for ticker, df in stock_data.items():
    # Create a copy for feature engineering
    enhanced_df = df.copy()
    
    # Calculate daily returns
    enhanced_df['daily_return'] = enhanced_df['close'].pct_change()
    enhanced_df['log_return'] = np.log(enhanced_df['close'] / enhanced_df['close'].shift(1))
    
    # Calculate moving averages
    enhanced_df['ma_5'] = enhanced_df['close'].rolling(window=5).mean()
    enhanced_df['ma_20'] = enhanced_df['close'].rolling(window=20).mean()
    enhanced_df['ma_50'] = enhanced_df['close'].rolling(window=50).mean()
    
    # Calculate volatility (rolling 20-day)
    enhanced_df['volatility_20'] = enhanced_df['daily_return'].rolling(window=20).std() * np.sqrt(252)
    
    # Calculate momentum indicators
    enhanced_df['momentum_10'] = enhanced_df['close'] / enhanced_df['close'].shift(10) - 1
    enhanced_df['momentum_20'] = enhanced_df['close'] / enhanced_df['close'].shift(20) - 1
    
    # Calculate price range and true range
    enhanced_df['price_range'] = (enhanced_df['high'] - enhanced_df['low']) / enhanced_df['close']
    enhanced_df['true_range'] = np.maximum(
        enhanced_df['high'] - enhanced_df['low'],
        np.maximum(
            abs(enhanced_df['high'] - enhanced_df['close'].shift(1)),
            abs(enhanced_df['low'] - enhanced_df['close'].shift(1))
        )
    )
    
    enhanced_data[ticker] = enhanced_df
    
    # Print summary statistics for returns
    print(f"\n{ticker} - Return Statistics:")
    print(f"Daily Return - Mean: {enhanced_df['daily_return'].mean():.4f}, Std: {enhanced_df['daily_return'].std():.4f}")
    print(f"Annualized Volatility: {enhanced_df['volatility_20'].iloc[-1]:.4f}")
    print(f"10-day Momentum: {enhanced_df['momentum_10'].iloc[-1]:.4f}")
    print(f"20-day Momentum: {enhanced_df['momentum_20'].iloc[-1]:.4f}")


AAPL - Return Statistics:
Daily Return - Mean: 0.0012, Std: 0.0211
Annualized Volatility: 0.1441
10-day Momentum: -0.0282
20-day Momentum: 0.0136

MSFT - Return Statistics:
Daily Return - Mean: 0.0011, Std: 0.0206
Annualized Volatility: 0.1462
10-day Momentum: 0.0276
20-day Momentum: -0.0076

GOOGL - Return Statistics:
Daily Return - Mean: 0.0009, Std: 0.0211
Annualized Volatility: 0.2549
10-day Momentum: 0.0587
20-day Momentum: 0.0540

AMZN - Return Statistics:
Daily Return - Mean: 0.0007, Std: 0.0237
Annualized Volatility: 0.1896
10-day Momentum: 0.0307
20-day Momentum: 0.0400

TSLA - Return Statistics:
Daily Return - Mean: 0.0031, Std: 0.0429
Annualized Volatility: 0.3345
10-day Momentum: -0.0102
20-day Momentum: 0.0350


## 4. Volatility Analysis
Detailed analysis of volatility patterns and clustering

In [7]:
# Analyze volatility patterns
fig, axes = plt.subplots(len(enhanced_data), 2, figsize=(15, 4*len(enhanced_data)))
if len(enhanced_data) == 1:
    axes = axes.reshape(1, -1)

for i, (ticker, df) in enumerate(enhanced_data.items()):
    # Plot 1: Price and Moving Averages
    axes[i, 0].plot(df.index, df['close'], label='Close Price', alpha=0.7)
    axes[i, 0].plot(df.index, df['ma_20'], label='MA 20', alpha=0.8)
    axes[i, 0].plot(df.index, df['ma_50'], label='MA 50', alpha=0.8)
    axes[i, 0].set_title(f'{ticker} - Price and Moving Averages')
    axes[i, 0].legend()
    axes[i, 0].grid(True, alpha=0.3)
    
    # Plot 2: Volatility over time
    axes[i, 1].plot(df.index, df['volatility_20'], label='20-day Volatility', color='red', alpha=0.7)
    axes[i, 1].set_title(f'{ticker} - Rolling 20-day Volatility')
    axes[i, 1].set_ylabel('Annualized Volatility')
    axes[i, 1].legend()
    axes[i, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Volatility statistics
print("\nVolatility Statistics Summary:")
print("="*60)
for ticker, df in enhanced_data.items():
    vol_stats = df['volatility_20'].describe()
    print(f"\n{ticker}:")
    print(f"  Mean Volatility: {vol_stats['mean']:.4f}")
    print(f"  Min Volatility:  {vol_stats['min']:.4f}")
    print(f"  Max Volatility:  {vol_stats['max']:.4f}")
    print(f"  Std Volatility:  {vol_stats['std']:.4f}")


Volatility Statistics Summary:

AAPL:
  Mean Volatility: 0.3052
  Min Volatility:  0.1202
  Max Volatility:  1.0795
  Std Volatility:  0.1498

MSFT:
  Mean Volatility: 0.2974
  Min Volatility:  0.1133
  Max Volatility:  1.1314
  Std Volatility:  0.1477

GOOGL:
  Mean Volatility: 0.3158
  Min Volatility:  0.1210
  Max Volatility:  0.8958
  Std Volatility:  0.1250

AMZN:
  Mean Volatility: 0.3561
  Min Volatility:  0.1372
  Max Volatility:  0.7411
  Std Volatility:  0.1344

TSLA:
  Mean Volatility: 0.6327
  Min Volatility:  0.2094
  Max Volatility:  1.5917
  Std Volatility:  0.2454


## 5. Returns Distribution and Risk Metrics
Analyze return distributions and calculate risk metrics

In [8]:
# Analyze return distributions
fig, axes = plt.subplots(2, len(enhanced_data), figsize=(4*len(enhanced_data), 8))
if len(enhanced_data) == 1:
    axes = axes.reshape(-1, 1)

for i, (ticker, df) in enumerate(enhanced_data.items()):
    # Histogram of daily returns
    axes[0, i].hist(df['daily_return'].dropna(), bins=50, alpha=0.7, density=True)
    axes[0, i].axvline(df['daily_return'].mean(), color='red', linestyle='--', label='Mean')
    axes[0, i].set_title(f'{ticker} - Daily Returns Distribution')
    axes[0, i].set_xlabel('Daily Return')
    axes[0, i].set_ylabel('Density')
    axes[0, i].legend()
    axes[0, i].grid(True, alpha=0.3)
    
    # Q-Q plot for normality check
    from scipy import stats
    stats.probplot(df['daily_return'].dropna(), dist="norm", plot=axes[1, i])
    axes[1, i].set_title(f'{ticker} - Q-Q Plot (Normality Check)')
    axes[1, i].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Calculate risk metrics
print("\nRisk Metrics Summary:")
print("="*80)
for ticker, df in enhanced_data.items():
    returns = df['daily_return'].dropna()
    
    # Basic statistics
    mean_return = returns.mean()
    std_return = returns.std()
    skewness = returns.skew()
    kurtosis = returns.kurtosis()
    
    # Risk metrics
    var_95 = returns.quantile(0.05)  # 5% VaR
    var_99 = returns.quantile(0.01)  # 1% VaR
    
    # Sharpe ratio (assuming 0% risk-free rate)
    sharpe_ratio = mean_return / std_return * np.sqrt(252) if std_return != 0 else 0
    
    # Maximum drawdown calculation
    cumulative_returns = (1 + returns).cumprod()
    rolling_max = cumulative_returns.expanding().max()
    drawdown = (cumulative_returns - rolling_max) / rolling_max
    max_drawdown = drawdown.min()
    
    print(f"\n{ticker}:")
    print(f"  Mean Daily Return:     {mean_return:.6f}")
    print(f"  Daily Volatility:      {std_return:.6f}")
    print(f"  Annualized Volatility: {std_return * np.sqrt(252):.4f}")
    print(f"  Skewness:              {skewness:.4f}")
    print(f"  Kurtosis:              {kurtosis:.4f}")
    print(f"  5% VaR:                {var_95:.6f}")
    print(f"  1% VaR:                {var_99:.6f}")
    print(f"  Sharpe Ratio:          {sharpe_ratio:.4f}")
    print(f"  Maximum Drawdown:      {max_drawdown:.6f}")


Risk Metrics Summary:

AAPL:
  Mean Daily Return:     0.001161
  Daily Volatility:      0.021147
  Annualized Volatility: 0.3357
  Skewness:              0.0790
  Kurtosis:              4.8794
  5% VaR:                -0.032406
  1% VaR:                -0.055592
  Sharpe Ratio:          0.8712
  Maximum Drawdown:      -0.314273

MSFT:
  Mean Daily Return:     0.001058
  Daily Volatility:      0.020555
  Annualized Volatility: 0.3263
  Skewness:              0.0266
  Kurtosis:              6.4922
  5% VaR:                -0.029482
  1% VaR:                -0.049500
  Sharpe Ratio:          0.8170
  Maximum Drawdown:      -0.375565

GOOGL:
  Mean Daily Return:     0.000934
  Daily Volatility:      0.021124
  Annualized Volatility: 0.3353
  Skewness:              -0.0947
  Kurtosis:              3.2273
  5% VaR:                -0.032451
  1% VaR:                -0.054304
  Sharpe Ratio:          0.7015
  Maximum Drawdown:      -0.443201

AMZN:
  Mean Daily Return:     0.000750
  Daily Vo

## 6. Correlation Analysis
Analyze correlations between different stocks for portfolio risk assessment

In [9]:
# Create correlation analysis
if len(enhanced_data) > 1:
    # Create a combined dataframe of returns
    returns_df = pd.DataFrame()
    prices_df = pd.DataFrame()
    volatility_df = pd.DataFrame()
    
    for ticker, df in enhanced_data.items():
        returns_df[ticker] = df['daily_return']
        prices_df[ticker] = df['close']
        volatility_df[ticker] = df['volatility_20']
    
    # Calculate correlation matrices
    returns_corr = returns_df.corr()
    vol_corr = volatility_df.corr()
    
    # Plot correlation heatmaps
    fig, axes = plt.subplots(1, 2, figsize=(15, 6))
    
    # Returns correlation
    sns.heatmap(returns_corr, annot=True, cmap='coolwarm', center=0, 
                square=True, ax=axes[0], cbar_kws={'label': 'Correlation'})
    axes[0].set_title('Daily Returns Correlation Matrix')
    
    # Volatility correlation
    sns.heatmap(vol_corr, annot=True, cmap='coolwarm', center=0, 
                square=True, ax=axes[1], cbar_kws={'label': 'Correlation'})
    axes[1].set_title('Volatility Correlation Matrix')
    
    plt.tight_layout()
    plt.show()
    
    # Print correlation statistics
    print("Correlation Analysis:")
    print("="*50)
    print("\nReturns Correlation Matrix:")
    print(returns_corr.round(4))
    print("\nVolatility Correlation Matrix:")
    print(vol_corr.round(4))
    
    # Calculate average correlations
    returns_avg_corr = returns_corr.values[np.triu_indices_from(returns_corr.values, k=1)].mean()
    vol_avg_corr = vol_corr.values[np.triu_indices_from(vol_corr.values, k=1)].mean()
    
    print(f"\nAverage Returns Correlation: {returns_avg_corr:.4f}")
    print(f"Average Volatility Correlation: {vol_avg_corr:.4f}")
    
else:
    print("Correlation analysis requires multiple stocks. Only one stock loaded.")

Correlation Analysis:

Returns Correlation Matrix:
         AAPL    MSFT   GOOGL    AMZN    TSLA
AAPL   1.0000  0.7768  0.6911  0.6239  0.5111
MSFT   0.7768  1.0000  0.7723  0.6793  0.4716
GOOGL  0.6911  0.7723  1.0000  0.6643  0.4282
AMZN   0.6239  0.6793  0.6643  1.0000  0.4542
TSLA   0.5111  0.4716  0.4282  0.4542  1.0000

Volatility Correlation Matrix:
         AAPL    MSFT   GOOGL    AMZN    TSLA
AAPL   1.0000  0.8960  0.7377  0.6007  0.7372
MSFT   0.8960  1.0000  0.8727  0.6845  0.6809
GOOGL  0.7377  0.8727  1.0000  0.7765  0.5475
AMZN   0.6007  0.6845  0.7765  1.0000  0.4502
TSLA   0.7372  0.6809  0.5475  0.4502  1.0000

Average Returns Correlation: 0.6073
Average Volatility Correlation: 0.6984


## 7. Save Processed Data
Save the enhanced dataset to interim folder for further feature engineering

In [10]:
# Save processed data to interim folder
interim_dir = os.path.join(project_root, "data", "interim")
os.makedirs(interim_dir, exist_ok=True)

print("Saving processed data to interim folder...")
print(f"Directory: {interim_dir}")

for ticker, df in enhanced_data.items():
    filename = f"{ticker}_processed_{datetime.now().strftime('%Y%m%d')}.csv"
    filepath = os.path.join(interim_dir, filename)
    
    # Save to CSV
    df.to_csv(filepath)
    print(f"✓ Saved {ticker} data to {filename} ({len(df)} records)")

# Create a summary report
summary_data = []
for ticker, df in enhanced_data.items():
    summary_data.append({
        'ticker': ticker,
        'records': len(df),
        'start_date': df.index.min().strftime('%Y-%m-%d'),
        'end_date': df.index.max().strftime('%Y-%m-%d'),
        'mean_return': df['daily_return'].mean(),
        'volatility': df['volatility_20'].iloc[-1],
        'max_drawdown': ((1 + df['daily_return'].fillna(0)).cumprod().cummax() - (1 + df['daily_return'].fillna(0)).cumprod()).max()
    })

summary_df = pd.DataFrame(summary_data)
summary_filepath = os.path.join(interim_dir, f"eda_summary_{datetime.now().strftime('%Y%m%d')}.csv")
summary_df.to_csv(summary_filepath, index=False)

print(f"\n✓ Saved EDA summary to eda_summary_{datetime.now().strftime('%Y%m%d')}.csv")
print("\nProcessed Data Summary:")
print(summary_df.round(6))

Saving processed data to interim folder...
Directory: /workspaces/Systematic-Options-Auto-Hedging-Engine/data/interim
✓ Saved AAPL data to AAPL_processed_20251001.csv (1006 records)
✓ Saved MSFT data to MSFT_processed_20251001.csv (1006 records)
✓ Saved GOOGL data to GOOGL_processed_20251001.csv (1006 records)
✓ Saved AMZN data to AMZN_processed_20251001.csv (1006 records)
✓ Saved TSLA data to TSLA_processed_20251001.csv (1006 records)

✓ Saved EDA summary to eda_summary_20251001.csv

Processed Data Summary:
  ticker  records  start_date    end_date  mean_return  volatility  \
0   AAPL     1006  2020-01-02  2023-12-29     0.001161    0.144101   
1   MSFT     1006  2020-01-02  2023-12-29     0.001058    0.146163   
2  GOOGL     1006  2020-01-02  2023-12-29     0.000934    0.254851   
3   AMZN     1006  2020-01-02  2023-12-29     0.000750    0.189638   
4   TSLA     1006  2020-01-02  2023-12-29     0.003070    0.334480   

   max_drawdown  
0      0.758981  
1      0.802266  
2      0.97

## Summary and Next Steps

This EDA notebook has successfully:

1. **Loaded raw data** using the improved data_loader module with fallback providers
2. **Analyzed basic statistics** and data quality for multiple stocks
3. **Calculated returns and momentum indicators** for risk assessment
4. **Examined volatility patterns** and clustering behavior
5. **Analyzed return distributions** and calculated comprehensive risk metrics
6. **Performed correlation analysis** between different assets
7. **Saved processed data** to interim folder for further analysis

### Key Findings:
- All stocks show realistic volatility patterns with clustering
- Return distributions show expected fat tails (high kurtosis)
- Correlations between tech stocks are generally positive but not perfect
- Risk metrics are within expected ranges for equity markets

### Next Steps:
1. Use the processed data in `data/interim/` for options Greeks validation (notebook 02)
2. Implement hedge performance analysis (notebook 03)
3. Develop dynamic hedging strategies based on volatility regimes
4. Backtest the systematic hedging engine with real market data

---
## Troubleshooting Section
Quick tests to verify the notebook environment is working correctly

In [11]:
# Quick environment test
print("🔧 Testing Notebook Environment")
print("=" * 40)

# Test 1: Basic imports
try:
    import pandas as pd
    import numpy as np
    print("✓ Core data science libraries imported")
except Exception as e:
    print(f"✗ Core libraries failed: {e}")

# Test 2: Data loader
try:
    # Test a simple function call
    print("✓ Data loader module accessible")
    print(f"  Available functions: {[f for f in dir(data_loader) if not f.startswith('_')][:5]}...")
except Exception as e:
    print(f"✗ Data loader test failed: {e}")

# Test 3: Simple data operation
try:
    test_df = pd.DataFrame({'test': [1, 2, 3]})
    result = test_df.mean()
    print("✓ Basic pandas operations working")
except Exception as e:
    print(f"✗ Pandas operations failed: {e}")

print("\n🎉 Environment test completed!")
print("If all tests passed, the notebook should work correctly.")

🔧 Testing Notebook Environment
✓ Core data science libraries imported
✓ Data loader module accessible
  Available functions: ['EmptyDataError', 'load_data', 'main', 'obb', 'pd']...
✓ Basic pandas operations working

🎉 Environment test completed!
If all tests passed, the notebook should work correctly.
