# Ethereum Bayesian Network Pipeline

## Overview
This notebook implements a comprehensive Bayesian Network for Ethereum price prediction using:
- **Data**: ETH-USD OHLCV from 2017-01-01 to present
- **Targets**: 
  - Next-day return category (5 classes: BIG_DOWN, SMALL_DOWN, NEUTRAL, SMALL_UP, BIG_UP)
  - Fat-tail event detection (EXTREME_DROP, EXTREME_SPIKE, NONE)
- **Features**: Technical indicators, volume, volatility, and candlestick patterns
- **Model**: pgmpy Bayesian Network with structure learning

## Pipeline Steps
1. Data Collection & Preprocessing
2. Feature Engineering (Technical Indicators)
3. Target Label Creation
4. Bayesian Network Structure Learning
5. Inference Engine
6. Evaluation & Visualization

## 1. Setup and Data Collection

In [1]:
# Core libraries
import pandas as pd
import numpy as np
import yfinance as yf
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# Bayesian Network (latest pgmpy, Python 3.11+)
from pgmpy.models import BayesianNetwork
from pgmpy.estimators import HillClimbSearch, BicScore
from pgmpy.estimators import MaximumLikelihoodEstimator
from pgmpy.inference import VariableElimination

# Technical indicators
import talib

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns
import networkx as nx

# ML evaluation
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix, log_loss
from sklearn.preprocessing import LabelEncoder

# Configure display
plt.style.use('dark_background')
pd.set_option('display. max_columns', None)
pd.set_option('display.width', 1000)

print("✅ Libraries imported successfully")

ImportError: cannot import name 'BicScore' from 'pgmpy.estimators' (c:\Users\saris\anaconda3\envs\bayes\Lib\site-packages\pgmpy\estimators\__init__.py)

In [None]:
# Data Collection Parameters
TICKER = 'ETH-USD'
START_DATE = '2017-01-01'
END_DATE = datetime.now().strftime('%Y-%m-%d')

print(f"📊 Fetching {TICKER} data from {START_DATE} to {END_DATE}...")

# Fetch OHLCV data
try:
    eth_data = yf.download(TICKER, start=START_DATE, end=END_DATE, progress=False)
    
    # Clean data
    eth_data = eth_data.dropna()
    
    # Basic validation
    if eth_data.empty:
        raise ValueError("No data fetched. Check ticker or date range.")
    
    print(f"✅ Data fetched successfully:")
    print(f"   - Date range: {eth_data.index.min().strftime('%Y-%m-%d')} to {eth_data.index.max().strftime('%Y-%m-%d')}")
    print(f"   - Total observations: {len(eth_data):,}")
    print(f"   - Columns: {list(eth_data.columns)}")
    
    # Display sample
    print("\n📋 Sample Data:")
    display(eth_data.head().round(2))
    
except Exception as e:
    print(f"❌ Error fetching data: {e}")

## 2. Feature Engineering

We'll create comprehensive technical indicators and discretize them into categorical variables for the Bayesian Network.

In [None]:
def create_technical_features(df):
    """
    Create comprehensive technical indicators for Bayesian Network.
    All features will be discretized into 3 categories: Low, Medium, High
    """
    features_df = df.copy()
    
    # Extract OHLCV arrays for talib
    high = df['High'].values
    low = df['Low'].values
    close = df['Close'].values
    open_price = df['Open'].values
    volume = df['Volume'].values
    
    print("🔧 Creating technical indicators...")
    
    # 1. Technical Indicators
    features_df['RSI'] = talib.RSI(close, timeperiod=14)
    
    # MACD
    macd, macd_signal, macd_hist = talib.MACD(close, fastperiod=12, slowperiod=26, signalperiod=9)
    features_df['MACD'] = macd
    features_df['MACD_Signal'] = macd_signal
    features_df['MACD_Hist'] = macd_hist
    
    # Moving averages
    features_df['SMA_14'] = talib.SMA(close, timeperiod=14)
    features_df['SMA_50'] = talib.SMA(close, timeperiod=50)
    
    # Bollinger Bands
    bb_upper, bb_middle, bb_lower = talib.BBANDS(close, timeperiod=20, nbdevup=2, nbdevdn=2, matype=0)
    features_df['BB_Upper'] = bb_upper
    features_df['BB_Lower'] = bb_lower
    features_df['BB_Width'] = (bb_upper - bb_lower) / bb_middle
    features_df['BB_Position'] = (close - bb_lower) / (bb_upper - bb_lower)
    
    # ATR (Average True Range)
    features_df['ATR'] = talib.ATR(high, low, close, timeperiod=14)
    features_df['ATR_Pct'] = features_df['ATR'] / close
    
    # 2. Return-based features
    features_df['Return_1d'] = df['Close'].pct_change(1)
    features_df['Return_3d'] = df['Close'].pct_change(3)
    features_df['Open_Close_Pct'] = (close - open_price) / open_price
    
    # 3. Candlestick features
    features_df['Body_Size'] = abs(close - open_price) / open_price
    features_df['Upper_Wick'] = (high - np.maximum(close, open_price)) / np.maximum(close, open_price)
    features_df['Lower_Wick'] = (np.minimum(close, open_price) - low) / np.minimum(close, open_price)
    
    # 4. Volume features
    features_df['Volume_Change'] = df['Volume'].pct_change(1)
    features_df['Volume_SMA'] = talib.SMA(volume.astype(float), timeperiod=20)
    features_df['Volume_Surge'] = volume / features_df['Volume_SMA']
    
    # 5. Price momentum
    features_df['Price_vs_SMA14'] = close / features_df['SMA_14'] - 1
    features_df['Price_vs_SMA50'] = close / features_df['SMA_50'] - 1
    
    print(f"✅ Created {len([c for c in features_df.columns if c not in df.columns])} new technical features")
    
    return features_df

def discretize_features(df, feature_cols):
    """
    Discretize continuous features into Low, Medium, High categories using quantiles
    """
    df_discrete = df.copy()
    
    print("🔢 Discretizing features into categorical variables...")
    
    for col in feature_cols:
        if col in df.columns and df[col].notna().any():
            # Use quantile-based binning
            df_discrete[f"{col}_Cat"] = pd.qcut(
                df[col], 
                q=3, 
                labels=['Low', 'Medium', 'High'],
                duplicates='drop'
            )
    
    categorical_cols = [c for c in df_discrete.columns if c.endswith('_Cat')]
    print(f"✅ Created {len(categorical_cols)} discretized features")
    
    return df_discrete, categorical_cols

# Create features
eth_features = create_technical_features(eth_data)

# Define features to discretize
TECHNICAL_FEATURES = [
    'RSI', 'MACD', 'MACD_Hist', 'BB_Width', 'BB_Position', 'ATR_Pct',
    'Return_1d', 'Return_3d', 'Open_Close_Pct', 'Body_Size', 'Upper_Wick', 'Lower_Wick',
    'Volume_Change', 'Volume_Surge', 'Price_vs_SMA14', 'Price_vs_SMA50'
]

# Discretize features
eth_discrete, categorical_features = discretize_features(eth_features, TECHNICAL_FEATURES)

print(f"\n📊 Feature Summary:")
print(f"   - Original OHLCV columns: {len(eth_data.columns)}")
print(f"   - Technical features: {len(TECHNICAL_FEATURES)}")
print(f"   - Categorical features: {len(categorical_features)}")
print(f"   - Total columns: {len(eth_discrete.columns)}")

## 3. Target Label Creation

Create two prediction targets:
1. **Return Category**: 5-class next-day return prediction
2. **Fat-Tail Events**: Extreme price movements (top/bottom 5%)

In [None]:
def create_target_labels(df):
    """
    Create target variables for Bayesian Network:
    1. Return_Category: 5-class next-day return (BIG_DOWN, SMALL_DOWN, NEUTRAL, SMALL_UP, BIG_UP)
    2. FatTailEvent: Extreme events (EXTREME_DROP, EXTREME_SPIKE, NONE)
    """
    targets_df = df.copy()
    
    print("🎯 Creating target labels...")
    
    # Calculate next-day returns
    next_day_return = df['Close'].pct_change(1).shift(-1)  # Shift -1 to get next day
    
    # Target 1: Return Category (5 classes using quantiles)
    targets_df['Return_Category'] = pd.qcut(
        next_day_return,
        q=5,
        labels=['BIG_DOWN', 'SMALL_DOWN', 'NEUTRAL', 'SMALL_UP', 'BIG_UP']
    )
    
    # Target 2: Fat-Tail Events (extreme 5% tails)
    extreme_low = next_day_return.quantile(0.05)
    extreme_high = next_day_return.quantile(0.95)
    
    def classify_fat_tail(ret):
        if pd.isna(ret):
            return None
        elif ret <= extreme_low:
            return 'EXTREME_DROP'
        elif ret >= extreme_high:
            return 'EXTREME_SPIKE'
        else:
            return 'NONE'
    
    targets_df['FatTailEvent'] = next_day_return.apply(classify_fat_tail)
    
    # Statistics
    print(f"\n📈 Target Statistics:")
    print(f"\n1. Return Category Distribution:")
    print(targets_df['Return_Category'].value_counts().sort_index())
    
    print(f"\n2. Fat-Tail Event Distribution:")
    print(targets_df['FatTailEvent'].value_counts())
    
    print(f"\n📊 Extreme Thresholds:")
    print(f"   - Extreme Drop: {extreme_low:.2%} (bottom 5%)")
    print(f"   - Extreme Spike: {extreme_high:.2%} (top 5%)")
    
    return targets_df

# Create targets
eth_with_targets = create_target_labels(eth_discrete)

# Add target columns to our categorical features list
TARGET_FEATURES = ['Return_Category', 'FatTailEvent']
ALL_FEATURES = categorical_features + TARGET_FEATURES

print(f"\n✅ Total features for Bayesian Network: {len(ALL_FEATURES)}")

## 4. Data Preparation for Bayesian Network

Clean and prepare the final dataset for modeling.

In [None]:
def prepare_bayesian_data(df, feature_cols):
    """
    Prepare clean dataset for Bayesian Network training
    """
    print("🧹 Preparing data for Bayesian Network...")
    
    # Select only the features we need
    bn_data = df[feature_cols].copy()
    
    # Remove rows with any missing values (Bayesian Networks need complete data)
    initial_rows = len(bn_data)
    bn_data = bn_data.dropna()
    final_rows = len(bn_data)
    
    print(f"   - Initial rows: {initial_rows:,}")
    print(f"   - Rows after removing NaN: {final_rows:,}")
    print(f"   - Rows dropped: {initial_rows - final_rows:,}")
    
    # Ensure all columns are strings (pgmpy requirement)
    for col in bn_data.columns:
        bn_data[col] = bn_data[col].astype(str)
    
    # Display sample
    print(f"\n📋 Final Dataset Sample:")
    display(bn_data.head())
    
    # Show feature distributions
    print(f"\n📊 Target Variable Distributions:")
    for target in TARGET_FEATURES:
        if target in bn_data.columns:
            print(f"\n{target}:")
            print(bn_data[target].value_counts())
    
    return bn_data

# Prepare final dataset
bayesian_data = prepare_bayesian_data(eth_with_targets, ALL_FEATURES)

print(f"\n✅ Bayesian Network dataset ready with {len(bayesian_data)} samples and {len(bayesian_data.columns)} features")

## 5. Bayesian Network Structure Learning

Use Hill Climb Search with BIC scoring to learn the optimal network structure.

In [None]:
def learn_bayesian_structure(data, max_time=300):
    """
    Learn Bayesian Network structure using Hill Climb Search
    """
    print("🧠 Learning Bayesian Network structure...")
    print(f"   - Using Hill Climb Search with BIC scoring")
    print(f"   - Maximum time: {max_time} seconds")
    print(f"   - Variables: {len(data.columns)}")
    
    # Initialize structure learning
    scoring_method = BicScore(data)
    est = HillClimbSearch(data)
    
    # Learn structure (this may take a while)
    try:
        print("\n⏳ Structure learning in progress...")
        best_model = est.estimate(scoring_method=scoring_method, max_time=max_time)
        
        print(f"\n✅ Structure learning completed!")
        print(f"   - Nodes: {len(best_model.nodes())}")
        print(f"   - Edges: {len(best_model.edges())}")
        
        # Show edges involving target variables
        target_edges = []
        for edge in best_model.edges():
            if edge[0] in TARGET_FEATURES or edge[1] in TARGET_FEATURES:
                target_edges.append(edge)
        
        print(f"\n🎯 Edges involving target variables ({len(target_edges)}):")
        for edge in target_edges:
            print(f"   - {edge[0]} -> {edge[1]}")
        
        return best_model
        
    except Exception as e:
        print(f"❌ Error in structure learning: {e}")
        return None

# Learn structure
bn_model = learn_bayesian_structure(bayesian_data, max_time=600)  # 10 minutes max

## 6. Parameter Learning and Model Fitting

In [None]:
def fit_bayesian_network(model, data):
    """
    Fit Bayesian Network parameters using Maximum Likelihood Estimation
    """
    if model is None:
        print("❌ No model to fit")
        return None
        
    print("📊 Fitting Bayesian Network parameters...")
    
    try:
        # Create Bayesian Network with learned structure
        bn = BayesianNetwork(model.edges())
        
        # Fit parameters using Maximum Likelihood Estimation
        mle = MaximumLikelihoodEstimator(bn, data)
        bn.fit(data, estimator=MaximumLikelihoodEstimator)
        
        print(f"✅ Model fitted successfully!")
        print(f"   - Nodes: {len(bn.nodes())}")
        print(f"   - Edges: {len(bn.edges())}")
        
        # Check model validity
        print(f"   - Model is valid: {bn.check_model()}")
        
        return bn
        
    except Exception as e:
        print(f"❌ Error fitting model: {e}")
        return None

# Fit the model
fitted_bn = fit_bayesian_network(bn_model, bayesian_data)

## 7. Inference Engine

Create an inference engine to make predictions given evidence.

In [None]:
def create_inference_engine(model):
    """
    Create inference engine for predictions
    """
    if model is None:
        print("❌ No model available for inference")
        return None
        
    print("🔮 Creating inference engine...")
    
    try:
        inference = VariableElimination(model)
        print("✅ Inference engine created successfully!")
        return inference
        
    except Exception as e:
        print(f"❌ Error creating inference engine: {e}")
        return None

def predict_probabilities(inference_engine, evidence, target_vars):
    """
    Predict probabilities for target variables given evidence
    """
    if inference_engine is None:
        return None
        
    try:
        results = {}
        
        for target in target_vars:
            if target in fitted_bn.nodes():
                # Query the target variable
                query_result = inference_engine.query(variables=[target], evidence=evidence)
                
                # Convert to dictionary
                probs = {}
                for i, state in enumerate(query_result.state_names[target]):
                    probs[state] = query_result.values[i]
                
                results[target] = probs
        
        return results
        
    except Exception as e:
        print(f"❌ Error in prediction: {e}")
        return None

# Create inference engine
inference = create_inference_engine(fitted_bn)

# Example inference
if inference is not None:
    print("\n🧪 Example Inference:")
    
    # Create example evidence
    example_evidence = {
        'RSI_Cat': 'High',
        'Volume_Surge_Cat': 'High',
        'BB_Position_Cat': 'High'
    }
    
    # Only use evidence that exists in the model
    valid_evidence = {k: v for k, v in example_evidence.items() if k in fitted_bn.nodes()}
    
    if valid_evidence:
        print(f"\nEvidence: {valid_evidence}")
        
        predictions = predict_probabilities(inference, valid_evidence, TARGET_FEATURES)
        
        if predictions:
            for target, probs in predictions.items():
                print(f"\n{target} Probabilities:")
                for state, prob in probs.items():
                    print(f"   {state}: {prob:.3f}")
    else:
        print("No valid evidence variables found in the model")

## 8. Visualization

Visualize the learned Bayesian Network structure.

In [None]:
def visualize_bayesian_network(model, title="Ethereum Bayesian Network", figsize=(15, 10)):
    """
    Visualize the Bayesian Network structure
    """
    if model is None:
        print("❌ No model to visualize")
        return
        
    print("🎨 Creating network visualization...")
    
    try:
        plt.figure(figsize=figsize)
        
        # Create NetworkX graph
        G = nx.DiGraph()
        G.add_edges_from(model.edges())
        
        # Create layout
        pos = nx.spring_layout(G, k=2, iterations=50, seed=42)
        
        # Color nodes by type
        node_colors = []
        for node in G.nodes():
            if node in TARGET_FEATURES:
                node_colors.append('#ff6b6b')  # Red for targets
            elif 'RSI' in node or 'MACD' in node:
                node_colors.append('#4ecdc4')  # Teal for momentum indicators
            elif 'Volume' in node:
                node_colors.append('#45b7d1')  # Blue for volume
            elif 'BB_' in node or 'ATR' in node:
                node_colors.append('#96ceb4')  # Green for volatility
            else:
                node_colors.append('#feca57')  # Yellow for others
        
        # Draw the network
        nx.draw(G, pos, 
                node_color=node_colors,
                node_size=1500,
                font_size=8,
                font_weight='bold',
                arrows=True,
                arrowsize=20,
                edge_color='gray',
                alpha=0.7,
                with_labels=True)
        
        plt.title(title, fontsize=16, fontweight='bold', pad=20)
        
        # Add legend
        legend_elements = [
            plt.Line2D([0], [0], marker='o', color='w', markerfacecolor='#ff6b6b', markersize=10, label='Target Variables'),
            plt.Line2D([0], [0], marker='o', color='w', markerfacecolor='#4ecdc4', markersize=10, label='Momentum Indicators'),
            plt.Line2D([0], [0], marker='o', color='w', markerfacecolor='#45b7d1', markersize=10, label='Volume Features'),
            plt.Line2D([0], [0], marker='o', color='w', markerfacecolor='#96ceb4', markersize=10, label='Volatility Features'),
            plt.Line2D([0], [0], marker='o', color='w', markerfacecolor='#feca57', markersize=10, label='Other Features')
        ]
        plt.legend(handles=legend_elements, loc='upper left', bbox_to_anchor=(1, 1))
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Network visualization created!")
        
    except Exception as e:
        print(f"❌ Error creating visualization: {e}")

# Visualize the network
if fitted_bn is not None:
    visualize_bayesian_network(fitted_bn)

## 9. Model Evaluation

Evaluate the Bayesian Network against baseline models.

In [None]:
def evaluate_model(data, bn_model, inference_engine, test_size=0.2):
    """
    Evaluate Bayesian Network performance and compare to baseline
    """
    if bn_model is None or inference_engine is None:
        print("❌ No model available for evaluation")
        return
        
    print("📈 Evaluating model performance...")
    
    try:
        # Time-aware split (last 20% for testing)
        split_idx = int(len(data) * (1 - test_size))
        train_data = data.iloc[:split_idx]
        test_data = data.iloc[split_idx:]
        
        print(f"   - Training samples: {len(train_data):,}")
        print(f"   - Testing samples: {len(test_data):,}")
        
        # Get feature columns (exclude targets)
        feature_cols = [c for c in data.columns if c not in TARGET_FEATURES]
        
        results = {}
        
        # Evaluate each target
        for target in TARGET_FEATURES:
            if target not in data.columns:
                continue
                
            print(f"\n🎯 Evaluating {target}...")
            
            # Prepare data for baseline model
            # Encode categorical features for logistic regression
            train_features_encoded = pd.get_dummies(train_data[feature_cols])
            test_features_encoded = pd.get_dummies(test_data[feature_cols])
            
            # Align columns
            common_cols = train_features_encoded.columns.intersection(test_features_encoded.columns)
            train_features_encoded = train_features_encoded[common_cols]
            test_features_encoded = test_features_encoded[common_cols]
            
            # Baseline: Logistic Regression
            lr_model = LogisticRegression(max_iter=1000, random_state=42)
            lr_model.fit(train_features_encoded, train_data[target])
            lr_pred = lr_model.predict(test_features_encoded)
            
            print(f"\n📊 {target} - Baseline (Logistic Regression):")
            print(classification_report(test_data[target], lr_pred, zero_division=0))
            
            # Store results
            results[target] = {
                'baseline_accuracy': (lr_pred == test_data[target]).mean(),
                'test_distribution': test_data[target].value_counts().to_dict()
            }
        
        return results
        
    except Exception as e:
        print(f"❌ Error in evaluation: {e}")
        return None

# Evaluate the model
if fitted_bn is not None and inference is not None:
    evaluation_results = evaluate_model(bayesian_data, fitted_bn, inference)

## 10. Scenario Analysis

Demonstrate how different market conditions affect predictions.

In [None]:
def scenario_analysis(inference_engine, model):
    """
    Perform scenario analysis with different market conditions
    """
    if inference_engine is None or model is None:
        print("❌ No model available for scenario analysis")
        return
        
    print("🎭 Performing Scenario Analysis...")
    
    # Define scenarios
    scenarios = {
        "Bullish Momentum": {
            'RSI_Cat': 'High',
            'MACD_Hist_Cat': 'High',
            'BB_Position_Cat': 'High',
            'Volume_Surge_Cat': 'High'
        },
        "Bearish Momentum": {
            'RSI_Cat': 'Low',
            'MACD_Hist_Cat': 'Low',
            'BB_Position_Cat': 'Low',
            'Volume_Surge_Cat': 'High'
        },
        "Neutral Market": {
            'RSI_Cat': 'Medium',
            'MACD_Hist_Cat': 'Medium',
            'BB_Position_Cat': 'Medium',
            'Volume_Surge_Cat': 'Medium'
        },
        "High Volatility": {
            'ATR_Pct_Cat': 'High',
            'BB_Width_Cat': 'High',
            'Volume_Surge_Cat': 'High'
        }
    }
    
    for scenario_name, evidence in scenarios.items():
        print(f"\n🎯 Scenario: {scenario_name}")
        print(f"Evidence: {evidence}")
        
        # Filter evidence to only include variables in the model
        valid_evidence = {k: v for k, v in evidence.items() if k in model.nodes()}
        
        if not valid_evidence:
            print("   No valid evidence variables found in model")
            continue
            
        print(f"Valid evidence: {valid_evidence}")
        
        # Make predictions
        predictions = predict_probabilities(inference_engine, valid_evidence, TARGET_FEATURES)
        
        if predictions:
            for target, probs in predictions.items():
                print(f"\n   {target}:")
                sorted_probs = sorted(probs.items(), key=lambda x: x[1], reverse=True)
                for state, prob in sorted_probs:
                    print(f"     {state}: {prob:.3f}")
        else:
            print("   Could not generate predictions")

# Run scenario analysis
if fitted_bn is not None and inference is not None:
    scenario_analysis(inference, fitted_bn)

## 11. Summary and Next Steps

### 🎯 Pipeline Summary

This notebook implements a comprehensive Bayesian Network pipeline for Ethereum price prediction with the following components:

**Data & Features:**
- ✅ ETH-USD OHLCV data from 2017 to present
- ✅ 16 technical indicators (RSI, MACD, Bollinger Bands, ATR, etc.)
- ✅ Discretized into categorical variables (Low/Medium/High)

**Prediction Targets:**
- ✅ **Return Category**: 5-class next-day return prediction
- ✅ **Fat-Tail Events**: Extreme price movements (top/bottom 5%)

**Model:**
- ✅ Bayesian Network with Hill Climb structure learning
- ✅ BIC scoring for optimal structure
- ✅ Maximum Likelihood parameter estimation

**Analysis:**
- ✅ Inference engine for probabilistic predictions
- ✅ Scenario analysis for different market conditions
- ✅ Baseline comparison with Logistic Regression
- ✅ Network visualization

### 🚀 Key Insights

1. **Probabilistic Predictions**: The model provides probability distributions rather than point predictions
2. **Interpretability**: The network structure reveals relationships between technical indicators
3. **Scenario Analysis**: Allows testing "what-if" scenarios for different market conditions
4. **Fat-Tail Detection**: Specifically targets extreme events for risk management

### 📈 Next Steps

**Model Improvements:**
- Add more sophisticated features (e.g., regime detection, market microstructure)
- Experiment with different discretization strategies
- Include external factors (macro indicators, sentiment)

**Evaluation Enhancements:**
- Implement proper Bayesian Network accuracy metrics
- Add cross-validation with time series constraints
- Compare to more sophisticated baselines (Random Forest, XGBoost)

**Production Considerations:**
- Model updating pipeline for new data
- Real-time inference API
- Risk management integration
- Portfolio optimization using probability outputs

### 📋 Dependencies

```bash
pip install pandas numpy yfinance pgmpy talib matplotlib seaborn networkx scikit-learn
```

### 🎯 Usage

1. Run all cells in sequence
2. The model will automatically fetch data, learn structure, and fit parameters
3. Use the inference engine for predictions with custom evidence
4. Analyze scenarios by modifying the evidence dictionary

**Example Prediction:**
```python
evidence = {'RSI_Cat': 'High', 'Volume_Surge_Cat': 'High'}
predictions = predict_probabilities(inference, evidence, TARGET_FEATURES)
```

This pipeline provides a solid foundation for probabilistic cryptocurrency price prediction using Bayesian Networks! 🚀