# Cryptocurrency Trading Algorithm

This notebook implements a complete cryptocurrency trading strategy using machine learning. The implementation includes:

1. Data Collection and Preprocessing
2. Technical Analysis
3. Feature Engineering
4. Model Training and Optimization
5. Backtesting
6. Performance Visualization

The strategy uses XGBoost for prediction and incorporates various technical indicators for feature generation.

## 1. Data Collection and Preprocessing

In this section, we:
1. Load historical cryptocurrency price data
2. Handle missing values
3. Ensure data integrity and chronological order

## 2. Technical Analysis

Calculate essential technical indicators:
- RSI (Relative Strength Index)
- Bollinger Bands
- Moving Averages
- ATR (Average True Range)
- Volume Metrics

## 3. Feature Engineering

In this section, we:
1. Generate trading signals from technical indicators
2. Handle feature scaling and normalization
3. Create derived features
4. Analyze feature correlations

## 4. Model Training and Optimization

This section covers:
1. Data splitting (train/test)
2. Class imbalance handling with SMOTE
3. XGBoost model training
4. Hyperparameter optimization
5. Model evaluation

## 5. Backtesting

Implement and evaluate the trading strategy:
1. Execute trades based on model predictions
2. Calculate performance metrics
3. Track portfolio value

## 6. Trading Performance Visualization

Create interactive visualizations of:
1. Price action with buy/sell signals
2. Volume analysis
3. Portfolio performance
4. Trading metrics

In [None]:
# --- Core Python & Environment ---
import logging
import warnings

# --- Data Handling & Numerics ---
import pandas as pd
import numpy as np

# --- Technical Analysis ---
from ta.momentum import RSIIndicator # Relative Strength Index
from ta.volatility import AverageTrueRange, BollingerBands
from ta.trend import SMAIndicator

# --- Machine Learning ---
import xgboost as xgb # ML Model
from sklearn.model_selection import train_test_split, StratifiedKFold, RandomizedSearchCV
from sklearn.metrics import accuracy_score, classification_report
from sklearn.exceptions import UndefinedMetricWarning
from imblearn.over_sampling import SMOTE # Synthetic Minority Over-sampling Technique

# --- Plotting ---
from plotly.subplots import make_subplots
import plotly.graph_objects as go

# --- Configure Logging ---
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - [%(funcName)s] - %(message)s')

# --- Suppress Specific Warnings ---
warnings.filterwarnings('ignore', category=UserWarning, module='pandas')
warnings.filterwarnings('ignore', category=UserWarning, module='sklearn')
warnings.filterwarnings('ignore', category=UserWarning, module='xgboost')
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings("ignore", category=UndefinedMetricWarning)
warnings.filterwarnings("ignore", category=DeprecationWarning)
warnings.filterwarnings("ignore", category=RuntimeWarning)

print("✅ Setup Complete: Libraries imported and configuration set.")

## Data Loading and Preprocessing

In this section, we'll:
1. Load our historical price data
2. Apply initial data cleaning
3. Handle missing values and outliers
4. Calculate basic technical indicators

In [None]:
def load_and_preprocess_data(file_path='test_df_features.csv'):
    try:
        # Load the data
        df = pd.read_csv(file_path)
        
        # Convert timestamp to datetime if needed
        if 'timestamp' in df.columns:
            df['timestamp'] = pd.to_datetime(df['timestamp'])
            df.set_index('timestamp', inplace=True)
        
        # Sort by index to ensure chronological order
        df = df.sort_index()
        
        # Handle missing values
        missing_count = df.isnull().sum()
        if missing_count.any():
            logging.warning(f'Found missing values:\n{missing_count[missing_count > 0]}')
            # Forward fill for technical indicators
            df = df.fillna(method='ffill')
            # Any remaining NaNs are filled with column median
            df = df.fillna(df.median())
        
        return df
    except Exception as e:
        logging.error(f'Error loading data: {str(e)}')
        raise

# Load the data
try:
    df = load_and_preprocess_data()
    print(f'✅ Data loaded successfully. Shape: {df.shape}')
    print('\nFirst few rows of the data:')
    display(df.head())
    
    print('\nDataset Info:')
    display(df.info())
except Exception as e:
    print(f'❌ Failed to load data: {str(e)}')

In [None]:
def calculate_technical_indicators(df):
    if not isinstance(df, pd.DataFrame):
        raise TypeError('Input must be a pandas DataFrame')
    
    if 'close' not in df.columns:
        raise ValueError("DataFrame must contain 'close' price column")
    
    try:
        # RSI
        rsi = RSIIndicator(close=df['close'], window=14)
        df['rsi'] = rsi.rsi()
        
        # Bollinger Bands
        bb = BollingerBands(close=df['close'], window=20, window_dev=2)
        df['bb_upper'] = bb.bollinger_hband()
        df['bb_lower'] = bb.bollinger_lband()
        df['bb_mid'] = bb.bollinger_mavg()
        df['bb_pct_b'] = (df['close'] - df['bb_lower']) / (df['bb_upper'] - df['bb_lower'])
        
        # Moving Averages
        df['sma_20'] = SMAIndicator(close=df['close'], window=20).sma_indicator()
        df['sma_50'] = SMAIndicator(close=df['close'], window=50).sma_indicator()
        df['ma_cross'] = (df['sma_20'] > df['sma_50']).astype(int)
        
        # Price momentum
        df['price_momentum'] = df['close'].pct_change(5)
        
        # Average True Range
        if all(col in df.columns for col in ['high', 'low']):
            atr = AverageTrueRange(high=df['high'], low=df['low'], close=df['close'], window=14)
            df['atr'] = atr.average_true_range()
            df['atr_pct'] = df['atr'] / df['close']  # ATR as percentage of price
            
        # Volume metrics
        if 'volume' in df.columns:
            df['volume_pct_change'] = df['volume'].pct_change()
        
        # Remove old duplicate columns
        columns_to_drop = ['BB_hband', 'BB_lband', 'BB_mavg', 'RSI', 'SMA_long', 'SMA_short']
        df = df.drop(columns=[col for col in columns_to_drop if col in df.columns])
        
        # Remove any remaining NaN values from indicator calculations
        df.dropna(inplace=True)
        
        return df
    except Exception as e:
        logging.error(f'Error calculating technical indicators: {str(e)}')
        raise

# Calculate technical indicators
try:
    df = calculate_technical_indicators(df)
    print('✅ Technical indicators calculated successfully')
    print(f'Final dataset shape: {df.shape}')
    
    # Display summary statistics
    print('\nSummary statistics of technical indicators:')
    display(df[['rsi', 'bb_upper', 'bb_lower', 'bb_mid', 'atr']].describe())
except Exception as e:
    print(f'❌ Failed to calculate technical indicators: {str(e)}')

## Feature Engineering

In this section, we'll create trading signals and labels based on our technical indicators. We'll focus on:
1. RSI-based signals with dynamic thresholds
2. Bollinger Bands signals
3. Feature normalization and scaling

In [None]:
def get_rsi_quantile_thresholds(rsi_series, lower_quantile=0.2, upper_quantile=0.8):
    """Calculate RSI thresholds based on historical distribution.
    
    Args:
        rsi_series (pd.Series): Series containing RSI values
        lower_quantile (float): Quantile for oversold threshold (default: 0.2)
        upper_quantile (float): Quantile for overbought threshold (default: 0.8)
    
    Returns:
        tuple: (lower_threshold, upper_threshold)
    """
    if not isinstance(rsi_series, pd.Series):
        raise TypeError('rsi_series must be a pandas Series')
    
    if rsi_series.empty:
        raise ValueError('RSI series is empty')
    
    # Validate quantile values
    if not (0 < lower_quantile < upper_quantile < 1):
        raise ValueError('Quantiles must be between 0 and 1, and lower must be less than upper')
    
    # Handle NaN values
    rsi_clean = rsi_series.dropna()
    if rsi_clean.empty:
        raise ValueError('RSI series contains only NaN values')
    
    try:
        lower_threshold = rsi_clean.quantile(lower_quantile)
        upper_threshold = rsi_clean.quantile(upper_quantile)
        
        # Ensure thresholds are within valid RSI range
        lower_threshold = max(0, min(100, lower_threshold))
        upper_threshold = max(0, min(100, upper_threshold))
        
        return lower_threshold, upper_threshold
    except Exception as e:
        logging.error(f'Error calculating RSI thresholds: {str(e)}')
        raise

def apply_rsi_labels(df, rsi_col='rsi', lower_threshold=30, upper_threshold=70):
    """Apply trading labels based on RSI values.
    
    Args:
        df (pd.DataFrame): DataFrame containing RSI values
        rsi_col (str): Name of the RSI column
        lower_threshold (float): RSI oversold threshold
        upper_threshold (float): RSI overbought threshold
    
    Returns:
        pd.DataFrame: DataFrame with new 'signal' column
    """
    if not isinstance(df, pd.DataFrame):
        raise TypeError('df must be a pandas DataFrame')
    
    if rsi_col not in df.columns:
        raise ValueError(f'Column {rsi_col} not found in DataFrame')
    
    # Create a copy to avoid modifying original
    result = df.copy()
    
    try:
        # Initialize signals column with 0 (hold)
        result['signal'] = 0
        
        # Apply signals based on RSI thresholds
        # Using boolean indexing to avoid ambiguity and ensure integer types
        result.loc[result[rsi_col] <= lower_threshold, 'signal'] = 1  # Buy signal
        result.loc[result[rsi_col] >= upper_threshold, 'signal'] = -1 # Sell signal
        
        # Ensure signal is integer type
        result['signal'] = result['signal'].astype(int)
        
        # Verify signal values are discrete
        unique_signals = set(result['signal'].unique())
        expected_signals = {-1, 0, 1}
        if not unique_signals.issubset(expected_signals):
            raise ValueError(f'Invalid signal values found: {unique_signals}')
        
        return result
    except Exception as e:
        logging.error(f'Error applying RSI labels: {str(e)}')
        raise

# Calculate dynamic RSI thresholds
try:
    lower_thresh, upper_thresh = get_rsi_quantile_thresholds(df['rsi'])
    print(f'✅ Dynamic RSI thresholds calculated:')
    print(f'Lower (oversold) threshold: {lower_thresh:.2f}')
    print(f'Upper (overbought) threshold: {upper_thresh:.2f}')
    
    # Apply trading signals
    df = apply_rsi_labels(df, lower_threshold=lower_thresh, upper_threshold=upper_thresh)
    
    # Display signal distribution
    signal_dist = df['signal'].value_counts()
    print('\nSignal distribution:')
    display(signal_dist)
    
    # Basic signal statistics
    print('\nSignal transition matrix:')
    display(pd.crosstab(df['signal'].shift(), df['signal']))
except Exception as e:
    print(f'❌ Error in feature engineering: {str(e)}')

In [None]:
from sklearn.preprocessing import StandardScaler

def normalize_features(df):
    """
    Normalize features while preserving binary columns and handling categorical data
    """
    if not isinstance(df, pd.DataFrame):
        raise TypeError('Input must be a pandas DataFrame')
        
    try:
        # Identify binary columns
        binary_cols = ['ma_cross']  # Add any other binary columns here
        binary_cols = [col for col in binary_cols if col in df.columns]
        
        # Store binary values
        binary_data = df[binary_cols] if binary_cols else None
        
        # Get numeric columns excluding binary
        numeric_cols = df.select_dtypes(include=[np.number]).columns.difference(binary_cols)
        
        # Normalize numeric features
        scaler = StandardScaler()
        df_normalized = pd.DataFrame(
            scaler.fit_transform(df[numeric_cols]),
            columns=numeric_cols,
            index=df.index
        )
        
        # Restore binary columns
        if binary_data is not None:
            df_normalized = pd.concat([df_normalized, binary_data], axis=1)
            
        # Verify feature ranges
        for col in df_normalized.columns:
            if col in binary_cols:
                assert df_normalized[col].isin([0, 1]).all(), f"Binary column {col} contains non-binary values"
            else:
                assert -10 < df_normalized[col].mean() < 10, f"Column {col} may not be properly normalized"
                
        return df_normalized
        
    except Exception as e:
        logging.error(f'Error normalizing features: {str(e)}')
        raise

# Normalize features
try:
    df_normalized = normalize_features(df)
    print('✅ Features normalized successfully')
    print('\nSummary statistics of normalized features:')
    display(df_normalized.describe())
except Exception as e:
    print(f'❌ Error normalizing features: {str(e)}')

In [None]:
# Analyze feature correlations
try:
    # Select only numeric columns for correlation analysis
    numeric_cols = df_normalized.select_dtypes(include=['int64', 'float64']).columns
    correlation_matrix = df_normalized[numeric_cols].corr()
    
    # Create correlation heatmap
    fig = go.Figure(data=go.Heatmap(
        z=correlation_matrix,
        x=correlation_matrix.index,
        y=correlation_matrix.index,
        colorscale='RdBu',
        zmin=-1,
        zmax=1
    ))
    
    fig.update_layout(
        title='Feature Correlation Matrix',
        width=800,
        height=800
    )
    
    fig.show()
    
    # Find highly correlated features
    threshold = 0.8
    high_corr = np.where(np.abs(correlation_matrix) > threshold)
    high_corr = [(correlation_matrix.index[x], correlation_matrix.columns[y], correlation_matrix.iloc[x, y]) 
                 for x, y in zip(*high_corr) if x != y]
    
    if high_corr:
        print('\nHighly correlated features (|correlation| > 0.8):')
        for feat1, feat2, corr in high_corr:
            print(f'{feat1} - {feat2}: {corr:.3f}')
    else:
        print('\nNo highly correlated features found')
        
    # Print the correlation with the target variable (signal)
    if 'signal' in df_normalized.columns:
        print('\nCorrelation with trading signal:')
        signal_corr = df_normalized[numeric_cols].corrwith(df_normalized['signal']).sort_values(ascending=False)
        display(signal_corr)
except Exception as e:
    print(f'❌ Error analyzing correlations: {str(e)}')

## Model Preparation and Training

In this section, we'll:
1. Prepare features and target variables
2. Split the data into training and testing sets
3. Handle class imbalance using SMOTE
4. Train and optimize an XGBoost model

In [None]:
def prepare_model_data(df, feature_cols=None, target_col='signal', test_size=0.2, random_state=42):
    """Prepare data for model training with proper validation.
    
    Args:
        df (pd.DataFrame): Input DataFrame
        feature_cols (list): List of feature columns to use
        target_col (str): Name of the target column
        test_size (float): Proportion of data to use for testing
        random_state (int): Random seed for reproducibility
    
    Returns:
        tuple: (X_train, X_test, y_train, y_test)
    """
    if not isinstance(df, pd.DataFrame):
        raise TypeError('df must be a pandas DataFrame')
    
    if target_col not in df.columns:
        raise ValueError(f'Target column {target_col} not found in DataFrame')
    
    # If feature columns not specified, use all except target
    if feature_cols is None:
        feature_cols = [col for col in df.columns if col != target_col]
    
    # Validate feature columns
    missing_cols = [col for col in feature_cols if col not in df.columns]
    if missing_cols:
        raise ValueError(f'Feature columns not found: {missing_cols}')
    
    try:
        # Select features and target
        X = df[feature_cols]
        y = df[target_col]
        
        # Ensure target is integer type
        y = y.astype(int)
        
        # Verify unique classes
        unique_classes = set(y.unique())
        expected_classes = {-1, 0, 1}
        if not unique_classes.issubset(expected_classes):
            raise ValueError(f'Invalid target values found: {unique_classes}. Expected: {expected_classes}')
        
        # Split data
        X_train, X_test, y_train, y_test = train_test_split(
            X, y, test_size=test_size, random_state=random_state, stratify=y
        )
        
        return X_train, X_test, y_train, y_test
    except Exception as e:
        logging.error(f'Error preparing model data: {str(e)}')
        raise

# Prepare model data
try:
    # Select features (excluding signal and any unwanted columns)
    feature_cols = [
        'rsi', 'bb_upper', 'bb_lower', 'bb_mid', 'bb_pct_b',
        'sma_20', 'sma_50', 'ma_cross', 'price_momentum',
        'atr', 'atr_pct'
    ]
    
    # Verify features exist in normalized data
    print('\nChecking for features in normalized data...')
    for col in feature_cols:
        if col not in df_normalized.columns:
            print(f'Missing feature: {col}')
    
    X_train, X_test, y_train, y_test = prepare_model_data(
        df_normalized, feature_cols=feature_cols
    )
    
    print('✅ Data split successfully:')
    print(f'Training set shape: {X_train.shape}')
    print(f'Testing set shape: {X_test.shape}')
    
    # Handle class imbalance with SMOTE
    smote = SMOTE(random_state=42)
    X_train_balanced, y_train_balanced = smote.fit_resample(X_train, y_train)
    
    print('\nClass distribution before SMOTE:')
    display(pd.Series(y_train).value_counts())
    print('\nClass distribution after SMOTE:')
    display(pd.Series(y_train_balanced).value_counts())
except Exception as e:
    print(f'❌ Error preparing model data: {str(e)}')

In [None]:
def train_xgboost_model(X_train, y_train, X_test, y_test, random_state=42):
    """Train and optimize XGBoost model with cross-validation.
    
    Args:
        X_train (pd.DataFrame): Training features
        y_train (pd.Series): Training targets
        X_test (pd.DataFrame): Test features
        y_test (pd.Series): Test targets
        random_state (int): Random seed for reproducibility
    
           tuple: (trained_model)
    """
    global reverse_mapper  # Make reverse_mapper accessible globally
    try:
        # Since our signals are already correctly labeled as [-1, 0, 1]
        # and XGBoost needs labels starting from 0, we add 1 to shift the range
        y_train_mapped = y_train + 1  # Converts [-1, 0, 1] to [0, 1, 2]
        
        # Create reverse mapper for predictions
        reverse_mapper = {0: -1, 1: 0, 2: 1}
        
        # Define parameter grid for optimization
        param_grid = {
            'max_depth': [3, 4, 5],
            'learning_rate': [0.01, 0.1],
            'n_estimators': [100, 200],
            'min_child_weight': [1, 3],
            'gamma': [0, 0.1],
            'subsample': [0.8, 0.9],
            'colsample_bytree': [0.8, 0.9]
        }
        
        # Initialize model with probability output
        model = xgb.XGBClassifier(
            objective='multi:softprob',  # Changed to output probabilities
            num_class=3,  # [0, 1, 2]
            random_state=random_state
        )
        
        # Perform randomized search with cross-validation
        random_search = RandomizedSearchCV(
            model,
            param_distributions=param_grid,
            n_iter=10,
            scoring='accuracy',
            cv=StratifiedKFold(n_splits=5),
            random_state=random_state,
            n_jobs=-1
        )
        
        # Fit the model with mapped labels
        random_search.fit(X_train, y_train_mapped)
        
        # Get best model and parameters
        best_model = random_search.best_estimator_
        best_params = random_search.best_params_
        
        # Get probability predictions
        y_pred_proba = best_model.predict_proba(X_test)
        # Convert to class predictions with confidence threshold
        confidence_threshold = 0.4  # Adjust this threshold as needed
        y_pred_mapped = np.argmax(y_pred_proba, axis=1)
        max_probs = np.max(y_pred_proba, axis=1)
        
        # Only make non-hold predictions when confidence is high enough
        # y_pred_mapped default is the max prob class. If confidence < threshold,
        # we override it to 'hold' (which is mapped to 1).
        y_pred_mapped[max_probs < confidence_threshold] = 1  
        
        # Convert predictions back to original labels
        y_pred = pd.Series(y_pred_mapped).map(reverse_mapper)
        accuracy = accuracy_score(y_test, y_pred)
        
        print(f'✅ Model training completed:')
        print(f'Best parameters: {best_params}')
        print(f'Test accuracy: {accuracy:.4f}')
        
        # Print classification report
        print('\nClassification Report:')
        print(classification_report(y_test, y_pred))
        
        return best_model, best_params
    except Exception as e:
        logging.error(f'Error training model: {str(e)}')
        raise

# The rest of your code block for feature importance plotting remains unchanged.
# (No need to include it in this response unless you want to confirm specific changes there too)

## Backtesting and Evaluation

In this section, we'll:
1. Implement a simple backtesting framework
2. Calculate trading metrics

In [None]:
def backtest_strategy(df, predictions, initial_balance=10000, transaction_fee_pct=0.001,
                      stop_loss_pct=None, take_profit_pct=None, position_sizing_pct=0.95):
    """Backtest the trading strategy with Stop-Loss, Take-Profit, and Position Sizing.
    
    Args:
        df (pd.DataFrame): DataFrame with price data (original, containing OHLCV).
        predictions (pd.Series): Model predictions (-1: sell, 0: hold, 1: buy).
        initial_balance (float): Initial trading balance.
        transaction_fee_pct (float): Transaction fee as percentage.
        stop_loss_pct (float, optional): Percentage below entry price to trigger stop-loss (e.g., 0.02 for 2%). Defaults to None.
        take_profit_pct (float, optional): Percentage above entry price to trigger take-profit (e.g., 0.05 for 5%). Defaults to None.
        position_sizing_pct (float): Percentage of current balance to allocate per trade (e.g., 0.95 for 95%).
                                     Must be between 0.01 and 1.0. Defaults to 0.95.
                                     
    Returns:
        tuple: (pd.DataFrame: DataFrame with individual trade records, pd.DataFrame: DataFrame with daily total portfolio value, dict: Summary metrics)
    """
    if len(df) != len(predictions):
        raise ValueError('Length of price data and predictions must match')
    
    results = df.copy() # Use a copy to add predictions
    results['prediction'] = predictions.values # Ensure predictions are aligned
    
    # Initialize trading variables
    balance = initial_balance
    position = 0  # 0: no position, 1: long position
    shares = 0
    entry_price = 0 # Price at which the current position was opened
    trades = [] # Stores individual buy/sell events
    
    # Track daily portfolio value for continuous plotting
    daily_portfolio_values = [] 

    # Input validation for SL/TP/Position Sizing
    if stop_loss_pct is not None and (not isinstance(stop_loss_pct, (int, float)) or stop_loss_pct <= 0):
        raise ValueError("stop_loss_pct must be a positive number or None.")
    if take_profit_pct is not None and (not isinstance(take_profit_pct, (int, float)) or take_profit_pct <= 0):
        raise ValueError("take_profit_pct must be a positive number or None.")
    if not isinstance(position_sizing_pct, (int, float)) or not (0.01 <= position_sizing_pct <= 1.0):
        raise ValueError("position_sizing_pct must be a float between 0.01 and 1.0.")

    try:
        for i in range(len(results)):
            current_price = results.iloc[i]['close']
            current_high = results.iloc[i]['high'] # Used for TP check
            current_low = results.iloc[i]['low']   # Used for SL check
            signal = results.iloc[i]['prediction']
            current_date = results.index[i]
            
            # --- Handle existing position first (SL/TP or Model Exit) ---
            exit_today = False
            exit_reason = None
            exit_price_actual = current_price # Default exit price is current_close

            if position == 1: # We are currently in a long position
                sl_triggered = False
                tp_triggered = False
                
                # Calculate SL/TP levels based on entry price
                sl_level = entry_price * (1 - stop_loss_pct) if stop_loss_pct is not None else None
                tp_level = entry_price * (1 + take_profit_pct) if take_profit_pct is not None else None

                # Check for Stop Loss trigger (prioritize SL if both hit on the same day)
                if sl_level is not None and current_low <= sl_level:
                    sl_triggered = True
                    # Exit at SL price (or current_close if it went below SL and rebounded above it)
                    exit_price_actual = sl_level # Assume execution at the stop-loss level
                    exit_reason = 'stop_loss'
                    exit_today = True
                    logging.info(f"SL HIT: {current_date} @ ${exit_price_actual:.2f} (Entry: ${entry_price:.2f})")
                
                # Check for Take Profit trigger (only if SL NOT triggered)
                if not sl_triggered and tp_level is not None and current_high >= tp_level:
                    tp_triggered = True
                    exit_price_actual = tp_level # Assume execution at the take-profit level
                    exit_reason = 'take_profit'
                    exit_today = True
                    logging.info(f"TP HIT: {current_date} @ ${exit_price_actual:.2f} (Entry: ${entry_price:.2f})")

                # If no SL/TP triggered, check model signal to exit
                if not exit_today and (signal == -1 or signal == 0): # Model says sell or hold
                    exit_today = True
                    exit_reason = 'model_signal_exit'
                    exit_price_actual = current_price # Exit at close price
                    logging.info(f"MODEL EXIT: {current_date} @ ${exit_price_actual:.2f} (Signal: {signal})")
            
            if exit_today:
                # Execute sell trade
                value_of_shares_sold = shares * exit_price_actual
                fee = value_of_shares_sold * transaction_fee_pct
                balance += (value_of_shares_sold - fee)
                
                trades.append({
                    'date': current_date,
                    'type': 'sell',
                    'price': exit_price_actual,
                    'shares': shares,
                    'fee': fee,
                    'balance': balance,
                    'total_value': balance, # After selling, total value is just cash
                    'reason': exit_reason
                })
                shares = 0
                position = 0 # No longer in a position
                # After exiting, we do not re-enter on the same day in this daily bar model.

            # --- Handle Buy Signal (only if not already exited today, and not in position) ---
            # Using 'elif' ensures that if a position was exited (due to SL/TP/model), we don't try to re-enter on the same day.
            elif position == 0 and signal == 1: # No position and model says buy
                # Calculate the amount of cash to allocate for this trade, based on position_sizing_pct
                cash_to_allocate = balance * position_sizing_pct

                # Calculate the number of shares we can buy with this allocated cash, considering fees
                # Shares = Cash_Allocated / (Price * (1 + Fee_Rate))
                shares_to_buy = cash_to_allocate / (current_price * (1 + transaction_fee_pct))

                # Calculate the total cost including fees for the allocated shares
                total_trade_cost = shares_to_buy * current_price * (1 + transaction_fee_pct)

                # Ensure we have enough balance for this trade and that we can buy at least a tiny amount
                if balance >= total_trade_cost and shares_to_buy * current_price > 0.01: # Small threshold for meaningful trades (e.g., > $0.01)
                    # Execute the buy
                    balance -= total_trade_cost
                    position = 1
                    shares = shares_to_buy
                    entry_price = current_price # Store entry price for SL/TP
                    
                    trades.append({
                        'date': current_date,
                        'type': 'buy',
                        'price': current_price,
                        'shares': shares,
                        'fee': total_trade_cost - (shares * current_price), # Fee is total_cost - actual_asset_value
                        'balance': balance,
                        'total_value': balance + (shares * current_price), # Total value including new asset
                        'reason': 'buy_signal'
                    })
                    logging.info(f"BUY: {current_date} @ ${current_price:.2f}, Shares: {shares:.4f}, Balance: ${balance:.2f} (Allocated: {position_sizing_pct*100:.0f}%)")
                else:
                    # Log why a buy might have failed (e.g., insufficient balance, too small an amount)
                    if shares_to_buy * current_price <= 0.01:
                        logging.debug(f"Skipped BUY on {current_date}: Trade amount too small. Value: ${shares_to_buy * current_price:.2f}")
                    else:
                        logging.warning(f"Failed to BUY on {current_date}: Insufficient funds. Req: ${total_trade_cost:.2f}, Has: ${balance:.2f}")


            # --- Daily Portfolio Value Tracking (end of day snapshot) ---
            current_portfolio_value = balance + (shares * current_price if position == 1 else 0)
            daily_portfolio_values.append({
                'date': current_date,
                'total_value': current_portfolio_value
            })

        # Close any remaining position at the very end of the backtest period
        if position == 1 and shares > 0:
            final_price = results.iloc[-1]['close']
            value_of_shares_sold = shares * final_price
            fee = value_of_shares_sold * transaction_fee_pct
            balance += (value_of_shares_sold - fee)
            trades.append({
                'date': results.index[-1],
                'type': 'sell',
                'price': final_price,
                'shares': shares,
                'fee': fee,
                'balance': balance,
                'total_value': balance,
                'reason': 'end_of_backtest_close' # Clear reason
            })
            logging.info(f"FINAL SELL: {results.index[-1]} @ ${final_price:.2f}, Shares: {shares:.4f}, Balance: ${balance:.2f}")
        
        # Convert daily portfolio values to DataFrame
        daily_portfolio_df = pd.DataFrame(daily_portfolio_values)
        if not daily_portfolio_df.empty:
            daily_portfolio_df.set_index('date', inplace=True)
        else:
            # Fallback: If no data or portfolio tracking failed, initialize with initial balance
            daily_portfolio_df = pd.DataFrame([{'date': df.index[0], 'total_value': initial_balance}]).set_index('date')

        # Convert individual trade records to DataFrame
        trades_df = pd.DataFrame(trades)
        if not trades_df.empty:
            trades_df.set_index('date', inplace=True)
            
            # --- Calculate round-trip trade metrics ---
            round_trip_trades = []
            # We need to iterate through trades to match buys and sells
            # This logic assumes sequential buy-sell pairs. If a buy is followed by another buy without an intervening sell,
            # it implies being in a position and not exiting, which current logic doesn't support (only one position at a time).
            # The current backtest only allows one open position at a time, so this pairing logic is suitable.
            
            buy_trade_info = None # Stores the buy trade record until a sell is found

            for idx, row in trades_df.iterrows():
                if row['type'] == 'buy':
                    # If we encounter a buy before a previous one was closed, it's an error in logic or re-entering.
                    # Given the current simple strategy, this implies the previous position was implicitly closed.
                    # For more complex strategies, this might need more robust handling (e.g., closing previous position).
                    if buy_trade_info is not None:
                        logging.warning(f"Unexpected BUY signal at {idx} while a BUY from {buy_trade_info.name} was still open. "
                                        "This implies strategy re-entered without explicit sell. Closing previous trade for calculation.")
                        # Treat the previous open buy as implicitly closed at current price for metrics
                        # (This is a workaround; ideally, the backtest logic ensures clear buy-sell pairs)
                        round_trip_trades.append({
                            'buy_date': buy_trade_info.name,
                            'sell_date': idx, # Effectively "sold" at this date
                            'buy_price': buy_trade_info['price'],
                            'sell_price': row['price'], # Sold at the current price
                            'shares': buy_trade_info['shares'],
                            'return_pct': ((row['price'] * buy_trade_info['shares']) - buy_trade_info['fee'] - (row['price'] * buy_trade_info['shares'] * transaction_fee_pct)) / ((buy_trade_info['price'] * buy_trade_info['shares']) + buy_trade_info['fee']) -1 if (buy_trade_info['price'] * buy_trade_info['shares']) + buy_trade_info['fee'] != 0 else 0,
                            'reason_exit': 'forced_re_entry_close'
                        })
                        buy_trade_info = None # Reset after forced close

                    buy_trade_info = row.copy() # Store the entire buy record
                    buy_trade_info['cost_basis'] = (buy_trade_info['price'] * buy_trade_info['shares']) + buy_trade_info['fee'] # Calculate initial cost basis

                elif row['type'] == 'sell' and buy_trade_info is not None:
                    sell_value_gross = row['price'] * row['shares']
                    sell_value_net = sell_value_gross - row['fee']

                    net_profit = sell_value_net - buy_trade_info['cost_basis']
                    
                    return_pct = net_profit / buy_trade_info['cost_basis'] if buy_trade_info['cost_basis'] != 0 else 0

                    round_trip_trades.append({
                        'buy_date': buy_trade_info.name, # Use index as date
                        'sell_date': idx,
                        'buy_price': buy_trade_info['price'],
                        'sell_price': row['price'],
                        'shares': row['shares'],
                        'return_pct': return_pct,
                        'reason_exit': row['reason'] # Reason for selling
                    })
                    buy_trade_info = None # Reset for next trade
            
            # Print performance metrics based on the *overall* portfolio value and round-trip trades
            final_balance_overall = daily_portfolio_df.iloc[-1]['total_value'] if not daily_portfolio_df.empty else initial_balance
            total_return_overall = (final_balance_overall - initial_balance) / initial_balance * 100
            
            print(f'\n=== Trading Performance ===')
            print(f'Initial Balance: ${initial_balance:.2f}')
            print(f'Final Balance: ${final_balance_overall:.2f}')
            print(f'Total Return: {total_return_overall:.2f}%')
            
            num_trades = len(round_trip_trades)
            print(f'Number of Trades Executed: {num_trades}')

            metrics_summary = { # Prepare metrics summary dict
                'initial_balance': initial_balance,
                'final_balance': final_balance_overall,
                'total_return': total_return_overall,
                'num_trades': num_trades,
                'win_rate': 0.0, # Will be updated if trades exist
                'avg_return_per_trade': 0.0,
                'best_trade_return': 0.0,
                'worst_trade_return': 0.0,
            }

            if num_trades > 0:
                round_trip_df = pd.DataFrame(round_trip_trades)
                profitable_trades = (round_trip_df['return_pct'] > 0).sum()
                win_rate = (profitable_trades / num_trades) * 100
                print(f'Win Rate (closed trades): {win_rate:.2f}%')
                
                print(f'Average Return per Trade: {round_trip_df["return_pct"].mean()*100:.2f}%')
                print(f'Best Trade Return: {round_trip_df["return_pct"].max()*100:.2f}%')
                print(f'Worst Trade Return: {round_trip_df["return_pct"].min()*100:.2f}%')

                # Update metrics_summary
                metrics_summary['win_rate'] = win_rate
                metrics_summary['avg_return_per_trade'] = round_trip_df["return_pct"].mean()*100
                metrics_summary['best_trade_return'] = round_trip_df["return_pct"].max()*100
                metrics_summary['worst_trade_return'] = round_trip_df["return_pct"].min()*100
            else:
                print('No completed trades to calculate win rate or average returns.')

        else: # trades_df is empty, meaning no buy trades were ever made
            print('No trades were executed during the backtesting period.')
            # Still provide overall balance
            final_balance_overall = daily_portfolio_df.iloc[-1]['total_value'] if not daily_portfolio_df.empty else initial_balance
            total_return_overall = (final_balance_overall - initial_balance) / initial_balance * 100
            print(f'\n=== Trading Performance (No Trades) ===')
            print(f'Initial Balance: ${initial_balance:.2f}')
            print(f'Final Balance: ${final_balance_overall:.2f}')
            print(f'Total Return: {total_return_overall:.2f}%')
            # metrics_summary defaults already handle this case

        return trades_df, daily_portfolio_df, metrics_summary # <-- Now also returning the metrics summary
    
    except Exception as e:
        logging.error(f'Error in backtesting: {str(e)}')
        raise

## Data Visualization

. In the section we'll
1. Visualize trading performance
2. Visualize Volume
3. Visualize daily porfolio


In [None]:
def visualize_trading_results(df_with_indicators, trades_df, daily_portfolio_df, backtest_metrics=None):
    """Create interactive candlestick chart with trading signals, technical indicators, and performance metrics.
    
    Args:
        df_with_indicators (pd.DataFrame): DataFrame with OHLCV and calculated technical indicators (RSI, BB, SMA, ATR).
        trades_df (pd.DataFrame): DataFrame with trade information (buy/sell points).
        daily_portfolio_df (pd.DataFrame): DataFrame with daily total portfolio value.
        backtest_metrics (dict, optional): Dictionary containing pre-calculated metrics from backtest_strategy,
                                         e.g., {'initial_balance', 'final_balance', 'total_return', 'num_trades', 'win_rate'}.
    """
    try:
        # --- DEBUGGING PRINTS ---
        print("\n--- Debugging Visualization Data ---")
        print("df_with_indicators columns:", df_with_indicators.columns.tolist())
        print("df_with_indicators head:\n", df_with_indicators.head())
        
        # Check for presence and NaNs in key indicator columns
        indicator_cols_to_check = ['rsi', 'sma_20', 'sma_50', 'bb_upper', 'bb_mid', 'bb_lower', 'atr']
        for col in indicator_cols_to_check:
            if col not in df_with_indicators.columns:
                print(f"ERROR: Column '{col}' is NOT found in df_with_indicators for plotting!")
            else:
                nan_count = df_with_indicators[col].isnull().sum()
                if nan_count > 0:
                    print(f"WARNING: Column '{col}' has {nan_count} NaN values.")
                    # Drop NaNs from these specific columns for plotting robustness, might lose rows but ensures line draws
                    df_with_indicators = df_with_indicators.dropna(subset=[col])
                else:
                    print(f"Column '{col}' is present and has no NaNs.")
                print(f"'{col}' stats: {df_with_indicators[col].describe()}")
        print("--- Show Visualization Data ---\n")

        # Define subplot layout: 5 rows, 1 column
        # Row 1: Price Action, Bollinger Bands & SMAs
        # Row 2: Volume
        # Row 3: ATR
        # Row 4: RSI
        # Row 5: Account Value
        fig = make_subplots(rows=5, cols=1,
                            shared_xaxes=True, # All X-axes will be linked
                            vertical_spacing=0.02, # Slightly reduced spacing for compactness
                            subplot_titles=('Price Action, Bollinger Bands & SMAs', 'Volume', 'ATR', 'RSI', 'Account Value'),
                            row_heights=[0.45, 0.10, 0.15, 0.15, 0.15]) # Adjusted relative heights
        
        # --- Row 1, Column 1: Candlestick chart + Bollinger Bands + SMAs + Buy/Sell Signals ---
        fig.add_trace(
            go.Candlestick(x=df_with_indicators.index, 
                           open=df_with_indicators['open'],
                           high=df_with_indicators['high'],
                           low=df_with_indicators['low'],
                           close=df_with_indicators['close'],
                           name='OHLC'),
            row=1, col=1
        )
        
        # Bollinger Bands (overlayed on price chart)
        if all(col in df_with_indicators.columns for col in ['bb_upper', 'bb_mid', 'bb_lower']):
            fig.add_trace(go.Scatter(x=df_with_indicators.index, y=df_with_indicators['bb_upper'], 
                                     line=dict(color='orange', width=1), name='BB Upper', showlegend=True), 
                          row=1, col=1)
            fig.add_trace(go.Scatter(x=df_with_indicators.index, y=df_with_indicators['bb_mid'], 
                                     line=dict(color='red', width=1, dash='dash'), name='BB Mid', showlegend=True), 
                          row=1, col=1)
            fig.add_trace(go.Scatter(x=df_with_indicators.index, y=df_with_indicators['bb_lower'], 
                                     line=dict(color='green', width=1), name='BB Lower', showlegend=True), 
                          row=1, col=1)
        else:
            print("WARNING: Bollinger Bands columns missing, skipping plot.")
        
        # Simple Moving Averages (overlayed on price chart)
        if 'sma_20' in df_with_indicators.columns:
            fig.add_trace(go.Scatter(x=df_with_indicators.index, y=df_with_indicators['sma_20'], 
                                     line=dict(color='blue', width=1.5), name='SMA 20', showlegend=True), 
                          row=1, col=1)
        else:
            print("WARNING: SMA 20 column missing, skipping plot.")
        if 'sma_50' in df_with_indicators.columns:
            fig.add_trace(go.Scatter(x=df_with_indicators.index, y=df_with_indicators['sma_50'], 
                                     line=dict(color='white', width=1.5), name='SMA 50', showlegend=True), 
                          row=1, col=1)
        else:
            print("WARNING: SMA 50 column missing, skipping plot.")

        # Add buy markers
        buys = trades_df[trades_df['type'] == 'buy']
        if not buys.empty:
            fig.add_trace(
                go.Scatter(x=buys.index,
                           y=buys['price'],
                           mode='markers',
                           name='Buy Signals',
                           marker=dict(color='green', size=10, symbol='triangle-up'),
                           text=[f'Buy @ ${p:.2f}' for p in buys['price']],
                           hoverinfo='text'),
                row=1, col=1
            )
        
        # Add sell markers
        sells = trades_df[trades_df['type'] == 'sell']
        if not sells.empty:
            fig.add_trace(
                go.Scatter(x=sells.index,
                           y=sells['price'],
                           mode='markers',
                           name='Sell Signals',
                           marker=dict(color='red', size=10, symbol='triangle-down'),
                           text=[f'Sell @ ${p:.2f}' for p in sells['price']],
                           hoverinfo='text'),
                row=1, col=1
            )
        
        # --- Row 2, Column 1: Volume chart ---
        fig.add_trace(
            go.Bar(x=df_with_indicators.index, 
                   y=df_with_indicators['volume'],
                   name='Volume',
                   marker_color='darkblue',
                   showlegend=True),
            row=2, col=1
        )
        
        # --- Row 3, Column 1: ATR chart ---
        if 'atr' in df_with_indicators.columns:
            fig.add_trace(go.Scatter(x=df_with_indicators.index, y=df_with_indicators['atr'], 
                                     name='ATR', line=dict(color='magenta')), 
                          row=3, col=1)
        else:
            print("WARNING: ATR column missing, skipping plot.")

        # --- Row 4, Column 1: RSI chart ---
        if 'rsi' in df_with_indicators.columns:
            fig.add_trace(go.Scatter(x=df_with_indicators.index, y=df_with_indicators['rsi'], 
                                     name='RSI', line=dict(color='cyan')), 
                          row=4, col=1)
            # Add RSI overbought/oversold lines for context
            fig.add_hline(y=70, line_dash="dash", line_color="red", row=4, col=1, annotation_text="Overbought", annotation_position="top left", annotation_font_color="red")
            fig.add_hline(y=30, line_dash="dash", line_color="green", row=4, col=1, annotation_text="Oversold", annotation_position="bottom left", annotation_font_color="green")
            fig.update_yaxes(range=[0, 100], row=4, col=1) # Fixed RSI range
        else:
            print("WARNING: RSI column missing, skipping plot.")
        
        # --- Row 5, Column 1: Account Value ---
        if not daily_portfolio_df.empty:
            fig.add_trace(
                go.Scatter(x=daily_portfolio_df.index,
                           y=daily_portfolio_df['total_value'],
                           name='Account Value',
                           line=dict(color='purple')),
                row=5, col=1
            )
        
        # --- Update layout ---
        fig.update_layout(
            title_text='Trading Strategy Performance',
            xaxis_rangeslider_visible=False, # Hide the default rangeslider that appears on the first subplot
            hovermode='x unified', # Unify hover info across subplots by x-coordinate
            height=1200, # Sufficient overall height for 5 rows
            width=1200,
            showlegend=True,
            template='plotly_dark', # Set dark theme here!
            
            # Assign proper Y-axis titles for each row/subplot
            yaxis=dict(title='Price'),  # Y-axis for Price (row=1)
            yaxis2=dict(title='Volume'),# Y-axis for Volume (row=2)
            yaxis3=dict(title='ATR'),   # Y-axis for ATR (row=3)
            yaxis4=dict(title='RSI'),   # Y-axis for RSI (row=4)
            yaxis5=dict(title='Value ($)'), # Y-axis for Account Value (row=5)

            # Link X-axes (all are shared by default now) and ensure range slider is on the bottom one
            xaxis5=dict(title='Date', rangeslider_visible=True) # Ensure range slider is visible on the very bottom axis
        )
        
        # Add performance metrics as annotations
        if backtest_metrics and not daily_portfolio_df.empty:
            metrics_text = (
                f"Initial Balance: ${backtest_metrics['initial_balance']:,.2f}<br>"
                f"Final Balance: ${backtest_metrics['final_balance']:,.2f}<br>"
                f"Total Return: {backtest_metrics['total_return']:.2f}%<br>"
                f"Number of Trades: {backtest_metrics['num_trades']}<br>"
                f"Win Rate: {backtest_metrics['win_rate']:.2f}%"
            )
            
            fig.add_annotation(
                xref='paper',
                yref='paper',
                x=1.0,
                y=1.0, # Top right of the plot
                text=metrics_text,
                showarrow=False,
                font=dict(size=12, color='white'), # Make text white for dark background
                align='left',
                bgcolor='rgba(0,0,0,0.6)', # Darker background for annotation
                bordercolor='white', # White border for dark background
                borderwidth=1,
                borderpad=4
            )
        
        fig.show()
        return fig
    
    except Exception as e:
        print(f'❌ Error in visualization: {str(e)}')
        raise

 ## 🤖 Run Main Application!

 In the Section we'll

 Run the Pipeline to visualize the data for Crypto Algorithmic Trading.

In [None]:
def main():
    """Execute the complete trading algorithm pipeline."""
    try:
        # 1. Load and preprocess data
        # Renamed to df_original for clarity; used for price & volume plotting later.
        df_original = load_and_preprocess_data('test_df_features.csv') 
        # Create a working copy for feature engineering and transformations.
        df_working = df_original.copy() 
        print(f'✅ Data loaded: {df_working.shape} records')
        
        # 2. Calculate technical indicators
        df_working = calculate_technical_indicators(df_working)
        print('✅ Technical indicators calculated')
        
        # Verify all technical indicators were created and present after NaN drops
        # Ensure this list matches the new column names generated by calculate_technical_indicators
        expected_features = [
            'rsi', 'bb_upper', 'bb_lower', 'bb_mid', 'bb_pct_b',
            'sma_20', 'sma_50', 'ma_cross', 'price_momentum',
            'atr', 'atr_pct'
        ]

        # Filter expected_features to only include those that are actually in df_working AFTER calculation
        # This makes the list more robust if some indicators aren't generated due to data issues or old TA libs
        feature_cols_for_model = [col for col in expected_features if col in df_working.columns]

        missing_features = [col for col in expected_features if col not in df_working.columns]
        if missing_features:
            # Change this to a warning if you don't want it to halt execution for missing optional features
            logging.warning(f'Missing expected technical indicators after calculation: {missing_features}. Model will proceed with available features.')
            
        print('\nAvailable features (after TI calc):', sorted(df_working.columns))
        
        # 3. Generate trading signals based on technical indicators
        lower_thresh, upper_thresh = get_rsi_quantile_thresholds(df_working['rsi'])
        df_working = apply_rsi_labels(df_working, lower_threshold=lower_thresh, upper_threshold=upper_thresh)
        print('✅ Trading signals generated')
        
        # Store a copy of the df with TIs and signals *BEFORE* normalization for plotting
        # This is the DataFrame with the correct new column names and values.
        df_for_plotting = df_working.copy() 

        # 4. Normalize numerical features
        df_normalized = normalize_features(df_working) # Pass df_working for normalization
        print('✅ Features normalized')
        
        # Verify features after normalization (using the list of features actually passed to the model)
        missing_normalized = [col for col in feature_cols_for_model if col not in df_normalized.columns]
        if missing_normalized:
            raise ValueError(f'Features missing after normalization for model training: {missing_normalized}')
        
        # 5. Prepare model data (split into train/test)
        # Use the confirmed feature_cols_for_model list here
        X_train, X_test, y_train, y_test = prepare_model_data(
            df_normalized, feature_cols=feature_cols_for_model
        )
        
        # 6. Handle class imbalance in training data using SMOTE
        smote = SMOTE(random_state=42)
        X_train_balanced, y_train_balanced = smote.fit_resample(X_train, y_train)
        print('✅ Data prepared and balanced')
        
        # 7. Train and optimize the XGBoost model
        model = train_xgboost_model(
            X_train_balanced, y_train_balanced,
            X_test, y_test
        )
        print('✅ Model trained')
        
        # 8. Generate predictions for backtesting on the full dataset
        # Ensure X_full contains only the features actually used by the model
        X_full = df_normalized[feature_cols_for_model] # Use the refined feature list
        if X_full.empty:
            raise ValueError("X_full is empty after feature selection/normalization, cannot generate predictions for backtesting.")

        pred_proba = model.predict_proba(X_full)
        
        # Apply a confidence threshold: if confidence in prediction is too low, default to 'hold' (0).
        confidence_threshold = 0.35 
        predictions_mapped = np.argmax(pred_proba, axis=1) 
        max_probs = np.max(pred_proba, axis=1) 
        
        # If the highest probability is below the threshold, force the prediction to 'hold' (mapped value 1)
        predictions_mapped[max_probs < confidence_threshold] = 1 
        
        # Map numerical predictions back to original trading signals [-1, 0, 1]
        predictions = pd.Series(predictions_mapped).map(reverse_mapper)
        
        # Ensure predictions series has the same index as X_full for correct alignment with original data
        predictions.index = X_full.index 
        
        # Debugging: Print prediction distribution and confidence levels
        print("\nPrediction Distribution:")
        print(predictions.value_counts())
        print("\nConfidence Distribution:")
        print(f"Mean confidence: {max_probs.mean():.3f}")
        print(f"Min confidence: {max_probs.min():.3f}")
        print(f"Max confidence: {max_probs.max():.3f}")
        print(f"Predictions above threshold: {(max_probs >= confidence_threshold).sum()}")
        
        print("\nOriginal Signal Distribution:")
        print(df_working['signal'].value_counts()) 

        # 9. Run backtesting strategy
        my_stop_loss_pct = 0.03 # 3% stop loss   
        my_take_profit_pct = 0.10 
        my_position_sizing_pct = 0.95 

        # CORRECTED LINE: Pass df_for_plotting.loc[predictions.index]
        trades_df, daily_portfolio_df, metrics_summary = backtest_strategy(
            df_for_plotting.loc[predictions.index], # This DataFrame has the correct new TI column names
            predictions,
            stop_loss_pct=my_stop_loss_pct,
            take_profit_pct=my_take_profit_pct,
            position_sizing_pct=my_position_sizing_pct 
        )
        
        # 10. Visualize trading results
        if not daily_portfolio_df.empty:
            # CORRECTED LINE: Pass df_for_plotting.loc[predictions.index]
            visualize_trading_results(
                df_for_plotting.loc[predictions.index], # This DataFrame has the correct new TI column names
                trades_df, 
                daily_portfolio_df, 
                backtest_metrics=metrics_summary 
            )
        else:
            print('⚠️ No daily portfolio data generated for visualization. This usually means no trades were executed or an error occurred in backtesting.')

        return df_original, model, trades_df, daily_portfolio_df, metrics_summary 
        
    except Exception as e:
        logging.error(f'Error in main pipeline: {str(e)}')
        raise

# Execute the pipeline when the script is run
if __name__ == "__main__":
    try:
        df_final, model_final, trades_df_final, daily_portfolio_df_final, metrics_summary_final = main()
        print('✅ Trading pipeline completed successfully')
    except Exception as e:
        print(f'❌ Error executing trading pipeline: {str(e)}')