# Production ML Prediction for LLM Stock Recommendations

## Purpose
Get ML predictions for ALL LLM-extracted stock recommendations using the trained XGBoost model (70.08% accuracy).

## Workflow
1. **Input**: List of stock symbols from LLM (e.g., `['RELIANCE', 'TCS', 'INFY']`)
2. **Fetch**: Get 60-day historical data for each stock
3. **Engineer**: Calculate all 47 technical features
4. **Predict**: Use trained XGBoost model to predict 7-day direction
5. **Display**: Show predictions for ALL stocks (UP/DOWN with confidence)

## Model Performance
- **Accuracy**: 70.08%
- **Precision**: 73.10% (when it says UP, it's right 73% of the time)
- **Win Rate**: 70.81% in trading simulation
- **Sharpe Ratio**: 0.4992

---
## 1. Imports and Setup

In [1]:
# Standard libraries
import os
import pickle
import warnings
from pathlib import Path
from datetime import datetime, timedelta

import numpy as np
import pandas as pd

# API requests
import requests
from requests.adapters import HTTPAdapter
from urllib3.util.retry import Retry

# Suppress warnings
warnings.filterwarnings('ignore')

print("‚úÖ All libraries imported successfully")

‚úÖ All libraries imported successfully


In [2]:
# Configuration
WINDOW_SIZE = 60          # Need 60 days of history
HORIZON = 7               # Predicting 7-day ahead direction

# Paths
MODELS_DIR = Path("models")
MODEL_PATH = MODELS_DIR / "xgboost_stock_direction_predictor.pkl"
SCALER_PATH = MODELS_DIR / "feature_scaler.pkl"

# Upstox API
UPSTOX_ACCESS_TOKEN = os.getenv("UPSTOX_ACCESS_TOKEN")

# Feature list (47 features in the exact order used during training)
FEATURE_COLS = [
    'Open', 'High', 'Low', 'Close', 'Volume', 'OI',
    'SMA_5', 'SMA_10', 'SMA_20', 'SMA_50',
    'EMA_12', 'EMA_26', 'MACD', 'MACD_Signal', 'MACD_Hist',
    'RSI', 'BB_Middle', 'BB_Upper', 'BB_Lower',
    'Volume_SMA_20', 'Volume_Ratio', 'Daily_Return', 'Price_Range', 'Price_Change',
    'Return_3d', 'Return_5d', 'Return_10d', 'Log_Return',
    'Volatility_5d', 'Volatility_20d', 'Momentum_10d', 'Momentum_20d',
    # Advanced features (15)
    'relative_strength_to_nifty50', 'correlation_to_nifty50_20d', 'market_regime',
    'rsi_divergence', 'macd_crossover_signal', 'bb_squeeze',
    'price_vs_sma50_pct', 'momentum_strength', 'support_resistance_distance',
    'volume_price_trend', 'on_balance_volume', 'volume_breakout',
    'returns_skewness_20d', 'returns_kurtosis_20d', 'hurst_exponent'
]

print(f"Configuration loaded")
print(f"Model path: {MODEL_PATH}")
print(f"Scaler path: {SCALER_PATH}")
print(f"Features: {len(FEATURE_COLS)} features")

Configuration loaded
Model path: models/xgboost_stock_direction_predictor.pkl
Scaler path: models/feature_scaler.pkl
Features: 47 features


In [3]:
# Load stock lookup data for symbol to ISIN conversion
import json

STOCK_LOOKUP_PATH = Path("data/processed/stock_lookup.json")

if STOCK_LOOKUP_PATH.exists():
    with open(STOCK_LOOKUP_PATH, 'r') as f:
        stock_lookup = json.load(f)
    print(f"‚úÖ Stock lookup data loaded ({len(stock_lookup['by_symbol'])} stocks)")
else:
    print("‚ö†Ô∏è  Warning: stock_lookup.json not found. API calls will fail.")
    stock_lookup = {'by_symbol': {}}

def get_isin_for_symbol(symbol):
    """Convert stock symbol to ISIN code"""
    if symbol in stock_lookup['by_symbol']:
        return stock_lookup['by_symbol'][symbol]['isin']
    else:
        print(f"   ‚ö†Ô∏è  ISIN not found for {symbol}")
        return None

print("‚úÖ Stock lookup functions defined")

‚úÖ Stock lookup data loaded (2252 stocks)
‚úÖ Stock lookup functions defined


---
## 2. Load Trained Model and Scaler

In [4]:
# Note: Run notebook 09 first to train and save the model
# For now, we'll handle the case where model doesn't exist yet

if MODEL_PATH.exists() and SCALER_PATH.exists():
    with open(MODEL_PATH, 'rb') as f:
        model = pickle.load(f)
    with open(SCALER_PATH, 'rb') as f:
        scaler = pickle.load(f)
    print("‚úÖ Model and scaler loaded successfully")
else:
    print("‚ö†Ô∏è  Model files not found!")
    print("   Please run notebook 09 first to train and save the model.")
    model = None
    scaler = None

‚úÖ Model and scaler loaded successfully


---
## 3. Data Fetching Functions

In [5]:
def create_session_with_retries(retries=3, backoff_factor=0.3):
    """Create a requests session with retry logic"""
    session = requests.Session()
    retry = Retry(
        total=retries,
        read=retries,
        connect=retries,
        backoff_factor=backoff_factor,
        status_forcelist=(500, 502, 504),
    )
    adapter = HTTPAdapter(max_retries=retry)
    session.mount('http://', adapter)
    session.mount('https://', adapter)
    return session

session = create_session_with_retries()
print("‚úÖ HTTP session created")

‚úÖ HTTP session created


In [6]:
def fetch_stock_data(symbol, days=200, access_token=None):
    """
    Fetch historical stock data from Upstox API using ISIN
    
    Args:
        symbol: Stock symbol (e.g., 'RELIANCE')
        days: Number of days to fetch (default 200 to ensure 60+ after feature engineering)
        access_token: Upstox API token
    
    Returns:
        DataFrame with OHLCV data, or None if failed
    """
    if access_token is None:
        access_token = UPSTOX_ACCESS_TOKEN
    
    if not access_token:
        print(f"‚ö†Ô∏è  Error: UPSTOX_ACCESS_TOKEN not set")
        return None
    
    # ‚úÖ FIX: Convert symbol to ISIN
    isin = get_isin_for_symbol(symbol)
    if isin is None:
        return None
    
    # Calculate date range
    end_date = datetime.now().strftime('%Y-%m-%d')
    start_date = (datetime.now() - timedelta(days=days)).strftime('%Y-%m-%d')
    
    # ‚úÖ FIX: Build URL using ISIN (same as notebook 02)
    instrument_key = f"NSE_EQ%7C{isin}"  # Use ISIN, not symbol
    url = f"https://api.upstox.com/v3/historical-candle/{instrument_key}/days/1/{end_date}/{start_date}"
    
    headers = {
        'Accept': 'application/json',
        'Authorization': f'Bearer {access_token}'
    }
    
    try:
        response = session.get(url, headers=headers, timeout=15)
        
        if response.status_code == 200:
            data = response.json()
            
            if data.get('status') == 'success' and 'data' in data:
                candles = data['data'].get('candles', [])
                
                if not candles:
                    print(f"‚ö†Ô∏è  No data for {symbol}")
                    return None
                
                # Convert to DataFrame
                df = pd.DataFrame(candles, columns=['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'OI'])
                df['Date'] = pd.to_datetime(df['Date'])
                df = df.sort_values('Date').reset_index(drop=True)
                df['Symbol'] = symbol
                
                return df
            else:
                print(f"‚ö†Ô∏è  API error for {symbol}: {data.get('message', 'Unknown')}")
                return None
        else:
            print(f"‚ö†Ô∏è  HTTP {response.status_code} for {symbol}")
            return None
    
    except Exception as e:
        print(f"‚ö†Ô∏è  Exception fetching {symbol}: {str(e)}")
        return None

print("‚úÖ fetch_stock_data() defined (using ISIN)")

‚úÖ fetch_stock_data() defined (using ISIN)


In [7]:
def fetch_nifty50_data(days=200, access_token=None):
    """
    Fetch NIFTY 50 index data
    """
    if access_token is None:
        access_token = UPSTOX_ACCESS_TOKEN
    
    if not access_token:
        print("‚ö†Ô∏è  Error: UPSTOX_ACCESS_TOKEN not set")
        return None
    
    end_date = datetime.now().strftime('%Y-%m-%d')
    start_date = (datetime.now() - timedelta(days=days)).strftime('%Y-%m-%d')
    
    instrument_key_encoded = "NSE_INDEX%7CNifty%2050"
    url = f"https://api.upstox.com/v3/historical-candle/{instrument_key_encoded}/days/1/{end_date}/{start_date}"
    
    headers = {
        'Accept': 'application/json',
        'Authorization': f'Bearer {access_token}'
    }
    
    try:
        response = session.get(url, headers=headers, timeout=15)
        
        if response.status_code == 200:
            data = response.json()
            
            if data.get('status') == 'success' and 'data' in data:
                candles = data['data'].get('candles', [])
                
                if not candles:
                    return None
                
                df = pd.DataFrame(candles, columns=['Date', 'Open', 'High', 'Low', 'Close', 'Volume', 'OI'])
                df['Date'] = pd.to_datetime(df['Date'])
                df = df.sort_values('Date').reset_index(drop=True)
                df = df[['Date', 'Close']]
                df.rename(columns={'Close': 'Nifty50_Close'}, inplace=True)
                
                return df
            else:
                return None
        else:
            return None
    
    except Exception as e:
        print(f"‚ö†Ô∏è  Exception fetching NIFTY 50: {str(e)}")
        return None

print("‚úÖ fetch_nifty50_data() defined")

‚úÖ fetch_nifty50_data() defined


---
## 4. Feature Engineering Functions

In [8]:
def calculate_basic_features(df):
    """
    Calculate basic technical indicators (first 32 features)
    Assumes df has OHLCV columns
    """
    df = df.copy()
    
    # Simple Moving Averages
    df['SMA_5'] = df['Close'].rolling(5).mean()
    df['SMA_10'] = df['Close'].rolling(10).mean()
    df['SMA_20'] = df['Close'].rolling(20).mean()
    df['SMA_50'] = df['Close'].rolling(50).mean()
    
    # Exponential Moving Averages
    df['EMA_12'] = df['Close'].ewm(span=12, adjust=False).mean()
    df['EMA_26'] = df['Close'].ewm(span=26, adjust=False).mean()
    
    # MACD
    df['MACD'] = df['EMA_12'] - df['EMA_26']
    df['MACD_Signal'] = df['MACD'].ewm(span=9, adjust=False).mean()
    df['MACD_Hist'] = df['MACD'] - df['MACD_Signal']
    
    # RSI
    delta = df['Close'].diff()
    gain = delta.where(delta > 0, 0).rolling(14).mean()
    loss = -delta.where(delta < 0, 0).rolling(14).mean()
    rs = gain / (loss + 1e-8)
    df['RSI'] = 100 - (100 / (1 + rs))
    
    # Bollinger Bands
    df['BB_Middle'] = df['Close'].rolling(20).mean()
    bb_std = df['Close'].rolling(20).std()
    df['BB_Upper'] = df['BB_Middle'] + (2 * bb_std)
    df['BB_Lower'] = df['BB_Middle'] - (2 * bb_std)
    
    # Volume indicators
    df['Volume_SMA_20'] = df['Volume'].rolling(20).mean()
    df['Volume_Ratio'] = df['Volume'] / (df['Volume_SMA_20'] + 1e-8)
    
    # Price-based features
    df['Daily_Return'] = df['Close'].pct_change() * 100
    df['Price_Range'] = df['High'] - df['Low']
    df['Price_Change'] = df['Close'] - df['Open']
    
    # Historical returns
    df['Return_3d'] = df['Close'].pct_change(3) * 100
    df['Return_5d'] = df['Close'].pct_change(5) * 100
    df['Return_10d'] = df['Close'].pct_change(10) * 100
    df['Log_Return'] = np.log(df['Close'] / df['Close'].shift(1)) * 100
    
    # Volatility
    df['Volatility_5d'] = df['Daily_Return'].rolling(5).std()
    df['Volatility_20d'] = df['Daily_Return'].rolling(20).std()
    
    # Momentum
    df['Momentum_10d'] = df['Close'] - df['Close'].shift(10)
    df['Momentum_20d'] = df['Close'] - df['Close'].shift(20)
    
    return df

print("‚úÖ calculate_basic_features() defined")

‚úÖ calculate_basic_features() defined


In [9]:
def calculate_advanced_features(df, nifty50_df=None):
    """
    Calculate advanced technical features (15 features)
    This is the same function from notebook 09
    """
    df = df.copy()
    
    # ========== MARKET CONTEXT FEATURES (3) ==========
    if nifty50_df is not None:
        df = df.merge(nifty50_df, on='Date', how='left')
        df['Nifty50_Return'] = df['Nifty50_Close'].pct_change() * 100
        df['relative_strength_to_nifty50'] = df['Daily_Return'] - df['Nifty50_Return']
        df['correlation_to_nifty50_20d'] = df['Daily_Return'].rolling(20).corr(df['Nifty50_Return'])
        
        nifty_sma_20 = df['Nifty50_Close'].rolling(20).mean()
        nifty_sma_50 = df['Nifty50_Close'].rolling(50).mean()
        df['market_regime'] = 0
        df.loc[nifty_sma_20 > nifty_sma_50, 'market_regime'] = 1
        df.loc[nifty_sma_20 < nifty_sma_50, 'market_regime'] = -1
        
        df.drop(columns=['Nifty50_Close', 'Nifty50_Return'], inplace=True)
    else:
        df['relative_strength_to_nifty50'] = np.nan
        df['correlation_to_nifty50_20d'] = np.nan
        df['market_regime'] = 0
    
    # ========== MOMENTUM & MEAN REVERSION (6) ==========
    rsi_change = df['RSI'].diff(5)
    price_change = df['Close'].pct_change(5)
    df['rsi_divergence'] = np.sign(rsi_change) - np.sign(price_change)
    
    df['macd_crossover_signal'] = 0
    macd_cross_up = (df['MACD'] > df['MACD_Signal']) & (df['MACD'].shift(1) <= df['MACD_Signal'].shift(1))
    macd_cross_down = (df['MACD'] < df['MACD_Signal']) & (df['MACD'].shift(1) >= df['MACD_Signal'].shift(1))
    df.loc[macd_cross_up, 'macd_crossover_signal'] = 1
    df.loc[macd_cross_down, 'macd_crossover_signal'] = -1
    
    bb_width = (df['BB_Upper'] - df['BB_Lower']) / df['BB_Middle']
    df['bb_squeeze'] = bb_width.rolling(20).apply(lambda x: (x.iloc[-1] - x.min()) / (x.max() - x.min() + 1e-8), raw=False)
    
    df['price_vs_sma50_pct'] = ((df['Close'] - df['SMA_50']) / df['SMA_50']) * 100
    df['momentum_strength'] = df['Momentum_10d'].diff(5)
    
    high_20 = df['High'].rolling(20).max()
    low_20 = df['Low'].rolling(20).min()
    df['support_resistance_distance'] = np.where(
        df['Close'] > df['Close'].shift(1),
        (high_20 - df['Close']) / df['Close'],
        (df['Close'] - low_20) / df['Close']
    )
    
    # ========== VOLUME & LIQUIDITY (3) ==========
    price_direction = np.sign(df['Close'].diff())
    df['volume_price_trend'] = (df['Volume'] * price_direction).cumsum()
    
    obv = np.where(df['Close'] > df['Close'].shift(1), df['Volume'],
                   np.where(df['Close'] < df['Close'].shift(1), -df['Volume'], 0))
    df['on_balance_volume'] = obv.cumsum()
    
    df['volume_breakout'] = (df['Volume'] > 2 * df['Volume_SMA_20']).astype(int)
    
    # ========== STATISTICAL (3) ==========
    df['returns_skewness_20d'] = df['Daily_Return'].rolling(20).skew()
    df['returns_kurtosis_20d'] = df['Daily_Return'].rolling(20).kurt()
    
    def calculate_hurst(ts, lags=range(2, 20)):
        if len(ts) < max(lags):
            return 0.5
        tau = []
        lagvec = []
        for lag in lags:
            pp = np.subtract(ts[lag:], ts[:-lag])
            lagvec.append(lag)
            tau.append(np.std(pp))
        try:
            poly = np.polyfit(np.log(lagvec), np.log(tau), 1)
            return poly[0]
        except:
            return 0.5
    
    df['hurst_exponent'] = df['Close'].rolling(60).apply(lambda x: calculate_hurst(x.values), raw=False)
    
    df = df.dropna()
    return df

print("‚úÖ calculate_advanced_features() defined")

‚úÖ calculate_advanced_features() defined


In [10]:
def prepare_stock_for_prediction(symbol, nifty50_df=None):
    """
    Full pipeline: Fetch data ‚Üí Calculate features ‚Üí Prepare for model
    
    Returns:
        tuple: (features_array, latest_close_price, symbol) or (None, None, symbol) if failed
    """
    # 1. Fetch stock data
    df = fetch_stock_data(symbol, days=250)  # Extra buffer for feature calculation
    
    if df is None or len(df) < 70:
        print(f"   ‚ö†Ô∏è  {symbol}: Insufficient data")
        return None, None, symbol
    
    # 2. Calculate basic features
    df = calculate_basic_features(df)
    
    # 3. Calculate advanced features
    df = calculate_advanced_features(df, nifty50_df)
    
    if len(df) < WINDOW_SIZE:
        print(f"   ‚ö†Ô∏è  {symbol}: Not enough rows after feature engineering")
        return None, None, symbol
    
    # 4. Extract last 60 days of features
    features_df = df[FEATURE_COLS].tail(WINDOW_SIZE)
    
    if features_df.isnull().any().any():
        print(f"   ‚ö†Ô∏è  {symbol}: NaN values in features")
        return None, None, symbol
    
    # 5. Convert to array (shape: 60, 47)
    features_array = features_df.values
    latest_close = df['Close'].iloc[-1]
    
    return features_array, latest_close, symbol

print("‚úÖ prepare_stock_for_prediction() defined")

‚úÖ prepare_stock_for_prediction() defined


---
## 5. Prediction Functions

In [11]:
def predict_stock_direction(features_array, model, scaler):
    """
    Predict 7-day direction for a single stock
    
    Args:
        features_array: numpy array of shape (60, 47)
        model: Trained XGBoost model
        scaler: Fitted StandardScaler
    
    Returns:
        dict with prediction, probability, and confidence
    """
    # 1. Reshape to (1, 60, 47) for single sample
    X = features_array.reshape(1, WINDOW_SIZE, -1)
    
    # 2. Scale features (reshape to 2D first to match scaler training)
    # Scaler was trained on (samples*60, 47), so we need to scale 47 features
    X_2d = X.reshape(-1, 47)  # Shape: (60, 47)
    X_scaled_2d = scaler.transform(X_2d)  # Scale each of the 47 features
    
    # 3. Reshape back to 3D then flatten for XGBoost
    X_scaled = X_scaled_2d.reshape(1, WINDOW_SIZE, -1)  # (1, 60, 47)
    X_flat = X_scaled.reshape(1, -1)  # (1, 2820)
    
    # 4. Predict
    prediction = model.predict(X_flat)[0]  # 0 or 1
    probabilities = model.predict_proba(X_flat)[0]  # [prob_down, prob_up]
    
    return {
        'prediction': int(prediction),
        'direction': 'UP' if prediction == 1 else 'DOWN',
        'probability_up': probabilities[1],
        'probability_down': probabilities[0],
        'confidence': max(probabilities)
    }

print("‚úÖ predict_stock_direction() defined")

‚úÖ predict_stock_direction() defined


---
## 6. Main Prediction Function

In [12]:
def predict_llm_recommendations(stock_symbols, model, scaler):
    """
    Get ML predictions for ALL LLM stock recommendations
    
    Args:
        stock_symbols: List of stock symbols (e.g., ['RELIANCE', 'TCS', 'INFY'])
        model: Trained XGBoost model
        scaler: Fitted StandardScaler
    
    Returns:
        DataFrame with predictions for all stocks
    """
    print(f"\n{'='*80}")
    print(f"ML PREDICTIONS FOR {len(stock_symbols)} LLM-RECOMMENDED STOCKS")
    print(f"{'='*80}\n")
    
    # Fetch NIFTY 50 data once (for market context features)
    print("Fetching NIFTY 50 data...")
    nifty50_df = fetch_nifty50_data(days=250)
    
    if nifty50_df is not None:
        print(f"‚úÖ NIFTY 50 data fetched ({len(nifty50_df)} days)\n")
    else:
        print("‚ö†Ô∏è  NIFTY 50 data not available, using defaults\n")
    
    # Process each stock
    results = []
    
    for i, symbol in enumerate(stock_symbols, 1):
        print(f"[{i}/{len(stock_symbols)}] Processing {symbol}...")
        
        # Prepare features
        features, latest_close, _ = prepare_stock_for_prediction(symbol, nifty50_df)
        
        if features is None:
            results.append({
                'symbol': symbol,
                'status': 'FAILED',
                'direction': 'N/A',
                'confidence': None,
                'probability_up': None,
                'probability_down': None,
                'latest_close': None,
                'error': 'Insufficient data or feature engineering failed'
            })
            print(f"   ‚ùå {symbol}: FAILED (insufficient data)\n")
            continue
        
        # Predict
        pred = predict_stock_direction(features, model, scaler)
        
        results.append({
            'symbol': symbol,
            'status': 'SUCCESS',
            'direction': pred['direction'],
            'confidence': pred['confidence'],
            'probability_up': pred['probability_up'],
            'probability_down': pred['probability_down'],
            'latest_close': latest_close,
            'error': None
        })
        
        # Display direction with appropriate emoji
        direction_emoji = 'üìà' if pred['direction'] == 'UP' else 'üìâ'
        print(f"   {direction_emoji} {symbol}: {pred['direction']} (confidence: {pred['confidence']:.1%})")
        print(f"      Prob UP: {pred['probability_up']:.1%} | Prob DOWN: {pred['probability_down']:.1%}")
        print(f"      Latest Price: ‚Çπ{latest_close:,.2f}\n")
    
    # Convert to DataFrame
    results_df = pd.DataFrame(results)
    
    # Summary
    print(f"\n{'='*80}")
    print("PREDICTION SUMMARY")
    print(f"{'='*80}")
    print(f"Total stocks:            {len(stock_symbols)}")
    print(f"Successful predictions:  {len(results_df[results_df['status'] == 'SUCCESS'])}")
    print(f"Failed predictions:      {len(results_df[results_df['status'] == 'FAILED'])}")
    
    if len(results_df[results_df['status'] == 'SUCCESS']) > 0:
        print(f"\nDirection Breakdown:")
        print(f"  üìà Predicted UP:         {len(results_df[results_df['direction'] == 'UP'])}")
        print(f"  üìâ Predicted DOWN:       {len(results_df[results_df['direction'] == 'DOWN'])}")
        
        # Calculate average confidence
        avg_conf = results_df[results_df['status'] == 'SUCCESS']['confidence'].mean()
        print(f"\nAverage Confidence:      {avg_conf:.1%}")
    
    print(f"{'='*80}\n")
    
    return results_df

print("‚úÖ predict_llm_recommendations() defined")

‚úÖ predict_llm_recommendations() defined


---
## 7. Example Usage

In [13]:
# Example: LLM extracted these stocks from news
llm_recommendations = [
    'RELIANCE',
    'TCS',
    'INFY',
    'HDFCBANK',
    'ICICIBANK'
]

print(f"LLM Recommendations: {llm_recommendations}")

LLM Recommendations: ['RELIANCE', 'TCS', 'INFY', 'HDFCBANK', 'ICICIBANK']


In [14]:
# Get predictions for all LLM-recommended stocks
if model is not None and scaler is not None:
    results = predict_llm_recommendations(
        stock_symbols=llm_recommendations,
        model=model,
        scaler=scaler
    )
    
    print("\nüìä DETAILED RESULTS TABLE:")
    print("=" * 100)
    
    # Display formatted table
    display_cols = ['symbol', 'direction', 'confidence', 'probability_up', 'probability_down', 'latest_close', 'status']
    display_df = results[display_cols].copy()
    
    # Format columns
    if len(display_df[display_df['status'] == 'SUCCESS']) > 0:
        display_df.loc[display_df['status'] == 'SUCCESS', 'confidence'] = display_df.loc[display_df['status'] == 'SUCCESS', 'confidence'].apply(lambda x: f"{x:.1%}" if pd.notnull(x) else 'N/A')
        display_df.loc[display_df['status'] == 'SUCCESS', 'probability_up'] = display_df.loc[display_df['status'] == 'SUCCESS', 'probability_up'].apply(lambda x: f"{x:.1%}" if pd.notnull(x) else 'N/A')
        display_df.loc[display_df['status'] == 'SUCCESS', 'probability_down'] = display_df.loc[display_df['status'] == 'SUCCESS', 'probability_down'].apply(lambda x: f"{x:.1%}" if pd.notnull(x) else 'N/A')
        display_df.loc[display_df['status'] == 'SUCCESS', 'latest_close'] = display_df.loc[display_df['status'] == 'SUCCESS', 'latest_close'].apply(lambda x: f"‚Çπ{x:,.2f}" if pd.notnull(x) else 'N/A')
    
    print(display_df.to_string(index=False))
    print("=" * 100)
    
    # Separate UP and DOWN predictions
    up_stocks = results[results['direction'] == 'UP'].sort_values('confidence', ascending=False)
    down_stocks = results[results['direction'] == 'DOWN'].sort_values('confidence', ascending=False)
    
    if len(up_stocks) > 0:
        print("\n\nüìà STOCKS PREDICTED TO GO UP (sorted by confidence):")
        print("=" * 80)
        for _, row in up_stocks.iterrows():
            print(f"  {row['symbol']:<12} | Price: ‚Çπ{row['latest_close']:>8,.2f} | Confidence: {row['confidence']:>5.1%} | Prob UP: {row['probability_up']:.1%}")
        print("=" * 80)
    
    if len(down_stocks) > 0:
        print("\n\nüìâ STOCKS PREDICTED TO GO DOWN (sorted by confidence):")
        print("=" * 80)
        for _, row in down_stocks.iterrows():
            print(f"  {row['symbol']:<12} | Price: ‚Çπ{row['latest_close']:>8,.2f} | Confidence: {row['confidence']:>5.1%} | Prob DOWN: {row['probability_down']:.1%}")
        print("=" * 80)
    
    # Recommendation based on high confidence UP predictions
    high_conf_up = results[(results['direction'] == 'UP') & (results['confidence'] >= 0.70)].sort_values('confidence', ascending=False)
    
    if len(high_conf_up) > 0:
        print("\n\nüí° HIGH CONFIDENCE UP PREDICTIONS (‚â•70%):")
        print("=" * 80)
        print("These stocks have the strongest bullish signals:")
        for _, row in high_conf_up.iterrows():
            print(f"  {row['symbol']:<12} | Price: ‚Çπ{row['latest_close']:>8,.2f} | Confidence: {row['confidence']:>5.1%}")
        print("=" * 80)
    else:
        print("\n\n‚ö†Ô∏è  No stocks with ‚â•70% confidence for UP prediction")
        
else:
    print("‚ö†Ô∏è  Model not loaded. Please run notebook 09 first and save the model.")


ML PREDICTIONS FOR 5 LLM-RECOMMENDED STOCKS

Fetching NIFTY 50 data...
‚úÖ NIFTY 50 data fetched (171 days)

[1/5] Processing RELIANCE...
   üìâ RELIANCE: DOWN (confidence: 52.6%)
      Prob UP: 47.4% | Prob DOWN: 52.6%
      Latest Price: ‚Çπ1,565.10

[2/5] Processing TCS...
   üìâ TCS: DOWN (confidence: 70.8%)
      Prob UP: 29.2% | Prob DOWN: 70.8%
      Latest Price: ‚Çπ3,282.00

[3/5] Processing INFY...
   üìâ INFY: DOWN (confidence: 81.5%)
      Prob UP: 18.5% | Prob DOWN: 81.5%
      Latest Price: ‚Çπ1,638.70

[4/5] Processing HDFCBANK...
   üìà HDFCBANK: UP (confidence: 54.9%)
      Prob UP: 54.9% | Prob DOWN: 45.1%
      Latest Price: ‚Çπ985.50

[5/5] Processing ICICIBANK...
   üìâ ICICIBANK: DOWN (confidence: 55.0%)
      Prob UP: 45.0% | Prob DOWN: 55.0%
      Latest Price: ‚Çπ1,354.10


PREDICTION SUMMARY
Total stocks:            5
Successful predictions:  5
Failed predictions:      0

Direction Breakdown:
  üìà Predicted UP:         1
  üìâ Predicted DOWN:       4


---
## 8. Save Filtered Results

In [15]:
# Save all predictions to CSV
if model is not None and scaler is not None:
    output_dir = Path("data/predictions")
    output_dir.mkdir(exist_ok=True)
    
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    output_file = output_dir / f"ml_predictions_{timestamp}.csv"
    
    results.to_csv(output_file, index=False)
    print(f"\n‚úÖ All predictions saved to: {output_file}")


‚úÖ All predictions saved to: data/predictions/ml_predictions_20251222_180522.csv
