### Stock Price Prediction System
##### Accepts stock symbols like AAPL, TSLA, RELIANCE.NS, INFY.NS.
##### Uses Yahoo Finance (yfinance) to fetch historical stock data (last 3 years by default).
##### Handles multiple exchanges: US Stocks, Indian Stocks, Etc.

In [13]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import warnings
warnings.filterwarnings('ignore')

from datetime import datetime, timedelta
import yfinance as yf
import ta

#### Machine Learning Models

In [14]:
from sklearn.preprocessing import MinMaxScaler, StandardScaler
from sklearn.ensemble import RandomForestRegressor
from sklearn.svm import SVR
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
from sklearn.model_selection import train_test_split


#### Deep Learning Models

In [15]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout
from tensorflow.keras.callbacks import EarlyStopping

#### Visualization Libraries

In [16]:
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.express as px

#### Suppress TensorFlow warnings

In [17]:
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
tf.get_logger().setLevel('ERROR')

print("Focused Stock Prediction System - RF | LSTM | SVR")
print("=" * 60)

Focused Stock Prediction System - RF | LSTM | SVR


#### Stock Handling (Symbols & Currencies)
#### Data Fetching & Preparation
#### Feature Engineering
#### Model Training
#### Model Evaluation
#### Future Predictions (7-Day Forecast)
#### Visualization Dashboards


In [18]:
class FocusedStockPredictor:
    def __init__(self):
        self.models = {}
        self.scalers = {}
        self.results = {}
        self.lstm_scaler = None
        self.lstm_model = None
        self.currency_info = {}
        
        # Enhanced stock symbol mapping with currency info
        self.STOCK_SYMBOL_MAP = {
            # US Stocks (USD)
            'AAPL': 'AAPL', 'APPLE': 'AAPL',
            'MSFT': 'MSFT', 'MICROSOFT': 'MSFT',
            'GOOGL': 'GOOGL', 'GOOGLE': 'GOOGL', 'ALPHABET': 'GOOGL',
            'AMZN': 'AMZN', 'AMAZON': 'AMZN',
            'TSLA': 'TSLA', 'TESLA': 'TSLA',
            'META': 'META', 'FACEBOOK': 'META',
            'NFLX': 'NFLX', 'NETFLIX': 'NFLX',
            'NVDA': 'NVDA', 'NVIDIA': 'NVDA',
            'JPM': 'JPM', 'JPMORGAN': 'JPM',
            'V': 'V', 'VISA': 'V',
            'JNJ': 'JNJ', 'JOHNSON': 'JNJ',
            'WMT': 'WMT', 'WALMART': 'WMT',
            'PG': 'PG', 'PROCTER': 'PG',
            'UNH': 'UNH', 'UNITEDHEALTH': 'UNH',
            'HD': 'HD', 'HOMEDEPOT': 'HD',
            'MA': 'MA', 'MASTERCARD': 'MA',
            'BAC': 'BAC', 'BANKOFAMERICA': 'BAC',
            'DIS': 'DIS', 'DISNEY': 'DIS',
            'ADBE': 'ADBE', 'ADOBE': 'ADBE',
            'CRM': 'CRM', 'SALESFORCE': 'CRM',
            'PYPL': 'PYPL', 'PAYPAL': 'PYPL',

            # Indian Stocks (NSE) - INR
            'RELIANCE': 'RELIANCE.NS', 
            'TCS': 'TCS.NS', 
            'INFY': 'INFY.NS', 'INFOSYS': 'INFY.NS',
            'HDFC': 'HDFCBANK.NS', 'HDFCBANK': 'HDFCBANK.NS',
            'ICICI': 'ICICIBANK.NS', 'ICICIBANK': 'ICICIBANK.NS',
            'SBIN': 'SBIN.NS', 'SBI': 'SBIN.NS',
            'WIPRO': 'WIPRO.NS',
            'LT': 'LT.NS', 'L&T': 'LT.NS', 'LARSEN': 'LT.NS',
            'AXISBANK': 'AXISBANK.NS',
            'KOTAKBANK': 'KOTAKBANK.NS', 'KOTAK': 'KOTAKBANK.NS',
            'BHARTI': 'BHARTIARTL.NS', 'AIRTEL': 'BHARTIARTL.NS',
            'ITC': 'ITC.NS',
            'HINDUNILVR': 'HINDUNILVR.NS', 'HUL': 'HINDUNILVR.NS',
            'MARUTI': 'MARUTI.NS',
            'ASIANPAINT': 'ASIANPAINT.NS',
            'BAJFINANCE': 'BAJFINANCE.NS',
            'TITAN': 'TITAN.NS',
            'ADANIPORTS': 'ADANIPORTS.NS',
            'HCLTECH': 'HCLTECH.NS',
            'ONGC': 'ONGC.NS',
            'NTPC': 'NTPC.NS',
            'COALINDIA': 'COALINDIA.NS',

            # International Stocks
            'SAMSUNG': '005930.KS',  # Korean Won (KRW)
            'TOYOTA': 'TM',  # USD
            'SONY': 'SONY',  # USD
            'SHELL': 'SHEL',  # USD
            'BP': 'BP',  # USD
            'TOTAL': 'TTE',  # USD
            'NESTLE': 'NESN.SW',  # Swiss Franc (CHF)
            'ASML': 'ASML',  # USD
            'SAP': 'SAP',  # USD
            'BABA': 'BABA', 'ALIBABA': 'BABA'  # USD
        }
        
        # Currency mapping
        self.CURRENCY_MAP = {
            # US stocks and most international
            '.NS': '₹',  # Indian Rupee
            '.KS': '₩',  # Korean Won
            '.SW': 'CHF ',  # Swiss Franc
            '.L': '£',    # British Pound
            '.DE': '€',   # Euro (Germany)
            '.PA': '€',   # Euro (France)
            '.AS': '€',   # Euro (Netherlands)
            '.BR': '€',   # Euro (Belgium)
            '.MI': '€',   # Euro (Italy)
            '.MC': '€',   # Euro (Spain)
            '.VI': '€',   # Euro (Austria)
            '.ST': 'kr',  # Swedish Krona
            '.CO': 'kr',  # Danish Krone
            '.OL': 'kr',  # Norwegian Krone
            '.HE': '€',   # Euro (Finland)
            '.TL': '€',   # Euro (Portugal)
            '.IR': '€',   # Euro (Ireland)
            '.SG': 'S$',  # Singapore Dollar
            '.HK': 'HK$', # Hong Kong Dollar
            '.TW': 'NT$', # Taiwan Dollar
            '.T': '¥',    # Japanese Yen
            '.AX': 'A$',  # Australian Dollar
            '.NZ': 'NZ$', # New Zealand Dollar
            '.V': 'C$',   # Canadian Dollar
            '.TO': 'C$',  # Canadian Dollar
            '.SA': 'SR',  # Saudi Riyal
            '.SR': 'SAR', # Saudi Riyal
        }
        
        # Default currency for symbols without specific mapping
        self.DEFAULT_CURRENCY = '$'
    
    def get_currency_symbol(self, symbol):
        """Get currency symbol based on stock symbol"""
        # Check for country-specific suffixes
        for suffix, currency in self.CURRENCY_MAP.items():
            if symbol.endswith(suffix):
                return currency
        
        # For US stocks and others without specific suffixes, default to USD
        return self.DEFAULT_CURRENCY
    
    def format_currency(self, value, currency_symbol):
        """Format currency value with appropriate symbol"""
        if currency_symbol in ['₹', '₩', '¥']:
            # These symbols typically come before the number
            return f"{currency_symbol}{value:,.2f}"
        elif currency_symbol in ['$', '£', '€', 'C$', 'A$', 'NZ$', 'S$', 'HK$', 'NT$']:
            return f"{currency_symbol}{value:,.2f}"
        else:
            # For others like 'kr', 'CHF', 'SR' - symbol after the number
            return f"{value:,.2f} {currency_symbol}"
    
    def get_correct_symbol(self, symbol_input):
        """Convert user input to correct Yahoo Finance symbol"""
        symbol_input = symbol_input.upper().strip()
        
        # Check if input is in our mapping
        if symbol_input in self.STOCK_SYMBOL_MAP:
            return self.STOCK_SYMBOL_MAP[symbol_input]
        
        # Check if it's already a valid symbol format
        if '.' in symbol_input or symbol_input in ['V', 'MA']:  # Handle single letter symbols
            return symbol_input
        
        # Try with .NS for Indian stocks (common pattern)
        if symbol_input + '.NS' in self.STOCK_SYMBOL_MAP.values():
            return symbol_input + '.NS'
        
        # If no mapping found, return as-is (user might know exact symbol)
        return symbol_input
    
    def get_symbol_suggestions(self, user_input):
        """Get symbol suggestions based on user input"""
        user_input = user_input.upper().strip()
        suggestions = []
        
        for name, symbol in self.STOCK_SYMBOL_MAP.items():
            if user_input in name or user_input in symbol:
                suggestions.append((name, symbol))
        
        return suggestions[:10]  # Return top 10 suggestions
    
    def fetch_and_prepare_data(self, symbol, period='3y'):
        
        # Convert to correct symbol format
        correct_symbol = self.get_correct_symbol(symbol)
        print(f"Fetching data for {symbol} -> {correct_symbol}...")
        
        try:
            # Fetch data
            df = yf.download(correct_symbol, period=period, progress=False)
            
            if df.empty:
                # Try alternative approach
                print("Trying alternative data fetch method...")
                ticker = yf.Ticker(correct_symbol)
                df = ticker.history(period=period)
                
            if df.empty:
                raise ValueError(f"No data found for symbol: {symbol} (tried: {correct_symbol})")
            
            # Handle MultiIndex columns
            if isinstance(df.columns, pd.MultiIndex):
                df = df.xs(correct_symbol, axis=1, level=1)
            
            print(f"Fetched {len(df)} records from {df.index[0].date()} to {df.index[-1].date()}")
            
            # Store currency info for this symbol
            self.currency_info[symbol] = self.get_currency_symbol(correct_symbol)
            print(f"Currency detected: {self.currency_info[symbol]}")
            
            # Add technical indicators
            print("Adding technical indicators...")
            
            # Moving averages
            df['SMA_10'] = df['Close'].rolling(10).mean()
            df['SMA_20'] = df['Close'].rolling(20).mean()
            df['SMA_50'] = df['Close'].rolling(50).mean()
            df['EMA_12'] = df['Close'].ewm(span=12).mean()
            df['EMA_26'] = df['Close'].ewm(span=26).mean()
            
            # Technical indicators using ta library
            close_series = df['Close'].squeeze()
            high_series = df['High'].squeeze()
            low_series = df['Low'].squeeze()
            volume_series = df['Volume'].squeeze()
            
            df['RSI'] = ta.momentum.rsi(close_series, window=14)
            
            # MACD
            macd = ta.trend.MACD(close_series)
            df['MACD'] = macd.macd()
            df['MACD_Signal'] = macd.macd_signal()
            
            # Bollinger Bands
            bb = ta.volatility.BollingerBands(close_series, window=20, window_dev=2)
            df['BB_High'] = bb.bollinger_hband()
            df['BB_Low'] = bb.bollinger_lband()
            df['BB_Mid'] = bb.bollinger_mavg()
            
            # Additional features
            df['Price_Change'] = df['Close'].pct_change()
            df['High_Low_Ratio'] = df['High'] / df['Low']
            df['Volume_MA'] = df['Volume'].rolling(20).mean()
            
            # Lagged features
            for lag in [1, 2, 3, 5]:
                df[f'Close_Lag_{lag}'] = df['Close'].shift(lag)
            
            # Remove NaN values
            df = df.dropna()
            
            if len(df) < 100:
                raise ValueError(f"Insufficient data after cleaning: {len(df)} records")
            
            print(f"Technical indicators added. Final dataset: {len(df)} records")
            return df
            
        except Exception as e:
            # Provide helpful error message with suggestions
            suggestions = self.get_symbol_suggestions(symbol)
            if suggestions:
                print(f"Did you mean one of these?")
                for i, (name, sym) in enumerate(suggestions[:5], 1):
                    print(f"   {i}. {name} -> {sym}")
            raise ValueError(f"Error fetching data for {symbol}: {str(e)}")
    
    def prepare_features_for_ml(self, df, target_col='Close', n_ahead=1):
        
        print(f"Preparing features for ML models (predicting {n_ahead} day ahead)...")
        
        # Select features
        feature_cols = [
            'Close', 'Volume', 'SMA_10', 'SMA_20', 'SMA_50', 'EMA_12', 'EMA_26',
            'RSI', 'MACD', 'MACD_Signal', 'BB_High', 'BB_Low', 'BB_Mid',
            'Price_Change', 'High_Low_Ratio', 'Volume_MA',
            'Close_Lag_1', 'Close_Lag_2', 'Close_Lag_3', 'Close_Lag_5'
        ]
        
        # Ensure all features exist
        available_features = [col for col in feature_cols if col in df.columns]
        
        # Create target variable
        df_ml = df.copy()
        df_ml['Target'] = df_ml[target_col].shift(-n_ahead)
        df_ml = df_ml.dropna()
        
        X = df_ml[available_features]
        y = df_ml['Target']
        
        print(f"Features: {len(available_features)}, Samples: {len(X)}")
        return X, y, available_features
    
    def prepare_lstm_data(self, df, sequence_length=60):

        print(f"Preparing LSTM data with sequence length: {sequence_length}")
        
        # Use Close and Volume for LSTM
        lstm_features = ['Close', 'Volume']
        lstm_data = df[lstm_features].values
        
        # Scale the data
        self.lstm_scaler = MinMaxScaler()
        scaled_data = self.lstm_scaler.fit_transform(lstm_data)
        
        # Create sequences
        X, y = [], []
        for i in range(sequence_length, len(scaled_data)):
            X.append(scaled_data[i-sequence_length:i])
            y.append(scaled_data[i, 0])  # Predict Close price (index 0)
        
        X, y = np.array(X), np.array(y)
        
        print(f"LSTM sequences created: {len(X)} samples")
        return X, y
    
    def train_models(self, X_ml, y_ml, X_lstm, y_lstm):
    
        print("Training models...")
        
        # Split ML data
        X_train_ml, X_test_ml, y_train_ml, y_test_ml = train_test_split(
            X_ml, y_ml, test_size=0.2, shuffle=False, random_state=42
        )
        
        # Split LSTM data
        split_idx = int(len(X_lstm) * 0.8)
        X_train_lstm = X_lstm[:split_idx]
        X_test_lstm = X_lstm[split_idx:]
        y_train_lstm = y_lstm[:split_idx]
        y_test_lstm = y_lstm[split_idx:]
        
        print(f"ML Training: {len(X_train_ml)}, Testing: {len(X_test_ml)}")
        print(f"LSTM Training: {len(X_train_lstm)}, Testing: {len(X_test_lstm)}")
        
        # === Random Forest ===
        print("Training Random Forest...")
        rf_scaler = StandardScaler()
        X_train_rf_scaled = rf_scaler.fit_transform(X_train_ml)
        X_test_rf_scaled = rf_scaler.transform(X_test_ml)
        
        rf_model = RandomForestRegressor(
            n_estimators=200, 
            max_depth=20, 
            min_samples_split=5,
            random_state=42
        )
        rf_model.fit(X_train_rf_scaled, y_train_ml)
        rf_pred = rf_model.predict(X_test_rf_scaled)
        
        self.models['Random_Forest'] = rf_model
        self.scalers['Random_Forest'] = rf_scaler
        
        # === SVR ===
        print("Training SVR...")
        svr_scaler = StandardScaler()
        X_train_svr_scaled = svr_scaler.fit_transform(X_train_ml)
        X_test_svr_scaled = svr_scaler.transform(X_test_ml)
        
        svr_model = SVR(kernel='rbf', C=100, gamma='scale', epsilon=0.01)
        svr_model.fit(X_train_svr_scaled, y_train_ml)
        svr_pred = svr_model.predict(X_test_svr_scaled)
        
        self.models['SVR'] = svr_model
        self.scalers['SVR'] = svr_scaler
        
        # === LSTM ===
        print("Training LSTM...")
        
        # Build LSTM model
        self.lstm_model = Sequential([
            LSTM(100, return_sequences=True, input_shape=(X_train_lstm.shape[1], X_train_lstm.shape[2])),
            Dropout(0.2),
            LSTM(50, return_sequences=False),
            Dropout(0.2),
            Dense(25),
            Dense(1)
        ])
        
        self.lstm_model.compile(optimizer='adam', loss='mse', metrics=['mae'])
        
        early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
        
        history = self.lstm_model.fit(
            X_train_lstm, y_train_lstm,
            validation_data=(X_test_lstm, y_test_lstm),
            epochs=100,
            batch_size=32,
            callbacks=[early_stopping],
            verbose=0
        )
        
        lstm_pred_scaled = self.lstm_model.predict(X_test_lstm, verbose=0)
        
        # Inverse transform LSTM predictions
        lstm_pred_full = np.zeros((len(lstm_pred_scaled), 2))
        lstm_pred_full[:, 0] = lstm_pred_scaled.flatten()
        lstm_pred = self.lstm_scaler.inverse_transform(lstm_pred_full)[:, 0]
        
        # Inverse transform actual values for LSTM
        y_test_lstm_full = np.zeros((len(y_test_lstm), 2))
        y_test_lstm_full[:, 0] = y_test_lstm
        y_test_lstm_actual = self.lstm_scaler.inverse_transform(y_test_lstm_full)[:, 0]
        
        # Calculate metrics for all models
        self.results = {
            'Random_Forest': self._calculate_metrics(y_test_ml, rf_pred, 'Random Forest'),
            'SVR': self._calculate_metrics(y_test_ml, svr_pred, 'SVR'),
            'LSTM': self._calculate_metrics(y_test_lstm_actual, lstm_pred, 'LSTM')
        }
        
        # Store predictions and actual values
        self.results['Random_Forest']['predictions'] = rf_pred
        self.results['Random_Forest']['actual'] = y_test_ml.values
        
        self.results['SVR']['predictions'] = svr_pred
        self.results['SVR']['actual'] = y_test_ml.values
        
        self.results['LSTM']['predictions'] = lstm_pred
        self.results['LSTM']['actual'] = y_test_lstm_actual
        
        print("All models trained successfully!")
        
    def _calculate_metrics(self, y_true, y_pred, model_name):
        mae = mean_absolute_error(y_true, y_pred)
        mse = mean_squared_error(y_true, y_pred)
        rmse = np.sqrt(mse)
        r2 = r2_score(y_true, y_pred)
        
        # Directional accuracy
        if len(y_true) > 1:
            y_true_direction = np.diff(y_true) > 0
            y_pred_direction = np.diff(y_pred) > 0
            directional_accuracy = np.mean(y_true_direction == y_pred_direction) * 100
        else:
            directional_accuracy = 0
        
        print(f"{model_name:12} - MAE: {mae:6.2f} | RMSE: {rmse:6.2f} | R²: {r2:6.3f} | Dir.Acc: {directional_accuracy:5.1f}%")
        
        return {
            'mae': mae,
            'mse': mse,
            'rmse': rmse,
            'r2': r2,
            'directional_accuracy': directional_accuracy
        }
    
    def predict_future_7_days(self, df, feature_cols):
        
        print("Generating 7-day future predictions...")
        
        future_predictions = {'Random_Forest': [], 'SVR': [], 'LSTM': []}
        
        # Generate future dates (excluding weekends)
        last_date = df.index[-1]
        future_dates = []
        current_date = last_date + timedelta(days=1)
        
        while len(future_dates) < 7:
            if current_date.weekday() < 5:  # Monday to Friday
                future_dates.append(current_date)
            current_date += timedelta(days=1)
        
        # === Random Forest & SVR Predictions ===
        current_data = df.iloc[-1:][feature_cols].copy()
        
        for day in range(7):
            # Scale current data
            rf_scaled = self.scalers['Random_Forest'].transform(current_data)
            svr_scaled = self.scalers['SVR'].transform(current_data)
            
            # Make predictions
            rf_pred = self.models['Random_Forest'].predict(rf_scaled)[0]
            svr_pred = self.models['SVR'].predict(svr_scaled)[0]
            
            future_predictions['Random_Forest'].append(rf_pred)
            future_predictions['SVR'].append(svr_pred)
            
            # Update current_data for next prediction
            if day < 6:
                # Update Close price and some dependent features
                current_data.iloc[0, current_data.columns.get_loc('Close')] = rf_pred
                
                # Update some lagged features (simplified)
                if 'Close_Lag_1' in current_data.columns:
                    current_data.iloc[0, current_data.columns.get_loc('Close_Lag_1')] = rf_pred
        
        # === LSTM Predictions ===
        lstm_features = ['Close', 'Volume']
        last_sequence = df[lstm_features].values[-60:]  # Last 60 days
        last_sequence_scaled = self.lstm_scaler.transform(last_sequence)
        
        current_sequence = last_sequence_scaled.copy()
        
        for day in range(7):
            # Reshape for LSTM
            X_pred = current_sequence.reshape(1, 60, 2)
            
            # Make prediction
            pred_scaled = self.lstm_model.predict(X_pred, verbose=0)[0, 0]
            
            # Inverse transform
            pred_full = np.zeros((1, 2))
            pred_full[0, 0] = pred_scaled
            pred_actual = self.lstm_scaler.inverse_transform(pred_full)[0, 0]
            
            future_predictions['LSTM'].append(pred_actual)
            
            # Update sequence for next prediction
            if day < 6:
                new_row = np.array([[pred_scaled, current_sequence[-1, 1]]])  # Use last volume
                current_sequence = np.vstack([current_sequence[1:], new_row])
        
        return future_dates, future_predictions
    
    def create_accuracy_comparison_dashboard(self):
    
        print("Creating accuracy comparison dashboard...")
        
        # Prepare comparison data
        comparison_data = []
        for model_name, metrics in self.results.items():
            comparison_data.append({
                'Model': model_name,
                'MAE': metrics['mae'],
                'RMSE': metrics['rmse'],
                'R²': metrics['r2'],
                'Directional_Accuracy': metrics['directional_accuracy']
            })
        
        df_comp = pd.DataFrame(comparison_data)
        
        # Create subplots
        fig = make_subplots(
            rows=2, cols=2,
            subplot_titles=('Mean Absolute Error (Lower = Better)', 
                          'R² Score (Higher = Better)',
                          'Directional Accuracy (%)', 
                          'Performance Ranking'),
            specs=[[{"type": "bar"}, {"type": "bar"}],
                   [{"type": "bar"}, {"type": "table"}]]
        )
        
        # Colors for each model
        colors = ['#FF6B6B', '#4ECDC4', '#45B7D1']
        
        # MAE comparison
        fig.add_trace(
            go.Bar(x=df_comp['Model'], y=df_comp['MAE'],
                   name='MAE', marker_color=colors),
            row=1, col=1
        )
        
        # R² comparison
        fig.add_trace(
            go.Bar(x=df_comp['Model'], y=df_comp['R²'],
                   name='R²', marker_color=colors),
            row=1, col=2
        )
        
        # Directional accuracy
        fig.add_trace(
            go.Bar(x=df_comp['Model'], y=df_comp['Directional_Accuracy'],
                   name='Directional Accuracy', marker_color=colors),
            row=2, col=1
        )
        
        # Ranking table
        df_comp_sorted = df_comp.sort_values('MAE')  # Sort by MAE (lower is better)
        
        # Assign rankings
        rankings = ['1st', '2nd', '3rd']
        
        fig.add_trace(
            go.Table(
                header=dict(
                    values=['Rank', 'Model', 'MAE', 'R²', 'Dir.Acc%'],
                    fill_color='#2C3E50',
                    font=dict(color='white', size=14, family="Arial Black"),
                    align='center',
                    height=40
                ),
                cells=dict(
                    values=[
                        rankings,
                        df_comp_sorted['Model'].tolist(),
                        [f"{x:.2f}" for x in df_comp_sorted['MAE']],
                        [f"{x:.3f}" for x in df_comp_sorted['R²']],
                        [f"{x:.1f}%" for x in df_comp_sorted['Directional_Accuracy']]
                    ],
                    fill_color=[['#FFD700', '#C0C0C0', '#CD7F32']],  # Gold, Silver, Bronze
                    font=dict(size=12),
                    align='center',
                    height=35
                )
            ),
            row=2, col=2
        )
        
        fig.update_layout(
            title_text="Model Performance Comparison Dashboard",
            title_x=0.5,
            title_font_size=20,
            height=700,
            showlegend=False,
            template='plotly_white'
        )
        
        fig.show()
        
        return df_comp_sorted
    
    def create_predictions_comparison_chart(self, symbol, current_price, future_dates, future_predictions):
       
        print("Creating predictions comparison chart...")
        
        currency_symbol = self.currency_info.get(symbol, self.DEFAULT_CURRENCY)
        
        fig = go.Figure()
        
        # Add current price as starting point
        all_dates = [future_dates[0] - timedelta(days=1)] + future_dates
        
        # Colors for each model
        colors = {'Random_Forest': '#FF6B6B', 'SVR': '#4ECDC4', 'LSTM': '#45B7D1'}
        
        for model_name, predictions in future_predictions.items():
            all_prices = [current_price] + predictions
            
            fig.add_trace(
                go.Scatter(
                    x=all_dates,
                    y=all_prices,
                    mode='lines+markers',
                    name=f'{model_name}',
                    line=dict(color=colors[model_name], width=3),
                    marker=dict(size=8)
                )
            )
        
        # Add current price marker
        fig.add_trace(
            go.Scatter(
                x=[all_dates[0]],
                y=[current_price],
                mode='markers',
                name='Current Price',
                marker=dict(size=15, color='black', symbol='star')
            )
        )
        
        # Add horizontal line for current price reference
        current_price_formatted = self.format_currency(current_price, currency_symbol)
        fig.add_hline(
            y=current_price,
            line_dash="dash",
            line_color="gray",
            annotation_text=f"Current: {current_price_formatted}"
        )
        
        fig.update_layout(
            title=f'{symbol} - 7-Day Price Predictions Comparison',
            xaxis_title='Date',
            yaxis_title=f'Price ({currency_symbol})',
            template='plotly_white',
            height=600,
            hovermode='x unified',
            legend=dict(
                orientation="h",
                yanchor="bottom",
                y=1.02,
                xanchor="right",
                x=1
            )
        )
        
        fig.show()
        
    def create_model_predictions_vs_actual(self, df, symbol):
        
        print("Creating actual vs predicted comparison...")
        
        currency_symbol = self.currency_info.get(symbol, self.DEFAULT_CURRENCY)
        
        fig = make_subplots(
            rows=1, cols=3,
            subplot_titles=('Random Forest', 'SVR', 'LSTM')
        )
        
        colors = ['#FF6B6B', '#4ECDC4', '#45B7D1']
        
        # Calculate test date ranges
        total_samples = len(df)
        test_size_ml = int(total_samples * 0.2)
        
        # ML models use last 20% of data
        ml_test_dates = df.index[-test_size_ml:]
        
        # LSTM uses last 20% of sequences (after removing first 60 for sequence creation)
        sequence_length = 60
        available_sequences = total_samples - sequence_length
        test_size_lstm = int(available_sequences * 0.2)
        lstm_test_dates = df.index[sequence_length:][-test_size_lstm:]
        
        for i, (model_name, result) in enumerate(self.results.items()):
            actual = result['actual']
            predicted = result['predictions']
            
            # Assign dates based on model type
            if model_name in ['Random_Forest', 'SVR']:
                dates = ml_test_dates[:len(actual)]
            else:  # LSTM
                dates = lstm_test_dates[:len(actual)]
            
            # Ensure we have matching lengths
            min_length = min(len(dates), len(actual), len(predicted))
            dates = dates[:min_length]
            actual = actual[:min_length]
            predicted = predicted[:min_length]
            
            # Add traces
            fig.add_trace(
                go.Scatter(
                    x=dates, 
                    y=actual, 
                    name=f'{model_name} Actual',
                    line=dict(color='black', width=3),
                    hovertemplate=f'Date: %{{x|%Y-%m-%d}}<br>Actual Price: {currency_symbol}%{{y:.2f}}<extra></extra>',
                    legendgroup=model_name
                ),
                row=1, col=i+1
            )
            
            fig.add_trace(
                go.Scatter(
                    x=dates, 
                    y=predicted, 
                    name=f'{model_name} Predicted',
                    line=dict(color=colors[i], width=2, dash='dot'),
                    hovertemplate=f'Date: %{{x|%Y-%m-%d}}<br>Predicted Price: {currency_symbol}%{{y:.2f}}<extra></extra>',
                    legendgroup=model_name
                ),
                row=1, col=i+1
            )
            
            # Update subplot titles with date range
            if len(dates) > 0:
                start_date = dates[0].strftime('%d %b %Y')
                end_date = dates[-1].strftime('%d %b %Y')
                fig.layout.annotations[i].text = f'{model_name}<br>{start_date} to {end_date}'
        
        fig.update_layout(
            title='Model Predictions vs Actual Prices',
            height=500,
            template='plotly_white',
            showlegend=True,
            hovermode='x unified',
            legend=dict(
                orientation="h",
                yanchor="bottom",
                y=1.02,
                xanchor="center",
                x=0.5
            )
        )
        
        # Update axes
        fig.update_xaxes(
            title_text="Date",
            tickformat='%b %d',
            tickangle=45,
            tickmode='auto',
            nticks=6
        )
        
        fig.update_yaxes(
            title_text=f"Price ({currency_symbol})",
        )
        
        fig.show()

### Main Function

In [19]:
def main():
    
    print("Welcome to Focused Stock Prediction System!")
    print("Models: Random Forest | LSTM | SVR")
    print("Supports: US, Indian & International Stocks")
    print("=" * 60)
    
    # Get user input
    symbol = input("Enter stock symbol (e.g., AAPL, INFOSYS, NVDA, TCS): ").strip().upper()
    
    if not symbol:
        symbol = "AAPL"  # Default
        print(f"Using default symbol: {symbol}")
    
    try:
        # Initialize predictor
        predictor = FocusedStockPredictor()
        
        # Get symbol suggestions if needed
        suggestions = predictor.get_symbol_suggestions(symbol)
        if suggestions and symbol not in [s[1] for s in suggestions]:
            print(f"Did you mean one of these?")
            for i, (name, sym) in enumerate(suggestions[:5], 1):
                print(f"   {i}. {name} -> {sym}")
        
        # Step 1: Fetch and prepare data
        df = predictor.fetch_and_prepare_data(symbol)
        current_price = float(df['Close'].iloc[-1])
        
        currency_symbol = predictor.currency_info.get(symbol, predictor.DEFAULT_CURRENCY)
        current_price_formatted = predictor.format_currency(current_price, currency_symbol)
        
        print(f"Current {symbol} Price: {current_price_formatted}")
        
        # Step 2: Prepare features
        X_ml, y_ml, feature_cols = predictor.prepare_features_for_ml(df)
        X_lstm, y_lstm = predictor.prepare_lstm_data(df)
        
        # Step 3: Train models
        predictor.train_models(X_ml, y_ml, X_lstm, y_lstm)
        
        # Step 4: Generate 7-day predictions
        future_dates, future_predictions = predictor.predict_future_7_days(df, feature_cols)
        
        # Step 5: Create visualizations
        comparison_df = predictor.create_accuracy_comparison_dashboard()
        predictor.create_predictions_comparison_chart(symbol, current_price, future_dates, future_predictions)
        predictor.create_model_predictions_vs_actual(df, symbol)
        
        # Step 6: Print detailed results
        print("\n" + "="*80)
        print("DETAILED RESULTS SUMMARY")
        print("="*80)
        
        print(f"Stock: {symbol}")
        print(f"Current Price: {current_price_formatted}")
        print(f"Currency: {currency_symbol}")
        print(f"Analysis Date: {datetime.now().strftime('%Y-%m-%d %H:%M')}")
        
        print(f"MODEL ACCURACY RANKING:")
        for i, (_, row) in enumerate(comparison_df.iterrows(), 1):
            rank_symbol = "1st" if i == 1 else "2nd" if i == 2 else "3rd"
            print(f"{rank_symbol:3} {row['Model']:12} - MAE: {currency_symbol}{row['MAE']:6.2f} | R²: {row['R²']:6.3f} | Dir.Acc: {row['Directional_Accuracy']:5.1f}%")
        
        print(f"7-DAY FUTURE PREDICTIONS:")
        print("-" * 60)
        
        for i, date in enumerate(future_dates):
            rf_pred = future_predictions['Random_Forest'][i]
            svr_pred = future_predictions['SVR'][i]
            lstm_pred = future_predictions['LSTM'][i]
            
            rf_change = ((rf_pred - current_price) / current_price) * 100
            svr_change = ((svr_pred - current_price) / current_price) * 100
            lstm_change = ((lstm_pred - current_price) / current_price) * 100
            
            rf_formatted = predictor.format_currency(rf_pred, currency_symbol)
            svr_formatted = predictor.format_currency(svr_pred, currency_symbol)
            lstm_formatted = predictor.format_currency(lstm_pred, currency_symbol)
            
            print(f"{date.strftime('%Y-%m-%d (%a)')}:")
            print(f"RF: {rf_formatted:>12} ({rf_change:+5.1f}%)")
            print(f"SVR: {svr_formatted:>12} ({svr_change:+5.1f}%)")
            print(f"LSTM: {lstm_formatted:>12} ({lstm_change:+5.1f}%)")
            print()
        
        # Average predictions
        rf_avg = np.mean(future_predictions['Random_Forest'])
        svr_avg = np.mean(future_predictions['SVR'])
        lstm_avg = np.mean(future_predictions['LSTM'])
        
        rf_avg_formatted = predictor.format_currency(rf_avg, currency_symbol)
        svr_avg_formatted = predictor.format_currency(svr_avg, currency_symbol)
        lstm_avg_formatted = predictor.format_currency(lstm_avg, currency_symbol)
        
        print(f"7-DAY AVERAGE PREDICTIONS:")
        print(f"Random Forest: {rf_avg_formatted} ({((rf_avg-current_price)/current_price)*100:+.1f}%)")
        print(f"SVR: {svr_avg_formatted} ({((svr_avg-current_price)/current_price)*100:+.1f}%)")
        print(f"LSTM: {lstm_avg_formatted} ({((lstm_avg-current_price)/current_price)*100:+.1f}%)")
        
        print(f"Analysis completed successfully for {symbol}!")
        
    except Exception as e:
        print(f"Error: {str(e)}")
        print("TIPS:")
        print("• For Indian stocks, use company names like 'INFOSYS', 'TCS', 'RELIANCE'")
        print("• For US stocks, use symbols like 'AAPL', 'TSLA', 'NVDA'")
        print("• Make sure the stock is actively traded")
        import traceback
        traceback.print_exc()

if __name__ == "__main__":
    main()

Welcome to Focused Stock Prediction System!
Models: Random Forest | LSTM | SVR
Supports: US, Indian & International Stocks
Did you mean one of these?
   1. INFOSYS -> INFY.NS
Fetching data for INFOSYS -> INFY.NS...
Fetched 743 records from 2022-10-03 to 2025-10-03
Currency detected: ₹
Adding technical indicators...
Technical indicators added. Final dataset: 694 records
Current INFOSYS Price: ₹1,441.50
Preparing features for ML models (predicting 1 day ahead)...
Features: 20, Samples: 693
Preparing LSTM data with sequence length: 60
LSTM sequences created: 634 samples
Training models...
ML Training: 554, Testing: 139
LSTM Training: 507, Testing: 127
Training Random Forest...
Training SVR...
Training LSTM...
Random Forest - MAE:  22.39 | RMSE:  28.88 | R²:  0.815 | Dir.Acc:  43.5%
SVR          - MAE:  19.33 | RMSE:  28.99 | R²:  0.814 | Dir.Acc:  49.3%
LSTM         - MAE:  20.23 | RMSE:  27.47 | R²:  0.839 | Dir.Acc:  53.2%
All models trained successfully!
Generating 7-day future predict

Creating predictions comparison chart...


Creating actual vs predicted comparison...



DETAILED RESULTS SUMMARY
Stock: INFOSYS
Current Price: ₹1,441.50
Currency: ₹
Analysis Date: 2025-10-03 09:51
MODEL ACCURACY RANKING:
1st SVR          - MAE: ₹ 19.33 | R²:  0.814 | Dir.Acc:  49.3%
2nd LSTM         - MAE: ₹ 20.23 | R²:  0.839 | Dir.Acc:  53.2%
3rd Random_Forest - MAE: ₹ 22.39 | R²:  0.815 | Dir.Acc:  43.5%
7-DAY FUTURE PREDICTIONS:
------------------------------------------------------------
2025-10-06 (Mon):
RF:    ₹1,434.17 ( -0.5%)
SVR:    ₹1,448.83 ( +0.5%)
LSTM:    ₹1,446.94 ( +0.4%)

2025-10-07 (Tue):
RF:    ₹1,427.63 ( -1.0%)
SVR:    ₹1,446.27 ( +0.3%)
LSTM:    ₹1,449.93 ( +0.6%)

2025-10-08 (Wed):
RF:    ₹1,424.46 ( -1.2%)
SVR:    ₹1,444.51 ( +0.2%)
LSTM:    ₹1,452.47 ( +0.8%)

2025-10-09 (Thu):
RF:    ₹1,418.73 ( -1.6%)
SVR:    ₹1,443.67 ( +0.2%)
LSTM:    ₹1,454.29 ( +0.9%)

2025-10-10 (Fri):
RF:    ₹1,413.39 ( -1.9%)
SVR:    ₹1,442.14 ( +0.0%)
LSTM:    ₹1,455.61 ( +1.0%)

2025-10-13 (Mon):
RF:    ₹1,410.57 ( -2.1%)
SVR:    ₹1,440.73 ( -0.1%)
LSTM:    ₹1,456.71