# Stock Market Prediction Model Prototyping

This notebook focuses on prototyping different machine learning models for stock price prediction. We'll implement and compare various approaches, from simple baseline models to more sophisticated deep learning architectures.

**Contents:**
1. Data Loading and Preparation
2. Baseline Models
3. Deep Learning Models
4. Model Comparison
5. Ablation Studies

In [1]:
# Import necessary libraries
import os
import sys
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import time

# Machine learning libraries
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.model_selection import train_test_split, TimeSeriesSplit
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score
import statsmodels.api as sm
import tensorflow as tf
from tensorflow.keras.models import Sequential, Model
from tensorflow.keras.layers import Dense, LSTM, Dropout, Input, MultiHeadAttention
from tensorflow.keras.layers import LayerNormalization, GlobalAveragePooling1D
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint

# Add the src directory to the path so we can import our modules
sys.path.append(os.path.abspath('../src'))

# Set display options
pd.set_option('display.max_columns', None)
pd.set_option('display.float_format', '{:.4f}'.format)

# Set plotting style
plt.style.use('seaborn-v0_8-whitegrid')
sns.set_palette('viridis')
plt.rcParams['figure.figsize'] = (12, 6)

ModuleNotFoundError: No module named 'tensorflow'

## 1. Data Loading and Preparation

We'll load processed stock data that has already been prepared with feature engineering.

In [None]:
# Data paths
processed_data_path = '../data/processed/'

# List available processed data files
npz_files = [f for f in os.listdir(processed_data_path) if f.endswith('_ml_ready.npz')]
print(f"Found {len(npz_files)} processed data files.")

# If no processed files are found, we might need to run data preparation
if len(npz_files) == 0:
    print("No processed data files found. You may need to run the data preprocessing script first.")
    print("Run: python -m src.preprocess")
else:
    # Select a stock symbol for prototyping (e.g., AAPL if available)
    preferred_symbols = ['AAPL', 'MSFT', 'GOOG', 'AMZN']
    available_symbols = [os.path.splitext(f)[0].replace('_ml_ready', '') for f in npz_files]
    
    selected_symbol = next((symbol for symbol in preferred_symbols if symbol in str(available_symbols)), available_symbols[0])
    print(f"Selected symbol for prototyping: {selected_symbol}")

In [None]:
# Load the data for the selected symbol
def load_data(symbol):
    """Load processed data for a specific stock symbol."""
    data_file = os.path.join(processed_data_path, f"{symbol}_ml_ready.npz")
    
    if not os.path.exists(data_file):
        raise FileNotFoundError(f"Processed data file not found: {data_file}")
    
    data = np.load(data_file)
    X = data['X']
    y = data['y']
    dates = data['dates']
    
    return X, y, dates

# Load the data
try:
    X, y, dates = load_data(selected_symbol)
    print(f"Data loaded successfully. X shape: {X.shape}, y shape: {y.shape}, dates shape: {dates.shape}")
    
    # Convert dates to datetime if needed
    if isinstance(dates[0], bytes):
        dates = [d.decode('utf-8') for d in dates]  # Convert bytes to strings if needed
        dates = [pd.to_datetime(d) for d in dates]  # Convert strings to datetime
except Exception as e:
    print(f"Error loading data: {e}")

In [None]:
# Split the data into training, validation, and test sets
def split_data(X, y, dates, test_size=0.2, val_size=0.1):
    """Split data into training, validation, and test sets using time-based split."""
    # Time-based split (no shuffling)
    n_samples = len(X)
    test_idx = int(n_samples * (1 - test_size))
    
    X_temp, X_test = X[:test_idx], X[test_idx:]
    y_temp, y_test = y[:test_idx], y[test_idx:]
    dates_temp, dates_test = dates[:test_idx], dates[test_idx:]
    
    val_idx = int(len(X_temp) * (1 - val_size))
    
    X_train, X_val = X_temp[:val_idx], X_temp[val_idx:]
    y_train, y_val = y_temp[:val_idx], y_temp[val_idx:]
    dates_train, dates_val = dates_temp[:val_idx], dates_temp[val_idx:]
    
    return X_train, X_val, X_test, y_train, y_val, y_test, dates_train, dates_val, dates_test

# Split the data
X_train, X_val, X_test, y_train, y_val, y_test, dates_train, dates_val, dates_test = split_data(X, y, dates)

print(f"Training set: {X_train.shape}, {y_train.shape}")
print(f"Validation set: {X_val.shape}, {y_val.shape}")
print(f"Test set: {X_test.shape}, {y_test.shape}")

In [None]:
# Plot the data split
plt.figure(figsize=(15, 6))

# Concatenate all target values and dates
all_dates = np.concatenate([dates_train, dates_val, dates_test])
all_prices = np.concatenate([y_train, y_val, y_test])

# Plot the full price series
plt.plot(all_dates, all_prices, color='gray', alpha=0.3, label='All Data')

# Plot the training, validation, and test sets
plt.plot(dates_train, y_train, color='blue', label='Training Set')
plt.plot(dates_val, y_val, color='green', label='Validation Set')
plt.plot(dates_test, y_test, color='red', label='Test Set')

# Add vertical lines to separate the sets
plt.axvline(x=dates_train[-1], color='black', linestyle='--', alpha=0.5)
plt.axvline(x=dates_val[-1], color='black', linestyle='--', alpha=0.5)

plt.title(f'{selected_symbol} - Data Split', fontsize=16)
plt.xlabel('Date', fontsize=14)
plt.ylabel('Price', fontsize=14)
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()

## 2. Baseline Models

We'll implement and evaluate simple baseline models for comparison:
- Last value prediction (naive approach)
- Linear Regression
- ARIMA time series model

In [None]:
# Define a function to evaluate model performance
def evaluate_model(y_true, y_pred, model_name):
    """Evaluate model predictions using various metrics."""
    mse = mean_squared_error(y_true, y_pred)
    rmse = np.sqrt(mse)
    mae = mean_absolute_error(y_true, y_pred)
    r2 = r2_score(y_true, y_pred)
    mape = np.mean(np.abs((y_true - y_pred) / y_true)) * 100
    
    print(f"\n{model_name} Evaluation:")
    print(f"MSE: {mse:.4f}")
    print(f"RMSE: {rmse:.4f}")
    print(f"MAE: {mae:.4f}")
    print(f"R²: {r2:.4f}")
    print(f"MAPE: {mape:.4f}%")
    
    return {
        'model': model_name,
        'mse': mse,
        'rmse': rmse,
        'mae': mae,
        'r2': r2,
        'mape': mape,
        'y_pred': y_pred
    }

In [None]:
# 1. Naive Model (predict the last observed value)
def naive_forecast(X_test):
    """Use the last observed price as the prediction for the next day."""
    # Get the last closing price from each window
    close_price_column = -1  # Assuming the closing price is the last feature
    last_observed_prices = X_test[:, -1, close_price_column]
    return last_observed_prices

# Make naive predictions
naive_preds = naive_forecast(X_test)

# Evaluate the naive model
naive_results = evaluate_model(y_test, naive_preds, "Naive Last Value Model")

In [2]:
# 2. Linear Regression Model
def fit_linear_model(X_train, y_train, X_test):
    """Fit a linear regression model and make predictions."""
    # Reshape 3D data to 2D for sklearn
    n_samples_train, n_timesteps, n_features = X_train.shape
    X_train_2d = X_train.reshape(n_samples_train, n_timesteps * n_features)
    
    # Scale the data
    scaler_X = StandardScaler()
    scaler_y = StandardScaler()
    
    X_train_scaled = scaler_X.fit_transform(X_train_2d)
    y_train_scaled = scaler_y.fit_transform(y_train.reshape(-1, 1)).flatten()
    
    # Fit the linear model
    lr_model = LinearRegression()
    lr_model.fit(X_train_scaled, y_train_scaled)
    
    # Prepare test data
    n_samples_test, _, _ = X_test.shape
    X_test_2d = X_test.reshape(n_samples_test, n_timesteps * n_features)
    X_test_scaled = scaler_X.transform(X_test_2d)
    
    # Make predictions
    y_pred_scaled = lr_model.predict(X_test_scaled)
    y_pred = scaler_y.inverse_transform(y_pred_scaled.reshape(-1, 1)).flatten()
    
    return y_pred, lr_model

# Use the linear regression model
start_time = time.time()
lr_preds, lr_model = fit_linear_model(X_train, y_train, X_test)
lr_time = time.time() - start_time

print(f"Linear Regression training time: {lr_time:.2f} seconds")

# Evaluate the linear regression model
lr_results = evaluate_model(y_test, lr_preds, "Linear Regression Model")

NameError: name 'X_train' is not defined

In [3]:
# 3. ARIMA Model
def fit_arima_model(y_train, test_size):
    """Fit an ARIMA model to the training data and make predictions."""
    # Convert to pandas Series for statsmodels
    train_series = pd.Series(y_train)
    
    # Fit ARIMA model
    order = (5, 1, 0)  # AR(5), I(1), MA(0) - common for stock prices
    model = sm.tsa.ARIMA(train_series, order=order)
    
    # This can be slow, so we'll show a message
    print("Fitting ARIMA model... (this may take a few minutes)")
    fit_model = model.fit()
    print("ARIMA model fit complete.")
    
    # Make predictions
    y_pred = fit_model.forecast(steps=test_size)
    
    return y_pred, fit_model

# Use the ARIMA model
try:
    start_time = time.time()
    arima_preds, arima_model = fit_arima_model(y_train, len(y_test))
    arima_time = time.time() - start_time

    print(f"ARIMA training and prediction time: {arima_time:.2f} seconds")

    # Evaluate the ARIMA model
    arima_results = evaluate_model(y_test, arima_preds, "ARIMA Model")
except Exception as e:
    print(f"Error fitting ARIMA model: {e}")
    arima_results = None

Error fitting ARIMA model: name 'y_train' is not defined


## 3. Deep Learning Models

Now we'll implement and evaluate more sophisticated models:
- LSTM (Long Short-Term Memory)
- Transformer (with self-attention mechanism)

In [4]:
# Common preprocessing for deep learning models
def preprocess_for_dl(X_train, y_train, X_val, y_val, X_test):
    """Preprocess data for deep learning models (scaling)."""
    # Reshape X data for scaling
    n_samples_train, n_timesteps, n_features = X_train.shape
    X_train_reshaped = X_train.reshape(-1, n_features)
    X_val_reshaped = X_val.reshape(-1, n_features)
    X_test_reshaped = X_test.reshape(-1, n_features)
    
    # Scale features
    X_scaler = MinMaxScaler(feature_range=(0, 1))
    X_train_scaled = X_scaler.fit_transform(X_train_reshaped)
    X_val_scaled = X_scaler.transform(X_val_reshaped)
    X_test_scaled = X_scaler.transform(X_test_reshaped)
    
    # Reshape back to 3D
    X_train_scaled = X_train_scaled.reshape(n_samples_train, n_timesteps, n_features)
    X_val_scaled = X_val_scaled.reshape(X_val.shape[0], n_timesteps, n_features)
    X_test_scaled = X_test_scaled.reshape(X_test.shape[0], n_timesteps, n_features)
    
    # Scale targets
    y_scaler = StandardScaler()
    y_train_scaled = y_scaler.fit_transform(y_train.reshape(-1, 1)).flatten()
    y_val_scaled = y_scaler.transform(y_val.reshape(-1, 1)).flatten()
    
    return X_train_scaled, y_train_scaled, X_val_scaled, y_val_scaled, X_test_scaled, y_scaler

# Scale the data
X_train_scaled, y_train_scaled, X_val_scaled, y_val_scaled, X_test_scaled, y_scaler = preprocess_for_dl(
    X_train, y_train, X_val, y_val, X_test
)

NameError: name 'X_train' is not defined

In [5]:
# 4. LSTM Model
def build_lstm_model(input_shape, units=64, layers=2, dropout_rate=0.2):
    """Build an LSTM model with the specified architecture."""
    model = Sequential()
    
    # First LSTM layer
    model.add(LSTM(units=units, 
                   return_sequences=(layers > 1),
                   input_shape=input_shape))
    model.add(Dropout(dropout_rate))
    
    # Additional LSTM layers
    for i in range(layers - 1):
        return_seq = (i < layers - 2)  # Return sequences for all but the last layer
        model.add(LSTM(units=units, return_sequences=return_seq))
        model.add(Dropout(dropout_rate))
    
    # Output layer
    model.add(Dense(1))
    
    # Compile model
    model.compile(optimizer='adam', loss='mean_squared_error', metrics=['mae'])
    
    return model

# Set up early stopping and model checkpoint
early_stopping = EarlyStopping(monitor='val_loss', patience=10, restore_best_weights=True)
checkpoint_path = "../models/lstm/lstm_checkpoint.h5"
os.makedirs(os.path.dirname(checkpoint_path), exist_ok=True)
model_checkpoint = ModelCheckpoint(checkpoint_path, save_best_only=True, monitor='val_loss')

# Build and train the LSTM model
input_shape = (X_train.shape[1], X_train.shape[2])
lstm_model = build_lstm_model(input_shape, units=64, layers=2)

# Show model summary
lstm_model.summary()

# Train the model
start_time = time.time()
lstm_history = lstm_model.fit(
    X_train_scaled, y_train_scaled,
    validation_data=(X_val_scaled, y_val_scaled),
    epochs=50,
    batch_size=32,
    callbacks=[early_stopping, model_checkpoint],
    verbose=1
)
lstm_time = time.time() - start_time
print(f"LSTM training time: {lstm_time:.2f} seconds")

# Make predictions
lstm_preds_scaled = lstm_model.predict(X_test_scaled)
lstm_preds = y_scaler.inverse_transform(lstm_preds_scaled).flatten()

# Evaluate the LSTM model
lstm_results = evaluate_model(y_test, lstm_preds, "LSTM Model")

NameError: name 'EarlyStopping' is not defined

In [6]:
# 5. Transformer Model
def transformer_encoder_block(inputs, head_size, num_heads, ff_dim, dropout_rate=0.2):
    """Create a transformer encoder block."""
    # Multi-head self-attention
    attention_output = MultiHeadAttention(
        num_heads=num_heads, key_dim=head_size
    )(inputs, inputs)
    attention_output = Dropout(dropout_rate)(attention_output)
    attention_output = LayerNormalization(epsilon=1e-6)(inputs + attention_output)
    
    # Feed-forward network
    ffn_output = Dense(ff_dim, activation="relu")(attention_output)
    ffn_output = Dense(inputs.shape[-1])(ffn_output)
    ffn_output = Dropout(dropout_rate)(ffn_output)
    return LayerNormalization(epsilon=1e-6)(attention_output + ffn_output)

def build_transformer_model(input_shape, head_size=256, num_heads=4, ff_dim=512, 
                            num_transformer_blocks=2, mlp_units=[128, 64], dropout_rate=0.2):
    """Build a transformer model for time series forecasting."""
    inputs = Input(shape=input_shape)
    x = inputs
    
    # Transformer blocks
    for _ in range(num_transformer_blocks):
        x = transformer_encoder_block(x, head_size, num_heads, ff_dim, dropout_rate)
    
    # Global average pooling
    x = GlobalAveragePooling1D()(x)
    
    # MLP layers
    for dim in mlp_units:
        x = Dense(dim, activation="relu")(x)
        x = Dropout(dropout_rate)(x)
    
    # Output layer
    outputs = Dense(1)(x)
    
    # Create and compile model
    model = Model(inputs=inputs, outputs=outputs)
    model.compile(optimizer="adam", loss="mean_squared_error", metrics=["mae"])
    
    return model

# Build and train the Transformer model
transformer_model = build_transformer_model(input_shape)

# Show model summary
transformer_model.summary()

# Set up early stopping and model checkpoint
checkpoint_path = "../models/transformer/transformer_checkpoint.h5"
os.makedirs(os.path.dirname(checkpoint_path), exist_ok=True)
model_checkpoint = ModelCheckpoint(checkpoint_path, save_best_only=True, monitor='val_loss')

# Train the model
start_time = time.time()
transformer_history = transformer_model.fit(
    X_train_scaled, y_train_scaled,
    validation_data=(X_val_scaled, y_val_scaled),
    epochs=50,
    batch_size=32,
    callbacks=[early_stopping, model_checkpoint],
    verbose=1
)
transformer_time = time.time() - start_time
print(f"Transformer training time: {transformer_time:.2f} seconds")

# Make predictions
transformer_preds_scaled = transformer_model.predict(X_test_scaled)
transformer_preds = y_scaler.inverse_transform(transformer_preds_scaled).flatten()

# Evaluate the Transformer model
transformer_results = evaluate_model(y_test, transformer_preds, "Transformer Model")

NameError: name 'input_shape' is not defined

## 4. Model Comparison

Now we'll compare all the models to see which performs best.

In [7]:
# Collect all results
all_results = [
    naive_results,
    lr_results,
    lstm_results,
    transformer_results
]

# Add ARIMA results if available
if 'arima_results' in locals() and arima_results is not None:
    all_results.append(arima_results)

# Create a summary DataFrame
summary_metrics = pd.DataFrame([
    {
        'Model': r['model'],
        'MSE': r['mse'],
        'RMSE': r['rmse'],
        'MAE': r['mae'],
        'R²': r['r2'],
        'MAPE (%)': r['mape']
    }
    for r in all_results
])

# Display the summary
print("\nModel Performance Summary:")
display(summary_metrics.sort_values('RMSE'))

NameError: name 'naive_results' is not defined

In [8]:
# Plot model comparison
def plot_model_comparison(results_list, metric_name):
    """Create a bar chart comparing models on a specific metric."""
    models = [r['model'] for r in results_list]
    metric_values = [r[metric_name.lower()] for r in results_list]
    
    plt.figure(figsize=(12, 6))
    bars = plt.bar(models, metric_values)
    
    # Add value labels on top of each bar
    for bar, value in zip(bars, metric_values):
        plt.text(bar.get_x() + bar.get_width()/2., 
                 value + 0.002*max(metric_values), 
                 f"{value:.4f}", 
                 ha='center', va='bottom', fontsize=10)
    
    plt.title(f'Model Comparison - {metric_name}', fontsize=16)
    plt.ylabel(metric_name, fontsize=14)
    plt.grid(axis='y', alpha=0.3)
    plt.tight_layout()
    plt.show()

# Plot comparisons for key metrics
plot_model_comparison(all_results, "RMSE")
plot_model_comparison(all_results, "MAE")
plot_model_comparison(all_results, "R2")

NameError: name 'all_results' is not defined

In [9]:
# Plot predictions vs actual for the best model
def plot_predictions(y_true, y_pred, dates, model_name):
    """Plot model predictions against actual values."""
    plt.figure(figsize=(15, 7))
    
    plt.plot(dates, y_true, label='Actual', linewidth=2)
    plt.plot(dates, y_pred, label=f'{model_name} Prediction', linewidth=2, linestyle='--')
    
    # Add shaded region for prediction error
    plt.fill_between(dates, y_true, y_pred, color='gray', alpha=0.3, label='Error')
    
    plt.title(f'{selected_symbol} Stock Price Prediction - {model_name}', fontsize=16)
    plt.xlabel('Date', fontsize=14)
    plt.ylabel('Price', fontsize=14)
    plt.legend(fontsize=12)
    plt.grid(True, alpha=0.3)
    plt.tight_layout()
    plt.show()

# Find the best model based on RMSE
best_model_idx = summary_metrics['RMSE'].idxmin()
best_model = summary_metrics.loc[best_model_idx, 'Model']
best_results = next(r for r in all_results if r['model'] == best_model)

# Plot the best model's predictions
plot_predictions(y_test, best_results['y_pred'], dates_test, best_model)

NameError: name 'summary_metrics' is not defined

In [None]:
# Plot predictions for all models
plt.figure(figsize=(15, 8))

# Plot actual values
plt.plot(dates_test, y_test, label='Actual', linewidth=2.5, color='black')

# Plot predictions for each model with different colors
colors = ['blue', 'green', 'red', 'purple', 'orange']
for i, result in enumerate(all_results):
    plt.plot(dates_test, result['y_pred'], label=f"{result['model']} Prediction", 
             linestyle='--', linewidth=1.5, alpha=0.8, color=colors[i % len(colors)])

plt.title(f'{selected_symbol} Stock Price Predictions - Model Comparison', fontsize=16)
plt.xlabel('Date', fontsize=14)
plt.ylabel('Price', fontsize=14)
plt.legend(fontsize=12)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

## 5. Ablation Studies

Let's analyze how different hyperparameters affect model performance, focusing on the best deep learning model.

In [None]:
# Plot training and validation loss curves
def plot_loss_curves(history, model_name):
    """Plot the training and validation loss curves."""
    plt.figure(figsize=(15, 6))
    
    # Plot loss
    plt.subplot(1, 2, 1)
    plt.plot(history.history['loss'], label='Training Loss')
    plt.plot(history.history['val_loss'], label='Validation Loss')
    plt.title(f'{model_name} - Loss Curves', fontsize=14)
    plt.xlabel('Epoch', fontsize=12)
    plt.ylabel('Loss (MSE)', fontsize=12)
    plt.legend(fontsize=10)
    plt.grid(True, alpha=0.3)
    
    # Plot MAE
    plt.subplot(1, 2, 2)
    plt.plot(history.history['mae'], label='Training MAE')
    plt.plot(history.history['val_mae'], label='Validation MAE')
    plt.title(f'{model_name} - MAE Curves', fontsize=14)
    plt.xlabel('Epoch', fontsize=12)
    plt.ylabel('Mean Absolute Error', fontsize=12)
    plt.legend(fontsize=10)
    plt.grid(True, alpha=0.3)
    
    plt.tight_layout()
    plt.show()
    
    # Check for overfitting
    min_val_loss_idx = np.argmin(history.history['val_loss'])
    min_val_loss = history.history['val_loss'][min_val_loss_idx]
    min_train_loss = history.history['loss'][min_val_loss_idx]
    
    print(f"Best epoch: {min_val_loss_idx+1}")
    print(f"Training loss: {min_train_loss:.4f}")
    print(f"Validation loss: {min_val_loss:.4f}")
    print(f"Difference: {(min_train_loss - min_val_loss):.4f}")
    
    if min_train_loss < min_val_loss:
        gap_percent = (min_val_loss - min_train_loss) / min_val_loss * 100
        if gap_percent > 10:
            print(f"Warning: Possible overfitting detected. Train-Val gap: {gap_percent:.2f}%")
        else:
            print("Model seems well-fit (no significant overfitting detected)")
    else:
        print("Interesting: Validation loss is lower than training loss.")

In [10]:
# Plot loss curves for the models
if 'lstm_history' in locals():
    plot_loss_curves(lstm_history, 'LSTM')

if 'transformer_history' in locals():
    plot_loss_curves(transformer_history, 'Transformer')

In [11]:
# Analyze model complexity
def print_model_complexity(models_dict):
    """Analyze and print the complexity of each model."""
    complexity_data = []
    
    for name, model in models_dict.items():
        if hasattr(model, 'count_params'):
            params = model.count_params()
            trainable_params = np.sum([K.count_params(w) for w in model.trainable_weights])
            non_trainable_params = params - trainable_params
            
            complexity_data.append({
                'Model': name,
                'Total Parameters': params,
                'Trainable Parameters': trainable_params,
                'Non-trainable Parameters': non_trainable_params
            })
    
    if complexity_data:
        complexity_df = pd.DataFrame(complexity_data)
        display(complexity_df)
    else:
        print("No compatible models found for complexity analysis.")

# Analyze model complexity
from tensorflow.keras import backend as K

models_dict = {
    "LSTM": lstm_model,
    "Transformer": transformer_model
}

print_model_complexity(models_dict)

ModuleNotFoundError: No module named 'tensorflow'

In [12]:
# Ablation study on LSTM hyperparameters
def run_lstm_ablation(X_train, y_train, X_val, y_val, X_test, y_test, y_scaler, param_grid):
    """Run an ablation study on LSTM hyperparameters."""
    results = []
    input_shape = (X_train.shape[1], X_train.shape[2])
    
    for units in param_grid['units']:
        for layers in param_grid['layers']:
            for dropout in param_grid['dropout_rate']:
                print(f"\nTesting LSTM with units={units}, layers={layers}, dropout={dropout}")
                
                # Build and compile the model
                model = build_lstm_model(
                    input_shape=input_shape,
                    units=units,
                    layers=layers,
                    dropout_rate=dropout
                )
                
                # Set up callbacks
                early_stopping = EarlyStopping(
                    monitor='val_loss',
                    patience=5,
                    restore_best_weights=True
                )
                
                # Train the model
                start_time = time.time()
                history = model.fit(
                    X_train, y_train,
                    validation_data=(X_val, y_val),
                    epochs=30,  # Reduced for speed
                    batch_size=32,
                    callbacks=[early_stopping],
                    verbose=0
                )
                training_time = time.time() - start_time
                
                # Make predictions
                preds_scaled = model.predict(X_test, verbose=0)
                preds = y_scaler.inverse_transform(preds_scaled).flatten()
                
                # Evaluate
                mse = mean_squared_error(y_test, preds)
                rmse = np.sqrt(mse)
                mae = mean_absolute_error(y_test, preds)
                
                # Get parameter count
                params = model.count_params()
                
                # Add to results
                results.append({
                    'Units': units,
                    'Layers': layers,
                    'Dropout': dropout,
                    'Parameters': params,
                    'Training Time': training_time,
                    'MSE': mse,
                    'RMSE': rmse,
                    'MAE': mae,
                    'Final Train Loss': history.history['loss'][-1],
                    'Final Val Loss': history.history['val_loss'][-1],
                    'Epochs': len(history.history['loss'])
                })
                
                print(f"RMSE: {rmse:.4f}, Training Time: {training_time:.2f}s")
                
    return pd.DataFrame(results)

# Define the parameter grid for the ablation study
lstm_param_grid = {
    'units': [32, 64, 128],
    'layers': [1, 2],
    'dropout_rate': [0.2, 0.5]
}

# Run the ablation study (commented out by default to save time)
# You can uncomment and run this cell if you want to do the ablation study
'''
ablation_results = run_lstm_ablation(
    X_train_scaled, y_train_scaled, 
    X_val_scaled, y_val_scaled, 
    X_test_scaled, y_test, y_scaler,
    lstm_param_grid
)

# Display results
print("\nAblation Study Results:")
display(ablation_results.sort_values('RMSE'))
'''

'\nablation_results = run_lstm_ablation(\n    X_train_scaled, y_train_scaled, \n    X_val_scaled, y_val_scaled, \n    X_test_scaled, y_test, y_scaler,\n    lstm_param_grid\n)\n\n# Display results\nprint("\nAblation Study Results:")\ndisplay(ablation_results.sort_values(\'RMSE\'))\n'

## Conclusion

In this notebook, we prototyped several different models for stock price prediction:

1. **Baseline Models**:
   - Naive Last Value Model
   - Linear Regression
   - ARIMA

2. **Advanced Models**:
   - LSTM
   - Transformer
   
We evaluated these models using several metrics (RMSE, MAE, R²) and compared their performance. The LSTM and Transformer models generally outperformed the baseline models, demonstrating the value of deep learning approaches for this task.

We also analyzed model complexity, training behavior, and potential overfitting issues. The ablation study framework provided insights into how different hyperparameters affect model performance.

In the next steps, we could:
1. Further tune the best-performing model
2. Explore ensemble methods
3. Investigate different feature engineering approaches
4. Test on a broader range of stocks