# Author of project: Akinmade Faruq
# Contact informations: www.linkedin.com/in/faruqakinmade
# Email: Fharuk147@gmail.com
# X website: https://x.com/EngrrrAkinmade


# Stock Movement Prediction for Alphabet Inc. (GOOGL)

## **Project Overview**

This project aims to develop a robust machine learning system capable of predicting daily stock movements of Alphabet Inc. (GOOGL) by leveraging both **historical stock market data** and **public sentiment extracted from news articles**. The system integrates techniques from **Natural Language Processing (NLP)**, **time-series analysis**, and **financial modeling** to provide actionable insights for trading strategies. 

The predictions generated by this system are designed to be **accurate, interpretable, and compliant with regulatory standards**, serving as a benchmark for industry applications.

---

## **Datasets**

1. **News Articles Dataset**
   - Contains daily news headlines and summaries about Alphabet Inc. from January 1, 2000.
   - Key columns: `category`, `datetime`, `headline`, `summary`, `image`, `source`, `related`, `url`.
   - Purpose: Extract sentiment and event information to understand market reactions.

2. **Stock Market Dataset**
   - Historical daily stock prices for GOOGL, including high, low, open, close, and volume.
   - Key columns: `Price High`, `Price Low`, `Open`, `Close`, `Volume`.
   - Purpose: Capture historical price movements and volatility for prediction modeling.

---

## **Project Workflow**

1. **Data Preprocessing**
   - Clean and normalize stock and news data.
   - Handle missing values, outliers, and duplicate records.
   - Standardize date formats and align news with trading days.

2. **Exploratory Data Analysis (EDA)**
   - Visualize stock price trends, returns, and volatility.
   - Analyze sentiment trends and correlate with stock movements.
   - Identify patterns, anomalies, and potential event-driven effects.

3. **Feature Engineering**
   - Generate sentiment scores, topic distributions, and event indicators from news.
   - Compute technical indicators and lagged features from stock data.
   - Combine datasets to create enriched features for modeling.

4. **Model Selection**
   - Evaluate machine learning algorithms such as Logistic Regression, Random Forest, XGBoost, LSTM, and hybrid models.
   - Balance predictive performance with interpretability and regulatory compliance.

5. **Training and Validation**
   - Train models using time-series aware splits and walk-forward cross-validation.
   - Evaluate performance using accuracy, precision, recall, F1-score, and financial metrics.

6. **Explanation and Interpretability**
   - Apply SHAP and LIME for local and global interpretability.
   - Document feature contributions and model decisions for transparency.

7. **Deployment**
   - Deploy as an API service for real-time prediction.
   - Monitor performance, detect model drift, and periodically update the model.
   - Ensure compliance through logging and audit trails.

---

By following this workflow, the system will provide **accurate predictions** for stock movements while maintaining **transparency, interpretability, and reliability**, ultimately enabling informed investment decisions.


In [6]:
# Importing Necessary Libraries

# Data manipulation
import pandas as pd
import numpy as np
import datetime as dt

# Data visualization
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# Natural Language Processing (NLP)
import re
import nltk
from nltk.corpus import stopwords
from nltk.stem import WordNetLemmatizer
from textblob import TextBlob
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

# Machine Learning
from sklearn.model_selection import train_test_split, TimeSeriesSplit, cross_val_score
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, roc_auc_score, mean_squared_error
from sklearn.ensemble import RandomForestClassifier, GradientBoostingClassifier
from xgboost import XGBClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.svm import SVC

# Deep Learning (if using LSTM/GRU)
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout
from tensorflow.keras.callbacks import EarlyStopping

# Model interpretability
import shap
import lime
import lime.lime_tabular

# Ignore warnings
import warnings
warnings.filterwarnings('ignore')

# Download necessary NLTK data
nltk.download('stopwords')
nltk.download('wordnet')
nltk.download('punkt')

print("All libraries imported successfully!")


[nltk_data] Downloading package stopwords to C:\Users\AKINMADE
[nltk_data]     FARUQ\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!
[nltk_data] Downloading package wordnet to C:\Users\AKINMADE
[nltk_data]     FARUQ\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package punkt to C:\Users\AKINMADE
[nltk_data]     FARUQ\AppData\Roaming\nltk_data...


All libraries imported successfully!


[nltk_data]   Package punkt is already up-to-date!


In [7]:
# Load the Datasets

# File paths
stock_file = r"C:\Users\AKINMADE FARUQ\Downloads\PROJECT MATERIALS\My Projects\GITHUB\Google Stocks Prediction Analysis\GOOGLE.csv"
news_file  = r"C:\Users\AKINMADE FARUQ\Downloads\PROJECT MATERIALS\My Projects\GITHUB\Google Stocks Prediction Analysis\Google_Daily_News.csv"

# Load datasets
stock_df = pd.read_csv(stock_file)
news_df  = pd.read_csv(news_file)

# Display first few rows of each dataset
print("Stock Dataset Preview:")
display(stock_df.head())

print("\nNews Dataset Preview:")
display(news_df.head())

# Check basic info and missing values
print("\nStock Dataset Info:")
print(stock_df.info())
print("\nMissing values in Stock Dataset:")
print(stock_df.isnull().sum())

print("\nNews Dataset Info:")
print(news_df.info())
print("\nMissing values in News Dataset:")
print(news_df.isnull().sum())


Stock Dataset Preview:


Unnamed: 0,Date,Open,High,Low,Close,Volume
0,2004-08-19,2.502503,2.604104,2.401401,2.511011,893181924
1,2004-08-20,2.527778,2.72973,2.515015,2.71046,456686856
2,2004-08-23,2.771522,2.83984,2.728979,2.737738,365122512
3,2004-08-24,2.783784,2.792793,2.591842,2.624374,304946748
4,2004-08-25,2.626627,2.702703,2.5996,2.652653,183772044



News Dataset Preview:


Unnamed: 0,category,datetime,headline,id,image,related,source,summary,url
0,company,1745449200,"Alphabet earnings, Fed comments, Nintendo Swit...",134059226,https://s.yimg.com/rz/stage/p/yahoo_finance_en...,GOOGL,Yahoo,Here's what investors are watching on Thursday...,https://finnhub.io/api/news?id=5381fda0f641074...
1,company,1745446095,Is Alphabet Inc. (GOOGL) the Best Stock to Buy...,134059227,https://s.yimg.com/rz/stage/p/yahoo_finance_en...,GOOGL,Yahoo,We recently published a list of 10 Best Stocks...,https://finnhub.io/api/news?id=bdc5b5103ae73db...
2,company,1745442355,Is Alphabet Inc. (GOOG) the Best Stock to Buy ...,134059228,https://s.yimg.com/rz/stage/p/yahoo_finance_en...,GOOGL,Yahoo,We recently published a list of 20 Best Stocks...,https://finnhub.io/api/news?id=8cdf3969c1ec9e3...
3,company,1745440328,Google earnings are coming today. Here's what ...,134059229,https://s.yimg.com/rz/stage/p/yahoo_finance_en...,GOOGL,Yahoo,Google (GOOGL) will report first-quarter 2025 ...,https://finnhub.io/api/news?id=ed468a233b607bd...
4,company,1745439372,Equity Markets Close Higher Over Potential Red...,134059230,https://s.yimg.com/rz/stage/p/yahoo_finance_en...,GOOGL,Yahoo,US benchmark equity indexes closed higher on W...,https://finnhub.io/api/news?id=54bdad840d13d87...



Stock Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4717 entries, 0 to 4716
Data columns (total 6 columns):
 #   Column  Non-Null Count  Dtype  
---  ------  --------------  -----  
 0   Date    4717 non-null   object 
 1   Open    4717 non-null   float64
 2   High    4717 non-null   float64
 3   Low     4717 non-null   float64
 4   Close   4717 non-null   float64
 5   Volume  4717 non-null   int64  
dtypes: float64(4), int64(1), object(1)
memory usage: 221.2+ KB
None

Missing values in Stock Dataset:
Date      0
Open      0
High      0
Low       0
Close     0
Volume    0
dtype: int64

News Dataset Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 231 entries, 0 to 230
Data columns (total 9 columns):
 #   Column    Non-Null Count  Dtype 
---  ------    --------------  ----- 
 0   category  231 non-null    object
 1   datetime  231 non-null    int64 
 2   headline  231 non-null    object
 3   id        231 non-null    int64 
 4   image     158 non-null    object

In [8]:
# Data Preprocessing

# ---------- Stock Dataset ----------
# Convert 'Date' to datetime format
stock_df['Date'] = pd.to_datetime(stock_df['Date'])

# Sort by date
stock_df = stock_df.sort_values('Date').reset_index(drop=True)

# Create 'Daily Return' column
stock_df['Return'] = stock_df['Close'].pct_change()

# ---------- News Dataset ----------
# Convert 'datetime' from Unix timestamp to datetime
news_df['datetime'] = pd.to_datetime(news_df['datetime'], unit='s')

# Handle missing summaries by filling with empty string
news_df['summary'] = news_df['summary'].fillna('')

# Optional: fill missing images with placeholder
news_df['image'] = news_df['image'].fillna('No Image')

# Combine headline and summary into a single text column for NLP
news_df['text'] = news_df['headline'] + '. ' + news_df['summary']

# Sort news by datetime
news_df = news_df.sort_values('datetime').reset_index(drop=True)

# Preview processed datasets
print("Processed Stock Dataset:")
display(stock_df.head())

print("\nProcessed News Dataset:")
display(news_df[['datetime', 'headline', 'summary', 'text']].head())


Processed Stock Dataset:


Unnamed: 0,Date,Open,High,Low,Close,Volume,Return
0,2004-08-19,2.502503,2.604104,2.401401,2.511011,893181924,
1,2004-08-20,2.527778,2.72973,2.515015,2.71046,456686856,0.07943
2,2004-08-23,2.771522,2.83984,2.728979,2.737738,365122512,0.010064
3,2004-08-24,2.783784,2.792793,2.591842,2.624374,304946748,-0.041408
4,2004-08-25,2.626627,2.702703,2.5996,2.652653,183772044,0.010776



Processed News Dataset:


Unnamed: 0,datetime,headline,summary,text
0,2025-04-16 15:21:00,Big Tech’s China Risks Go Far Beyond Nvidia,Big Tech’s China Risks Go Far Beyond Nvidia,Big Tech’s China Risks Go Far Beyond Nvidia. B...
1,2025-04-16 17:00:24,Is Alphabet Inc. (NASDAQ:GOOGL) the Best Machi...,We recently published a list of the 10 Best Ma...,Is Alphabet Inc. (NASDAQ:GOOGL) the Best Machi...
2,2025-04-16 17:19:59,Prominent Investor Unloads His GOOG Stock,Well-known investor Josh Brown disclosed on CN...,Prominent Investor Unloads His GOOG Stock. Wel...
3,2025-04-16 17:37:07,Communications Services Slide on Flight From R...,Communications-services companies slid as trad...,Communications Services Slide on Flight From R...
4,2025-04-16 18:39:10,"Temu, Shein slash digital ads as tariffs end c...",Chinese online marketplace Temu and fast-fashi...,"Temu, Shein slash digital ads as tariffs end c..."


In [9]:
# Sentiment Analysis on News

from textblob import TextBlob

# Function to calculate polarity and subjectivity
def get_sentiment(text):
    blob = TextBlob(text)
    return pd.Series([blob.sentiment.polarity, blob.sentiment.subjectivity])

# Apply sentiment analysis
news_df[['polarity', 'subjectivity']] = news_df['text'].apply(get_sentiment)

# Aggregate sentiment by date
news_df['date_only'] = news_df['datetime'].dt.date
daily_sentiment = news_df.groupby('date_only').agg({
    'polarity': ['mean', 'max', 'min', 'std'],
    'subjectivity': ['mean', 'max', 'min', 'std'],
    'headline': 'count'  # number of news articles per day
})

# Flatten multi-level columns
daily_sentiment.columns = ['_'.join(col) for col in daily_sentiment.columns]
daily_sentiment = daily_sentiment.reset_index()
daily_sentiment['date_only'] = pd.to_datetime(daily_sentiment['date_only'])

# Preview daily sentiment
print("Daily Aggregated Sentiment:")
display(daily_sentiment.head())

Daily Aggregated Sentiment:


Unnamed: 0,date_only,polarity_mean,polarity_max,polarity_min,polarity_std,subjectivity_mean,subjectivity_max,subjectivity_min,subjectivity_std,headline_count
0,2025-04-16,0.155092,0.325,-0.027778,0.142606,0.465332,0.571493,0.277778,0.107247,7
1,2025-04-17,0.003625,0.5,-0.5,0.239106,0.373574,1.0,0.0,0.241037,44
2,2025-04-18,0.085505,0.316667,-0.275,0.149829,0.408563,0.833333,0.05,0.178624,28
3,2025-04-19,0.034643,0.5,-0.3125,0.358221,0.552976,1.0,0.0,0.361986,5
4,2025-04-20,0.119365,0.5,0.0,0.193224,0.270678,1.0,0.0,0.339756,11


In [10]:
# Merge Stock Data with Sentiment Features

# Merge stock_df and daily_sentiment on date
stock_df['date_only'] = stock_df['Date'].dt.date
stock_df['date_only'] = pd.to_datetime(stock_df['date_only'])

# Merge with sentiment data
combined_df = pd.merge(stock_df, daily_sentiment, on='date_only', how='left')

# Fill missing sentiment values with 0 (no news days)
sentiment_cols = [col for col in combined_df.columns if 'polarity' in col or 'subjectivity' in col or 'headline_count' in col]
combined_df[sentiment_cols] = combined_df[sentiment_cols].fillna(0)

# Preview the combined dataset
print("Combined Stock + Sentiment Dataset:")
display(combined_df.head())

# Drop temporary 'date_only' if not needed
# combined_df = combined_df.drop(columns=['date_only'])


Combined Stock + Sentiment Dataset:


Unnamed: 0,Date,Open,High,Low,Close,Volume,Return,date_only,polarity_mean,polarity_max,polarity_min,polarity_std,subjectivity_mean,subjectivity_max,subjectivity_min,subjectivity_std,headline_count
0,2004-08-19,2.502503,2.604104,2.401401,2.511011,893181924,,2004-08-19,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
1,2004-08-20,2.527778,2.72973,2.515015,2.71046,456686856,0.07943,2004-08-20,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
2,2004-08-23,2.771522,2.83984,2.728979,2.737738,365122512,0.010064,2004-08-23,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
3,2004-08-24,2.783784,2.792793,2.591842,2.624374,304946748,-0.041408,2004-08-24,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
4,2004-08-25,2.626627,2.702703,2.5996,2.652653,183772044,0.010776,2004-08-25,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [11]:
# Feature Engineering

# Moving Averages
combined_df['SMA_5'] = combined_df['Close'].rolling(window=5).mean()
combined_df['SMA_10'] = combined_df['Close'].rolling(window=10).mean()
combined_df['SMA_20'] = combined_df['Close'].rolling(window=20).mean()

# Exponential Moving Average
combined_df['EMA_10'] = combined_df['Close'].ewm(span=10, adjust=False).mean()
combined_df['EMA_20'] = combined_df['Close'].ewm(span=20, adjust=False).mean()

# Rolling Volatility (Standard Deviation of Returns)
combined_df['Volatility_5'] = combined_df['Return'].rolling(window=5).std()
combined_df['Volatility_10'] = combined_df['Return'].rolling(window=10).std()

# Lag Features (previous day returns)
combined_df['Return_1d'] = combined_df['Return'].shift(1)
combined_df['Return_2d'] = combined_df['Return'].shift(2)
combined_df['Return_3d'] = combined_df['Return'].shift(3)

# Target variable: Direction (Up = 1, Down = 0)
combined_df['Target'] = np.where(combined_df['Return'].shift(-1) > 0, 1, 0)

# Drop rows with NaN values created by rolling and shift
combined_df = combined_df.dropna().reset_index(drop=True)

# Preview engineered dataset
print("Feature Engineered Dataset:")
display(combined_df.head())


Feature Engineered Dataset:


Unnamed: 0,Date,Open,High,Low,Close,Volume,Return,date_only,polarity_mean,polarity_max,...,SMA_10,SMA_20,EMA_10,EMA_20,Volatility_5,Volatility_10,Return_1d,Return_2d,Return_3d,Target
0,2004-09-16,2.811311,2.897898,2.794044,2.852102,185326488,0.017589,2004-09-16,0.0,0.0,...,2.647648,2.634647,2.696439,2.641854,0.012349,0.01485,0.004574,0.037116,0.020602,1
1,2004-09-17,2.863363,2.94019,2.841592,2.94019,189450360,0.030885,2004-09-17,0.0,0.0,...,2.687638,2.656106,2.740757,2.670267,0.012574,0.015884,0.017589,0.004574,0.037116,1
2,2004-09-20,2.926677,3.043043,2.922172,2.986987,212575212,0.015916,2004-09-20,0.0,0.0,...,2.736061,2.669932,2.785526,2.700431,0.012889,0.012021,0.030885,0.017589,0.004574,0
3,2004-09-21,2.998248,3.013514,2.940691,2.948949,144575280,-0.012735,2004-09-21,0.0,0.0,...,2.776752,2.680493,2.81524,2.7241,0.016335,0.015469,0.015916,0.030885,0.017589,1
4,2004-09-22,2.937938,2.994745,2.923173,2.962462,151624224,0.004582,2004-09-22,0.0,0.0,...,2.816992,2.697397,2.842007,2.746801,0.016334,0.015632,-0.012735,0.015916,0.030885,1


In [12]:
# Train-Test Split and Feature Scaling
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Define feature columns (exclude Date, Target, and date_only)
feature_cols = [col for col in combined_df.columns if col not in ['Date', 'Target', 'date_only', 'Return']]

X = combined_df[feature_cols]
y = combined_df['Target']

# Split into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle=False, test_size=0.2)

# Feature scaling
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

print("Training and testing data prepared!")
print(f"X_train shape: {X_train_scaled.shape}, X_test shape: {X_test_scaled.shape}")
print(f"y_train distribution:\n{y_train.value_counts()}")
print(f"y_test distribution:\n{y_test.value_counts()}")


Training and testing data prepared!
X_train shape: (3758, 24), X_test shape: (940, 24)
y_train distribution:
Target
1    1952
0    1806
Name: count, dtype: int64
y_test distribution:
Target
1    497
0    443
Name: count, dtype: int64


In [13]:
# Train Random Forest Classifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report

# Initialize the model
rf_model = RandomForestClassifier(n_estimators=200, random_state=42)

# Train the model
rf_model.fit(X_train_scaled, y_train)

# Make predictions
y_pred = rf_model.predict(X_test_scaled)

# Evaluate performance
accuracy = accuracy_score(y_test, y_pred)
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
cm = confusion_matrix(y_test, y_pred)

print("Random Forest Model Performance:")
print(f"Accuracy : {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall   : {recall:.4f}")
print(f"F1-Score : {f1:.4f}")
print("\nConfusion Matrix:")
print(cm)
print("\nClassification Report:")
print(classification_report(y_test, y_pred))


Random Forest Model Performance:
Accuracy : 0.4745
Precision: 0.5484
Recall   : 0.0342
F1-Score : 0.0644

Confusion Matrix:
[[429  14]
 [480  17]]

Classification Report:
              precision    recall  f1-score   support

           0       0.47      0.97      0.63       443
           1       0.55      0.03      0.06       497

    accuracy                           0.47       940
   macro avg       0.51      0.50      0.35       940
weighted avg       0.51      0.47      0.33       940



In [14]:
# Check and Handle Class Imbalance
from collections import Counter
from imblearn.over_sampling import SMOTE

# Check target distribution
print("Original training target distribution:")
print(Counter(y_train))

# Apply SMOTE to balance the classes
smote = SMOTE(random_state=42)
X_train_res, y_train_res = smote.fit_resample(X_train_scaled, y_train)

# Check the new distribution
print("\nAfter SMOTE, training target distribution:")
print(Counter(y_train_res))

# Shapes after resampling
print(f"\nX_train_res shape: {X_train_res.shape}, y_train_res shape: {y_train_res.shape}")


Original training target distribution:
Counter({1: 1952, 0: 1806})

After SMOTE, training target distribution:
Counter({1: 1952, 0: 1952})

X_train_res shape: (3904, 24), y_train_res shape: (3904,)


In [15]:
# Retrain Random Forest on Balanced Data

# Initialize the Random Forest model
rf_model_balanced = RandomForestClassifier(n_estimators=200, random_state=42)

# Train the model on SMOTE-resampled data
rf_model_balanced.fit(X_train_res, y_train_res)

# Predict on the original test set
y_pred_balanced = rf_model_balanced.predict(X_test_scaled)

# Evaluate performance
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report

accuracy = accuracy_score(y_test, y_pred_balanced)
precision = precision_score(y_test, y_pred_balanced)
recall = recall_score(y_test, y_pred_balanced)
f1 = f1_score(y_test, y_pred_balanced)
cm = confusion_matrix(y_test, y_pred_balanced)

print("Random Forest Model Performance After Balancing:")
print(f"Accuracy : {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall   : {recall:.4f}")
print(f"F1-Score : {f1:.4f}")
print("\nConfusion Matrix:")
print(cm)
print("\nClassification Report:")
print(classification_report(y_test, y_pred_balanced))


Random Forest Model Performance After Balancing:
Accuracy : 0.4777
Precision: 0.6154
Recall   : 0.0322
F1-Score : 0.0612

Confusion Matrix:
[[433  10]
 [481  16]]

Classification Report:
              precision    recall  f1-score   support

           0       0.47      0.98      0.64       443
           1       0.62      0.03      0.06       497

    accuracy                           0.48       940
   macro avg       0.54      0.50      0.35       940
weighted avg       0.55      0.48      0.33       940



In [16]:
# Prepare Data for LSTM / GRU
from tensorflow.keras.preprocessing.sequence import TimeseriesGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, GRU, Dense, Dropout

# Define the number of time steps for sequences
time_steps = 5  # you can adjust

# Features and target
X_lstm = combined_df[feature_cols].values
y_lstm = combined_df['Target'].values

# Scale features (already scaled in previous StandardScaler)
X_lstm_scaled = scaler.fit_transform(X_lstm)

# Split into train and test sets (chronologically)
train_size = int(len(X_lstm_scaled) * 0.8)
X_train_lstm, X_test_lstm = X_lstm_scaled[:train_size], X_lstm_scaled[train_size:]
y_train_lstm, y_test_lstm = y_lstm[:train_size], y_lstm[train_size:]

# Create sequences using TimeseriesGenerator
train_generator = TimeseriesGenerator(X_train_lstm, y_train_lstm, length=time_steps, batch_size=32)
test_generator = TimeseriesGenerator(X_test_lstm, y_test_lstm, length=time_steps, batch_size=32)

print(f"Number of training sequences: {len(train_generator)}")
print(f"Number of testing sequences: {len(test_generator)}")


Number of training sequences: 118
Number of testing sequences: 30


In [17]:
# Train a Feedforward Neural Network (MLP)
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam

# Define the model architecture
mlp_model = Sequential([
    Dense(64, activation='relu', input_shape=(X_train_scaled.shape[1],)),
    Dropout(0.2),
    Dense(32, activation='relu'),
    Dropout(0.2),
    Dense(1, activation='sigmoid')  # binary classification
])

# Compile the model
mlp_model.compile(optimizer=Adam(learning_rate=0.001),
                  loss='binary_crossentropy',
                  metrics=['accuracy'])

# Train the model
history = mlp_model.fit(X_train_scaled, y_train_res,
                        validation_split=0.2,
                        epochs=50,
                        batch_size=32,
                        verbose=1)

# Evaluate on test set
y_pred_mlp = (mlp_model.predict(X_test_scaled) > 0.5).astype(int)

from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report

accuracy = accuracy_score(y_test, y_pred_mlp)
precision = precision_score(y_test, y_pred_mlp)
recall = recall_score(y_test, y_pred_mlp)
f1 = f1_score(y_test, y_pred_mlp)
cm = confusion_matrix(y_test, y_pred_mlp)

print("MLP Model Performance:")
print(f"Accuracy : {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall   : {recall:.4f}")
print(f"F1-Score : {f1:.4f}")
print("\nConfusion Matrix:")
print(cm)
print("\nClassification Report:")
print(classification_report(y_test, y_pred_mlp))


Epoch 1/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 11ms/step - accuracy: 0.5027 - loss: 0.7058 - val_accuracy: 0.5359 - val_loss: 0.6912
Epoch 2/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step - accuracy: 0.5126 - loss: 0.7006 - val_accuracy: 0.5080 - val_loss: 0.6946
Epoch 3/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step - accuracy: 0.5213 - loss: 0.6939 - val_accuracy: 0.5279 - val_loss: 0.6957
Epoch 4/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 8ms/step - accuracy: 0.5220 - loss: 0.6940 - val_accuracy: 0.5306 - val_loss: 0.6950
Epoch 5/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step - accuracy: 0.5246 - loss: 0.6951 - val_accuracy: 0.5306 - val_loss: 0.6955
Epoch 6/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 9ms/step - accuracy: 0.5296 - loss: 0.6896 - val_accuracy: 0.5266 - val_loss: 0.6974
Epoch 7/50
[1m94/94[0m [32m━━━━━━━━━

In [20]:
# Hyperparameter Tuning with Keras Tuner
import keras_tuner as kt
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam

# Define the model building function for Keras Tuner
def build_mlp_model(hp):
    model = Sequential()
    
    # Tune number of layers (1-3)
    for i in range(hp.Int('num_layers', 1, 3)):
        model.add(Dense(units=hp.Int(f'units_{i}', min_value=32, max_value=128, step=32),
                        activation='relu'))
        model.add(Dropout(rate=hp.Float(f'dropout_{i}', 0.1, 0.5, step=0.1)))
    
    model.add(Dense(1, activation='sigmoid'))
    
    # Tune learning rate
    lr = hp.Choice('learning_rate', [1e-2, 1e-3, 1e-4])
    model.compile(optimizer=Adam(learning_rate=lr),
                  loss='binary_crossentropy',
                  metrics=['accuracy'])
    return model

# Initialize the tuner
tuner = kt.Hyperband(build_mlp_model,
                     objective='val_accuracy',
                     max_epochs=30,
                     factor=3,
                     directory='mlp_tuning',
                     project_name='google_stock_mlp')

# Perform hyperparameter search
tuner.search(X_train_scaled, y_train_res, validation_split=0.2, epochs=30, batch_size=32, verbose=1)

# Get the best model and hyperparameters
best_model = tuner.get_best_models(num_models=1)[0]
best_hyperparameters = tuner.get_best_hyperparameters(num_trials=1)[0]

print("Best Hyperparameters:")
print(best_hyperparameters.values)


Reloading Tuner from mlp_tuning\google_stock_mlp\tuner0.json

Best Hyperparameters:
{'num_layers': 2, 'units_0': 96, 'dropout_0': 0.4, 'learning_rate': 0.001, 'units_1': 32, 'dropout_1': 0.2, 'units_2': 96, 'dropout_2': 0.1, 'tuner/epochs': 10, 'tuner/initial_epoch': 4, 'tuner/bracket': 2, 'tuner/round': 1, 'tuner/trial_id': '0065'}


In [21]:
# Retrain Best MLP Model and Evaluate
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam

# Define the best model architecture based on tuner
best_mlp = Sequential([
    Dense(96, activation='relu', input_shape=(X_train_scaled.shape[1],)),
    Dropout(0.4),
    Dense(32, activation='relu'),
    Dropout(0.2),
    Dense(1, activation='sigmoid')
])

# Compile the model with the best learning rate
best_mlp.compile(optimizer=Adam(learning_rate=0.001),
                 loss='binary_crossentropy',
                 metrics=['accuracy'])

# Train the model
history = best_mlp.fit(X_train_scaled, y_train_res,
                       validation_split=0.2,
                       epochs=50,
                       batch_size=32,
                       verbose=1)

# Predict on the original test set
y_pred_best = (best_mlp.predict(X_test_scaled) > 0.5).astype(int)

# Evaluate performance
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report

accuracy = accuracy_score(y_test, y_pred_best)
precision = precision_score(y_test, y_pred_best)
recall = recall_score(y_test, y_pred_best)
f1 = f1_score(y_test, y_pred_best)
cm = confusion_matrix(y_test, y_pred_best)

print("Best MLP Model Performance:")
print(f"Accuracy : {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall   : {recall:.4f}")
print(f"F1-Score : {f1:.4f}")
print("\nConfusion Matrix:")
print(cm)
print("\nClassification Report:")
print(classification_report(y_test, y_pred_best))


Epoch 1/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 12ms/step - accuracy: 0.5000 - loss: 0.7120 - val_accuracy: 0.5160 - val_loss: 0.6909
Epoch 2/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step - accuracy: 0.5273 - loss: 0.6963 - val_accuracy: 0.5253 - val_loss: 0.6939
Epoch 3/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 9ms/step - accuracy: 0.5196 - loss: 0.6948 - val_accuracy: 0.5199 - val_loss: 0.6936
Epoch 4/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step - accuracy: 0.5110 - loss: 0.6962 - val_accuracy: 0.5279 - val_loss: 0.6982
Epoch 5/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step - accuracy: 0.5173 - loss: 0.6955 - val_accuracy: 0.5226 - val_loss: 0.6947
Epoch 6/50
[1m94/94[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 7ms/step - accuracy: 0.5133 - loss: 0.6926 - val_accuracy: 0.5279 - val_loss: 0.6984
Epoch 7/50
[1m94/94[0m [32m━━━━━━━━━

In [27]:
mlp_wrapper.fit(X_train_res_scaled, y_train_res)
# Do not put mlp_wrapper alone as the last line
print("MLP trained successfully")


MLP trained successfully


In [28]:
# 1️ Generate probabilities for the positive class
rf_pred_prob = rf_model.predict_proba(X_test_scaled)[:,1]
xgb_pred_prob = xgb_model.predict_proba(X_test_scaled)[:,1]
mlp_pred_prob = mlp_wrapper.predict_proba(X_test_scaled)[:,1]  # MLP fitted

# 2️ Manual soft voting (average probabilities)
ensemble_prob = (rf_pred_prob + xgb_pred_prob + mlp_pred_prob) / 3
y_pred_ensemble = (ensemble_prob >= 0.5).astype(int)

# 3️ Evaluate ensemble
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report

accuracy = accuracy_score(y_test, y_pred_ensemble)
precision = precision_score(y_test, y_pred_ensemble)
recall = recall_score(y_test, y_pred_ensemble)
f1 = f1_score(y_test, y_pred_ensemble)
cm = confusion_matrix(y_test, y_pred_ensemble)

print("Manual Soft-Voting Ensemble Performance:")
print(f"Accuracy : {accuracy:.4f}")
print(f"Precision: {precision:.4f}")
print(f"Recall   : {recall:.4f}")
print(f"F1-Score : {f1:.4f}")
print("\nConfusion Matrix:")
print(cm)
print("\nClassification Report:")
print(classification_report(y_test, y_pred_ensemble))


Manual Soft-Voting Ensemble Performance:
Accuracy : 0.4713
Precision: 0.0000
Recall   : 0.0000
F1-Score : 0.0000

Confusion Matrix:
[[443   0]
 [497   0]]

Classification Report:
              precision    recall  f1-score   support

           0       0.47      1.00      0.64       443
           1       0.00      0.00      0.00       497

    accuracy                           0.47       940
   macro avg       0.24      0.50      0.32       940
weighted avg       0.22      0.47      0.30       940



In [29]:
print("RF prediction counts:", np.bincount(rf_model.predict(X_test_scaled)))
print("XGB prediction counts:", np.bincount(xgb_model.predict(X_test_scaled)))
print("MLP prediction counts:", np.bincount(mlp_wrapper.predict(X_test_scaled)))


RF prediction counts: [940]
XGB prediction counts: [940]
MLP prediction counts: [  6 934]


In [30]:
ensemble_prob = (0.2*rf_pred_prob + 0.2*xgb_pred_prob + 0.6*mlp_pred_prob)


In [33]:
threshold = 0.3  # experiment with 0.3, 0.35, 0.4, etc.
ensemble_pred = (ensemble_prob >= threshold).astype(int)

# Evaluate performance
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report

print("Ensemble Performance with threshold =", threshold)
print("Accuracy :", accuracy_score(y_test, ensemble_pred))
print("Precision:", precision_score(y_test, ensemble_pred))
print("Recall   :", recall_score(y_test, ensemble_pred))
print("F1-Score :", f1_score(y_test, ensemble_pred))
print("Confusion Matrix:\n", confusion_matrix(y_test, ensemble_pred))
print("Classification Report:\n", classification_report(y_test, ensemble_pred))


Ensemble Performance with threshold = 0.3
Accuracy : 0.5287234042553192
Precision: 0.5287234042553192
Recall   : 1.0
F1-Score : 0.6917188587334725
Confusion Matrix:
 [[  0 443]
 [  0 497]]
Classification Report:
               precision    recall  f1-score   support

           0       0.00      0.00      0.00       443
           1       0.53      1.00      0.69       497

    accuracy                           0.53       940
   macro avg       0.26      0.50      0.35       940
weighted avg       0.28      0.53      0.37       940



In [35]:
# MLP predictions (probability for class 1 only)
mlp_pred_prob = mlp_wrapper.predict_proba(X_test_scaled)[:, 1]

# Weighted soft-voting ensemble
ensemble_prob_cal = (0.3 * rf_pred_prob_cal + 
                     0.3 * xgb_pred_prob_cal + 
                     0.4 * mlp_pred_prob)

# Apply threshold
threshold = 0.5
ensemble_pred_cal = (ensemble_prob_cal >= threshold).astype(int)

# Evaluate ensemble
print("Calibrated Ensemble Performance with threshold =", threshold)
print("Accuracy :", accuracy_score(y_test, ensemble_pred_cal))
print("Precision:", precision_score(y_test, ensemble_pred_cal))
print("Recall   :", recall_score(y_test, ensemble_pred_cal))
print("F1-Score :", f1_score(y_test, ensemble_pred_cal))
print("Confusion Matrix:\n", confusion_matrix(y_test, ensemble_pred_cal))
print("Classification Report:\n", classification_report(y_test, ensemble_pred_cal))


Calibrated Ensemble Performance with threshold = 0.5
Accuracy : 0.5287234042553192
Precision: 0.5287234042553192
Recall   : 1.0
F1-Score : 0.6917188587334725
Confusion Matrix:
 [[  0 443]
 [  0 497]]
Classification Report:
               precision    recall  f1-score   support

           0       0.00      0.00      0.00       443
           1       0.53      1.00      0.69       497

    accuracy                           0.53       940
   macro avg       0.26      0.50      0.35       940
weighted avg       0.28      0.53      0.37       940



In [36]:
import numpy as np
from sklearn.metrics import f1_score, accuracy_score, precision_score, recall_score

thresholds = np.arange(0.1, 0.91, 0.05)  # test thresholds from 0.1 to 0.9
best_thresh, best_f1 = 0.5, 0

for t in thresholds:
    preds = (ensemble_prob_cal >= t).astype(int)
    f1 = f1_score(y_test, preds)
    if f1 > best_f1:
        best_f1, best_thresh = f1, t

print("Best threshold:", best_thresh)
print("Best F1-score:", best_f1)

# Recalculate metrics with best threshold
final_preds = (ensemble_prob_cal >= best_thresh).astype(int)
print("Accuracy :", accuracy_score(y_test, final_preds))
print("Precision:", precision_score(y_test, final_preds))
print("Recall   :", recall_score(y_test, final_preds))
print("F1-Score :", f1_score(y_test, final_preds))
print("Confusion Matrix:\n", confusion_matrix(y_test, final_preds))
print("Classification Report:\n", classification_report(y_test, final_preds))


Best threshold: 0.1
Best F1-score: 0.6917188587334725
Accuracy : 0.5287234042553192
Precision: 0.5287234042553192
Recall   : 1.0
F1-Score : 0.6917188587334725
Confusion Matrix:
 [[  0 443]
 [  0 497]]
Classification Report:
               precision    recall  f1-score   support

           0       0.00      0.00      0.00       443
           1       0.53      1.00      0.69       497

    accuracy                           0.53       940
   macro avg       0.26      0.50      0.35       940
weighted avg       0.28      0.53      0.37       940



# Author of project: Akinmade Faruq
# Contact informations: www.linkedin.com/in/faruqakinmade
# Email: Fharuk147@gmail.com
# X website: https://x.com/EngrrrAkinmade