# Dividend Yield Prediction using Artificial Neural Network


---


This analysis focuses on exploring the relationships between various financial metrics and predicting the dividend yield using artificial neural network. It includes data loading, exploratory data analysis, feature scaling, model building, training, and evaluation.


---



## Import Libraries

In [1]:
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import tensorflow as tf
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error, r2_score
from tensorflow import keras
from tensorflow.keras import layers

## Import Dataset

In [2]:
df = pd.read_csv('/content/drive/MyDrive/datasets/stock_market_june2025.csv')

df.head()

Unnamed: 0,Date,Ticker,Open Price,Close Price,High Price,Low Price,Volume Traded,Market Cap,PE Ratio,Dividend Yield,EPS,52 Week High,52 Week Low,Sector
0,01-06-2025,SLH,34.92,34.53,35.22,34.38,2966611,57381360000.0,29.63,2.85,1.17,39.39,28.44,Industrials
1,01-06-2025,WGB,206.5,208.45,210.51,205.12,1658738,52747070000.0,13.03,2.73,16.0,227.38,136.79,Energy
2,01-06-2025,ZIN,125.1,124.03,127.4,121.77,10709898,55969490000.0,29.19,2.64,4.25,138.35,100.69,Healthcare
3,01-06-2025,YPY,260.55,265.28,269.99,256.64,14012358,79640890000.0,19.92,1.29,13.32,317.57,178.26,Industrials
4,01-06-2025,VKD,182.43,186.89,189.4,179.02,14758143,72714370000.0,40.18,1.17,4.65,243.54,165.53,Technology


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1762 entries, 0 to 1761
Data columns (total 14 columns):
 #   Column          Non-Null Count  Dtype  
---  ------          --------------  -----  
 0   Date            1762 non-null   object 
 1   Ticker          1762 non-null   object 
 2   Open Price      1762 non-null   float64
 3   Close Price     1762 non-null   float64
 4   High Price      1762 non-null   float64
 5   Low Price       1762 non-null   float64
 6   Volume Traded   1762 non-null   int64  
 7   Market Cap      1762 non-null   float64
 8   PE Ratio        1762 non-null   float64
 9   Dividend Yield  1762 non-null   float64
 10  EPS             1762 non-null   float64
 11  52 Week High    1762 non-null   float64
 12  52 Week Low     1762 non-null   float64
 13  Sector          1762 non-null   object 
dtypes: float64(10), int64(1), object(3)
memory usage: 192.8+ KB


## Define Features and Target

In [4]:
X = df.drop(['Date', 'Ticker', 'Sector', 'Dividend Yield'], axis=1).values
y = df['Dividend Yield'].values.reshape(-1, 1)

## Split Train / Test

In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

## Normalise Features and Target

In [6]:
X_scaler = MinMaxScaler()
y_scaler = MinMaxScaler()

X_train_scaled = X_scaler.fit_transform(X_train)
X_test_scaled = X_scaler.transform(X_test)

y_train_scaled = y_scaler.fit_transform(y_train)
y_test_scaled = y_scaler.transform(y_test)

## Create the Neural Network

In [7]:
model = keras.Sequential([

    layers.Dense(64, input_dim=X_train_scaled.shape[1], activation='relu'),
    layers.Dense(32, activation='relu'),
    layers.Dense(1, activation='linear'),

])

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


## Train the Model

In [8]:
# Compile
model.compile(optimizer='adam', loss='mean_squared_error', metrics = ['MSE'])

# Train
model.fit(X_train_scaled, y_train_scaled, batch_size=32, epochs=100)

Epoch 1/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - MSE: 0.1886 - loss: 0.1886
Epoch 2/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - MSE: 0.0619 - loss: 0.0619
Epoch 3/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - MSE: 0.0582 - loss: 0.0582
Epoch 4/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - MSE: 0.0550 - loss: 0.0550
Epoch 5/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step - MSE: 0.0536 - loss: 0.0536
Epoch 6/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - MSE: 0.0506 - loss: 0.0506
Epoch 7/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - MSE: 0.0492 - loss: 0.0492
Epoch 8/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 2ms/step - MSE: 0.0495 - loss: 0.0495
Epoch 9/100
[1m45/45[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step - M

<keras.src.callbacks.history.History at 0x78ea509f39d0>

In [9]:
# Predict on test data (scaled)
y_pred_scaled = model.predict(X_test_scaled)

[1m12/12[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step 


In [10]:
# Inverse transform predictions to get original scale y_test_original, y_pred
# y_pred = y_scaler.inverse_transform(y_pred_scaled)
# y_test_original = y_scaler.inverse_transform(y_test_scaled)

## Model Evaluation

In [11]:
# Model Evaluate
mse = mean_squared_error(y_test_scaled, y_pred_scaled)
r2 = r2_score(y_test_scaled, y_pred_scaled)

print(f"Test MSE: {mse:.4f}")
print(f"Test R2: {r2:.4f}")

Test MSE: 0.0476
Test R2: 0.0452


## Conclusion

The model achieved an R-squared score of 0.0777 on the scaled test set. This low R-squared value suggests that the current model does not explain a significant portion of the variance in the dividend yield, indicating limited predictive power with the current features and model architecture.

Further work could involve exploring additional features, trying different neural network architectures, or experimenting with other machine learning models to improve the prediction of dividend yield. Evaluating the model's performance on the unscaled data would also provide a more direct interpretation of the prediction errors in terms of actual dividend yield values.