Long Short-Term Memory (LSTM) neural networks are a type of Recurrent Neural Network (RNN) specifically designed to learn and capture patterns in sequential data. Unlike traditional RNNs, LSTMs are capable of learning long-term dependencies, which makes them particularly effective for tasks where the context from earlier inputs significantly influences the output. This is achieved through the use of memory cells and gates—input, forget, and output gates—that control the flow of information, allowing the network to retain or discard information over time. LSTMs are widely used in applications such as natural language processing, time series forecasting (like stock price prediction), speech recognition, and anomaly detection, where sequential dependencies are crucial for accurate predictions. Their ability to handle vanishing gradient problems and maintain information over long sequences makes LSTMs a powerful tool in deep learning for sequence modeling.


The code performs a comprehensive analysis and prediction of stock prices using historical data. It starts by fetching historical stock price data from Yahoo Finance based on a user-specified ticker symbol and date range. The data is then preprocessed by scaling it to a range between 0 and 1 to enhance model performance. Sequences of historical prices are created to feed into an LSTM (Long Short-Term Memory) neural network, which is trained to recognize patterns and predict future stock prices. After training the model, it forecasts stock prices for the next year (365 days) and converts these predictions back to the original price scale. Finally, the historical and predicted stock prices are visualized in an interactive Plotly graph, providing a clear view of past trends and future projections.


In [7]:
import numpy as np
import pandas as pd
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM
import plotly.graph_objects as go
import json

# Function to fetch data from Yahoo Finance
def fetch_data(ticker, start_date, end_date):
    df = yf.download(ticker, start=start_date, end=end_date)
    df = df[['Close']].rename(columns={'Close': 'price'})
    return df

# Function to preprocess data
def preprocess_data(df):
    scaler = MinMaxScaler(feature_range=(0, 1))
    df_scaled = scaler.fit_transform(df[['price']])
    return df_scaled, scaler

# Function to create sequences for LSTM
def create_sequences(data, time_step=50):
    X, y = [], []
    for i in range(len(data) - time_step - 1):
        X.append(data[i:(i + time_step), 0])
        y.append(data[i + time_step, 0])
    return np.array(X), np.array(y)

# Function to train the LSTM model
def train_lstm_model(X, y, time_step=50):
    model = Sequential()
    model.add(LSTM(50, return_sequences=True, input_shape=(time_step, 1)))
    model.add(LSTM(50, return_sequences=False))
    model.add(Dense(25))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mean_squared_error')
    model.fit(X, y, epochs=20, batch_size=32, verbose=1)
    return model

# Function to predict future stock prices using LSTM model
def predict_future_prices(model, scaled_data, scaler, time_step=50, output_days=365):
    last_50_days = scaled_data[-time_step:].reshape(1, -1)
    temp_input = last_50_days[0].tolist()
    predicted_prices = []

    for i in range(output_days):
        if len(temp_input) > time_step:
            x_input = np.array(temp_input[1:])
            x_input = x_input.reshape(1, time_step, 1)
        else:
            x_input = np.array(temp_input).reshape(1, time_step, 1)

        pred = model.predict(x_input, verbose=0)
        temp_input.append(pred[0][0])
        temp_input = temp_input[1:]
        predicted_prices.append(pred[0][0])

    # Convert predictions back to original scale
    predicted_prices = scaler.inverse_transform(np.array(predicted_prices).reshape(-1, 1))
    return predicted_prices

# Function to visualize historical and predicted stock prices
def visualize_results(data, predicted_prices, stock_name, output_days=365):
    fig = go.Figure()
    fig.add_trace(go.Scatter(x=data.index, y=data['price'], mode='lines', name=f'Historical Prices (2021-2023) for {stock_name}'))
    future_dates = pd.date_range(start='2024-01-01', periods=output_days)
    fig.add_trace(go.Scatter(x=future_dates, y=predicted_prices.flatten(), mode='lines', name=f'Predicted Prices (2024) for {stock_name}'))
    fig.update_layout(title=f'Stock Price Prediction for {stock_name} for 2024 using LSTM', xaxis_title='Date', yaxis_title='Price')
    fig.show()

# Main function to accept user input and run the prediction
def main():
    # Ask user for input details
    ticker = input("Enter the stock ticker (e.g., AAPL): ")
    start_date = input("Enter the start date (YYYY-MM-DD) for historical data (e.g., 2021-01-01): ")
    end_date = input("Enter the end date (YYYY-MM-DD) for historical data (e.g., 2023-12-31): ")

    # Step 1: Fetch historical data
    data = fetch_data(ticker, start_date, end_date)

    # Step 2: Preprocess the data
    scaled_data, scaler = preprocess_data(data)

    # Step 3: Prepare data for LSTM
    time_step = 50
    X, y = create_sequences(scaled_data, time_step)
    X = X.reshape(X.shape[0], X.shape[1], 1)  # Reshape for LSTM

    # Step 4: Train LSTM model
    model = train_lstm_model(X, y, time_step)

    # Step 5: Predict future stock prices
    output_days = 365  # Predict for the entire year of 2024
    predicted_prices = predict_future_prices(model, scaled_data, scaler, time_step, output_days)

    # Step 6: Visualize the results
    visualize_results(data, predicted_prices, ticker, output_days)

# Run the prediction
main()


Enter the stock ticker (e.g., AAPL): IOC.NS
Enter the start date (YYYY-MM-DD) for historical data (e.g., 2021-01-01): 2023-04-23
Enter the end date (YYYY-MM-DD) for historical data (e.g., 2023-12-31): 2024-04-23


[*********************100%***********************]  1 of 1 completed

Epoch 1/20




Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 42ms/step - loss: 0.1179
Epoch 2/20
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 43ms/step - loss: 0.0452
Epoch 3/20
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 47ms/step - loss: 0.0214
Epoch 4/20
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 44ms/step - loss: 0.0102
Epoch 5/20
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 44ms/step - loss: 0.0103
Epoch 6/20
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 71ms/step - loss: 0.0051
Epoch 7/20
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 77ms/step - loss: 0.0056
Epoch 8/20
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 74ms/step - loss: 0.0054
Epoch 9/20
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 75ms/step - loss: 0.0047
Epoch 10/20
[1m7/7[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 81ms/step - loss: 0.0045
Epoch 11/20
[1m7/7[0m [32m━

Stock Price Prediction using Recurrent Neural Networks (RNN)
In this project, I aim to predict stock prices for Indian Oil Corporation (IOC), listed on the National Stock Exchange (IOC.NS). The prediction task leverages historical stock data from April 2023 to April 2024.

Data Overview
The dataset consists of historical stock prices over a one-year period, starting from April 23, 2023, to April 23, 2024. This data is critical for training a time series model capable of understanding temporal dependencies, which is essential in predicting future stock prices.

Model Architecture
I utilized a Recurrent Neural Network (RNN) architecture for this task, which is well-suited for sequential data such as time series. The RNN is expected to capture the sequential patterns in stock prices, making it possible to forecast future trends based on historical data.

The training process involves running the model for 20 epochs, with the data fed into the model in small batches, allowing the network to iteratively learn from the data.

Training Process
The training is executed for 20 epochs, as seen in the output where the first epoch has completed successfully. The process involves the RNN making predictions on stock prices for the next day based on prior data points, followed by an update to the model’s weights to reduce prediction error

In [5]:
import numpy as np
import pandas as pd
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, LSTM, Dropout
import plotly.graph_objects as go

# Step 1: Data Collection
def fetch_data(ticker, start_date, end_date):
    df = yf.download(ticker, start=start_date, end=end_date)
    df = df[['Close']].rename(columns={'Close': 'price'})
    return df

# Step 2: Data Preprocessing
def preprocess_data(df):
    scaler = MinMaxScaler(feature_range=(0, 1))
    df_scaled = scaler.fit_transform(df[['price']])
    return df_scaled, scaler

# Step 3: Prepare Data for LSTM
def create_sequences(data, time_step=50):
    X, y = [], []
    for i in range(len(data) - time_step - 1):
        X.append(data[i:(i + time_step), 0])
        y.append(data[i + time_step, 0])
    return np.array(X), np.array(y)

# Fetch historical data from 2021 to 2023
ticker = input("Enter the stock ticker for prediction (e.g., AAPL, MSFT): ")
data = fetch_data(ticker, '2021-01-01', '2023-12-31')  # Replace 'AAPL' with your stock ticker
scaled_data, scaler = preprocess_data(data)

# Prepare data for LSTM
time_step = 60  # Increase time step to capture longer-term patterns
X, y = create_sequences(scaled_data, time_step)
X = X.reshape(X.shape[0], X.shape[1], 1)  # Reshape for LSTM

# Step 4: Define and Train LSTM Model with Improvements
model = Sequential()
model.add(LSTM(100, return_sequences=True, input_shape=(time_step, 1)))  # Increase LSTM units
model.add(Dropout(0.2))  # Add dropout to prevent overfitting
model.add(LSTM(100, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(50, activation='relu'))  # Add a Dense layer with activation
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')

model.fit(X, y, epochs=50, batch_size=32, verbose=1)  # Increase epochs for better training

# Step 5: Predict Future Stock Prices
last_50_days = scaled_data[-time_step:].reshape(1, -1)
temp_input = last_50_days[0].tolist()
output_days = 365  # Number of days to predict for 2024
predicted_prices = []

for i in range(output_days):
    if len(temp_input) > time_step:
        x_input = np.array(temp_input[1:])
        x_input = x_input.reshape(1, time_step, 1)
    else:
        x_input = np.array(temp_input).reshape(1, time_step, 1)

    pred = model.predict(x_input, verbose=0)
    temp_input.append(pred[0][0])
    temp_input = temp_input[1:]
    predicted_prices.append(pred[0][0])

# Step 6: Convert Predictions Back to Original Scale
predicted_prices = scaler.inverse_transform(np.array(predicted_prices).reshape(-1, 1))

# Step 7: Visualize Results
fig = go.Figure()
fig.add_trace(go.Scatter(x=data.index, y=data['price'], mode='lines', name=f'Historical Prices (2021-2023) for {ticker}', line=dict(color='blue')))
future_dates = pd.date_range(start='2024-01-01', periods=output_days)
fig.add_trace(go.Scatter(x=future_dates, y=predicted_prices.flatten(), mode='lines', name=f'Predicted Prices (2024) for {ticker}', line=dict(color='red')))
fig.update_layout(title=f'Stock Price Prediction for 2024 using LSTM for {ticker}', xaxis_title='Date', yaxis_title='Price')
fig.show()


Enter the stock ticker for prediction (e.g., AAPL, MSFT): HINDPETRO.NS


[*********************100%***********************]  1 of 1 completed

Epoch 1/50




Do not pass an `input_shape`/`input_dim` argument to a layer. When using Sequential models, prefer using an `Input(shape)` object as the first layer in the model instead.



[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 134ms/step - loss: 0.0339
Epoch 2/50
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 92ms/step - loss: 0.0051
Epoch 3/50
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 90ms/step - loss: 0.0038
Epoch 4/50
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 89ms/step - loss: 0.0036
Epoch 5/50
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 107ms/step - loss: 0.0035
Epoch 6/50
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 149ms/step - loss: 0.0033
Epoch 7/50
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 91ms/step - loss: 0.0036
Epoch 8/50
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 89ms/step - loss: 0.0034
Epoch 9/50
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 93ms/step - loss: 0.0031
Epoch 10/50
[1m22/22[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 91ms/step - loss: 0.0030
Epoch 1

The graph displays the historical stock prices for Hindustan Petroleum Corporation Limited (HINDPETRO.NS) from January 2021 to December 2023, and the predicted stock prices for the year 2024 using a Long Short-Term Memory (LSTM) model.

Key Insights:
Historical Price Trend (Blue Line):

The blue line shows the actual stock prices from January 2021 to December 2023.
The stock price fluctuates between ₹100 and ₹300, with noticeable ups and downs, reflecting the market volatility and company performance during this period.
There is a slight increase in price toward the end of 2023, showing a potential upward trend.
Predicted Price Trend (Red Line):

The LSTM model predicts a sharp rise in the stock price starting from early 2024.
The predicted prices quickly escalate beyond ₹900 by mid-2024, after which the price plateaus.
This steep increase in predicted price may indicate that the model expects a significant growth in the company's stock value, possibly driven by factors not reflected in the past data, or there could be some overfitting of the model to the data.
Model Performance:

The loss during training appears to be minimal (loss: 0.0012), which suggests that the model has a good fit on the historical data.
However, the sudden and steep rise in predicted prices should be carefully evaluated as it might signal the model is overestimating future prices, which often happens if the model is not regularized or if there are insufficient features used for training.
Interpretation:
While the LSTM model predicts a massive growth in Hindustan Petroleum's stock price in 2024, it’s important to validate these results with more domain-specific features or external factors that can substantiate this prediction (e.g., market news, company expansions, or macroeconomic factors).
Additionally, the sharp increase might suggest overfitting, where the model is giving undue weight to recent trends in the historical data without considering the broader market context. Further model fine-tuning and cross-validation may be required to improve the predictions.
