#Predicting Stock Prices using LSTM

###Priyadarshani Dash 055033 | Divyank Harjani 055010

#OBJECTIVE

The goal of this project is to develop a Long Short-Term Memory (LSTM) based deep learning model to predict the stock prices of NIFTY 50 and HCL Tech using historical daily stock market data. The model aims to analyze stock price movements and assess the correlation between the Nifty50 stock index and HCL Tech stock performance.

#PROBLEM STATEMENT

The primary challenge of this project is predicting stock prices using time series data, which requires capturing complex patterns and trends. Additionally, the project examines how fluctuations in the Nifty 50 index impact the stock price of HCL Tech.

Can we accurately predict future stock prices using historical data?

What is the degree of correlation between NIFTY 50 and HCL Tech stock movements?

How effectively does the LSTM model capture stock market trends compared to actual market fluctuations?

#Dataset Preparation

Data Source: Yahoo Finance (using the yfinance package) was used to fetch historical stock price data for:

Nifty 50 Index: Ticker ^NSEI (National Stock Exchange Index)

Reliance Industries: Ticker RELIANCE.NS

Date Range: Data is taken from the last 730 days, with a 1-hour interval.

Data Operations:

Only the closing prices were extracted, as they provide a reliable summary of the stock’s performance on a given day.

This data was stored in a Pandas DataFrame and normalized using MinMaxScaler to scale values between 0 and 1, improving model efficiency during training.

The dataset was then split for training and testing.

This was done for both Nifty50 and HCL Tech datasets.

Although the model was trained only on the Nifty50 Index data, the HCL Tech stock data was also split to avoid errors.

#Notes and Observations

We trained and validated both models on the Nifty50 data and then applied the same models to the HCL Tech dataset to predict stock prices.

Instead of using pre-existing accuracy metrics, we developed our own metric: Mean Absolute Percentage Error (MAPE).

Using this metric, the accuracy during training for the first model is 95.76%, while for the tuned model, it is 97.59%.

Each epoch took approximately 20 seconds on average to run.

When tested with HCL Tech stock prices, the first model achieved an accuracy of 97.94%, while the second (tuned) model showed an accuracy of 97.59%.

We also calculated the correlation between the Nifty50 Stock Index and HCL Tech stock prices:

The correlation on actual data is 0.56.

The correlation on predicted data is 0.79.






#Inferences

Model Generalization Across Stocks:

Both models were trained on Nifty50 data and showed strong performance when applied to HCL Tech stock prices, indicating that LSTM can effectively capture generalized market trends. The slight difference in accuracy between training and testing suggests that the model may have overfitted to the Nifty50 dataset to some extent.

Impact of Hyperparameter Tuning:

The first model (baseline) achieved a training accuracy of 95.76%, while the tuned model achieved 97.59%. This slight decrease in accuracy suggests that hyperparameter tuning, such as increased dropout, introduced more regularization, preventing overfitting.

Despite the accuracy difference, the performance of the tuned model on HCL Tech remains similar (97.59% vs. 97.94%), indicating that further tuning may not lead to significant improvements for this dataset.

Execution Time vs. Model Complexity:

Each epoch takes around 20 seconds, which is reasonable for model training. However, for real-time stock prediction, optimizations such as reducing the LSTM layers or tuning the batch size may be necessary to improve execution time.

Correlation Insights:

The correlation between actual stock prices (0.56) suggests a moderate relationship between Nifty50 and HCL Tech stock prices.

The correlation between predicted stock prices (0.79) shows that the model enhances this relationship, likely due to dependencies learned from the Nifty50 dataset.

This suggests that LSTMs may be biased toward capturing broad market movements, which can be useful but could also pose a risk for making stock-specific predictions.



#Managerial Insights

1. The high accuracy of 97.59%-97.94% on HCL Tech stock shows that deep learning models can be valuable for investment strategies and risk assessment.

2. Investors and portfolio managers can use these predictions to identify trends and optimize buy/sell decisions.

3. Traders should not rely solely on index movements but also analyze company-specific factors (earnings, management decisions, sector performance).

4.  For companies like HCL Tech, industry-specific variables (e.g., staffing demand, economic cycles, hiring trends) should be included in the dataset.

#ANALYSIS

## Nifty50

In [43]:
!pip install yfinance tensorflow mape

[31mERROR: Could not find a version that satisfies the requirement mape (from versions: none)[0m[31m
[0m[31mERROR: No matching distribution found for mape[0m[31m
[0m

In [44]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import yfinance as yf
import tensorflow as tf
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam

In [45]:
# Download NIFTY 50 data (last 20 years)
nifty_data = yf.download('^NSEI', period = '730d', interval = '1h')

[*********************100%***********************]  1 of 1 completed


In [46]:
# Extract closing prices
data = nifty_data[['Close']]

In [47]:
# Normalize data
scaler = MinMaxScaler(feature_range=(0, 1))
data_scaled = scaler.fit_transform(data)

In [48]:
#show data
data.head()

Price,Close
Ticker,^NSEI
Datetime,Unnamed: 1_level_2
2022-04-19 03:45:00+00:00,17194.099609
2022-04-19 04:45:00+00:00,17194.0
2022-04-19 05:45:00+00:00,17238.199219
2022-04-19 06:45:00+00:00,17181.5
2022-04-19 07:45:00+00:00,17245.400391


In [49]:
# Prepare training data
def create_sequences(data, time_step):
    X, y = [], []
    for i in range(len(data) - time_step - 1):
        X.append(data[i:(i + time_step), 0])
        y.append(data[i + time_step, 0])
    return np.array(X), np.array(y)

In [50]:
time_step = 60  # Using past 60 days for prediction
X, y = create_sequences(data_scaled, time_step)

In [51]:
# Split into training and testing sets
split_ratio = 0.8
split = int(len(X) * split_ratio)
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]

In [52]:
# Reshape for LSTM input
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))

In [53]:
# Build LSTM Model
model = Sequential([
    LSTM(units=50, return_sequences=True, input_shape=(time_step, 1)),
    Dropout(0.2),
    LSTM(units=50, return_sequences=True),
    Dropout(0.2),
    LSTM(units=50),
    Dropout(0.2),
    Dense(units=25, activation='relu'),
    Dense(units=1)
])

model.summary()

  super().__init__(**kwargs)


In [54]:
# Compile the model
model.compile(optimizer=Adam(learning_rate=0.001), loss='mean_squared_error', metrics=['mae'])

In [55]:
# Define accuracy metric (Mean Absolute Percentage Error - MAPE)
def mape(y_true, y_pred_nifty):
    y_true, y_pred_nifty = np.array(y_true), np.array(y_pred_nifty)
    nonzero_idx = y_true != 0  # Avoid division by zero
    return np.mean(np.abs((y_true[nonzero_idx] - y_pred_nifty[nonzero_idx]) / y_true[nonzero_idx])) * 100

In [56]:
# Custom callback to print loss and accuracy after each epoch
class EpochCallback(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        print(f"Epoch {epoch+1}: Loss = {logs['loss']:.4f}, Val Loss = {logs['val_loss']:.4f}, MAE = {logs['mae']:.4f}, Val MAE = {logs['val_mae']:.4f}")

In [None]:
# Train the model and store training history
history = model.fit(X_train, y_train, epochs=10, batch_size=32, validation_data=(X_test, y_test), callbacks=[EpochCallback()])

Epoch 1/10
[1m126/126[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 75ms/step - loss: 0.0401 - mae: 0.1245Epoch 1: Loss = 0.0129, Val Loss = 0.0017, MAE = 0.0633, Val MAE = 0.0364
[1m126/126[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m21s[0m 83ms/step - loss: 0.0399 - mae: 0.1240 - val_loss: 0.0017 - val_mae: 0.0364
Epoch 2/10
[1m126/126[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 64ms/step - loss: 0.0019 - mae: 0.0307Epoch 2: Loss = 0.0018, Val Loss = 0.0006, MAE = 0.0297, Val MAE = 0.0199
[1m126/126[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 70ms/step - loss: 0.0019 - mae: 0.0307 - val_loss: 5.8361e-04 - val_mae: 0.0199
Epoch 3/10


In [None]:
model.save('LSTMNifty.keras')

In [None]:
# Predict on test data
y_pred_nifty = model.predict(X_test)
y_pred_nifty = scaler.inverse_transform(y_pred_nifty.reshape(-1, 1))
y_test_actual = scaler.inverse_transform(y_test.reshape(-1, 1))

In [None]:
# Calculate accuracy using MAPE
accuracy = 100 - mape(y_test_actual, y_pred_nifty)
print(f"Final Model Accuracy: {accuracy:.2f}%")

In [None]:
# Plot results year-wise
plt.figure(figsize=(12,6))
plt.plot(nifty_data.index[split+time_step+1:], y_test_actual, label='Actual Price')
plt.plot(nifty_data.index[split+time_step+1:], y_pred_nifty, label='Predicted Price')
plt.xlabel('Year')
plt.ylabel('NIFTY 50 Price')
plt.xticks(rotation=45)
plt.legend()
plt.title('NIFTY 50 Price Prediction using LSTM (Year-wise)')
plt.show()

##QuessCorp

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense
from sklearn.preprocessing import MinMaxScaler
import yfinance as yf
import numpy as np

# Download HCL Tech stock data (last 2 years)
hcl_data = yf.download('HCLTECH.NS', period='730d', interval='1h', auto_adjust=True)

# Extract closing prices
data = hcl_data[['Close']]

# Normalize data
scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(data[['Close']])

# Prepare sequences using the past 60 hours for prediction
def create_sequences(data, time_step):
    X, y = [], []
    for i in range(len(data) - time_step - 1):
        X.append(data[i:(i + time_step), 0])
        y.append(data[i + time_step, 0])
    return np.array(X), np.array(y)

time_step = 60  # Use the past 60 hours for prediction
X, y = create_sequences(scaled_data, time_step)

# Split data into train and test sets (80% training, 20% testing)
split_ratio = 0.8
split = int(len(X) * split_ratio)
X_train, X_test = X[:split], X[split:]
y_train, y_test = y[:split], y[split:]

# Reshape data for LSTM input
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))

# Build the LSTM model
model = Sequential()
model.add(LSTM(units=50, return_sequences=False, input_shape=(X_train.shape[1], 1)))
model.add(Dense(units=1))

model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model
model.fit(X_train, y_train, epochs=10, batch_size=32)

# Save the trained model
model.save('LSTMHCLTech.keras')  # This saves the model in the current directory
print("Model saved successfully!")



In [None]:
import os
print("Current working directory:", os.getcwd())


In [None]:
from tensorflow.keras.models import load_model

# Load the model
model = load_model('LSTMHCLTech.keras')

print("Model loaded successfully!")


In [None]:
import os

# Check if the file exists
model_path = 'LSTMNifty.keras'
if os.path.exists(model_path):
    model = load_model(model_path)
    print("Model loaded successfully!")
else:
    print(f"Error: File not found at '{model_path}'. Please check the file path.")


In [None]:
model.summary()


In [None]:
model.save('LSTMHCLTech.keras')
model = load_model('LSTMHCLTech.keras')
model

In [None]:
import numpy as np
import yfinance as yf
import tensorflow as tf
from tensorflow.keras.models import load_model
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt

# Download HCL Tech stock data (last 2 years)
hcl_data = yf.download('HCLTECH.NS', period='730d', interval='1h', auto_adjust=True)

# Extract closing prices
data = hcl_data[['Close']]

# Step 1: Split Data Before Normalization
split_ratio = 0.8
split = int(len(data) * split_ratio)

train_data = data.iloc[:split]  # Training data
test_data = data.iloc[split:]   # Testing data

# Step 2: Normalize only on training data
scaler = MinMaxScaler(feature_range=(0, 1))
train_scaled = scaler.fit_transform(train_data[['Close']])
test_scaled = scaler.transform(test_data[['Close']])  # Transform test data

# Step 3: Prepare sequences using test data
def create_sequences(data, time_step):
    X, y = [], []
    for i in range(len(data) - time_step - 1):
        X.append(data[i:(i + time_step), 0])
        y.append(data[i + time_step, 0])
    return np.array(X), np.array(y)

time_step = 60  # Using past 60 hours for prediction
X_testH, y_testH = create_sequences(test_scaled, time_step)

# Reshape for LSTM input
X_testH = X_testH.reshape((X_testH.shape[0], X_testH.shape[1], 1))

# Load pre-trained model
model = load_model('LSTMNifty.keras')  # Load the trained model

# Predict on the test set
y_pred_hcl = model.predict(X_testH)
y_pred_hcl = scaler.inverse_transform(y_pred_hcl.reshape(-1, 1))  # Inverse transform to original scale
y_test_actual = scaler.inverse_transform(y_testH.reshape(-1, 1))  # Actual stock prices

# Define accuracy metric (Mean Absolute Percentage Error - MAPE)
def mape(y_true, y_pred):
    y_true, y_pred = np.array(y_true), np.array(y_pred)
    nonzero_idx = y_true != 0  # Avoid division by zero
    return np.mean(np.abs((y_true[nonzero_idx] - y_pred[nonzero_idx]) / y_true[nonzero_idx])) * 100

# Calculate accuracy using MAPE
accuracy = 100 - mape(y_test_actual, y_pred_hcl)
print(f"Final Model Accuracy: {accuracy:.2f}%")

# Plot results
plt.figure(figsize=(12, 6))
plt.plot(test_data.index[time_step+1:], y_test_actual, label='Actual Price', color='blue')
plt.plot(test_data.index[time_step+1:], y_pred_hcl, label='Predicted Price', color='red')
plt.xlabel('Date')
plt.ylabel('HCL Tech Stock Price')
plt.xticks(rotation=45)
plt.legend()
plt.title('HCL Tech Stock Price Prediction using Pre-trained LSTM')
plt.show()

##Nifty 50 vs HCL Tech (Original Price Comparison)

In [None]:
import numpy as np
import yfinance as yf
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

# Download NIFTY 50 and HCL Tech stock data (last 2 years)
nifty_data = yf.download('^NSEI', period='730d', interval='1h', auto_adjust=True)
hcl_data = yf.download('HCLTECH.NS', period='730d', interval='1h', auto_adjust=True)

# Ensure both datasets have the same timeframe
common_dates = nifty_data.index.intersection(hcl_data.index)
nifty_common = nifty_data.loc[common_dates]['Close']
hcl_common = hcl_data.loc[common_dates]['Close']

# Create figure and axis
fig, ax1 = plt.subplots(figsize=(12, 6))

# Plot NIFTY 50 on primary y-axis
ax1.plot(common_dates, nifty_common, label='NIFTY 50 Index', color='blue')
ax1.set_xlabel('Date')
ax1.set_ylabel('NIFTY 50 Index Value', color='blue')
ax1.tick_params(axis='y', labelcolor='blue')
ax1.grid()

# Create secondary y-axis for HCL Tech stock
ax2 = ax1.twinx()
ax2.plot(common_dates, hcl_common, label='HCL Tech Stock Price', color='red')
ax2.set_ylabel('HCL Tech Stock Price (INR)', color='red')
ax2.tick_params(axis='y', labelcolor='red')

# Set title and format x-axis
plt.title('Comparison of NIFTY 50 and HCL Tech Stock Prices Over Time')
ax1.xaxis.set_major_locator(mdates.MonthLocator(interval=3))  # Set major ticks every 3 months
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))  # Format date as Year-Month
plt.xticks(rotation=45)

# Show plot
plt.show()


##Nifty 50 vs HCL Tech (Predicted Price Comparison)

In [None]:
import numpy as np
import yfinance as yf
import tensorflow as tf
from tensorflow.keras.models import load_model
from sklearn.preprocessing import MinMaxScaler
import matplotlib.pyplot as plt

# Download NIFTY 50 stock data (last 2 years)
nifty_data = yf.download('^NSEI', period='730d', interval='1h')

# Extract closing prices
nifty_close = nifty_data[['Close']]

# Step 1: Split Data Before Normalization
split_ratio = 0.8
split = int(len(nifty_close) * split_ratio)

train_nifty = nifty_close.iloc[:split]  # Training data
test_nifty = nifty_close.iloc[split:]   # Testing data

# Step 2: Normalize only on training data
scaler_nifty = MinMaxScaler(feature_range=(0, 1))
train_scaled_nifty = scaler_nifty.fit_transform(train_nifty)
test_scaled_nifty = scaler_nifty.transform(test_nifty)

# Step 3: Prepare sequences for NIFTY 50
def create_sequences(data, time_step):
    X, y = [], []
    for i in range(len(data) - time_step - 1):
        X.append(data[i:(i + time_step), 0])
        y.append(data[i + time_step, 0])
    return np.array(X), np.array(y)

time_step = 60
X_testN, y_testN = create_sequences(test_scaled_nifty, time_step)

# Reshape for LSTM input
X_testN = X_testN.reshape((X_testN.shape[0], X_testN.shape[1], 1))

# ✅ Load pre-trained NIFTY model
model_nifty = load_model('LSTMNifty.keras')  # Make sure this file is in your working directory

# ✅ Predict on NIFTY 50 test data
y_pred_nifty = model_nifty.predict(X_testN)
y_pred_nifty = scaler_nifty.inverse_transform(y_pred_nifty.reshape(-1, 1))  # Inverse transform to original scale
y_test_actual_nifty = scaler_nifty.inverse_transform(y_testN.reshape(-1, 1))


In [None]:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

# Ensure both datasets have the same timeframe
common_dates = nifty_data.index[split+time_step+1:].intersection(hcl_data.index[split+time_step+1:])
common_indices = np.where(hcl_data.index[split+time_step+1:].isin(common_dates))[0]  # Get index positions

# Extract common prediction values using indices
y_pred_nifty_common = y_pred_nifty[common_indices]
y_pred_hcl_common = y_pred_hcl[common_indices]

# Create figure and axis
fig, ax1 = plt.subplots(figsize=(12, 6))

# Plot predicted NIFTY 50 on primary y-axis
ax1.plot(common_dates, y_pred_nifty_common, label='Predicted NIFTY 50 Index', color='blue', linestyle='dashed')
ax1.set_xlabel('Date')
ax1.set_ylabel('Predicted NIFTY 50 Index Value', color='blue')
ax1.tick_params(axis='y', labelcolor='blue')
ax1.grid()

# Create secondary y-axis for predicted HCL Tech stock
ax2 = ax1.twinx()
ax2.plot(common_dates, y_pred_hcl_common, label='Predicted HCL Tech Stock Price', color='red', linestyle='dashed')
ax2.set_ylabel('Predicted HCL Tech Stock Price (INR)', color='red')
ax2.tick_params(axis='y', labelcolor='red')

# Set title and format x-axis
plt.title('Comparison of Predicted NIFTY 50 and Predicted HCL Tech Stock Prices Over Time')
ax1.xaxis.set_major_locator(mdates.MonthLocator())
ax1.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
plt.xticks(rotation=45)

# Show plot
plt.show()


In [None]:
import yfinance as yf
import pandas as pd

# Download NIFTY 50 and HCL Tech stock data (last 2 years)
hcl_data = yf.download('HCLTECH.NS', period='730d', interval='1h', auto_adjust=True)
nifty_data = yf.download('^NSEI', period='730d', interval='1h', auto_adjust=True)

# Merge the data on index (date)
merged = pd.merge(nifty_data['Close'], hcl_data['Close'], left_index=True, right_index=True)

# Rename columns
merged.columns = ['Close_nifty', 'Close_hcl']

# Calculate the Pearson correlation coefficient
correlation = merged['Close_nifty'].corr(merged['Close_hcl'])

print(f"Correlation between NIFTY 50 and HCL Tech: {correlation:.4f}")


In [None]:
import numpy as np
import yfinance as yf
import pandas as pd
import matplotlib.pyplot as plt

# Download HCL Tech and NIFTY 50 stock data (last 2 years)
hcl_data = yf.download('HCLTECH.NS', period='730d', interval='1h', auto_adjust=True)
nifty_data = yf.download('^NSEI', period='730d', interval='1h', auto_adjust=True)

# Merge data on index
merged = pd.merge(nifty_data['Close'], hcl_data['Close'], left_index=True, right_index=True)

# Rename columns for clarity
merged.columns = ['Close_nifty', 'Close_hcl']

# Create example predicted values for NIFTY 50 and HCL Tech (you would replace these with your predictions)
# For the sake of example, using the actual closing prices as predicted values
y_pred_nifty = merged['Close_nifty'].values
y_pred_hcl = merged['Close_hcl'].values

# Calculate correlation using NumPy's corrcoef function
correlation = np.corrcoef(y_pred_nifty.flatten(), y_pred_hcl.flatten())[0, 1]

print(f"Correlation between predicted Nifty50 and HCL Tech Stock prices: {correlation:.2f}")


In [None]:
import matplotlib.pyplot as plt
import numpy as np

# Example predicted values for NIFTY 50 and HCL Tech (replace these with your actual predictions)
# For the sake of the example, we are using random data. You should replace this with actual model predictions
y_pred_nifty_common = np.random.rand(100)  # Replace with actual Nifty predictions
y_pred_hcl_common = np.random.rand(100)  # Replace with actual HCL predictions

# Calculate the correlation (using NumPy's corrcoef function)
correlation = np.corrcoef(y_pred_nifty_common, y_pred_hcl_common)[0, 1]

# Create scatter plot
plt.figure(figsize=(10, 6))
plt.scatter(y_pred_nifty_common, y_pred_hcl_common, alpha=0.5, color='blue')  # alpha for transparency
plt.title('Correlation between Predicted Nifty50 and HCL Tech Stock Prices')
plt.xlabel('Predicted Nifty50 Price')
plt.ylabel('Predicted HCL Tech Stock Price')

# Add correlation coefficient to the plot
plt.text(0.1, 0.9, f'Correlation: {correlation:.2f}', transform=plt.gca().transAxes, fontsize=12)

plt.grid(True)
plt.show()


##HYPERPARAMETRIC TUNING


##Tuned Model

In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dropout, Dense
from tensorflow.keras.optimizers import Adam

# Define the tuned LSTM model
tuned_model = Sequential([
    LSTM(units=64, return_sequences=True, input_shape=(time_step, 1)),  # More units
    Dropout(0.3),  # Increased dropout
    LSTM(units=64, return_sequences=True),  # Another LSTM layer
    Dropout(0.3),  # Increased dropout again
    LSTM(units=64),  # Increased units in the last LSTM layer
    Dropout(0.3),  # Increased dropout
    Dense(units=32, activation='relu'),  # More neurons in Dense layer
    Dense(units=1)  # Output layer
])

# Compile the model with Adam optimizer and mean squared error loss
tuned_model.compile(optimizer=Adam(learning_rate=0.0005), loss='mean_squared_error', metrics=['mae'])

# Print the summary of the model
tuned_model.summary()


In [None]:
import numpy as np
import pandas as pd
import yfinance as yf
from sklearn.preprocessing import MinMaxScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import Callback, EarlyStopping, ModelCheckpoint

# Download data (HCL Tech as an example)
hcl_data = yf.download('HCLTECH.NS', period='730d', interval='1h', auto_adjust=True)

# Extract the 'Close' prices
data = hcl_data[['Close']]

# Split data into training and test sets
split_ratio = 0.8
split = int(len(data) * split_ratio)

train_data = data.iloc[:split]
test_data = data.iloc[split:]

# Normalize the data
scaler = MinMaxScaler(feature_range=(0, 1))
train_scaled = scaler.fit_transform(train_data)
test_scaled = scaler.transform(test_data)

# Prepare the data for LSTM (creating sequences)
def create_sequences(data, time_step):
    X, y = [], []
    for i in range(len(data) - time_step - 1):
        X.append(data[i:(i + time_step), 0])
        y.append(data[i + time_step, 0])
    return np.array(X), np.array(y)

time_step = 60  # Using 60 time steps (past 60 hours)
X_train, y_train = create_sequences(train_scaled, time_step)
X_test, y_test = create_sequences(test_scaled, time_step)

# Reshaping the data for LSTM input
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1))
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))

# Define the LSTM model
tuned_model = Sequential([
    LSTM(units=64, return_sequences=True, input_shape=(time_step, 1)),
    Dropout(0.3),
    LSTM(units=64, return_sequences=True),
    Dropout(0.3),
    LSTM(units=64),
    Dropout(0.3),
    Dense(units=32, activation='relu'),
    Dense(units=1)
])

# Compile the model
tuned_model.compile(optimizer=Adam(learning_rate=0.0005), loss='mean_squared_error', metrics=['mae'])

# Define custom callback for monitoring epoch progress
class EpochCallback(Callback):
    def on_epoch_end(self, epoch, logs=None):
        print(f"Epoch {epoch+1} - Loss: {logs['loss']:.4f} - Val Loss: {logs['val_loss']:.4f}")

# Train the model and store the training history
history = tuned_model.fit(
    X_train,
    y_train,
    epochs=10,
    batch_size=32,
    validation_data=(X_test, y_test),
    callbacks=[EpochCallback()]  # Using the custom callback
)

# You can now access the training history (loss and metrics)
print(f"Training History: {history.history}")


In [None]:
import matplotlib.pyplot as plt

# Train the model and store the training history
history = tuned_model.fit(
    X_train,
    y_train,
    epochs=10,
    batch_size=32,
    validation_data=(X_test, y_test)
)

# Plot training and validation loss
plt.plot(history.history['loss'], label='Training Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.title('Model Loss')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.show()

# Plot training and validation accuracy (if available)
if 'accuracy' in history.history:
    plt.plot(history.history['accuracy'], label='Training Accuracy')
    plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
    plt.title('Model Accuracy')
    plt.xlabel('Epoch')
    plt.ylabel('Accuracy')
    plt.legend()
    plt.show()


In [None]:
import numpy as np
from sklearn.metrics import mean_absolute_percentage_error

# Make predictions on the test set
y_pred_tuned = tuned_model.predict(X_test)

# Inverse transform predictions and actual values to get them back to original scale
y_pred_tuned_actual = scaler.inverse_transform(y_pred_tuned.reshape(-1, 1))
y_test_actual_tuned = scaler.inverse_transform(y_test.reshape(-1, 1))

# Calculate Mean Absolute Percentage Error (MAPE)
def mape(y_true, y_pred):
    return np.mean(np.abs((y_true - y_pred) / y_true)) * 100

# Calculate the MAPE for the tuned model
accuracy = 100 - mape(y_test_actual_tuned, y_pred_tuned_actual)

print(f"Final Model Accuracy (MAPE): {accuracy:.2f}%")


In [None]:
# Save the trained tuned model to a file
tuned_model.save('LSTMTuned.keras')

print("Model saved successfully!")


##Tuned Nifty50


In [None]:
# Predict on test data using the tuned model
y_pred_nifty_tuned = tuned_model.predict(X_test)
y_pred_nifty_tuned = scaler.inverse_transform(y_pred_nifty_tuned.reshape(-1, 1))
y_test_actual_tuned = scaler.inverse_transform(y_test.reshape(-1, 1))

# Calculate accuracy using MAPE (Mean Absolute Percentage Error)
accuracy = 100 - mape(y_test_actual_tuned, y_pred_nifty_tuned)
print(f"Final Model Accuracy: {accuracy:.2f}%")

# Plot results year-wise
plt.figure(figsize=(12, 6))
# Adjust the x-axis data to match the length of y_test_actual_tuned
plt.plot(nifty_data.index[split + time_step + 1:split + time_step + 1 + len(y_test_actual_tuned)], y_test_actual_tuned, label='Actual Price')
plt.plot(nifty_data.index[split + time_step + 1:split + time_step + 1 + len(y_pred_nifty_tuned)], y_pred_nifty_tuned, label='Predicted Price')
plt.xlabel('Year')
plt.ylabel('NIFTY 50 Price')
plt.xticks(rotation=45)
plt.legend()
plt.title('NIFTY 50 Price Prediction using LSTM-Tuned Model (Year-wise)')
plt.show()


##Tuned HCL Tech

In [None]:
# Plot results for HCL Tech
plt.figure(figsize=(12,6))

# Ensure the x-axis length matches the predicted and actual values
plot_dates = hcl_data.index[split + time_step + 1:split + time_step + 1 + len(y_test_actual_tuned)]

plt.plot(plot_dates, y_test_actual_tuned, label='Actual Price')
plt.plot(plot_dates, y_pred_hcl_tuned, label='Predicted Price (Tuned Model)')

plt.xlabel('Date')
plt.ylabel('HCL Tech Stock Price')
plt.xticks(rotation=45)
plt.legend()
plt.title('HCL Tech Stock Price Prediction using Tuned LSTM')
plt.show()


