In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)

# Input data files are available in the read-only "../input/" directory
# For example, running this (by clicking run or pressing Shift+Enter) will list all files under the input directory

import os
for dirname, _, filenames in os.walk('/kaggle/input'):
    for filename in filenames:
        print(os.path.join(dirname, filename))

# You can write up to 20GB to the current directory (/kaggle/working/) that gets preserved as output when you create a version using "Save & Run All" 
# You can also write temporary files to /kaggle/temp/, but they won't be saved outside of the current session

### Long Short-Term Memory (LSTM) networks
LSTM networks stand as a powerful solution to one of the most persistent challenges in training recurrent neural networks (RNNs): the vanishing gradient problem.

In traditional RNNs, the gradients can become very small as they propagate back through time, which impedes the networks' ability to capture long-range dependencies in sequential data. LSTMs address this issue with an ingenious architectural design that includes specialized memory cells and gating mechanisms. Each LSTM cell possesses the ability to remember or forget information over extended sequences, rendering them particularly adept at modeling sequences with extended gaps between relevant information.

The core of the LSTM architecture is its three gating mechanisms: the input gate, the forget gate, and the output gate. These gates allow LSTMs to determine what information to store, what information to discard, and how to update the cell's memory state. This unique design empowers LSTMs to effectively learn and maintain long-term dependencies in sequential data, making them a preferred choice for tasks such as machine translation, speech recognition, and sentiment analysis where capturing context over extended sequences is essential.

Exercise
Build an LSTM network to predict stock prices based on historical stock data. Show the model's ability to capture sequential dependencies.

In [None]:
!pip install yfinance

In [None]:
import yfinance as yf
import matplotlib.pyplot as plt
from keras.models import Sequential
from keras.layers import LSTM, Dense
from sklearn.metrics import mean_squared_error

In [None]:
# Fetch historical stock data using yfinance
stock_symbol = "AAPL"
start_date = "2020-01-01"
end_date = "2023-01-01"
stock_data = yf.download(stock_symbol, start=start_date, end=end_date, progress=False)

# Extract the 'Close' prices
stock_prices = stock_data["Close"].values
# Reshape the data to 2D
stock_prices = stock_prices.reshape(-1, 1)

In [None]:
# Split the data into training and testing sets
split_date = len(stock_prices) - 120
train_data = stock_prices[:split_date]
test_data = stock_prices[split_date:]


Model Architecture:

Create an LSTM model architecture. The architecture should include one or more LSTM layers, followed by one or more Dense layers for regression.
Explain the concept of input sequences and time steps, as well as how to reshape the data to fit the LSTM input format.

In [None]:
# Create the LSTM model architecture
model = Sequential([
    LSTM(128, activation="relu", input_shape=(None, 1), return_sequences=True),
    LSTM(64, activation="relu"),
    Dense(1)
])

Model Training:

Train the LSTM model using the training data. Explain the importance of setting appropriate hyperparameters, such as batch size and number of epochs.
Monitor the training progress by plotting loss curves and observing how the model's performance changes over epochs.

In [None]:
# Set the hyperparameters
batch_size = 32
epochs = 100

# Train the model
model.compile(loss="mse", optimizer="adam")
model.fit(train_data, train_data, epochs=epochs, batch_size=batch_size)

In [None]:
# Make predictions on the testing data
predictions = model.predict(test_data)

Model Evaluation:

Use the trained model to make predictions on the testing data.
Evaluate the model's performance using appropriate metrics like Mean Squared Error (MSE) or Root Mean Squared Error (RMSE).

In [None]:
# Evaluate the model's performance
rmse = np.sqrt(mean_squared_error(predictions, test_data))
print("RMSE:", rmse)

Lower RMSE: A lower RMSE value indicates that the predicted stock prices are closer to the true stock prices. In other words, a lower RMSE implies better predictive accuracy. A value close to 0 would mean the model's predictions are almost perfect.

Visualization:

Plot the true stock prices and the predicted stock prices over time to visually assess the model's predictions.

In [None]:
# Plot the true stock prices and the predicted stock prices
plt.figure(figsize=(10, 6))
plt.plot(test_data, label="True Stock Prices", color="red") 
plt.plot(predictions, label="Predicted Stock Prices", color="blue")
plt.title("True vs. Predicted Stock Prices")
plt.xlabel("Time")
plt.ylabel("Stock Price")
plt.legend()
plt.grid(True)
plt.show()