<a href="https://colab.research.google.com/github/galib-1206/Machine-Learning-Basics/blob/main/LSTMp1_1206.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Tensorflow LSTM

Install dependencies

In [None]:
!pip install nltk



In [None]:
import nltk

nltk.download('punkt')
nltk.download('stopwords')


[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Unzipping corpora/stopwords.zip.


True

### Loading Dataset
We consider Tesla Stock Price for this work.

In [None]:
!wget https://raw.githubusercontent.com/plotly/datasets/refs/heads/master/tesla-stock-price.csv

--2024-11-27 06:32:52--  https://raw.githubusercontent.com/plotly/datasets/refs/heads/master/tesla-stock-price.csv
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 55371 (54K) [text/plain]
Saving to: ‘tesla-stock-price.csv’


2024-11-27 06:32:52 (5.89 MB/s) - ‘tesla-stock-price.csv’ saved [55371/55371]



In [None]:
import pandas as pd

data = pd.read_csv('tesla-stock-price.csv')

In [None]:
data.drop("date", inplace=True, axis=1)

In [None]:
data = data.replace({',': ''}, regex=True)

# Convert the relevant columns to numeric
data = data.apply(pd.to_numeric, errors='coerce')


In [None]:
data

Unnamed: 0,close,volume,open,high,low
0,270.49,4787699.0,264.50,273.8800,262.2400
1,259.59,6189026.0,259.06,263.2800,254.5367
2,258.78,7189257.0,261.00,261.9900,252.0100
3,252.23,8128184.0,257.53,262.2500,249.0300
4,256.88,12781560.0,264.61,265.5100,247.7700
...,...,...,...,...,...
752,210.09,4177956.0,211.99,214.8100,208.8000
753,213.03,14877020.0,227.72,228.6000,202.0000
754,228.10,2506836.0,226.50,231.1500,224.9400
755,227.01,4327574.0,223.04,230.4805,222.8700


### Preparing data for training

In [None]:
# Split into train and test sets
train_size = int(len(data) * 0.8)
train_data, test_data = data[:train_size], data[train_size:]

In [None]:
from sklearn.preprocessing import MinMaxScaler

# Normalize data
scaler = MinMaxScaler()
train_data = scaler.fit_transform(train_data)
test_data = scaler.transform(test_data)

To reshape your data into the correct format for an LSTM model, we need to transform it into a 3D array with the shape (samples, timesteps, features).

- samples refer to the number of rows in our dataset
- timesteps refer to the number of time steps in each sample sequence
- features refer to the number of variables in each time step.

In [None]:
import numpy as np

def create_sequences(data, seq_length):
    X = []
    y = []
    for i in range(len(data) - seq_length):
        X.append(data[i:i+seq_length])
        y.append(data[i+seq_length])
    return np.array(X), np.array(y)

# Define sequence length
seq_length = 50

# Create sequences for training and testing sets
X_train, y_train = create_sequences(train_data, seq_length)
X_test, y_test = create_sequences(test_data, seq_length)

# Print the shape of X_train and X_test
print("X_train shape:", X_train.shape)
print("X_test shape:", X_test.shape)

# Now reshape the data to add an extra dimension for the feature (1 feature per time step)
X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 5))
X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 5))

# Print reshaped shapes
print("Reshaped X_train shape:", X_train.shape)
print("Reshaped X_test shape:", X_test.shape)


X_train shape: (555, 50, 5)
X_test shape: (102, 50, 5)
Reshaped X_train shape: (555, 50, 5)
Reshaped X_test shape: (102, 50, 5)


### Training Data

In [None]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense


In [None]:
model = Sequential()
model.add(LSTM(units=128, return_sequences=True, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(LSTM(units=64, return_sequences=True))
model.add(LSTM(units=64, return_sequences=False))  # Don't return sequences here

model.add(Dense(units=1))  # Predict a single value for the entire sequence

model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=2)


  super().__init__(**kwargs)


Epoch 1/100
555/555 - 9s - 17ms/step - loss: 0.0410
Epoch 2/100
555/555 - 6s - 11ms/step - loss: 0.0370
Epoch 3/100
555/555 - 11s - 20ms/step - loss: 0.0360
Epoch 4/100
555/555 - 5s - 9ms/step - loss: 0.0356
Epoch 5/100
555/555 - 4s - 7ms/step - loss: 0.0352
Epoch 6/100
555/555 - 7s - 14ms/step - loss: 0.0349
Epoch 7/100
555/555 - 8s - 14ms/step - loss: 0.0350
Epoch 8/100
555/555 - 6s - 10ms/step - loss: 0.0350
Epoch 9/100
555/555 - 5s - 8ms/step - loss: 0.0349
Epoch 10/100
555/555 - 7s - 13ms/step - loss: 0.0349
Epoch 11/100
555/555 - 8s - 15ms/step - loss: 0.0349
Epoch 12/100
555/555 - 6s - 11ms/step - loss: 0.0346
Epoch 13/100
555/555 - 4s - 7ms/step - loss: 0.0346
Epoch 14/100
555/555 - 4s - 7ms/step - loss: 0.0347
Epoch 15/100
555/555 - 6s - 11ms/step - loss: 0.0348
Epoch 16/100
555/555 - 4s - 7ms/step - loss: 0.0347
Epoch 17/100
555/555 - 5s - 8ms/step - loss: 0.0346
Epoch 18/100
555/555 - 5s - 9ms/step - loss: 0.0346
Epoch 19/100
555/555 - 5s - 8ms/step - loss: 0.0346
Epoch 20/1

<keras.src.callbacks.history.History at 0x7cbbcb186ad0>