<a href="https://colab.research.google.com/github/saifalyglt/RNNs_Learning/blob/main/LSTM_Learning_Labs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [3]:
import kagglehub

# Download latest version
path = kagglehub.dataset_download("divyanshpalia/manual-load-of-daily-temperature")

print("Path to dataset files:", path)

Downloading from https://www.kaggle.com/api/v1/datasets/download/divyanshpalia/manual-load-of-daily-temperature?dataset_version_number=1...


100%|██████████| 11.5k/11.5k [00:00<00:00, 21.7MB/s]

Extracting files...
Path to dataset files: /root/.cache/kagglehub/datasets/divyanshpalia/manual-load-of-daily-temperature/versions/1





In [5]:
import pandas as pd
import os

# List the contents of the directory to find the correct file path
print(os.listdir(path))

# Once you identify the correct file path, you can read it like this:
# df = pd.read_csv(os.path.join(path, "your_correct_subfolder", "daily-temperature.csv"))
# df.head()

['MaunaLoaDailyTemps.csv']


In [6]:
df = pd.read_csv(os.path.join(path, "MaunaLoaDailyTemps.csv"))
df.head()

Unnamed: 0,DATE,MinTemp,MaxTemp,AvgTemp,Sunrise,Sunset
0,1/1/2014,33.0,46.0,40.0,657,1756
1,1/2/2014,35.0,50.0,43.0,657,1756
2,1/3/2014,36.0,45.0,41.0,657,1757
3,1/4/2014,32.0,41.0,37.0,658,1757
4,1/5/2014,24.0,38.0,31.0,658,1758


# Task
Build and evaluate an LSTM model for time series forecasting using the `df` DataFrame.

## Prepare Data for LSTM

### Subtask:
Preprocess the `df` DataFrame for LSTM model input, including converting dates, handling missing values, scaling numerical features, and creating sequences.


**Reasoning**:
The first step in preparing the data for an LSTM model is to convert the 'DATE' column to datetime objects and set it as the DataFrame's index, which is crucial for time series analysis.



In [7]:
df['DATE'] = pd.to_datetime(df['DATE'])
df = df.set_index('DATE')
print("DataFrame after converting 'DATE' to datetime and setting as index:")
print(df.head())

DataFrame after converting 'DATE' to datetime and setting as index:
            MinTemp  MaxTemp  AvgTemp  Sunrise  Sunset
DATE                                                  
2014-01-01     33.0     46.0     40.0      657    1756
2014-01-02     35.0     50.0     43.0      657    1756
2014-01-03     36.0     45.0     41.0      657    1757
2014-01-04     32.0     41.0     37.0      658    1757
2014-01-05     24.0     38.0     31.0      658    1758


**Reasoning**:
Now, I will handle any missing values in the `MinTemp`, `MaxTemp`, and `AvgTemp` columns using forward fill, which is a suitable imputation method for time series data.



In [8]:
for col in ['MinTemp', 'MaxTemp', 'AvgTemp']:
    df[col] = df[col].ffill()

print("Missing values after forward fill:")
print(df[['MinTemp', 'MaxTemp', 'AvgTemp']].isnull().sum())

Missing values after forward fill:
MinTemp    0
MaxTemp    0
AvgTemp    0
dtype: int64


**Reasoning**:
Next, I will select the numerical features (`MinTemp`, `MaxTemp`, `AvgTemp`, `Sunrise`, `Sunset`) that will be used for the LSTM model, as specified in the instructions.



In [9]:
features = ['MinTemp', 'MaxTemp', 'AvgTemp', 'Sunrise', 'Sunset']
data = df[features].values
print("Selected features for LSTM model:")
print(df[features].head())

Selected features for LSTM model:
            MinTemp  MaxTemp  AvgTemp  Sunrise  Sunset
DATE                                                  
2014-01-01     33.0     46.0     40.0      657    1756
2014-01-02     35.0     50.0     43.0      657    1756
2014-01-03     36.0     45.0     41.0      657    1757
2014-01-04     32.0     41.0     37.0      658    1757
2014-01-05     24.0     38.0     31.0      658    1758


**Reasoning**:
Now I will initialize and fit a MinMaxScaler to the selected features, then transform them. This step is crucial for LSTM models as they are sensitive to the scale of input data.



In [10]:
from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range=(0, 1))
scaled_data = scaler.fit_transform(data)

print("Scaled data (first 5 rows):\n", scaled_data[:5])

Scaled data (first 5 rows):
 [[0.40740741 0.32258065 0.34615385 0.72611465 0.08588957]
 [0.48148148 0.4516129  0.46153846 0.72611465 0.08588957]
 [0.51851852 0.29032258 0.38461538 0.72611465 0.09202454]
 [0.37037037 0.16129032 0.23076923 0.73248408 0.09202454]
 [0.07407407 0.06451613 0.         0.73248408 0.09815951]]


**Reasoning**:
I will define a function to create input sequences (`X`) and target values (`y`) from the scaled data, using a specified `look_back` window. This function will be crucial for structuring the time series data for the LSTM model.



In [11]:
import numpy as np

def create_sequences(data, look_back):
    X, y = [], []
    for i in range(len(data) - look_back):
        X.append(data[i:(i + look_back), :])
        y.append(data[i + look_back, 0]) # Predicting the 'MinTemp' of the next day as the target
    return np.array(X), np.array(y)

print("Function 'create_sequences' defined.")

Function 'create_sequences' defined.


**Reasoning**:
Now I will apply the `create_sequences` function to the `scaled_data` using a `look_back` value of 7 to generate the `X` (input sequences) and `y` (target values) arrays, which are necessary for training the LSTM model.



In [12]:
look_back = 7
X, y = create_sequences(scaled_data, look_back)

print(f"Shape of X (sequences): {X.shape}")
print(f"Shape of y (targets): {y.shape}")


Shape of X (sequences): (1818, 7, 5)
Shape of y (targets): (1818,)


**Reasoning**:
I will split the `X` and `y` arrays into training and testing sets while preserving the temporal order, which is crucial for time series forecasting. I will use an 80/20 split for this purpose.



In [13]:
train_size = int(len(X) * 0.8)
X_train, X_test = X[0:train_size], X[train_size:len(X)]
y_train, y_test = y[0:train_size], y[train_size:len(y)]

print(f"Shape of X_train: {X_train.shape}")
print(f"Shape of y_train: {y_train.shape}")
print(f"Shape of X_test: {X_test.shape}")
print(f"Shape of y_test: {y_test.shape}")

Shape of X_train: (1454, 7, 5)
Shape of y_train: (1454,)
Shape of X_test: (364, 7, 5)
Shape of y_test: (364,)


## Build LSTM Model

### Subtask:
Design and compile an LSTM model using a deep learning framework like Keras or TensorFlow.


**Reasoning**:
I will import the necessary layers and model from TensorFlow/Keras to define the LSTM architecture as per the instructions, including `Sequential`, `LSTM`, `Dense`, and `Dropout`.



In [14]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout

print("Keras modules imported successfully.")

Keras modules imported successfully.


**Reasoning**:
I will define the LSTM model architecture, including multiple LSTM layers with dropout for regularization, and then compile it with the Adam optimizer and Mean Squared Error loss, finally printing the model summary to confirm the structure.



In [15]:
model = Sequential()
model.add(LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(Dropout(0.2))
model.add(LSTM(units=50, return_sequences=False))
model.add(Dropout(0.2))
model.add(Dense(units=1))

model.compile(optimizer='adam', loss='mean_squared_error')

print("LSTM Model Summary:")
model.summary()

  super().__init__(**kwargs)


LSTM Model Summary:


**Reasoning**:
Now that the LSTM model is defined and compiled, I will proceed to train the model using the prepared training data (`X_train`, `y_train`) and validate it with the test data (`X_test`, `y_test`) to monitor its performance during training.



In [16]:
history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test), verbose=1)

print("Model training complete.")

Epoch 1/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 23ms/step - loss: 0.1393 - val_loss: 0.0122
Epoch 2/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - loss: 0.0178 - val_loss: 0.0118
Epoch 3/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - loss: 0.0181 - val_loss: 0.0115
Epoch 4/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - loss: 0.0163 - val_loss: 0.0107
Epoch 5/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step - loss: 0.0156 - val_loss: 0.0119
Epoch 6/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step - loss: 0.0152 - val_loss: 0.0106
Epoch 7/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 11ms/step - loss: 0.0164 - val_loss: 0.0107
Epoch 8/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 13ms/step - loss: 0.0149 - val_loss: 0.0111
Epoch 9/50
[1m46/46[0m [32m━━━━━━━━━━━━━━━━━━━━

## Evaluate LSTM Model

### Subtask:
Evaluate the performance of the trained LSTM model on the test dataset.



**Reasoning**:
Now I will make predictions using the trained LSTM model on the test dataset (`X_test`) to get the model's forecasts.



In [17]:
y_pred_scaled = model.predict(X_test)
print("Predictions on test data (first 5):")
print(y_pred_scaled[:5])

[1m12/12[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 17ms/step
Predictions on test data (first 5):
[[0.5293625 ]
 [0.5255373 ]
 [0.513582  ]
 [0.48747796]
 [0.61139524]]


**Reasoning**:
I need to inverse transform the scaled predictions (`y_pred_scaled`) and the actual scaled test targets (`y_test`) to their original scale. This is crucial for interpreting the model's output in the context of the original data (temperatures) and for calculating performance metrics in the correct units.



In [18]:
y_test_reshaped = y_test.reshape(-1, 1)
empty_array = np.zeros((len(y_test_reshaped), scaled_data.shape[1]))
empty_array[:, 0] = y_test_reshaped[:, 0]
y_test_inversed = scaler.inverse_transform(empty_array)[:, 0]

empty_array[:, 0] = y_pred_scaled[:, 0]
y_pred_inversed = scaler.inverse_transform(empty_array)[:, 0]

print("Original test targets (first 5) after inverse transformation:", y_test_inversed[:5])
print("Predicted targets (first 5) after inverse transformation:", y_pred_inversed[:5])

Original test targets (first 5) after inverse transformation: [35. 32. 35. 39. 38.]
Predicted targets (first 5) after inverse transformation: [36.29278749 36.18950742 35.86671376 35.16190487 38.50767148]


**Reasoning**:
I will calculate the Mean Squared Error (MSE) and Mean Absolute Error (MAE) between the inverse-transformed actual and predicted values to quantify the model's performance and understand its accuracy.



In [19]:
from sklearn.metrics import mean_squared_error, mean_absolute_error

mse = mean_squared_error(y_test_inversed, y_pred_inversed)
mae = mean_absolute_error(y_test_inversed, y_pred_inversed)

print(f"Mean Squared Error (MSE): {mse}")
print(f"Mean Absolute Error (MAE): {mae}")

Mean Squared Error (MSE): 4.553486898729161
Mean Absolute Error (MAE): 1.6902047838647296


## Final Task

### Subtask:
Summarize the process of building and evaluating the LSTM model, including insights from the model's performance and potential next steps.


## Summary:

### Data Analysis Key Findings

*   **Data Preprocessing:** The `DATE` column was successfully converted to datetime objects and set as the DataFrame index. Missing values in 'MinTemp', 'MaxTemp', and 'AvgTemp' were imputed using forward-fill. Selected features ('MinTemp', 'MaxTemp', 'AvgTemp', 'Sunrise', 'Sunset') were scaled using `MinMaxScaler`.
*   **Sequence Generation:** Data was transformed into sequences with a `look_back` window of 7, resulting in input sequences `X` of shape `(1818, 7, 5)` and target values `y` (predicting 'MinTemp') of shape `(1818,)`.
*   **Train-Test Split:** The dataset was split into training and testing sets (80/20 ratio), yielding `X_train` of shape `(1454, 7, 5)`, `y_train` of shape `(1454,)`, `X_test` of shape `(364, 7, 5)`, and `y_test` of shape `(364,)`.
*   **LSTM Model Architecture:** An LSTM model was constructed with two LSTM layers (50 units each, with `return_sequences=True` for the first and `False` for the second), two Dropout layers (0.2 rate), and a final Dense output layer (1 unit). The model had 31,451 trainable parameters.
*   **Model Training:** The model was trained for 50 epochs with a batch size of 32. Training logs indicated a decrease in both training and validation loss, suggesting successful learning.
*   **Model Evaluation:** After inverse transforming the predictions and actual values to their original scale, the model achieved a Mean Squared Error (MSE) of 4.553 and a Mean Absolute Error (MAE) of 1.690 on the test set.

### Insights or Next Steps

*   The MAE of 1.690 suggests that, on average, the model's predictions for 'MinTemp' are off by approximately 1.69 degrees from the actual values, which could be considered a good performance for temperature forecasting depending on the application's tolerance.
*   Further model optimization could be explored by tuning hyperparameters (e.g., number of LSTM units, dropout rate, `look_back` period), experimenting with different optimizers or loss functions, or incorporating additional relevant features to potentially improve forecasting accuracy.


# Task
`finish`

## Final Task

### Subtask:
Summarize the process of building and evaluating the LSTM model, including insights from the model's performance and potential next steps.


## Summary:

### Data Analysis Key Findings
*   The final subtask for the analysis has been defined: to summarize the process of building and evaluating the LSTM model.

### Insights or Next Steps
*   The next step involves synthesizing all previous analytical work to provide a comprehensive summary of the LSTM model, including its performance insights and potential future directions.


### Explanation of `model = Sequential()` and LSTM Model Architecture

`model = Sequential()` initializes a **Sequential model** in Keras. This is the simplest type of model, where layers are added one after another in a linear stack.

The following code defines the architecture and compilation process for our LSTM model:

*   `model = Sequential()`: As explained, this creates an empty linear stack where we can add layers.

*   `model.add(LSTM(units=50, return_sequences=True, input_shape=(X_train.shape[1], X_train.shape[2])))`: This line adds the first LSTM layer:
    *   `LSTM`: Long Short-Term Memory, a type of recurrent neural network (RNN) suitable for sequence prediction problems.
    *   `units=50`: Specifies that this LSTM layer will have 50 memory units (neurons).
    *   `return_sequences=True`: This is crucial for stacked LSTM layers. It means the layer will return the full sequence of outputs for each time step, rather than just the last output. This is required when passing the output to another LSTM layer.
    *   `input_shape=(X_train.shape[1], X_train.shape[2])`: Defines the expected input shape for the first layer. `X_train.shape[1]` is the `look_back` window (7 in our case), and `X_train.shape[2]` is the number of features (5 in our case).

*   `model.add(Dropout(0.2))`: A Dropout layer is added for regularization. It randomly sets 20% of the input units to 0 at each update during training time, which helps prevent overfitting.

*   `model.add(LSTM(units=50, return_sequences=False))`: This adds a second LSTM layer:
    *   `units=50`: Again, 50 memory units.
    *   `return_sequences=False`: Since this is the last LSTM layer before the output layer, we only need the output from the final time step of the sequence, so it returns a single output for each input sequence.

*   `model.add(Dropout(0.2))`: Another Dropout layer with a 20% rate for further regularization.

*   `model.add(Dense(units=1))`: This is the output layer:
    *   `Dense`: A fully connected neural network layer.
    *   `units=1`: Indicates a single output neuron, as we are predicting a single continuous value (MinTemp).

*   `model.compile(optimizer='adam', loss='mean_squared_error')`: This configures the model for training:
    *   `optimizer='adam'`: Specifies the Adam optimizer, a popular and effective optimization algorithm for deep learning models.
    *   `loss='mean_squared_error'`: Defines the loss function to be minimized during training. Mean Squared Error (MSE) is a common choice for regression tasks like temperature forecasting, as it measures the average squared difference between the estimated values and the actual value.