Import the following packages and functions

- random , warnings, numpy, pandas, matplotlib
- tensorflow as tf
- MinMaxScaler from ``sklearn``
- list_physical_devices, Sequential, LSTM and Dense from `tensorflow`

- Execute the following cell

In [None]:
# Setting a fixed random seed for reproducibility
random.seed(42)  # For Python's built-in random module
np.random.seed(42)  # For NumPy's random number generator
tf.random.set_seed(42)  # For TensorFlow's random number generator

# To suppress the warnings in the notebook
warnings.filterwarnings("ignore")

# Checking if GPU is installed on PC. If not, the neural network model
# will take many hours to train the model on CPU
list_physical_devices()

- Read the data file **pgm_consumption.csv**
    - Set the `datetime` column as index, while the datatype of each timestamp should NOT be str
    - Sort the index
    - Display the top 2 rows.

- Check for missing values

- Keep only the data between the following dates using the `loc` function.
    - start : **2005-01-01 00:00:00**
    - end : **2017-12-31 23:59:00**

- Resample the data by `mean` from hourly data to daily data.
- Display the top 2 rows.

- Display the bottom 2 rows

- Manually split the dataset into train subset and test subset using the `loc` function.
    - The variable names of new subsets should be `train` and `test`
    - The range of train subset should be between **2005-01-01 00:00:00** and **2016-12-31 23:59:00**
    - Similarly, the range of test subset should be from **2017-01-01 00:00:00** to onwards

- Execute the following cell

In [549]:
def get_X_and_y(dataframe:pd.DataFrame, lag:int, is_train=False) -> np.array:
    
    # Scaling the data between 0 and 1
    dataframe = dataframe.values
    scale = MinMaxScaler(feature_range=(0, 1))
    data_array = scale.fit_transform(dataframe)
    
    # defining empty lists of X and y
    X = []
    y = []

    # Range should be from the lag Values to END 
    for i in range(lag, data_array.shape[0]):
        
        # X_Train 
        X.append(data_array[i-lag:i])
        
        # y Would be t+lag Value based on past lag Values 
        y.append(data_array[i])

    # Convert into Numpy Array
    X = np.array(X)
    y = np.array(y)
    
    # Shape should be Number of [Datapoints , Steps , 1 )
    # we convert into 3-d Vector or #rd Dimesnsion
    X = np.reshape(X, newshape=(X.shape[0], X.shape[1], 1))
    
    if bool(is_train):
        return X, y, scale
    else:
        return X,y
    

- In the cell below, make a new variable `lag` having value equals to **30**

- Here, You should add lagging features of the `consumption` data.
    - Earlier, you made the `lag=30`. This refers to lag features for 30 previous days.
    - Using the user-defined function `get_X_and_y`, make the following subsets of earlier subsets.
        - `X_train` and `y_train`. Be careful about the output variables of the function.
        - `X_test` and `y_test`. Again, be careful about the output variables of the function.

- Display the shape of **X_train, y_train, X_test, y_test**.
    - Do you see any anomaly in the shape of X_test and y_test?

- Make a tensorflow model by sequentially adding the layers

- Fit the model on `X_train` and `y_train` and store/attribute in a variable `history`. Add the following arguments/inputs to the `fit` method.
    - epochs = 50
    - batch_size = 32
    - validation_split = 0.2

- Execute the following cell

In [554]:
def plot_losses(history):
    # Extract the training loss and validation loss
    train_loss = history.history['loss']
    validation_loss = history.history['val_loss']
    
    # Plotting the training and validation loss
    plt.figure(figsize=(10, 6))
    plt.plot(train_loss, label='Training Loss')
    plt.plot(validation_loss, label='Validation Loss')
    plt.title('Training and Validation Loss')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.legend()
    plt.grid(True)
    plt.show()

- Plot the losses. 
- In a markdown below
    - Tell whether the model is well learning, underfitting or overfitting?
    - Do you validate the model to be used on test subset? Justify it.

- Predict on **X_test** and store it in a variable `y_predict_test`

- Using the ``scale`` that you established above, `inverse_transform` the **y_test** and and **y_predict_test**
    - You can store the results in same variables i.e. **y_test** and and **y_predict_test**

- If you display both, you see that they are n_dimensional arrays. We need them to be 1 dimensional arrays just like lists.
    - Use the `flatten` function of numpy arrays to transform it into 1-D.

- Make a dataframe of the two arrays above i.e. **y_test** and **y_predict_test**.
- Plot the dataframe

- Save the model in `.keras` format. See [New high-level .keras format](https://www.tensorflow.org/tutorials/keras/save_and_load#new_high-level_keras_format)