Code Documentation
====================

This code is for time series forecasting using LSTM (Long Short Term Memory) networks with TensorFlow and Keras. It uses the 'jena\_climate\_2009\_2016.csv' dataset which contains weather data collected in Jena, Germany.

Importing Libraries and Loading Data
-----------------------------------

The required libraries are imported and the dataset is loaded into a pandas dataframe. The first few rows of the dataframe and the column names are printed.
```python
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras

df= pd.read_csv('jena_climate_2009_2016.csv')
print(df.head())
print(list(df.columns))
```
Preparing Data
--------------

The raw data is separated from the temperature column. The temperature column is plotted to visualize the data. The data is then split into training, validation, and testing sets.
```python
raw_data=df.iloc[:,1:]
temperature=df.iloc[:,2]
print(temperature.head())

temperature.iloc[:1440].plot(figsize=(12,5))

num_train_samples=int(0.5*len(temperature))
num_val_samples=int(0.25*len(temperature))
num_test_samples=len(temperature)-num_train_samples-num_val_samples
print(f'number of train samoles :{num_train_samples}  \nnumber of test samples :{num_test_samples} \nnumber of valedation data :{num_val_samples}  ')
```
The raw data is then normalized by subtracting the mean and dividing by the standard deviation.
```python
mean=raw_data[:num_train_samples].mean(axis=0)
raw_data-=mean
std=raw_data[:num_train_samples].std(axis=0)
raw_data/=std
raw_data.iloc[:1440].plot(figsize=(120,5))
```
Creating Datasets
-----------------

The training, validation, and testing datasets are created using the `timeseries_dataset_from_array` function from Keras.
```python
sampling_rate=6
sequence_length=120
delay=sampling_rate*(sequence_lenght+24-1)
batch_size=256

train_dataset = keras.utils.timeseries_dataset_from_array(
    raw_data[:-delay],
    targets=temperature[delay:],
    sampling_rate=sampling_rate,
    sequence_length=sequence_length,
    shuffle=True,
    batch_size=batch_size,
    start_index=0,
    end_index=num_train_samples)
val_dataset = keras.utils.timeseries_dataset_from_array(
    raw_data[:-delay],
    targets=temperature[delay:],
    sampling_rate=sampling_rate,
    sequence_length=sequence_length,
    shuffle=True,
    batch_size=batch_size,
    start_index=num_train_samples,
    end_index=num_train_samples + num_val_samples)

test_dataset = keras.utils.timeseries_dataset_from_array(
    raw_data[:-delay],
    targets=temperature[delay:],
    sampling_rate=sampling_rate,
    sequence_length=sequence_length,
    shuffle=True,
    batch_size=batch_size,
    start_index=num_train_samples + num_val_samples)
```
The shape of the samples and targets in the training dataset is printed.
```python
for samples, targets in train_dataset:
    print("samples shape:", samples.shape)
    print("targets shape:", targets.shape)
    break
```
Building and Training the Model
-------------------------------

The first LSTM model is created with 16 LSTM cells and a dense layer with 1 unit. The model is compiled with the RMSprop optimizer and the mean squared error loss function. The model is trained on the training dataset for 10 epochs and the validation data is used for validation.
```python
inputs=keras.Input(shape=(sequence_length,raw_data.shape[-1]))
x=keras.layers.LSTM(16)(inputs)
outputs=keras.layers.Dense(1)(x)
model = keras.Model(inputs, outputs)

model.compile(optimizer="rmsprop", loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
                    epochs=10,
                    validation_data=val_dataset)
```
The training and validation mean absolute error (MAE) are plotted.
```python
loss = history.history["mae"]
val_loss = history.history["val_mae"]
epochs = range(1, len(loss) + 1)
plt.figure()
plt.plot(epochs, loss, "bo", label="Training MAE")
plt.plot(epochs, val_loss, "b", label="Validation MAE")
plt.title("Training and validation MAE")
plt.legend()
plt.show()
```
A second LSTM model is created with 32 LSTM cells, recurrent dropout, and a dense layer with 1 unit. The model is compiled with the RMSprop optimizer and the mean squared error loss function. The model is trained on the training dataset for 10 epochs.
```python
inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))
x = keras.layers.LSTM(32, recurrent_dropout=0.25)(inputs)
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(1)(x)
model = keras.Model(inputs, outputs)

model.compile(optimizer="rmsprop", loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
                    epochs=10,)
```
The training and validation mean absolute error (MAE) are plotted.
```python
loss = history.history["mae"]
val_loss = history.history["val_mae"]
epochs = range(1, len(loss) + 1)
plt.figure()
plt.plot(epochs, loss, "bo", label="Training MAE")
plt.plot(epochs, val_loss, "b", label="Validation MAE")
plt.title("Training and validation MAE")
plt.legend()
plt.show()
```

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow import keras
df= pd.read_csv('jena_climate_2009_2016.csv')
print(df.head())
print(list(df.columns))

In [None]:
raw_data=df.iloc[:,1:]
temperature=df.iloc[:,2]
print(temperature.head())

In [None]:
temperature.iloc[:1440].plot(figsize=(12,5))

In [None]:
# sparating teat train data
num_train_samples=int(0.5*len(temperature))
num_val_samples=int(0.25*len(temperature))
num_test_samples=len(temperature)-num_train_samples-num_val_samples
print(f'number of train samoles :{num_train_samples}  \nnumber of test samples :{num_test_samples} \nnumber of valedation data :{num_val_samples}  ')

In [None]:
mean=raw_data[:num_train_samples].mean(axis=0)
raw_data-=mean
std=raw_data[:num_train_samples].std(axis=0)
raw_data/=std
raw_data.iloc[:1440].plot(figsize=(120,5))

In [None]:
sampling_rate=6
sequence_length=120
delay=sampling_rate*(sequence_lenght+24-1)
batch_size=256

In [None]:


train_dataset = keras.utils.timeseries_dataset_from_array(
    raw_data[:-delay],
    targets=temperature[delay:],
    sampling_rate=sampling_rate,
    sequence_length=sequence_length,
    shuffle=True,
    batch_size=batch_size,
    start_index=0,
    end_index=num_train_samples)
val_dataset = keras.utils.timeseries_dataset_from_array(
    raw_data[:-delay],
    targets=temperature[delay:],
    sampling_rate=sampling_rate,
    sequence_length=sequence_length,
    shuffle=True,
    batch_size=batch_size,
    start_index=num_train_samples,
    end_index=num_train_samples + num_val_samples)

test_dataset = keras.utils.timeseries_dataset_from_array(
    raw_data[:-delay],
    targets=temperature[delay:],
    sampling_rate=sampling_rate,
    sequence_length=sequence_length,
    shuffle=True,
    batch_size=batch_size,
    start_index=num_train_samples + num_val_samples)

In [None]:
for samples, targets in train_dataset:
    print("samples shape:", samples.shape)
    print("targets shape:", targets.shape)
    break 

In [None]:
inputs=keras.Input(shape=(sequence_length,raw_data.shape[-1]))
x=keras.layers.LSTM(16)(inputs)
outputs=keras.layers.Dense(1)(x)
model = keras.Model(inputs, outputs)

In [None]:
model.compile(optimizer="rmsprop", loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
                    epochs=10,
                    validation_data=val_dataset)


In [None]:
loss = history.history["mae"]
val_loss = history.history["val_mae"]
epochs = range(1, len(loss) + 1)
plt.figure()
plt.plot(epochs, loss, "bo", label="Training MAE")
plt.plot(epochs, val_loss, "b", label="Validation MAE")
plt.title("Training and validation MAE")
plt.legend()
plt.show()


In [None]:
inputs = keras.Input(shape=(sequence_length, raw_data.shape[-1]))
x = keras.layers.LSTM(32, recurrent_dropout=0.25)(inputs)
x = keras.layers.Dropout(0.5)(x)
outputs = keras.layers.Dense(1)(x)
model = keras.Model(inputs, outputs)


model.compile(optimizer="rmsprop", loss="mse", metrics=["mae"])
history = model.fit(train_dataset,
                    epochs=10,)

In [None]:
loss = history.history["mae"]
val_loss = history.history["val_mae"]
epochs = range(1, len(loss) + 1)
plt.figure()
plt.plot(epochs, loss, "bo", label="Training MAE")
plt.plot(epochs, val_loss, "b", label="Validation MAE")
plt.title("Training and validation MAE")
plt.legend()
plt.show()