# Neural Network Model

We use Keras to create a Neural Network Model. RNNs, and especially LSTMs, seem to be particularly good for time-series predictions, so this is our strategy. Our goal is to provide 36 hours forecast of the temperature using 4 days of temperature data. We run the machine learning model on the Google Colab GPU.

### Importing libraries and getting data

In [None]:
from google.colab import drive
drive.mount('/content/drive/')

In [None]:
!ls "/content/drive/My Drive"

We upload the libraries. 

In [45]:
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.python.keras.layers import Dense, LSTM, Dropout, Bidirectional
from tensorflow.python.keras.models import Sequential
from tensorflow.python.keras.wrappers.scikit_learn import KerasRegressor
from sklearn.preprocessing import MinMaxScaler

In [82]:
df_nn = pd.read_csv('/content/drive/My Drive/weatherpredict/weather_data_initial_clean.csv')
df_nn['dt_iso'] = pd.to_datetime(df_nn['dt_iso'], format='%Y-%m-%d %H:%M:%S.%f')
df_nn = df_nn.set_index('dt_iso')
print('Data Shape = {}'.format(df_nn.shape))
print(df_nn.columns)

Data Shape = (364512, 13)
Index(['temp', 'feels_like', 'pressure', 'humidity', 'wind_speed', 'wind_deg',
       'rain_1h', 'rain_3h', 'snow_1h', 'snow_3h', 'clouds_all', 'weather_id',
       'weather_main'],
      dtype='object')


We only keep the temperature column. 

In [83]:
df_nn = df_nn['temp']

In [84]:
len(df_nn)

364512

### Preparing training and test data 

We first split the data into a training set and a test set. As we have time-series data, it is important to not shuffle those sets. 

In [85]:
# Splitting into training and test data, in a 80-20 split
split_point = int(len(df_nn)*(80/100))
nn_train = df_nn[:split_point]
nn_test = df_nn[split_point:]

In [76]:
print("Length of nn_train:", len(nn_train))
print("Length of nn_test:", len(nn_test))

Length of nn_train: 291609
Length of nn_test: 72903


Next we use MinMaxScaler to normalise the data. 

In [86]:
# Transforming data into numpy array 

nn_train = nn_train.to_numpy().reshape(-1,1)
nn_test = nn_test.to_numpy().reshape(-1,1)

print("Shape of nn_train: ", nn_train.shape)
print("Shape of nn_test: ", nn_test.shape)

Shape of nn_train:  (291609, 1)
Shape of nn_test:  (72903, 1)


In [87]:
# Normalising using MinMaxScaler 
min_max_scaler = MinMaxScaler()

nn_train_norm = min_max_scaler.fit_transform(nn_train)
nn_test_norm = min_max_scaler.transform(nn_test)

In [88]:
print("Shape of nn_train_norm: ", nn_train_norm.shape)
print("Shape of nn_test_norm: ", nn_test_norm.shape)

Shape of nn_train_norm:  (291609, 1)
Shape of nn_test_norm:  (72903, 1)


In [89]:
# Creating train and test data
x_train = []
y_train = []
x_test = []
y_test = []

# Setting 'n_future' days to predict using 'n_past'days. 
n_future = 1.5
n_past = 4

# Getting number of hours 
n_future = int(n_future * 24)
n_past = int(n_past * 24) 

for i in range(0,len(nn_train_norm)-n_past-n_future+1):
    x_train.append(nn_train_norm[i : i + n_past , 0])     
    y_train.append(nn_train_norm[i + n_past : i + n_past + n_future , 0 ])
for i in range(0,len(nn_test_norm)-n_past-n_future+1):
    x_test.append(nn_test_norm[i : i + n_past , 0])  
    y_test.append(nn_test_norm[i + n_past : i + n_past + n_future , 0 ])

x_train , y_train, x_test, y_test = np.array(x_train), np.array(y_train), np.array(x_test), np.array(y_test)

x_train = np.reshape(x_train, (x_train.shape[0] , x_train.shape[1], 1) )
x_test = np.reshape(x_test, (x_test.shape[0] , x_test.shape[1], 1) )

print('Training data:')
print('x_train: ', x_train.shape)
print('y_train: ', y_train.shape)
print('x_test: ', x_test.shape)
print('y_test: ', y_test.shape)

Training data:
x_train:  (291478, 96, 1)
y_train:  (291478, 36)
x_test:  (72772, 96, 1)
y_test:  (72772, 36)


### Building the model

We built the model using LSTM. We use Google Colab to perform the computations using a GPU. 

In [59]:
units = x_train.shape[1]

regressor = Sequential()

regressor.add(Bidirectional(LSTM(units=units, return_sequences=True, input_shape = (x_train.shape[1],1) ) ))
regressor.add(Dropout(0.2))

regressor.add(LSTM(units= units , return_sequences=True))
regressor.add(Dropout(0.2))

# regressor.add(LSTM(units= units , return_sequences=True))
# regressor.add(Dropout(0.2))

regressor.add(LSTM(units= units))
regressor.add(Dropout(0.2))
regressor.add(Dense(units = n_future,activation='linear'))

regressor.compile(optimizer='adam', loss='mean_squared_error', metrics=['mse', 'mae'])
regressor.fit(x_train, y_train, epochs=20,batch_size=512)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x7fe97e4e6518>

### Saving the model

We save the model. We will test it and analyse results in the 'lstm_model_test' notebook. 

In [66]:
regressor.save("/content/drive/My Drive/weatherpredict/lstm_model.h5")