### Predicting the Weather with Neural Networks

Example Neural Network

![Example Neural Network](ExampleNN.jpg)

Importing Useful Libraries

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import GridSearchCV

Readin the Data into Program Memory

In [2]:
df = pd.read_csv("Brisbane Data.csv")
df.head(2)

Unnamed: 0.1,Unnamed: 0,Date,Location,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustDir,WindGustSpeed,...,Humidity9am,Humidity3pm,Pressure9am,Pressure3pm,Cloud9am,Cloud3pm,Temp9am,Temp3pm,RainToday,RainTomorrow
0,84007,01-07-2008,Brisbane,10.3,25.2,0.0,1.4,9.5,WNW,26.0,...,81.0,37.0,1019.6,1014.8,0.0,1.0,14.9,24.6,No,No
1,84008,02-07-2008,Brisbane,8.1,22.9,0.0,2.0,9.8,W,30.0,...,41.0,30.0,1018.8,1015.0,0.0,0.0,16.2,22.4,No,No


In [3]:
df.tail(2)

Unnamed: 0.1,Unnamed: 0,Date,Location,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustDir,WindGustSpeed,...,Humidity9am,Humidity3pm,Pressure9am,Pressure3pm,Cloud9am,Cloud3pm,Temp9am,Temp3pm,RainToday,RainTomorrow
3191,87198,24-06-2017,Brisbane,10.4,24.5,0.0,3.4,8.9,S,17.0,...,75.0,33.0,1018.6,1015.4,7.0,5.0,14.3,24.0,No,No
3192,87199,25-06-2017,Brisbane,11.0,24.2,0.0,2.2,9.8,ENE,20.0,...,68.0,53.0,1020.5,1017.3,6.0,3.0,15.9,22.6,No,No


The variables Location and Date are unwanted for the further analysis and modelling. It is because the Location feature only contains the "Perth", since the data is only for the Perth Region. Date is also irrelevant because we are not doing any time series analysis or comaprison for now, we are only going to model the data in such a way that we can predict if the rain is going to occur tomorrow or not based on the historical data. Although , we can see that Data is available for each day from July 1st 2008 upto June 25th 2017.

In [4]:
exclude = ["Unnamed: 0","Date", "Location"]
for att in exclude:
 del df[att] # Deleting Date and Location features from the data
df.head()

Unnamed: 0,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustDir,WindGustSpeed,WindDir9am,WindDir3pm,WindSpeed9am,...,Humidity9am,Humidity3pm,Pressure9am,Pressure3pm,Cloud9am,Cloud3pm,Temp9am,Temp3pm,RainToday,RainTomorrow
0,10.3,25.2,0.0,1.4,9.5,WNW,26.0,SSW,W,6.0,...,81.0,37.0,1019.6,1014.8,0.0,1.0,14.9,24.6,No,No
1,8.1,22.9,0.0,2.0,9.8,W,30.0,W,WNW,15.0,...,41.0,30.0,1018.8,1015.0,0.0,0.0,16.2,22.4,No,No
2,9.7,22.4,0.0,5.8,9.4,E,22.0,SW,E,7.0,...,55.0,52.0,1021.4,1019.1,1.0,4.0,15.4,21.3,No,No
3,11.8,20.0,0.8,1.8,1.1,SW,24.0,SW,SSE,9.0,...,76.0,53.0,1023.5,1021.7,7.0,7.0,14.1,19.6,No,No
4,12.3,16.7,0.0,2.0,0.3,S,37.0,S,SSW,11.0,...,81.0,89.0,1027.3,1026.2,7.0,8.0,16.1,15.0,No,Yes


Dealing with the missing values

In [5]:
df.iloc[214] #From visual inspection we can see that some values are missing
# MinTemp value missing at row number 215 meaning at index 214

MinTemp             NaN
MaxTemp            29.4
Rainfall            NaN
Evaporation         8.2
Sunshine           10.9
WindGustDir         ENE
WindGustSpeed      35.0
WindDir9am            E
WindDir3pm          ESE
WindSpeed9am       15.0
WindSpeed3pm       13.0
Humidity9am        72.0
Humidity3pm        64.0
Pressure9am      1017.0
Pressure3pm      1015.0
Cloud9am            5.0
Cloud3pm            6.0
Temp9am            26.1
Temp3pm            28.0
RainToday           NaN
RainTomorrow         No
Name: 214, dtype: object

In [6]:
df.isnull().sum() # Number of Missing values in each column

MinTemp           9
MaxTemp          14
Rainfall         32
Evaporation      19
Sunshine         49
WindGustDir      41
WindGustSpeed    41
WindDir9am       70
WindDir3pm       34
WindSpeed9am      1
WindSpeed3pm      8
Humidity9am       4
Humidity3pm      15
Pressure9am       1
Pressure3pm       8
Cloud9am          1
Cloud3pm          2
Temp9am           4
Temp3pm          15
RainToday        32
RainTomorrow     32
dtype: int64

In [7]:
original_size = len(df) # Initial size of dataset

In [8]:
df = df.dropna(axis = 0) # Dropping rows having missing values
print(df.isnull().sum()) # Seeing how many missing values are there now in dataset
print()
print("Number of rows dropped", original_size-len(df))

MinTemp          0
MaxTemp          0
Rainfall         0
Evaporation      0
Sunshine         0
WindGustDir      0
WindGustSpeed    0
WindDir9am       0
WindDir3pm       0
WindSpeed9am     0
WindSpeed3pm     0
Humidity9am      0
Humidity3pm      0
Pressure9am      0
Pressure3pm      0
Cloud9am         0
Cloud3pm         0
Temp9am          0
Temp3pm          0
RainToday        0
RainTomorrow     0
dtype: int64

Number of rows dropped 240


Boolean Variables to 0s and 1s

In [9]:
bools = ["RainToday", "RainTomorrow"]
# these are the columns in the dataset that can only take one of the two value
# Meaning that It is raining (1) or it isn't (0)
for var in bools:
    df[var] = df[var].map({
        'Yes': 1,
        'No': 0,
    })
df.head()

Unnamed: 0,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustDir,WindGustSpeed,WindDir9am,WindDir3pm,WindSpeed9am,...,Humidity9am,Humidity3pm,Pressure9am,Pressure3pm,Cloud9am,Cloud3pm,Temp9am,Temp3pm,RainToday,RainTomorrow
0,10.3,25.2,0.0,1.4,9.5,WNW,26.0,SSW,W,6.0,...,81.0,37.0,1019.6,1014.8,0.0,1.0,14.9,24.6,0,0
1,8.1,22.9,0.0,2.0,9.8,W,30.0,W,WNW,15.0,...,41.0,30.0,1018.8,1015.0,0.0,0.0,16.2,22.4,0,0
2,9.7,22.4,0.0,5.8,9.4,E,22.0,SW,E,7.0,...,55.0,52.0,1021.4,1019.1,1.0,4.0,15.4,21.3,0,0
3,11.8,20.0,0.8,1.8,1.1,SW,24.0,SW,SSE,9.0,...,76.0,53.0,1023.5,1021.7,7.0,7.0,14.1,19.6,0,0
4,12.3,16.7,0.0,2.0,0.3,S,37.0,S,SSW,11.0,...,81.0,89.0,1027.3,1026.2,7.0,8.0,16.1,15.0,0,1


Cyclic Attributes

![Map Cardinal Directions](CardinalDirections.jpg)

Map Cardinal Directions to Radians

In [10]:
dirs = ['N','NNE','NE','ENE','E','ESE','SE','SSE','S','SSW','SW','WSW','W','WNW','NW','NNW','N']
angles = np.arange(0.0, 2.0*np.pi, 2.0*np.pi/16)
wind_angles = dict(zip(dirs,angles))
print(wind_angles)

{'N': 0.0, 'NNE': 0.39269908169872414, 'NE': 0.7853981633974483, 'ENE': 1.1780972450961724, 'E': 1.5707963267948966, 'ESE': 1.9634954084936207, 'SE': 2.356194490192345, 'SSE': 2.748893571891069, 'S': 3.141592653589793, 'SSW': 3.5342917352885173, 'SW': 3.9269908169872414, 'WSW': 4.319689898685965, 'W': 4.71238898038469, 'WNW': 5.105088062083414, 'NW': 5.497787143782138, 'NNW': 5.890486225480862}


Replace cyclical attributes with sin() and cos(), we are doing this to preserve the cyclical attributes, since the value of 'N' and 'NNW' must be similar but the angle will be very different but their cos and sin values will be closer since those are continous functions.

In [11]:
wind_attributes = ["WindGustDir","WindDir9am","WindDir3pm"]
for att in wind_attributes:
    df[att] = df[att].map(wind_angles)
    df[att +"_cos"] = np.cos(df[att])
    df[att +"_sin"] = np.sin(df[att])
    df = df.drop(columns=att)
df.head()

Unnamed: 0,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustSpeed,WindSpeed9am,WindSpeed3pm,Humidity9am,Humidity3pm,...,Temp9am,Temp3pm,RainToday,RainTomorrow,WindGustDir_cos,WindGustDir_sin,WindDir9am_cos,WindDir9am_sin,WindDir3pm_cos,WindDir3pm_sin
0,10.3,25.2,0.0,1.4,9.5,26.0,6.0,15.0,81.0,37.0,...,14.9,24.6,0,0,0.3826834,-0.9238795,-0.9238795,-0.3826834,-1.83697e-16,-1.0
1,8.1,22.9,0.0,2.0,9.8,30.0,15.0,19.0,41.0,30.0,...,16.2,22.4,0,0,-1.83697e-16,-1.0,-1.83697e-16,-1.0,0.3826834,-0.92388
2,9.7,22.4,0.0,5.8,9.4,22.0,7.0,15.0,55.0,52.0,...,15.4,21.3,0,0,6.123234000000001e-17,1.0,-0.7071068,-0.7071068,6.123234000000001e-17,1.0
3,11.8,20.0,0.8,1.8,1.1,24.0,9.0,7.0,76.0,53.0,...,14.1,19.6,0,0,-0.7071068,-0.7071068,-0.7071068,-0.7071068,-0.9238795,0.382683
4,12.3,16.7,0.0,2.0,0.3,37.0,11.0,7.0,81.0,89.0,...,16.1,15.0,0,1,-1.0,1.224647e-16,-1.0,1.224647e-16,-0.9238795,-0.382683


Extract Target Class

In [12]:
y = df['RainTomorrow']
y.head()

0    0
1    0
2    0
3    0
4    1
Name: RainTomorrow, dtype: int64

Extraxt Other Attribbutes

In [13]:
X = df.drop(columns='RainTomorrow')
X.head()

Unnamed: 0,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustSpeed,WindSpeed9am,WindSpeed3pm,Humidity9am,Humidity3pm,...,Cloud3pm,Temp9am,Temp3pm,RainToday,WindGustDir_cos,WindGustDir_sin,WindDir9am_cos,WindDir9am_sin,WindDir3pm_cos,WindDir3pm_sin
0,10.3,25.2,0.0,1.4,9.5,26.0,6.0,15.0,81.0,37.0,...,1.0,14.9,24.6,0,0.3826834,-0.9238795,-0.9238795,-0.3826834,-1.83697e-16,-1.0
1,8.1,22.9,0.0,2.0,9.8,30.0,15.0,19.0,41.0,30.0,...,0.0,16.2,22.4,0,-1.83697e-16,-1.0,-1.83697e-16,-1.0,0.3826834,-0.92388
2,9.7,22.4,0.0,5.8,9.4,22.0,7.0,15.0,55.0,52.0,...,4.0,15.4,21.3,0,6.123234000000001e-17,1.0,-0.7071068,-0.7071068,6.123234000000001e-17,1.0
3,11.8,20.0,0.8,1.8,1.1,24.0,9.0,7.0,76.0,53.0,...,7.0,14.1,19.6,0,-0.7071068,-0.7071068,-0.7071068,-0.7071068,-0.9238795,0.382683
4,12.3,16.7,0.0,2.0,0.3,37.0,11.0,7.0,81.0,89.0,...,8.0,16.1,15.0,0,-1.0,1.224647e-16,-1.0,1.224647e-16,-0.9238795,-0.382683


In [14]:
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.33,
    random_state=0
)

In [15]:
print('X_train: ',X_train.shape)
print('X_test: ',X_test.shape)
print('y_train: ',y_train.shape)
print('y_test: ',y_test.shape)

X_train:  (1978, 23)
X_test:  (975, 23)
y_train:  (1978,)
y_test:  (975,)


Scaling the data

In [16]:
scaler = StandardScaler()
X_train = scaler.fit(X_train).transform(X_train)
X_test = scaler.fit(X_test).transform(X_test)

Now for each column in X_train and X_test , mean is 0 and standard deviation is 1

![Hidden Layers](HiddenLayer.jpg)

Input Layer Size

In [17]:
print(X_train.shape)

(1978, 23)


The size of Input Layer will be same as the number of features in training set, also the number of hidden layers and number of nodes in a hidden layer needs to be decided. The more the number of nodes and hidden layer, it network will be more capable of handling complex relationsships. But, it will then be more prone to overfitting, and will not generalize enough to produce good results for new data. For now, lets take 2 hidden layers with 50 nodes each.

Instantiate a Neural Network and Train it

In [18]:
X.head()

Unnamed: 0,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustSpeed,WindSpeed9am,WindSpeed3pm,Humidity9am,Humidity3pm,...,Cloud3pm,Temp9am,Temp3pm,RainToday,WindGustDir_cos,WindGustDir_sin,WindDir9am_cos,WindDir9am_sin,WindDir3pm_cos,WindDir3pm_sin
0,10.3,25.2,0.0,1.4,9.5,26.0,6.0,15.0,81.0,37.0,...,1.0,14.9,24.6,0,0.3826834,-0.9238795,-0.9238795,-0.3826834,-1.83697e-16,-1.0
1,8.1,22.9,0.0,2.0,9.8,30.0,15.0,19.0,41.0,30.0,...,0.0,16.2,22.4,0,-1.83697e-16,-1.0,-1.83697e-16,-1.0,0.3826834,-0.92388
2,9.7,22.4,0.0,5.8,9.4,22.0,7.0,15.0,55.0,52.0,...,4.0,15.4,21.3,0,6.123234000000001e-17,1.0,-0.7071068,-0.7071068,6.123234000000001e-17,1.0
3,11.8,20.0,0.8,1.8,1.1,24.0,9.0,7.0,76.0,53.0,...,7.0,14.1,19.6,0,-0.7071068,-0.7071068,-0.7071068,-0.7071068,-0.9238795,0.382683
4,12.3,16.7,0.0,2.0,0.3,37.0,11.0,7.0,81.0,89.0,...,8.0,16.1,15.0,0,-1.0,1.224647e-16,-1.0,1.224647e-16,-0.9238795,-0.382683


In [19]:
nn = MLPClassifier(
    hidden_layer_sizes=(50,50),
    random_state=0,
    max_iter=500,
) 
nn.fit(X_train,y_train)

Predict Target Class for testing Data

In [20]:
y_pred = nn.predict(X_test)
print(accuracy_score(y_test, y_pred))

0.8451282051282051


Search for the best Network Layout

In [21]:
parameters = {
    'hidden_layer_sizes': ((2,),(10,),(25,40),(50,50))
}
nn = MLPClassifier(max_iter=3000, random_state=0)
gs = GridSearchCV(nn, parameters, cv = 3)
gs.fit(X_train, y_train)

Display GridSearch Results

In [22]:
print(gs.cv_results_['params'])
print()
print(gs.cv_results_['mean_test_score'])

[{'hidden_layer_sizes': (2,)}, {'hidden_layer_sizes': (10,)}, {'hidden_layer_sizes': (25, 40)}, {'hidden_layer_sizes': (50, 50)}]

[0.83164345 0.83367744 0.80839579 0.82002728]


Predictions Using best Neural Network

In [23]:
best_nn = gs.best_estimator_
y_pred = best_nn.predict(X_test)
print(accuracy_score(y_test, y_pred))

0.8717948717948718


Although the accuracy is same as previous estimnator but the training time is greatly reduced, since previosuly the there were 2 hidden layers having 50 nodes each and in this there is only one layer with 2 nodes. Even if this new network is simpler but the accuracy is not compromised and this is a good thing.