### Predicting the Weather with Neural Networks

Example Neural Network

![Example Neural Network](ExampleNN.png)

Importing Useful Libraries

In [1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import GridSearchCV

Readin the Data into Program Memory

In [2]:
df = pd.read_csv("Data\WeatherPerth_csv.csv")
df.head(2)

Unnamed: 0,Date,Location,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustDir,WindGustSpeed,WindDir9am,...,Humidity9am,Humidity3pm,Pressure9am,Pressure3pm,Cloud9am,Cloud3pm,Temp9am,Temp3pm,RainToday,RainTomorrow
0,01-07-2008,Perth,2.7,18.8,0.0,0.8,9.1,ENE,20.0,,...,97.0,53.0,1027.6,1024.5,2.0,3.0,8.5,18.1,No,No
1,02-07-2008,Perth,6.4,20.7,0.0,1.8,7.0,NE,22.0,ESE,...,80.0,39.0,1024.1,1019.0,0.0,6.0,11.1,19.7,No,No


In [3]:
df.tail(2)

Unnamed: 0,Date,Location,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustDir,WindGustSpeed,WindDir9am,...,Humidity9am,Humidity3pm,Pressure9am,Pressure3pm,Cloud9am,Cloud3pm,Temp9am,Temp3pm,RainToday,RainTomorrow
3191,24-06-2017,Perth,11.5,18.2,0.0,3.8,9.3,SE,30.0,ESE,...,62.0,47.0,1025.9,1023.4,1.0,3.0,14.0,17.6,No,No
3192,25-06-2017,Perth,6.3,17.0,0.0,1.6,7.9,E,26.0,SE,...,75.0,49.0,1028.6,1026.0,1.0,3.0,11.5,15.6,No,No


The variables Location and Date are unwanted for the further analysis and modelling. It is because the Location feature only contains the "Perth", since the data is only for the Perth Region. Date is also irrelevant because we are not doing any time series analysis or comaprison for now, we are only going to model the data in such a way that we can predict if the rain is going to occur tomorrow or not based on the historical data. Although , we can see that Data is available for each day from July 1st 2008 upto June 25th 2017.

In [4]:
exclude = ["Date", "Location"]
for att in exclude:
 del df[att] # Deleting Date and Location features from the data
df.head()

Unnamed: 0,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustDir,WindGustSpeed,WindDir9am,WindDir3pm,WindSpeed9am,...,Humidity9am,Humidity3pm,Pressure9am,Pressure3pm,Cloud9am,Cloud3pm,Temp9am,Temp3pm,RainToday,RainTomorrow
0,2.7,18.8,0.0,0.8,9.1,ENE,20.0,,E,0,...,97.0,53.0,1027.6,1024.5,2.0,3.0,8.5,18.1,No,No
1,6.4,20.7,0.0,1.8,7.0,NE,22.0,ESE,ENE,6,...,80.0,39.0,1024.1,1019.0,0.0,6.0,11.1,19.7,No,No
2,6.5,19.9,0.4,2.2,7.3,NE,31.0,,WNW,0,...,84.0,71.0,1016.8,1015.6,1.0,3.0,12.1,17.7,No,Yes
3,9.5,19.2,1.8,1.2,4.7,W,26.0,NNE,NNW,11,...,93.0,73.0,1019.3,1018.4,6.0,6.0,13.2,17.7,Yes,Yes
4,9.5,16.4,1.8,1.4,4.9,WSW,44.0,W,SW,13,...,69.0,57.0,1020.4,1022.1,7.0,5.0,15.9,16.0,Yes,Yes


Dealing with the missing values

In [5]:
df.iloc[214] #From visual inspection we can see that some values are missing
# Sunshine value missing at row number 215 meaning at index 214

MinTemp            23.3
MaxTemp            36.0
Rainfall            0.0
Evaporation         5.6
Sunshine            NaN
WindGustDir          SW
WindGustSpeed      31.0
WindDir9am            E
WindDir3pm           SE
WindSpeed9am         15
WindSpeed3pm        6.0
Humidity9am        63.0
Humidity3pm        42.0
Pressure9am      1008.6
Pressure3pm      1005.9
Cloud9am            3.0
Cloud3pm            5.0
Temp9am            26.6
Temp3pm            34.8
RainToday            No
RainTomorrow         No
Name: 214, dtype: object

In [6]:
df.isnull().sum() # Number of Missing values in each column

MinTemp            0
MaxTemp            1
Rainfall           0
Evaporation        1
Sunshine           5
WindGustDir        5
WindGustSpeed      5
WindDir9am       134
WindDir3pm         7
WindSpeed9am       0
WindSpeed3pm       1
Humidity9am        9
Humidity3pm        8
Pressure9am        1
Pressure3pm        1
Cloud9am           2
Cloud3pm           4
Temp9am            0
Temp3pm            1
RainToday          0
RainTomorrow       0
dtype: int64

In [7]:
original_size = len(df) # Initial size of dataset

In [8]:
df = df.dropna(axis = 0) # Dropping rows having missing values
print(df.isnull().sum()) # Seeing how many missing values are there now in dataset
print()
print("Number of rows dropped", original_size-len(df))

MinTemp          0
MaxTemp          0
Rainfall         0
Evaporation      0
Sunshine         0
WindGustDir      0
WindGustSpeed    0
WindDir9am       0
WindDir3pm       0
WindSpeed9am     0
WindSpeed3pm     0
Humidity9am      0
Humidity3pm      0
Pressure9am      0
Pressure3pm      0
Cloud9am         0
Cloud3pm         0
Temp9am          0
Temp3pm          0
RainToday        0
RainTomorrow     0
dtype: int64

Number of rows dropped 168


Boolean Variables to 0s and 1s

In [9]:
bools = ["RainToday", "RainTomorrow"]
# these are the columns in the dataset that can only take one of the two value
# Meaning that It is raining (1) or it isn't (0)
for var in bools:
    df[var] = df[var].map({
        'Yes': 1,
        'No': 0,
    })
df.head()

Unnamed: 0,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustDir,WindGustSpeed,WindDir9am,WindDir3pm,WindSpeed9am,...,Humidity9am,Humidity3pm,Pressure9am,Pressure3pm,Cloud9am,Cloud3pm,Temp9am,Temp3pm,RainToday,RainTomorrow
1,6.4,20.7,0.0,1.8,7.0,NE,22.0,ESE,ENE,6,...,80.0,39.0,1024.1,1019.0,0.0,6.0,11.1,19.7,0,0
3,9.5,19.2,1.8,1.2,4.7,W,26.0,NNE,NNW,11,...,93.0,73.0,1019.3,1018.4,6.0,6.0,13.2,17.7,1,1
4,9.5,16.4,1.8,1.4,4.9,WSW,44.0,W,SW,13,...,69.0,57.0,1020.4,1022.1,7.0,5.0,15.9,16.0,1,1
5,0.7,15.9,6.8,2.4,9.3,NNE,24.0,ENE,NE,4,...,86.0,41.0,1032.0,1029.6,0.0,1.0,6.9,15.5,1,0
6,0.7,18.3,0.0,0.8,9.3,N,37.0,NE,NNE,15,...,72.0,36.0,1028.9,1024.2,1.0,5.0,8.7,17.9,0,0


Cyclic Attributes

![Map Cardinal Directions](CardinalDirections.png)

Map Cardinal Directions to Radians

In [10]:
dirs = ['N','NNE','NE','ENE','E','ESE','SE','SSE','S','SSW','SW','WSW','W','WNW','NW','NNW','N']
angles = np.arange(0.0, 2.0*np.pi, 2.0*np.pi/16)
wind_angles = dict(zip(dirs,angles))
print(wind_angles)

{'N': 0.0, 'NNE': 0.39269908169872414, 'NE': 0.7853981633974483, 'ENE': 1.1780972450961724, 'E': 1.5707963267948966, 'ESE': 1.9634954084936207, 'SE': 2.356194490192345, 'SSE': 2.748893571891069, 'S': 3.141592653589793, 'SSW': 3.5342917352885173, 'SW': 3.9269908169872414, 'WSW': 4.319689898685965, 'W': 4.71238898038469, 'WNW': 5.105088062083414, 'NW': 5.497787143782138, 'NNW': 5.890486225480862}


Replace cyclical attributes with sin() and cos(), we are doing this to preserve the cyclical attributes, since the value of 'N' and 'NNW' must be similar but the angle will be very different but their cos and sin values will be closer since those are continous functions.

In [11]:
wind_attributes = ["WindGustDir","WindDir9am","WindDir3pm"]
for att in wind_attributes:
    df[att] = df[att].map(wind_angles)
    df[att +"_cos"] = np.cos(df[att])
    df[att +"_sin"] = np.sin(df[att])
    df = df.drop(columns=att)
df.head()

Unnamed: 0,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustSpeed,WindSpeed9am,WindSpeed3pm,Humidity9am,Humidity3pm,...,Temp9am,Temp3pm,RainToday,RainTomorrow,WindGustDir_cos,WindGustDir_sin,WindDir9am_cos,WindDir9am_sin,WindDir3pm_cos,WindDir3pm_sin
1,6.4,20.7,0.0,1.8,7.0,22.0,6,9.0,80.0,39.0,...,11.1,19.7,0,0,0.7071068,0.707107,-0.3826834,0.92388,0.382683,0.92388
3,9.5,19.2,1.8,1.2,4.7,26.0,11,6.0,93.0,73.0,...,13.2,17.7,1,1,-1.83697e-16,-1.0,0.9238795,0.382683,0.92388,-0.382683
4,9.5,16.4,1.8,1.4,4.9,44.0,13,17.0,69.0,57.0,...,15.9,16.0,1,1,-0.3826834,-0.92388,-1.83697e-16,-1.0,-0.707107,-0.707107
5,0.7,15.9,6.8,2.4,9.3,24.0,4,7.0,86.0,41.0,...,6.9,15.5,1,0,0.9238795,0.382683,0.3826834,0.92388,0.707107,0.707107
6,0.7,18.3,0.0,0.8,9.3,37.0,15,13.0,72.0,36.0,...,8.7,17.9,0,0,1.0,0.0,0.7071068,0.707107,0.92388,0.382683


Extract Target Class

In [12]:
y = df['RainTomorrow']
y.head()

1    0
3    1
4    1
5    0
6    0
Name: RainTomorrow, dtype: int64

Extraxt Other Attribbutes

In [13]:
X = df.drop(columns='RainTomorrow')
X.head()

Unnamed: 0,MinTemp,MaxTemp,Rainfall,Evaporation,Sunshine,WindGustSpeed,WindSpeed9am,WindSpeed3pm,Humidity9am,Humidity3pm,...,Cloud3pm,Temp9am,Temp3pm,RainToday,WindGustDir_cos,WindGustDir_sin,WindDir9am_cos,WindDir9am_sin,WindDir3pm_cos,WindDir3pm_sin
1,6.4,20.7,0.0,1.8,7.0,22.0,6,9.0,80.0,39.0,...,6.0,11.1,19.7,0,0.7071068,0.707107,-0.3826834,0.92388,0.382683,0.92388
3,9.5,19.2,1.8,1.2,4.7,26.0,11,6.0,93.0,73.0,...,6.0,13.2,17.7,1,-1.83697e-16,-1.0,0.9238795,0.382683,0.92388,-0.382683
4,9.5,16.4,1.8,1.4,4.9,44.0,13,17.0,69.0,57.0,...,5.0,15.9,16.0,1,-0.3826834,-0.92388,-1.83697e-16,-1.0,-0.707107,-0.707107
5,0.7,15.9,6.8,2.4,9.3,24.0,4,7.0,86.0,41.0,...,1.0,6.9,15.5,1,0.9238795,0.382683,0.3826834,0.92388,0.707107,0.707107
6,0.7,18.3,0.0,0.8,9.3,37.0,15,13.0,72.0,36.0,...,5.0,8.7,17.9,0,1.0,0.0,0.7071068,0.707107,0.92388,0.382683


In [14]:
X_train, X_test, y_train, y_test = train_test_split(
    X,
    y,
    test_size=0.33,
    random_state=0
)

In [15]:
print('X_train: ',X_train.shape)
print('X_test: ',X_test.shape)
print('y_train: ',y_train.shape)
print('y_test: ',y_test.shape)

X_train:  (2026, 23)
X_test:  (999, 23)
y_train:  (2026,)
y_test:  (999,)


Scaling the data

In [16]:
scaler = StandardScaler()
X_train = scaler.fit(X_train).transform(X_train)
X_test = scaler.fit(X_test).transform(X_test)

Now for each column in X_train and X_test , mean is 0 and standard deviation is 1

![Hidden Layers](HiddenLayer.png)

Input Layer Size

In [19]:
print(X_train.shape)

(2026, 23)


The size of Input Layer will be same as the number of features in training set, also the number of hidden layers and number of nodes in a hidden layer needs to be decided. The more the number of nodes and hidden layer, it network will be more capable of handling complex relationsships. But, it will then be more prone to overfitting, and will not generalize enough to produce good results for new data. For now, lets take 2 hidden layers with 50 nodes each.

Instantiate a Neural Network and Train it

In [20]:
nn = MLPClassifier(
    hidden_layer_sizes=(50,50),
    random_state=0,
    max_iter=500,
) 
nn.fit(X_train,y_train)

Predict Target Class for testing Data

In [21]:
y_pred = nn.predict(X_test)
print(accuracy_score(y_test, y_pred))

0.8908908908908909


Search for the best Network Layout

In [22]:
parameters = {
    'hidden_layer_sizes': ((2,),(10,),(25,40),(50,50))
}
nn = MLPClassifier(max_iter=3000, random_state=0)
gs = GridSearchCV(nn, parameters, cv = 3)
gs.fit(X_train, y_train)

Display GridSearch Results

In [23]:
print(gs.cv_results_['params'])
print()
print(gs.cv_results_['mean_test_score'])

[{'hidden_layer_sizes': (2,)}, {'hidden_layer_sizes': (10,)}, {'hidden_layer_sizes': (25, 40)}, {'hidden_layer_sizes': (50, 50)}]

[0.90424355 0.89487179 0.88154211 0.88450873]


Predictions Using best Neural Network

In [24]:
best_nn = gs.best_estimator_
y_pred = best_nn.predict(X_test)
print(accuracy_score(y_test, y_pred))

0.8938938938938938


Although the accuracy is same as previous estimnator but the training time is greatly reduced, since previosuly the there were 2 hidden layers having 50 nodes each and in this there is only one layer with 2 nodes. Even if this new network is simpler but the accuracy is not compromised and this is a good thing.