The challenge is to build a neural network to predict the magnitude of an Earthquake given the date, time, Latitude, and Longitude as features. This is the dataset. Optimize at least 1 hyperparameter using Random Search. See this example for more information


You can use any library you like, bonus points are given if you do this using only numpy.



In [1]:
import numpy as np
import pandas as pd

import matplotlib.pyplot as plt
%matplotlib inline

In [2]:
data = pd.read_csv('database.csv')[['Date','Time','Latitude','Longitude','Magnitude']]

In [3]:
y = data['Magnitude'].values

In [4]:
x = data[['Latitude','Longitude']].values
lons = data['Longitude'].values
lats = data['Latitude'].values

In [5]:
years = []
for date in data['Date'].values:
    year = date[-4:]
    try:
        years.append(int(year))
    except:
        print("got a weird version")
        years.append(np.nan)
    
data['Year'] = years 
del data['Date']
data.fillna(method='bfill',inplace=True) #assuming its that same year, this should do it.  
#lazy, so will leave as null. 

got a weird version
got a weird version
got a weird version


For starters, let's look at latitude and longitude, and as a binary problem with a reasonable cutoff. 

In [None]:
plt.scatter(lons,lats,c=y,cmap='viridis')
plt.colorbar()

So it looks like lat and long doesn't play a huge role in what the magnitude is, but you can see the outlines of continents :) 

So now its time to build a model. With input as year, lat, and lon trying to regress to magnitude. First will be a minmax normalization and train test splitting, then the construction of the model

In [6]:
y = y.reshape(np.shape(y)[0],1)

I will be changing the dataset so there is an additional hidden layer. I will try to code the NN by memory as well in order to help memorize the architecture and logic of these models. 

In [7]:
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split


X = data[['Latitude','Longitude','Year']].values
y = data['Magnitude'].values




In [8]:
#removing the last 12 values to simplify batch size 
X = X[0:23400]
y = y[0:23400]

In [9]:
y = y.reshape(np.shape(y)[0],1)

In [10]:
X_train,X_test,y_train,y_test = train_test_split(X,y,shuffle=True)

scaler = MinMaxScaler()

X_train = scaler.fit_transform(X_train)

In [31]:
X_train.shape[0]/50

351.0

In [32]:
#Now initialize hyperparameters:

num_epochs = 52560 #*10 
#something not included can write in batches now. 
batches = 351  #not exactly sure how to set this, but if I have 351 batches that means I have a batch size of fifty. 
syn0 = 2*np.random.random((3,5)) -1 #first hidden layer's weights
syn1 = 2 * np.random.random((5,4)) - 1  #second hidden layer's weights
syn2 = 2 * np.random.random((4,1)) - 1

In [33]:
X_train[0:50]
X_train[50*i:(50*i)+10]

array([[ 0.4245727 ,  0.85197253,  0.15686275],
       [ 0.59233501,  0.0690956 ,  0.35294118],
       [ 0.69491171,  0.77806852,  0.56862745],
       [ 0.79152894,  0.94338795,  0.35294118],
       [ 0.38940565,  0.96290216,  0.88235294],
       [ 0.37719794,  0.96416329,  0.        ],
       [ 0.22042073,  0.73759285,  0.82352941],
       [ 0.58729096,  0.21630083,  0.60784314],
       [ 0.24674072,  0.99221931,  0.49019608],
       [ 0.50879176,  0.4085957 ,  0.25490196]])

In [34]:
#activation function goes here. 
def nonlin(x,deriv = False):
    """This is the sigmoid function. 
    
    Derivative feature included to help with gradient descent to soon come. 
    """
    if deriv :
        return x*(1-x)

    return 1/(1+np.exp(-x))

In [37]:
#now for training day:

for i in range(num_epochs):
    for batch in range(batches):
        X_batch = X_train[50*i:(50*i)+50]
        y_batch = y_train[50*i:(50*i)+50]
        
        k0 = X_batch
        k1 = nonlin(np.dot(k0,syn0)) #activation applied onto the first hidden layer
        k2 = nonlin(np.dot(k1,syn1)) #activation onto the second layer
        k3 = nonlin(np.dot(k2,syn2)) #activation onto the output layer, this is the resulting guess. 
    
        k3_error = y_batch - k3  # how far off you were. This is a fairl
    
        k3_delta = k3_error * nonlin(k3,deriv=True)
    
    #and backpropagation goes here. 
    #the error in hidden layer 2 is dot product of the direction of the error in the next layer, 
    #with the weights used at this particular layer. 
        k2_error = k3_delta.dot(syn2.T)
    
        k2_delta = k2_error * nonlin(k2,deriv=True)
    
        k1_error = k2_delta.dot(syn1.T)
        k1_delta = k1_error * nonlin(k1,deriv=True)
    
    

    #update weights
        syn0 += k0.T.dot(k1_delta)
        syn1 += k1.T.dot(k2_delta)
        syn2 += k2.T.dot(k3_delta)
    if (i% 10000) == 0:  #quick sanity check for every 10000 epochs.  The error should drop!
        print("Error: " + str(np.mean(np.abs(k3_error))))
        

Error: 4.824


KeyboardInterrupt: 

So it runs, but it isn't learning. Based on the preliminary data exploration this seemed like it was going to be the case, but regardless going through a neural network and building it by memory was a pretty good excercise!