Neural networks Example Python workflow

Import the necessary Python modules.  In this case we need only Numpy for further processing

In [2]:
import numpy as np

OK, we have our module loaded.  Now the task is to load our data into arrays for further processing.

In [5]:
X=np.genfromtxt('data.txt',delimiter=',')

Now we have all of our data, but it is important to check:
1.  To see if we have all of it.
2.  To see if it makes sense.

Do this by looking at shapes, sizes and colors

In [8]:
Nv,N=X.shape
print X.shape,Nv,N

(100, 4) 100 4


The data is in the format x_1,x_2,x_3,sinh(x_1+x_2+x_3).  I want to grab the last column because it contains our target vector Y.

In [7]:
y_col=X[:,-1] #This contains the last column

To account for the bias term we need a column of ones.  We can accomplish this by first resizing our X array to account for the additional column.  For starters, I need to get the sizes of the original array.  I will do this a different way from last time.  I'll allocate an array, x, of the correct size and then copy the original array, X, into the correctly sized array.

In [9]:
x=np.zeros((Nv,N+1))
print x.shape

(100, 5)


Now I have the needed array.  I just need to copy the old values in!  Maybe I am uncomfortable with this operation, so I sanity check by printing out shapes before transfer.

In [10]:
print x[:,:-1].shape

(100, 4)


In [11]:
x[:,:-1]=X

Alas, the transfer has occured and I am in business, except...I do not have my column of ones yet and I do not want to overwrite the target vector Y.

One inquisitive student, Abraham, wants to know if the last column is indeed zeros.  I will satisfy him right now by printing out the last column.

In [12]:
print x[:,-1]

[0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 0. 0. 0. 0.]


Yay!!!!  Now I need to alter the last column

In [14]:
x[:,N]=y_col

Now the last two columns should be identical.  Are they?  I'll check maybe the first 5 rows and infer that the rest work out.

In [17]:
print x[:5,-2:]

[[3.25995146 3.25995146]
 [1.21035987 1.21035987]
 [1.20954846 1.20954846]
 [1.90077105 1.90077105]
 [3.37680794 3.37680794]]


Now I need merely to overwrite the next to the last column with a row of ones.

In [18]:
x[:,-2]=np.ones((Nv))

Now I just need to...once again...sanity check my results.

In [19]:
print x[30:35,:]

[[0.79117375 0.18919469 0.32776426 1.         1.71446765]
 [0.36692967 0.44009942 0.92279323 1.         2.73116792]
 [0.25537941 0.69284941 0.63459244 1.         2.33163872]
 [0.77489764 0.85272623 0.2290415  1.         3.12307953]
 [0.84849869 0.17401053 0.9808345  1.         3.63946042]]


Crisco eat you heart out...I'm cooking!  Now we get to the actual neural network processing.  The stage above is preprocessing and in general can consume a lot of time, but is vital in any machine learning application.

We start by randomizing the input and output weights.  Initialization is a critical step of machine learning training and subject to considerable research.  But for now we are going to use naive randomization.  A popular initialization method is that of Xavier.

Yikes!  I have not specified my topology in this code, so a good point is to do so now.  I want 2 hidden units and one output.

In [25]:
Nh=2
M=1
N1=3

In this code my input array is given by W and my output array by Woh.  Also, I call the number of inputs N1.

In [26]:
W=np.random.rand(Nh,N1+1)
Woh=np.random.rand(M,Nh)

I have my arrays and am now ready for full processing.

First we compute the net function.  Note, that I am a bit worried about shapes, etc.

In [29]:
temp_x=x[:,:-1]
print temp_x.shape,W.shape,Woh.shape
net=np.dot(W,temp_x.T)
print net.shape

(100, 4) (2, 4) (1, 2)
(2, 100)


I now have my net function.  The next stage in the pipeline is to apply an activation.  For this design this is the hyperbolic tangent and the variable is O.

In [30]:
O=np.tanh(net)
print O.shape

(2, 100)


Now I need to apply the output weights to O for my final result.

In [31]:
y_hat=np.dot(Woh,O)
print y_hat.shape

(1, 100)


So.....my output is here.  Now I need a measure of how good it is.  This means computing the loss function.  In this case we are using the MSE (mean square error).  There are numerous loss functions and you should choose one to suit your application.

First we compute the difference between what we expected and what we actually processed.

In [34]:
temp=y_col-y_hat

This is OK, but I would feel more comfortable without the singleton dimension.  Our friend np.squeeze comes to the rescue!

In [35]:
temp=np.squeeze(temp)
print temp.shape

(100,)


Now to the MSE! Note, that the result should be a scalar.

In [36]:
initial_E=1./Nv*np.dot(temp,temp)
print initial_E


4.917760577174252


Last time we derived formulae to "solve" for the output weights.  We do so here in code.

In [37]:
R=np.dot(O,O.T)
print R.shape

(2, 2)


In [39]:
C=np.dot(O,y_col)
print C.shape

(2,)


Now we solve!

In [41]:
Woh=np.linalg.solve(R,C)
print Woh.shape

(2,)
