# Project on Neural Networks

We have a dataset of film reviews classified according to the sentiment they produce, which can be **positive** or **negative**. We want to be able to predict whether a film will make you feel good or not using the vocabulary on the reviews.

In order to make your task easier, we have preprocessed the dataset. We want to encode the information of each review as a vector of the most frequent words used in the descriptions of films. As reviews are quite long, we have splitted them into sentences. This decision may distort the feeling of the author of the review, as they can describe both positive and negative aspects of the film. However, being the aim of this project to put into practise the concepts we have learned in this subject, the accuracy of the results play a minor role. 

The texts are tokenized, and we have filtered out stopwords (except *no* and *not*) and some proper nouns (of the characters). We have also discarded very short sentences (less than 3 words). Then, we have obtained the vocabulary of the most frequent 100 words (this number could be adapted in order to enrich the features used as input).

The dataset has been processed in order to be encoded according to the vocabulary. Each column represents a word in the vocabulary. Each row corresponds to a sentence, in which the frequencies of the words in the sentence are given.

In the first column you will find the True value (1 = _positive_ or 0 = _negative_) and in the rest of the columns words of the vocabulary (0 = word not appearing; >0 = word frequence in the sentence).

The dataset has been splitted into train and test samples. The number of review sentences in the train is **246, out of which 112 are positive and 134 negative**. The number of review sentences in the test is **50, out of which 24 are positive and 26 negative.**

Task fullfilled:
 - Build a neural network which is able to process all the examples on the train and classify them according to the true values.
 - Consider different values for $\alpha$ and iterations and choose the best one. 
 - To make a classification of the sentiment of the example, consider that the sentiment is positive when the prediction is bigger than 0.5, and negative otherwise. Considering your last best choices for the weights, what are the predicted sentiments for the test examples? What is the correct classification percentage?
 - Consider two different structures for the neural network. Repeat the evaluation on the test examples.
    
(The complete dataset is available at https://github.com/iamtrask/grokking-deep-learning)

In [6]:
# Train
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import random as rd
import os
import re

cwd = os.getcwd() #

xy_data = pd.read_csv('./train.csv',sep=" ") # Examples are defined as TRUE_LABEL REVIEW
# TRUE_LABEL is clasified as positive or negative


y_val = xy_data.iloc[:,0:1] # only first column
x_val = xy_data.iloc[:,1:] # all columns except the first one

X,Y = np.array(x_val),np.array(y_val)

X = X.T # Each column is a review

print(X.shape, Y.shape, Y[0])
print(xy_data.iloc[:,0].value_counts())


weights = np.array([[0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5]])
#ad hoc. I chose to use 100 random weights so each can make a prediciton for the 100 words of the input X)
#Then I will just update them to obtaining more accurate real values according to an estimated error of 0.
#print (weights.shape)

weights = weights.T
#print ("weights",weights.shape)
#I'm converting my weight matrix to vector (1, 100) 
#so I can multiply it with my input data matrix with dimensions (100, 246)

alpha = 0.1 #Random number of alpha
Ni =  200  #Random number of iterations
input_data = X
goal = Y

prediction =  np.matmul(weights, input_data)# (1,100)*(100,246)
print('Prediction: ', prediction, '\n')

gap = prediction - goal 
print('Gap:\n', gap, '\n')
print('Gap**2:\n ', gap**2, '\n')
print('Sum Gap**2: ', sum(gap**2), '\n')

error = sum(gap**2) 
print('Initial weights =', weights, '\n Initial gap = ', gap, '\n Initial error = ', error, '\n')

Ni = 200
# update the weight
for iteration in range(Ni):
    prediction =  np.matmul(weights, input_data)
    gap = prediction - goal
    error = sum(gap**2)
    weights = weights - alpha*np.matmul(gap, input_data.T)
        
print('\n\n''Updated weights =', weights, '\n\n' 'Updated gap=', gap,'\n\nUpdated error = ', error, '\n')


(100, 246) (246, 1) [1]
0    134
1    112
Name: true_value, dtype: int64
Prediction:  [[0.3        0.35714286 0.2        0.35       0.28461538 0.36666667
  0.36666667 0.35       0.41666667 0.32       0.3        0.28
  0.31111111 0.35       0.3        0.3        0.27857143 0.25
  0.25       0.4        0.38       0.375      0.1        0.3
  0.23333333 0.275      0.4        0.28       0.36666667 0.3
  0.17142857 0.1        0.25555556 0.2        0.44       0.4
  0.4        0.3        0.35       0.4        0.3        0.26666667
  0.2        0.23333333 0.3        0.175      0.375      0.25714286
  0.23333333 0.4        0.28333333 0.25714286 0.22       0.26666667
  0.3        0.15       0.32       0.2        0.125      0.36666667
  0.36666667 0.27777778 0.275      0.26       0.15       0.22
  0.3        0.1        0.23333333 0.25       0.26666667 0.26666667
  0.22857143 0.1        0.36666667 0.25       0.25       0.2
  0.2        0.35       0.2        0.26666667 0.1        0.35
  0.33333333 0



Updated weights = [[1.0096486  1.00133312 1.0042309  ... 1.02116326 0.98407005 0.9993358 ]
 [1.0096486  1.00133312 1.0042309  ... 1.02116326 0.98407005 0.9993358 ]
 [1.0096486  1.00133312 1.0042309  ... 1.02116326 0.98407005 0.9993358 ]
 ...
 [1.0096486  1.00133312 1.0042309  ... 1.02116326 0.98407005 0.9993358 ]
 [1.0096486  1.00133312 1.0042309  ... 1.02116326 0.98407005 0.9993358 ]
 [1.0096486  1.00133312 1.0042309  ... 1.02116326 0.98407005 0.9993358 ]] 

Updated gap= [[ 0.00034159 -0.00214493  0.00541526 ... -0.00203981 -0.00041839
   0.00029217]
 [ 0.00034159 -0.00214493  0.00541526 ... -0.00203981 -0.00041839
   0.00029217]
 [ 0.00034159 -0.00214493  0.00541526 ... -0.00203981 -0.00041839
   0.00029217]
 ...
 [ 0.00034159 -0.00214493  0.00541526 ... -0.00203981 -0.00041839
   0.00029217]
 [ 0.00034159 -0.00214493  0.00541526 ... -0.00203981 -0.00041839
   0.00029217]
 [ 0.00034159 -0.00214493  0.00541526 ... -0.00203981 -0.00041839
   0.00029217]] 

Updated error =  [1.5432134

In [None]:
# Train
#I have decided to work with this type of set up where
#the error is set to be 0, so I can get the most convenient
#value of alpha and number of iterations.
import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import random as rd
import os
import re

cwd = os.getcwd() #

xy_data = pd.read_csv('./train.csv',sep=" ") # Examples are defined as TRUE_LABEL REVIEW
# TRUE_LABEL is clasified as positive or negative


y_val = xy_data.iloc[:,0:1] # only first column
x_val = xy_data.iloc[:,1:] # all columns except the first one

X,Y = np.array(x_val),np.array(y_val)

X = X.T # Each column is a review

print(X.shape, Y.shape, Y[0])
print(xy_data.iloc[:,0].value_counts())


weights = np.array([[0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5]])
#ad hoc. I chose to use 100 random weights so each can make a prediciton for the 100 words of the input X)
#Then I will just update them to obtaining more accurate real values according to an estimated error of 0.
#print (weights.shape)

weights = weights.T
#print ("weights",weights.shape)
#I'm converting my weight matrix to vector (1, 100) 
#so I can multiply it with my input data matrix with dimensions (100, 246)

alpha = 0.1 #Random number of alpha
Ni =  200  #Random number of iterations
for iteration in range(Ni):
    error_for_all = 0
    for ind in range(246):
        input_data = X[:, [ind]] #(100,1[each 1 out of 246]))
        #print ("id",input_data.shape) 
        #input_data2= input_data.T
        #print ("id.T",input_data2.shape) #(1,100)
        goal = Y[ind] 
        #print("goal",goal.shape) (246,1)
        prediction = np.matmul(weights, input_data) #multiply each weight with each 1 out of 246 (1,100)*(100,1)
        #print("pred", prediction.shape) #(1,1) but there are 246 preds: one x for each weight.
        gap = prediction - goal
        error = gap**2
        weights = weights - alpha*gap*input_data.T
        
print('\n\n''Updated weights =', weights, '\n\n' 'Gap=', gap,'\n\nError = ', error_for_all, '\n')
print('These are my predictions =\n', np.matmul(weights, X))

In [340]:
# Train
import numpy as np 
import pandas as pd 
import random as rd
import os
import re

cwd = os.getcwd() 

xy_data = pd.read_csv('./train.csv',sep=" ") 

y_val = xy_data.iloc[:,0:1]
x_val = xy_data.iloc[:,1:] 

X,Y = np.array(x_val),np.array(y_val)

X = X.T 

print(X.shape, Y.shape, Y[0])
print(xy_data.iloc[:,0].value_counts())


weights = np.array([[0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5]])

weights = weights.T

#I will now consider different values for 𝛼 and iterations and choose the best one.

alpha = 0.1
Ni =  37  #Number of iterations
for iteration in range(Ni):
    error_for_all = 0
    for ind in range(246):
        input_data = X[:, [ind]]
        #print (input_data.shape)
        #input_data2= input_data.T
        #print (input_data2.shape)
        goal = Y[ind]
        prediction = np.matmul(weights, input_data)
        gap = prediction - goal
        error = gap**2
        weights = weights - alpha*gap*input_data.T
        
print('\n\n''Updated weights=', weights, '\n\n' 'Gap=', gap,'\n\nError= ', error_for_all, '\n')
print('These are my predictions =\n', np.matmul(weights, X))
#the same step (0.1) with less iterations shows smaller gap between the prediction and the real output. 
#Ni = 2000 -> Gap= [[-0.40053022]] 
#Ni = 200 -> Gap= [[-0.39557737]] 
#Ni = 20 -> Gap= [[-0.38210846]] 
#However, if the number of iterations is too small, the gap starts to get bigger again
#Ni = 10 -> Gap= [[-0.41339655]] 
#The best value is found at 38, anything up or down gets us further from the smallest gap at this alpha value.
#Ni = 41 -> Gap= [[-0.37539414]] 
#I will now maintain the number of iterations and vary the alpha value to see which provides better/worse results

(100, 246) (246, 1) [1]
0    134
1    112
Name: true_value, dtype: int64


Updated weights= [[ 0.96720321  0.78969208  0.42941048  0.25954953  0.82114487  0.49710724
   0.45118091  1.50310527  0.38922573  0.4497097  -0.30628642  0.98444896
  -0.07227812 -0.11477675 -0.30294953  0.74904634 -0.33243995  1.30620107
   0.2080179   0.81712207 -0.21613236 -0.20382165  0.06351953  1.53253875
   0.37124874  0.79029165  0.44814142 -0.52361144  1.33286916  1.35399241
   0.34919135  0.94254813 -0.15737753  0.52596146  1.25764323  0.80778765
  -0.24680843 -0.58574482  0.5665732   0.70626437 -0.12520672  0.57981076
   0.48531307  0.71677958  0.48538636 -0.56033824 -0.45263875 -0.10369534
   0.61348077  0.3636277   0.83928301 -0.00809487  0.84299146 -0.37434021
   0.50807609 -0.01030918  0.44334335  0.05646798  0.82611093  0.09906427
   0.77104309 -0.17655047  0.66111296  1.66417893  0.88765322  0.84350514
  -0.16179477  0.37257854 -0.26126199  0.97470676 -0.82700674  0.85414849
   0.52441717 -0.443

In [341]:
# Train
import numpy as np 
import pandas as pd 
import random as rd
import os
import re

cwd = os.getcwd() 

xy_data = pd.read_csv('./train.csv',sep=" ") 

y_val = xy_data.iloc[:,0:1]
x_val = xy_data.iloc[:,1:] 

X,Y = np.array(x_val),np.array(y_val)

X = X.T 

print(X.shape, Y.shape, Y[0])
print(xy_data.iloc[:,0].value_counts())


weights = np.array([[0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5]])

weights = weights.T

alpha = 0.063 #different tries
Ni =  38  #fixed number of iterations
for iteration in range(Ni):
    error_for_all = 0
    for ind in range(246):
        input_data = X[:, [ind]]
        goal = Y[ind]
        prediction = np.matmul(weights, input_data)
        gap = prediction - goal
        error = gap**2
        weights = weights - alpha*gap*input_data.T
        
print('\n\n''Updated weights =', weights, '\n\n' 'Gap=', gap,'\n\nError = ', error_for_all, '\n')
print('These are my predictions =\n', np.matmul(weights, X))

#different tries for alpha:
#alpha = 5 -> Gap= [[-3.63125484e+10]]
#alpha = 0.5 -> Gap= [[-0.49070269]]
#alpha = 0.05 -> Gap= [[-0.37274581]] 
#alpha = 0.005 -> Gap= [[-0.60096044]]
#the best value is found between 0.05 and 0.005
#after different tries, the best alpha value is 0.063. 



(100, 246) (246, 1) [1]
0    134
1    112
Name: true_value, dtype: int64


Updated weights = [[ 0.99405306  0.77327022  0.40055255  0.27001113  0.84714474  0.50295956
   0.4931697   1.46186834  0.3330859   0.54037277 -0.22912162  0.94219288
  -0.08851101 -0.04184842 -0.21641056  0.75360957 -0.24386631  1.27289334
   0.2809794   0.67550431 -0.16251248 -0.14534007  0.17534465  1.38677827
   0.41899432  0.73613047  0.45802212 -0.2777449   1.23809691  1.27796856
   0.36919523  0.85930168 -0.07794687  0.50461242  1.26407484  0.68080584
  -0.20407245 -0.44521194  0.50626005  0.68391225 -0.12513564  0.55132426
   0.45412857  0.66745158  0.49714838 -0.41665683 -0.35753548  0.04532853
   0.55155381  0.43891167  0.78725387  0.00286828  0.75347777 -0.21057813
   0.5041742   0.03627095  0.4167948   0.13810232  0.73778035  0.20022163
   0.59565478 -0.07811166  0.5575793   1.32542425  0.84744858  0.66005448
  -0.09688213  0.38566318 -0.1259109   0.89576743 -0.60809814  0.72911368
   0.47731092 -0.30

In [376]:
#Train
#I now perform a sentiment classification of the data according to the predictions obtained in my previous operations.
#If the prediction is bigger than 0.5, the example will be classified as positive;
#if smaller, the class will be negative.
import numpy as np 
import pandas as pd 
import random as rd
import os
import re

cwd = os.getcwd() 

xy_data = pd.read_csv('./train.csv',sep=" ") 

y_val = xy_data.iloc[:,0:1]
x_val = xy_data.iloc[:,1:] 

X,Y = np.array(x_val),np.array(y_val)

X = X.T 

print(X.shape, Y.shape, Y[0])
print(xy_data.iloc[:,0].value_counts())


weights = np.array([[0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5]])

weights = weights.T

alpha = 0.063 
Ni =  38  
for iteration in range(Ni):
    error_for_all = 0
    for ind in range(246):
        input_data = X[:, [ind]]
        goal = Y[ind]
        prediction = np.matmul(weights, input_data)
        gap = prediction - goal
        error = gap**2
        weights = weights - alpha*gap*input_data.T
        
print('\n\n''Updated weights =', weights, '\n\n' 'Gap=', gap,'\n\nError = ', error_for_all, '\n')
print('These are my predictions =\n', np.matmul(weights, X))
#print (weights.shape)

#weightsnew = weights.T
#for i in weightsnew: 
#    neww = ','.join(str(i) for i in weightsnew)
#print("New", neww)   
#Here I am just obtaining the values of the new weights so I can use them
#later on for the second layer (B) of weights (last question)


predictions = np.matmul(weights, X) #these are my latest predictions
#print(predictions)
#print (predictions.shape)(1, 246)
'\n'
predictionsnew = np.array(predictions.T)
#print(predictions2)

#The predicted sentiments for the test examples, mixed:
for i in predictionsnew:
    if i <= 0.5:
        print ("Neg", i)
    else:   
        print ("Pos", i) 
        
#I now separate the negative and the positive sentiment examples

#Examples of negative sentiment:
for i in predictionsnew: 
        negs = ','.join(str(i) for i in predictionsnew if i <= 0.5)

#print("Negatives", negs) 
  
#I know this is a bit sketchy and I'm sure there is a way to obtain 
#an array automatically#I printed results above and copied negative values to create a new array with them:
negatives = np.array([[0.46805395],[0.49017903],[0.40891028],[0.248771],[0.21952099],[0.33334332],[0.41577029],[0.49889016],[0.46520188],[0.33481477],[0.12502651],[0.1779762],[0.46422194],[0.35937454],[0.32661209],[0.47391775],[0.2618764],[0.4257705],[0.20593618],[0.3581238],[0.42426313],[0.48959191],[0.04697861],[0.49706102],[0.34705436],[0.43283746],[0.42052578],[0.42975858],[0.31803844],[0.07275571],[0.45628773],[0.25609413],[0.44001772],[0.28869812],[0.26745776],[0.44494372],[0.04251311],[0.07711203],[0.17601171],[-0.11966224],[0.42337548],[0.27345617],[0.39076018],[0.24042795],[0.41494987],[0.36596071],[0.47152886],[0.4632603],[0.46338723],[0.42241831],[0.31937239],[0.34830498],[0.35991243],[0.42084252],[0.47046266],[0.11902003],[-0.01955465],[0.215147],[0.3569808],[0.39314586],[0.02806073],[0.18122624],[0.24743665],[0.13457945],[0.46070361],[0.07834312],[0.15384375],[0.09602306],[0.34671122],[0.10489647],[0.12658124],[0.08571547],[-0.14798422],[0.44047011],[0.39360723],[0.2702076],[0.27938644],[0.2622083],[0.1043391],[0.46839875],[0.08571547],[0.46710119],[0.19033703],[0.36659549],[0.27370736],[0.11014399],[-0.21605396],[0.29835763],[0.29302239],[0.48630682],[0.21575526],[0.07243775],[0.47472442],[0.18426355],[0.15402513],[0.26095405],[0.33036087],[-0.05157633],[0.19915055],[0.23064418],[0.25886222],[0.44423173],[-0.02653716],[-0.03975052],[0.14195087],[-0.17865659],[0.28506725],[0.14561874],[-0.03435983],[0.30336541],[0.00039025],[0.35503687],[-0.19459627],[0.05491561],[0.34921788],[0.1563041],[0.04829654],[0.43552592],[-0.01764202],[0.25679094],[0.06924738],[0.37380976],[0.20232934],[0.41248357],[0.27504656],[0.11403556],[0.39333943],[0.3221467],[0.44225201],[0.4848859],[0.49634568],[-0.01441891],[0.26337518],[0.37131335],[0.48440028],[0.14817356],[0.15183521],[0.45998205],[0.1755474],[0.23592512],[0.49987317],[0.10857915],[0.30023135],[0.27168285],[0.11035841],[0.28932975],[0.33528184],[0.33528184],[0.45459047]])
negatives = negatives.T #it is now a vector (1, 149)
print("Negative examples:"'\n', negatives)
#print (negatives.shape) = (1, 149)


#Examples of negative sentiment:
for i in predictionsnew: 
        pos = ','.join(str(i) for i in predictionsnew if i > 0.5)

#print("Positives", pos) 
  
#Same here: I printed results above and copied negative values to create a new array with them:
positives = np.array([[0.85978248],[0.63228077],[0.63525047],[0.89223767],[0.84490685],[0.75339738],[0.74959604],[1.14829676],[1.02342868],[0.85976676],[0.50723767],[0.5710384],[0.75038075],[0.89482073],[0.66203036],[0.77280344],[0.95994523],[0.53406912],[1.02095446],[0.85669578],[0.59572065],[0.66787318],[0.50590281],[0.52454275],[0.53050845],[0.71134912],[0.70309292],[0.60658168],[0.75732106],[0.75482245],[0.79311665],[0.72717998],[0.99520925],[0.6272892],[0.58431576],[0.56393628],[0.70573128],[0.71278512],[0.62338963],[0.59167193],[0.68162415],[0.96574364],[0.95977199],[0.51262308],[0.68782824],[0.97751902],[0.65240101],[1.03007361],[0.54584852],[0.53382708],[0.74049521],[0.51539236],[1.15169266],[0.51192534],[0.50414817],[0.91842542],[0.72603759],[0.77145224],[0.77636394],[0.5008804],[0.88366164],[0.52444868],[1.02561939],[0.79283023],[0.58210378],[0.52085935],[0.64203048],[0.65842717],[0.8133927],[1.11454063],[0.54439903],[0.57858162],[0.6350194],[0.66247528],[1.04526556],[0.55833447],[0.86044378],[1.17552297],[0.9046525],[0.63642174],[0.76309084],[0.5614024],[0.52619756],[0.51930791],[0.50178537],[0.69797467],[0.52574483],[0.58530247],[0.62463815],[0.57014054],[0.59946713],[0.81214262],[0.70856854],[0.6680275],[0.79246255],[0.82978974],[0.63751432]])
positives = positives.T #vector (1, 97)
print("Positive examples:"'\n', positives)
#print (positives.shape) = (1, 97)

#My new classification percentage is 149 negative examples versus 97 positive examples out of 246,
#compared to the initial 134 negative and 112 positive examples.
#this means that the relation before was 55% negative and 45% positive, 
#and after this classification the new percentage is %60.46 negative and %39.43 positive.



(100, 246) (246, 1) [1]
0    134
1    112
Name: true_value, dtype: int64


Updated weights = [[ 0.99405306  0.77327022  0.40055255  0.27001113  0.84714474  0.50295956
   0.4931697   1.46186834  0.3330859   0.54037277 -0.22912162  0.94219288
  -0.08851101 -0.04184842 -0.21641056  0.75360957 -0.24386631  1.27289334
   0.2809794   0.67550431 -0.16251248 -0.14534007  0.17534465  1.38677827
   0.41899432  0.73613047  0.45802212 -0.2777449   1.23809691  1.27796856
   0.36919523  0.85930168 -0.07794687  0.50461242  1.26407484  0.68080584
  -0.20407245 -0.44521194  0.50626005  0.68391225 -0.12513564  0.55132426
   0.45412857  0.66745158  0.49714838 -0.41665683 -0.35753548  0.04532853
   0.55155381  0.43891167  0.78725387  0.00286828  0.75347777 -0.21057813
   0.5041742   0.03627095  0.4167948   0.13810232  0.73778035  0.20022163
   0.59565478 -0.07811166  0.5575793   1.32542425  0.84744858  0.66005448
  -0.09688213  0.38566318 -0.1259109   0.89576743 -0.60809814  0.72911368
   0.47731092 -0.30

Negative examples:
 [[ 4.6805395e-01  4.9017903e-01  4.0891028e-01  2.4877100e-01
   2.1952099e-01  3.3334332e-01  4.1577029e-01  4.9889016e-01
   4.6520188e-01  3.3481477e-01  1.2502651e-01  1.7797620e-01
   4.6422194e-01  3.5937454e-01  3.2661209e-01  4.7391775e-01
   2.6187640e-01  4.2577050e-01  2.0593618e-01  3.5812380e-01
   4.2426313e-01  4.8959191e-01  4.6978610e-02  4.9706102e-01
   3.4705436e-01  4.3283746e-01  4.2052578e-01  4.2975858e-01
   3.1803844e-01  7.2755710e-02  4.5628773e-01  2.5609413e-01
   4.4001772e-01  2.8869812e-01  2.6745776e-01  4.4494372e-01
   4.2513110e-02  7.7112030e-02  1.7601171e-01 -1.1966224e-01
   4.2337548e-01  2.7345617e-01  3.9076018e-01  2.4042795e-01
   4.1494987e-01  3.6596071e-01  4.7152886e-01  4.6326030e-01
   4.6338723e-01  4.2241831e-01  3.1937239e-01  3.4830498e-01
   3.5991243e-01  4.2084252e-01  4.7046266e-01  1.1902003e-01
  -1.9554650e-02  2.1514700e-01  3.5698080e-01  3.9314586e-01
   2.8060730e-02  1.8122624e-01  2.4743665e-01  1.

In [377]:
# Train
#draft.
#Here I am just playing with my data to understand what's happening at every step
#Results are in the following cell.
import numpy as np 
import pandas as pd 
import random as rd
import os
import re

cwd = os.getcwd() 

xy_data = pd.read_csv('./train.csv',sep=" ") 


y_val = xy_data.iloc[:,0:1] 
x_val = xy_data.iloc[:,1:] 

X,Y = np.array(x_val),np.array(y_val)

X = X.T 

print(X.shape, Y.shape, Y[0])
print(xy_data.iloc[:,0].value_counts())

A = np.array([[0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5], 
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5]])
A = A.T
#print (A.shape) #(1, 100)
#Shape of A = (100, 1) matrix
#Shape of A.T = (1, 100) vector

#For the second array of weights, I would like to try to use the updated weights
#obtained in the previous operations (I figured this is better than using random values)

#B weights:
B = np.array([[0.97829312],[0.83158072],[0.46425509],[0.25254652],[0.84236338],[0.49941111],[0.3982277],[1.53827897],[0.48500617],[0.34455053],[-0.38086452],[1.04936081],[0.01939414],[-0.20121832],[-0.34912384],[0.76017423],[-0.37566003],[1.29371956],[0.13744561],[1.09240483],[-0.26255273],[-0.25677293],[-0.12557412],[1.67559589],[0.27950026],[0.82765408],[0.45568791],[-0.92778133],[1.40005674],[1.37127036],[0.33884329],[1.01830059],[-0.20226904],[0.5416187],[1.1846969],[0.97070746],[-0.20186939],[-0.752855],[0.73500703],[0.76656658],[-0.07205643],[0.60872105],[0.49756567],[0.7778359],[0.50010229],[-0.68669364],[-0.5205433],[-0.33079999],[0.69656099],[0.20482022],[0.84356392],[0.0183879],[0.89559422],[-0.59024129],[0.48906595],[-0.01876273],[0.49119977],[-0.09704396],[0.94684656],[-0.10294812],[1.01193132],[-0.37416234],[0.92393849],[2.2195974],[0.94791971],[1.10624846],[-0.17412155],[0.31680436],[-0.46997878],[1.08782986],[-1.06873495],[1.0434044],[0.56842111],[-0.61248971],[-0.2313668],[1.40435686],[0.82124543],[0.68502217],[-0.05442003],[-0.31402426],[0.62669135],[-0.49257749],[-0.76807836],[-0.49902664],[1.42574697],[0.53251532],[-0.61735453],[0.47533006],[-0.05046968],[-0.32304893],[0.92861148],[0.92020826],[-0.81638137],[0.7758058],[1.47809868],[1.14160865],[0.24207072],[0.62760918],[-0.22811186],[1.20714824]])
#B = B.T
#print(B.shape) #(1, 100) with B.T, 
#Shape of B = (100, 1) matrix
#Shape of B.T = (1, 100) vector

def ReLU(X): #an activation function defined as the positive part of its argument
    return (X >= 0)*X #sets the values that are smaller than 0 to 0
def derivReLU(X):
    return 1*(X > 0)

#print("X", X.shape) 
#Shape of X = (100, 246) matrix

input_data = X
layer1 = ReLU(np.matmul(A, input_data)) # Prediction for first layer #A(1,100)*X(100,246)
print('First layer: ', layer1 )
#print("First layer", layer1.shape) #(1,246). #why no 'for' loop for X here? Wild guess:
#because we didn't do the step of taking each X column in a loop, we don't obtain (1,1)preds x 246 times.
#we obtain (1,246) because we still have another layer coming up, we need a matrix form to operate with?

pred =  np.matmul(B, layer1)   # Prediction for the second layer #B(100,1)*X(1,246)
print('Prediction Last layer: ', pred) #my prediction result is a matrix (100,246)
#this prediction result is different from the one of only one layer, which was a vector of (1,246) dimensions.

goal = Y #Y is (246, 1)
gap =  pred.T - goal  #(100,246) - (246,1) cannot do this so I transposed pred: (pred.T) #(246,100)
#print("gap", gap.shape) #gap (246, 100)
error =  gap**2  
print('Gap= ', gap, 'Error = ', error)


alpha = 0.063 #I will maintain the original step for now
Ni =  38 #maintained for now
for iteration in range(Ni):
    error_for_all = 0
    for ind in range(246):
        input_data = X[:, [ind]]
        goal = Y[ind]
        pre_layer1 =  np.matmul(A, input_data)# <-- Ax product, pre-layer 1
        #print("prelayer",pre_layer1.shape) #(1,1)
        layer1 =  ReLU(pre_layer1)   # <-- At layer 1
        #print("layer",layer1.shape) #dim=(1,1) 
        pred =  np.matmul(B.T, layer1)    # <-- Prediction at layer 2 #transformed B
        #print("pred",pred.shape) #now this is (100,1)
        gap = pred - goal
        error = gap**2
        # Update second layer's weights
        B = B - alpha * gap*layer1.T
        # Update first layer's weights in two steps
        aux = np.multiply(B.T, derivReLU(pre_layer1)) # <-- Multiply arguments element-wise, B and derivReLU evaluated at Ax
        #print("A", A.shape)
        #A = A.T
        #print("A.T", A.shape)
        A =  A - alpha * gap* np.matmul(aux, input_data)# <-- update A
print('Final weights:', A, B)


pred = np.matmul(B, ReLU(np.matmul(A, X)))    # <-- Final predictions
gap2 = (pred- Y.T)**2
total_error = np.sum(gap2)
print('Total error=', total_error)
print('Predictions:\n', pred)


(100, 246) (246, 1) [1]
0    134
1    112
Name: true_value, dtype: int64
Final weights: [[-0.8344544  -0.7344544  -0.6344544  ... -0.6344544  -0.5344544
  -0.4344544 ]
 [-0.42024932 -0.32024932 -0.22024932 ... -0.22024932 -0.12024932
  -0.02024932]
 [-0.40914053 -0.30914053 -0.20914053 ... -0.20914053 -0.10914053
  -0.00914053]
 ...
 [-0.4240788  -0.3240788  -0.2240788  ... -0.2240788  -0.1240788
  -0.0240788 ]
 [-0.732438   -0.632438   -0.532438   ... -0.532438   -0.432438
  -0.332438  ]
 [-1.68122441 -1.58122441 -1.48122441 ... -1.48122441 -1.38122441
  -1.28122441]] [[ 0.08494386 -0.05230639 -1.21291672 ... -0.87508032 -0.99227694
   0.30270963]
 [ 0.22324179  0.08455697 -0.81386592 ... -0.41799872 -0.87284177
   0.44268883]
 [ 0.56950001  0.42722346  0.23241991 ...  0.96638069 -0.57323699
   0.79315646]
 ...
 [ 0.41551485  0.27483558 -0.3074792  ...  0.19123093 -0.70651659
   0.63729932]
 [ 1.2221573   1.07311072  2.02049889 ...  2.81496832 -0.01042514
   1.45374801]
 [-0.13078562 

In [17]:
#Train
#I will now add a second layer to my initial neural network.
#In order to do that, I will add a second array of weights.
import numpy as np 
import pandas as pd 
import random as rd
import os
import re

cwd = os.getcwd() 

xy_data = pd.read_csv('./train.csv',sep=" ") 


y_val = xy_data.iloc[:,0:1] 
x_val = xy_data.iloc[:,1:] 

X,Y = np.array(x_val),np.array(y_val)

X = X.T 

print(X.shape, Y.shape, Y[0])
print(xy_data.iloc[:,0].value_counts())

A = np.array([[0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5], 
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5]])
A = A.T
#print (A.shape) #(1, 100)
#Shape of A = (100, 1) matrix
#Shape of A.T = (1, 100) vector

#For the second array of weights, I would like to try to use the updated weights
#obtained in the previous operations (I figured this is better than using random values)

#B weights:
B = np.array([[0.97829312],[0.83158072],[0.46425509],[0.25254652],[0.84236338],[0.49941111],[0.3982277],[1.53827897],[0.48500617],[0.34455053],[-0.38086452],[1.04936081],[0.01939414],[-0.20121832],[-0.34912384],[0.76017423],[-0.37566003],[1.29371956],[0.13744561],[1.09240483],[-0.26255273],[-0.25677293],[-0.12557412],[1.67559589],[0.27950026],[0.82765408],[0.45568791],[-0.92778133],[1.40005674],[1.37127036],[0.33884329],[1.01830059],[-0.20226904],[0.5416187],[1.1846969],[0.97070746],[-0.20186939],[-0.752855],[0.73500703],[0.76656658],[-0.07205643],[0.60872105],[0.49756567],[0.7778359],[0.50010229],[-0.68669364],[-0.5205433],[-0.33079999],[0.69656099],[0.20482022],[0.84356392],[0.0183879],[0.89559422],[-0.59024129],[0.48906595],[-0.01876273],[0.49119977],[-0.09704396],[0.94684656],[-0.10294812],[1.01193132],[-0.37416234],[0.92393849],[2.2195974],[0.94791971],[1.10624846],[-0.17412155],[0.31680436],[-0.46997878],[1.08782986],[-1.06873495],[1.0434044],[0.56842111],[-0.61248971],[-0.2313668],[1.40435686],[0.82124543],[0.68502217],[-0.05442003],[-0.31402426],[0.62669135],[-0.49257749],[-0.76807836],[-0.49902664],[1.42574697],[0.53251532],[-0.61735453],[0.47533006],[-0.05046968],[-0.32304893],[0.92861148],[0.92020826],[-0.81638137],[0.7758058],[1.47809868],[1.14160865],[0.24207072],[0.62760918],[-0.22811186],[1.20714824]])
B = B.T
#print(B.shape) #(1, 100) with B.T, 
#Shape of B = (1, 100) matrix
#Shape of B.T = (100, 1) vector


def ReLU(X):
    return (X >= 0)*X
def derivReLU(X):
    return 1*(X > 0)

#summary:
#print("X", X.shape) #(100,246) #weights should be 246 or 100??
#print("Y", Y.shape) #(246,1)
#print("A", A.shape) #(1,100)
#print("B", B.shape) #(100,1) #Pasar a vector? en GD es 1,4.
#la única diferencia en la rel dimensional es Y y B. Si se transposa B
#la única diferencia es 246 y 100. Y= (246,1) y B =(1, 100).

alpha = 0.063 #I will maintain the original step for now
Ni =  38 #maintained for now

for iteration in range(Ni):
    for ind in range(246):
        input_data = X[:, [ind]] #(100,1[each 1 out of 246]))
        goal = Y[ind] #(246,1)
        pre_layer1 =  np.matmul(A, input_data)#(1,100)*(100,1)[*246 times]
        #print("pre_layer",pre_layer1) [[0.3]]
        #print("pred1", pre_layer1.shape) #Pre_layer1=(1,1) but there are 246 preds: one x for each weight.
        layer1 =  ReLU(pre_layer1)   #(1,1)*[246 times]
        #print("layer1",layer1) #layer1 [[0.3]]
        pred =  np.matmul(B.T, layer1)    #(100,1)*(1,1) = (100,1)
        #print("pred",pred.shape) #Pred=(100,1)*[246 times]
        gap = pred - goal #(100,1)-(246,1)
        error = gap**2
        # Updating second layer's weights
        B = B - alpha * gap*layer1.T
        # Updating first layer's weights in two steps
        aux = np.multiply(B.T, derivReLU(pre_layer1)) #Multiply arguments element-wise, B and derivReLU evaluated at Ax
        A =  A - alpha * gap* np.matmul(aux, input_data)#update A
print('Final A weights:\n', A,'\n',
      'Final B weights:\n', B)

pred = np.matmul(B, ReLU(np.matmul(A, X)))#final predictions
gap2 = (pred- Y.T)**2
total_error = np.sum(gap2)
print('Total error=', total_error)
print('Predictions for train set:\n', pred)

(100, 246) (246, 1) [1]
0    134
1    112
Name: true_value, dtype: int64
Final A weights:
 [[-0.8344544  -0.7344544  -0.6344544  ... -0.6344544  -0.5344544
  -0.4344544 ]
 [-0.42024932 -0.32024932 -0.22024932 ... -0.22024932 -0.12024932
  -0.02024932]
 [-0.40914053 -0.30914053 -0.20914053 ... -0.20914053 -0.10914053
  -0.00914053]
 ...
 [-0.4240788  -0.3240788  -0.2240788  ... -0.2240788  -0.1240788
  -0.0240788 ]
 [-0.732438   -0.632438   -0.532438   ... -0.532438   -0.432438
  -0.332438  ]
 [-1.68122441 -1.58122441 -1.48122441 ... -1.48122441 -1.38122441
  -1.28122441]] 
 Final B weights:
 [[ 0.08494386 -0.05230639 -1.21291672 ... -0.87508032 -0.99227694
   0.30270963]
 [ 0.22324179  0.08455697 -0.81386592 ... -0.41799872 -0.87284177
   0.44268883]
 [ 0.56950001  0.42722346  0.23241991 ...  0.96638069 -0.57323699
   0.79315646]
 ...
 [ 0.41551485  0.27483558 -0.3074792  ...  0.19123093 -0.70651659
   0.63729932]
 [ 1.2221573   1.07311072  2.02049889 ...  2.81496832 -0.01042514
   1.4

In [25]:
# Test
#I now apply the previous procedure to our examples of the test set.
#In this section, I focus on only one layer of NN and 
#I obtain best alpha values and number of iterations
#A second NN structure is added in the following cell.
import numpy as np 
import pandas as pd 
import random as rd
import os
import re

cwd = os.getcwd() 

xy_data = pd.read_csv('./test.csv',sep=" ") 

y_val = xy_data.iloc[:,0:1]
x_val = xy_data.iloc[:,1:] 

X,Y = np.array(x_val),np.array(y_val)

X = X.T 

print(X.shape, Y.shape, Y[0])
print(xy_data.iloc[:,0].value_counts())


weights = np.array([[0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
                   [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5]])
#Same first 100 random weights to make a prediciton.

weights = weights.T
alpha = 0.1 #Random number of alpha
Ni =  200  #Random number of iterations
input_data = X
goal = Y

prediction =  np.matmul(weights, input_data)# (1,100)*(100,246)
print('Prediction: ', prediction, '\n')

gap = prediction - goal 
print('Gap:\n', gap, '\n')
print('Gap**2:\n ', gap**2, '\n')
print('Sum Gap**2: ', sum(gap**2), '\n')

error = sum(gap**2) 
#print('Initial weights =', weights, '\n Initial gap = ', gap, '\n Initial error = ', error, '\n')

Ni = 200
# update the weight
for iteration in range(Ni):
    prediction =  np.matmul(weights, input_data)
    gap = prediction - goal
    error = sum(gap**2)
    weights = weights - alpha*np.matmul(gap, input_data.T)
        
#print('\n\n''Updated weights =', weights, '\n\n' 'Updated gap=', gap,'\n\nUpdated error = ', error, '\n')

#Trying different alpha and Ni values with total error of 0:
alpha = 4 #After several tries (see below), I found this to be the best alpha value
Ni =  300  #Best result for number of iterations
for iteration in range(Ni):
    error_for_all = 0
    for ind in range(50):
        input_data = X[:, [ind]]
        goal = Y[ind] 
        prediction = np.matmul(weights, input_data)
        gap = prediction - goal
        error = gap**2
        weights = weights - alpha*gap*input_data.T
        
#different tries for alpha:
#alpha = 5 -> Gap= [[1.45710669e+27]]
#alpha = 0.5 -> Gap= [[-0.00574624]]
#alpha = 0.05 -> Gap= [[-0.05765209]]
#alpha = 0.005 -> Gap= [[-0.60096044]
#alpha = 0.6 -> Gap= [[-0.00392432]]
#alpha = 0.8 -> Gap= [[-0.00212213]]
#alpha = 4 -> Gap= [[-1.65472469e-09]]

#different tries for Ni:
#Ni = 200 -> Gap= [[-1.65472469e-09]]
#Ni = 20 -> Gap= [[-0.0022714]] 
#Ni = 335 -> Gap= [[-5.7620575e-14]] #Increasing more gives RuntimeWarning: overflow error


print('\n\n''Updated weights =', weights, '\n\n' 'Gap=', gap,'\n\nError = ', error_for_all, '\n')
print('These are my predictions =\n', np.matmul(weights, X))

#obtaining updated weights value for my second NN layer:
#weightsnew = weights.T
#for i in weightsnew: 
#    neww = ','.join(str(i) for i in weightsnew)
#print("New", neww)

(100, 50) (50, 1) [1]
0    26
1    24
Name: true_value, dtype: int64
Prediction:  [[0.34       0.46666667 0.2        0.35       0.3        0.3
  0.43333333 0.3        0.15       0.3        0.31666667 0.23333333
  0.38571429 0.175      0.4        0.3        0.18       0.22857143
  0.28333333 0.26666667 0.32       0.25       0.25       0.225
  0.25       0.4        0.23333333 0.24285714 0.35       0.33333333
  0.225      0.3        0.24285714 0.2875     0.2        0.35
  0.4        0.25       0.3        0.3        0.4        0.26666667
  0.35       0.35       0.26666667 0.25       0.22       0.23333333
  0.3        0.225     ]] 

Gap:
 [[-0.66       -0.53333333 -0.8        ... -0.76666667 -0.7
  -0.775     ]
 [ 0.34        0.46666667  0.2        ...  0.23333333  0.3
   0.225     ]
 [ 0.34        0.46666667  0.2        ...  0.23333333  0.3
   0.225     ]
 ...
 [ 0.34        0.46666667  0.2        ...  0.23333333  0.3
   0.225     ]
 [-0.66       -0.53333333 -0.8        ... -0.76666667 -0.

In [19]:
#Test
#I will now add a second layer to my initial neural network.
#In order to do that, I will add a second array of weights
#(Weights B obtained from my previously updated A weights).
import numpy as np 
import pandas as pd 
import random as rd
import os
import re

cwd = os.getcwd() 

xy_data = pd.read_csv('./test.csv',sep=" ") 


y_val = xy_data.iloc[:,0:1] 
x_val = xy_data.iloc[:,1:] 

X,Y = np.array(x_val),np.array(y_val)

X = X.T 


print(X.shape, Y.shape, Y[0])
print(xy_data.iloc[:,0].value_counts())

A = np.array([[0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5], 
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5]])
A = A.T
#print (A.shape) #(1, 100)
#Shape of A = (100, 1) matrix
#Shape of A.T = (1, 100) vector

#B weights:
B = np.array([[2.47631677],[1.36646453],[0.48526488],[-0.62961473],[0.23126217],[0.67363088],[1.52368322],[1.51473512],[0.66719809],[1.053561],[-0.23126218],[-0.62857642],[0.10910143],[1.80032682],[-0.07678844],[0.63536659],[0.00315013],[0.9597753],[-0.79438186],[1.36463341],[0.59103477],[-0.10910145],[-0.43191581],[0.97274755],[-0.52987776],[-0.00248008],[0.2],[1.62982099],[-1.82898889],[-0.70986642],[-0.00315012],[1.32004632],[-0.70986642],[0.62843979],[0.5],[0.12063146],[0.86719809],[-0.89510681],[2.04777785],[-0.10365767],[-0.15],[0.2],[0.3],[5.47565916e+142],[1.84265348],[-0.65411806],[0.79123172],[-0.36891434],[0.92366903],[-0.14994765],[-0.46675497],[0.29717401],[0.3],[0.63353545],[0.5],[0.1],[0.2],[0.3],[1.62295058],[0.72252009],[-0.98342359],[1.52443401],[-0.29717366],[0.15],[0.5],[0.1],[0.62961473],[0.3],[0.97274755],[0.5],[0.59028988],[-0.92366905],[0.94616214],[0.4],[0.5],[-0.72252009],[0.70986641],[0.37457371],[-0.28861552],[0.5],[1.78696124],[-0.85704485],[0.36126004],[-0.66719809],[0.5],[0.1],[0.2],[0.96841101],[-0.33191581],[-0.07678844],[1.32295058],[1.93023552],[0.79438185],[0.4],[0.93259262],[0.1],[0.2],[0.3],[-0.36126004],[1.15225964]])
B = B.T
#print(B.shape) #(1, 100) with B.T, 
#Shape of B = (1, 100) matrix
#Shape of B.T = (100, 1) vector


def ReLU(X):
    return (X >= 0)*X
def derivReLU(X):
    return 1*(X > 0)

#summary:
#print("X", X.shape) #(100,50) 
#print("Y", Y.shape) #(50,1)
#print("A", A.shape) #(1,100)
#print("B", B.shape) #(1,100)

alpha = 0.002 
Ni =  30

for iteration in range(Ni):
    for ind in range(50):
        input_data = X[:, [ind]] 
        goal = Y[ind]
        pre_layer1 =  np.matmul(A, input_data)
        layer1 =  ReLU(pre_layer1)   
        pred =  np.matmul(B.T, layer1)   
        gap = pred - goal
        error = gap**2
        # Updating second layer's weights
        B = B - alpha * gap*layer1.T
        # Updating first layer's weights in two steps
        aux = np.multiply(B.T, derivReLU(pre_layer1)) #Multiply arguments element-wise, B and derivReLU evaluated at Ax
        A =  A - alpha * gap* np.matmul(aux, input_data)#update A
print('Final A weights:\n', A,'\n',
      'Final B weights:\n', B)

pred = np.matmul(B, ReLU(np.matmul(A, X)))#final predictions
gap2 = (pred- Y.T)**2
total_error = np.sum(gap2)
print('Total error=', total_error)
print('Predictions for test set:\n', pred)
#I'm not sure why I get a RuntimeWarning: overflow encountered in (function).
#I downloaded the latest version of the data and 
#I checked the test set, and numbers are not too big but it's still an error. 
#I tried also reducing the numbers of the B weights but it didnt work.
#Reduced B weights:
#B = np.array([[0.24],[0.13],[0.48],[-0.62],[0.23],[0.67],[0.15],[0.15],[0.66],[0.10],[-0.23],[-0.62],[0.10],[0.18],[-0.07],[0.63],[0.003],[0.95],[-0.79],[0.13],[0.59],[-0.10],[-0.43],[0.97],[-0.52],[-0.002],[0.2],[0.16],[-0.18],[-0.70],[-0.003],[0.13],[-0.70],[0.62],[0.5],[0.12],[0.86],[-0.89],[0.20],[-0.10],[-0.15],[0.2],[0.3],[0.54],[0.18],[-0.65],[0.79],[-0.36],[0.92],[-0.14],[-0.46],[0.29],[0.3],[0.63],[0.5],[0.1],[0.2],[0.3],[0.16],[0.72],[-0.98],[0.15],[-0.29],[0.15],[0.5],[0.1],[0.62],[0.3],[0.97],[0.5],[0.59],[-0.92],[0.94],[0.4],[0.5],[-0.72],[0.70],[0.37],[-0.28],[0.5],[0.17],[-0.85],[0.36],[-0.66],[0.5],[0.1],[0.2],[0.96],[-0.33],[-0.07],[0.13],[0.19],[0.79],[0.4],[0.93],[0.1],[0.2],[0.3],[-0.36],[0.11]])
#Finally, I managed to make it work in the cell below by setting a very low
#alpha value. This would not work with the B weights in this cell but it works with the second ones
#that I have in the notation above. The results are in the following cell:


(100, 50) (50, 1) [1]
0    26
1    24
Name: true_value, dtype: int64


  error = gap**2
  B = B - alpha * gap*layer1.T
  A =  A - alpha * gap* np.matmul(aux, input_data)#update A
  pred =  np.matmul(B.T, layer1)


Final A weights:
 [[nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 ...
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]] 
 Final B weights:
 [[nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 ...
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]]
Total error= nan
Predictions for test set:
 [[nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 ...
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]
 [nan nan nan ... nan nan nan]]


In [23]:
#Test
#The cumulative error is rather high
#this might be because I over trained the training set 
#so the process does not work good with new data
#It could be because the process is done in a wrong way as well.
import numpy as np 
import pandas as pd 
import random as rd
import os
import re

cwd = os.getcwd() 

xy_data = pd.read_csv('./test.csv',sep=" ") 


y_val = xy_data.iloc[:,0:1] 
x_val = xy_data.iloc[:,1:] 

X,Y = np.array(x_val),np.array(y_val)

X = X.T 


print(X.shape, Y.shape, Y[0])
print(xy_data.iloc[:,0].value_counts())

A = np.array([[0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5], 
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5],
              [0.1], [0.2], [0.3], [0.4], [0.5], [0.1], [0.2], [0.3], [0.4], [0.5]])
A = A.T
#print (A.shape) #(1, 100)
#Shape of A = (100, 1) matrix
#Shape of A.T = (1, 100) vector

#B weights:
B = np.array([[0.24],[0.13],[0.48],[-0.62],[0.23],[0.67],[0.15],[0.15],[0.66],[0.10],[-0.23],[-0.62],[0.10],[0.18],[-0.07],[0.63],[0.003],[0.95],[-0.79],[0.13],[0.59],[-0.10],[-0.43],[0.97],[-0.52],[-0.002],[0.2],[0.16],[-0.18],[-0.70],[-0.003],[0.13],[-0.70],[0.62],[0.5],[0.12],[0.86],[-0.89],[0.20],[-0.10],[-0.15],[0.2],[0.3],[0.54],[0.18],[-0.65],[0.79],[-0.36],[0.92],[-0.14],[-0.46],[0.29],[0.3],[0.63],[0.5],[0.1],[0.2],[0.3],[0.16],[0.72],[-0.98],[0.15],[-0.29],[0.15],[0.5],[0.1],[0.62],[0.3],[0.97],[0.5],[0.59],[-0.92],[0.94],[0.4],[0.5],[-0.72],[0.70],[0.37],[-0.28],[0.5],[0.17],[-0.85],[0.36],[-0.66],[0.5],[0.1],[0.2],[0.96],[-0.33],[-0.07],[0.13],[0.19],[0.79],[0.4],[0.93],[0.1],[0.2],[0.3],[-0.36],[0.11]])
B = B.T
#print(B.shape) #(1, 100) with B.T, 
#Shape of B = (1, 100) matrix
#Shape of B.T = (100, 1) vector


def ReLU(X):
    return (X >= 0)*X
def derivReLU(X):
    return 1*(X > 0)

#summary:
#print("X", X.shape) #(100,50) 
#print("Y", Y.shape) #(50,1)
#print("A", A.shape) #(1,100)
#print("B", B.shape) #(1,100)

alpha = 0.0002 
Ni =  300

for iteration in range(Ni):
    for ind in range(50):
        input_data = X[:, [ind]] 
        goal = Y[ind]
        pre_layer1 =  np.matmul(A, input_data)
        layer1 =  ReLU(pre_layer1)   
        pred =  np.matmul(B.T, layer1)   
        gap = pred - goal
        error = gap**2
        # Updating second layer's weights
        B = B - alpha * gap*layer1.T
        # Updating first layer's weights in two steps
        aux = np.multiply(B.T, derivReLU(pre_layer1)) #Multiply arguments element-wise, B and derivReLU evaluated at Ax
        A =  A - alpha * gap* np.matmul(aux, input_data)#update A
print('Final A weights:\n', A,'\n',
      'Final B weights:\n', B)

pred = np.matmul(B, ReLU(np.matmul(A, X)))#final predictions
gap2 = (pred- Y.T)**2
total_error = np.sum(gap2)
print('Total error=', total_error)
print('Predictions for test set:\n', pred)


(100, 50) (50, 1) [1]
0    26
1    24
Name: true_value, dtype: int64
Final A weights:
 [[-0.26090742 -0.16090742 -0.06090742 ... -0.06090742  0.03909258
   0.13909258]
 [-0.20711853 -0.10711853 -0.00711853 ... -0.00711853  0.09288147
   0.19288147]
 [-0.36700663 -0.26700663 -0.16700663 ... -0.16700663 -0.06700663
   0.03299337]
 ...
 [-0.33405336 -0.23405336 -0.13405336 ... -0.13405336 -0.03405336
   0.06594664]
 [-0.36685485 -0.26685485 -0.16685485 ... -0.16685485 -0.06685485
   0.03314515]
 [-0.20799761 -0.10799761 -0.00799761 ... -0.00799761  0.09200239
   0.19200239]] 
 Final B weights:
 [[-0.01272134 -0.26780102  0.42279986 ...  0.14328221 -0.43806785
  -0.29947727]
 [ 0.15662833  0.06093492  0.45372846 ...  0.2385259  -0.39470441
   0.04253175]
 [-0.40361068 -1.0467181   0.3547207  ... -0.0710054  -0.53393815
  -1.11032501]
 ...
 [-0.11104759 -0.4656654   0.40583458 ...  0.08979681 -0.46195136
  -0.50549003]
 [ 0.86610466  1.42246278  0.5890221  ...  0.64283752 -0.20613627
   1.4