### Ex1: Linear Algebra and Neural Networks

### Goal: predict asperity (highly correlated with the fact that the part is healthy or faulty) based on the observed sensor values.

### Columns maxtemp, mintemp, maxvibration are aggregations of sensor values, so we have lost some information from the raw data.


In [3]:
import numpy as np

In [4]:
# create data points
dp1 = {'partno': 100, 'maxtemp': 35, 'mintemp': 35, 'maxvibration': 12, 'asperity': 0.32}
dp2 = {'partno': 101, 'maxtemp': 46, 'mintemp': 35, 'maxvibration': 21, 'asperity': 0.34}
dp3 = {'partno': 130, 'maxtemp': 56, 'mintemp': 46, 'maxvibration': 3412, 'asperity': 12.42}
dp4 = {'partno': 131, 'maxtemp': 58, 'mintemp': 48, 'maxvibration': 3542, 'asperity': 13.43}

#### Simple rule to predict asperity from sensor values. 


In [5]:
def predict(dp):
    if dp['maxvibration'] > 100:
        return 13
    else:
        return 0.33

In [6]:
predict(dp1)

0.33

In [7]:
predict(dp2)

0.33

In [8]:
predict(dp3)

13

In [9]:
predict(dp4)

13

#### Let's see if we can do better without hardcoding a rule, using linear regression with observation X and weights W. 

#### In practice, W will be set by an optimizer which is part of a ML training process.

In [10]:
w1 = 0.32
w2 = 0
w3 = 0
w4 = 13/3412.0

def mlpredict(dp):
    return w1+w2*dp['maxtemp']+w3*dp['mintemp']+w4*dp['maxvibration']

In [11]:
mlpredict(dp1)

0.36572098475967174

In [12]:
mlpredict(dp2)

0.4000117233294256

In [13]:
mlpredict(dp3)

13.32

In [14]:
mlpredict(dp4)

13.815310668229777

#### It turns out that asperity > 1 is not usable. So we consider them to be faulty or broken. 

In [15]:
dp1 = {'partno': 100, 'maxtemp': 35, 'mintemp': 35, 'maxvibration': 12, 'broken': 0}
dp2 = {'partno': 101, 'maxtemp': 46, 'mintemp': 35, 'maxvibration': 21, 'broken': 0}
dp3 = {'partno': 130, 'maxtemp': 56, 'mintemp': 46, 'maxvibration': 3412, 'broken': 1}
dp4 = {'partno': 131, 'maxtemp': 58, 'mintemp': 48, 'maxvibration': 3542, 'broken': 1}

In [16]:
# we change the regression problem into a binary classification one.

def predict(dp):
    if dp['maxvibration'] > 100:
        return 1
    else:
        return 0

In [17]:
predict(dp1)

0

In [18]:
predict(dp2)

0

In [19]:
predict(dp3)

1

In [20]:
predict(dp4)

1

#### This has changed the linear regression problem to a logistic regression one.

In [21]:
import math

def sigmoid(x):
    return 1/(1+math.exp(-x))

In [22]:
w1 = 0.32
w2 = 0
w3 = 0
w4 = 13/3412.0

def mlpredict(dp):
    return 1 if sigmoid(w1+w2*dp['maxtemp']+w3*dp['mintemp']+w4*dp['maxvibration']) > 0.7 else 0

In [23]:
mlpredict(dp1)

0

In [24]:
mlpredict(dp2)

0

In [25]:
mlpredict(dp3)

1

In [26]:
mlpredict(dp4)

1

### Vectorized Implementation of a two-layer MLP (with 1 unit in both layers)

In [27]:
dp1 = {'partno': 100, 'maxtemp': 35, 'mintemp': 35, 'maxvibration': 12, 'asperity': 0.32}
dp2 = {'partno': 101, 'maxtemp': 46, 'mintemp': 35, 'maxvibration': 21, 'asperity': 0.34}
dp3 = {'partno': 130, 'maxtemp': 56, 'mintemp': 46, 'maxvibration': 3412, 'asperity': 12.42}
dp4 = {'partno': 131, 'maxtemp': 58, 'mintemp': 48, 'maxvibration': 3542, 'asperity': 13.43}

In [28]:
dp1

{'partno': 100,
 'maxtemp': 35,
 'mintemp': 35,
 'maxvibration': 12,
 'asperity': 0.32}

In [29]:
x1 = np.array([1] + [v for k, v in dp1.items()] [1:-1])
x2 = np.array([1] + [v for k, v in dp2.items()] [1:-1])
x3 = np.array([1] + [v for k, v in dp3.items()] [1:-1])
x4 = np.array([1] + [v for k, v in dp4.items()] [1:-1])

In [32]:
x4

array([   1,   58,   48, 3542])

In [33]:
w_layer1 = np.random.rand(4) # weight initialized randomly

# calculates logistic function for just one data point
def neuron(x):
    return sigmoid(x.dot(w_layer1)) # apply sigmoid as activation function

In [34]:
neuron(x1) # output of first layer

0.9999999999977185

In [35]:
# calculates logistic function for all data points at once
def sigmoid(x):
    return 1/(1+np.exp(-x))

In [36]:
w_layer1 = np.random.rand(4,4)

def layer1(x):
    return sigmoid(x.dot(w_layer1))

In [37]:
x = np.array([x1,x2,x3,x4])

In [38]:
x

array([[   1,   35,   35,   12],
       [   1,   46,   35,   21],
       [   1,   56,   46, 3412],
       [   1,   58,   48, 3542]])

In [39]:
layer1(x) # output from first layer

array([[1.        , 0.99999979, 1.        , 1.        ],
       [1.        , 1.        , 1.        , 1.        ],
       [1.        , 1.        , 1.        , 1.        ],
       [1.        , 1.        , 1.        , 1.        ]])

In [41]:
w_layer2 = np.random.rand(4,4) # we change the weight matrix because the second layer is computed independently from the first layer

def layer2(x):
    return sigmoid(x.dot(w_layer2))

In [42]:
layer2(layer1(x)) # output from second layer

array([[0.92115564, 0.86387622, 0.89757167, 0.88648855],
       [0.92115565, 0.86387624, 0.89757168, 0.88648857],
       [0.92115565, 0.86387625, 0.89757168, 0.88648857],
       [0.92115565, 0.86387625, 0.89757168, 0.88648857]])