
$sigmoid(x) = \frac{1}{1+e^{-x}}$ is sometimes also known as the logistic function. It is a non-linear function used not only in Machine Learning (Logistic Regression), but also in Deep Learning.

Let's first build a function to compute $sigmoid(x)$.

In [26]:
import math
import numpy as np

x = np.array([1, 2, 3])
# try to run this first line
#print(math.exp(x))
print(np.exp(x))


[ 2.71828183  7.3890561  20.08553692]


In [27]:
# using math package
def basic_sigmoid(x):
    return 1/(1+math.exp(-x))

basic_sigmoid(3)

0.9525741268224334

In [28]:
# using numpy package
def sigmoid(x):
    return 1/(1+np.exp(-x))

sigmoid(x)

array([0.73105858, 0.88079708, 0.95257413])

In [29]:
# compute z for one element using loop
x = np.array([1, 2, 3])
w= np.array([0.5, 0.3, 0.2])
b = 0.4
z = b

for i in range(len(x)):
    z += x[i]*w[i]

print(z)

2.1


In [30]:
# compute the activation
a = sigmoid(z)
print(a)

0.8909031788043871


In [31]:
# compute the activation but using vectorize approach
sigmoid(np.dot(w.T, x) + b)

0.8909031788043871

Let's generalize the activation calculation to a set of elements (your training data) using vectorization

In [32]:
# build the X matrix with the right orientation (dimensions)
X = np.array([[1, 2, 3], [1.5, 3.1, 4], [5, 6, 3]]).T
print(X)

[[1.  1.5 5. ]
 [2.  3.1 6. ]
 [3.  4.  3. ]]


In [33]:
print("x(1,1) :",X[0,0])
print("x(2,3) :",X[1,2])

x(1,1) : 1.0
x(2,3) : 6.0


In [34]:
# check the shape of w.
w.shape# ===> but this is the dim we need (3 , 1)

(3,)

In [35]:
# use reshape to adjust the dimensions
w= w.reshape(len(w),1)

In [36]:
print("w dim: ", w.shape)
print("X dim: ", X.shape)

w dim:  (3, 1)
X dim:  (3, 3)


In [37]:
# compute Z of all elements
Z = np.dot(w.T, X) + b
print(Z)

[[2.1  2.88 5.3 ]]


In [38]:
Z.shape

(1, 3)

In [39]:
# compute activation of all elements
A = sigmoid(Z)
print(A)

[[0.89090318 0.94684886 0.9950332 ]]


Let's now compute the two versions of cost function:  
- Least square: $\text{Cost} =  \sum_{i=1}^{m} (y_i - \hat{y}_i)^2$
- Log loos: $- \frac{1}{m} \sum_{i=1}^{m} \left( y_i \log(\hat{y}_i) + (1 - y_i) \log(1 - \hat{y}_i) \right)$

In [40]:
# Y is the vector that contains the real values
Y = np.array([[1, 0, 1]])
print(Y.shape)

(1, 3)


In [41]:
D = A - Y
print(D)

[[-0.10909682  0.94684886 -0.0049668 ]]


In [42]:
# Least square
np.dot(D,D.T)

array([[0.90844956]])

In [43]:
np.sum(D*D)

0.9084495560178965

In [44]:
# log loss
m = Y.shape[1]

-1/m * np.sum(Y*np.log(A) + (1-Y)*np.log(1-A))



1.0183714979486944