## Understanding the math underhood 

1. logit = z $$z=wx+b$$
2. likelihood function: $$\sigma^y(z)[1-\sigma(z)]^{1-y}$$ where $$\sigma = \frac{1}{1+e^{-z}}$$
3. log-likelihood function: $$ln(\sigma^y(z)[1-\sigma(z)]^{1-y})$$ to make math easier
4. Loss function for Logistic regression: $$-ln(\sigma^y(z)[1-\sigma(z)]^{1-y})$$
$$or$$
$$\frac{1\sum_{i=1}^m }{m}$$

## Manual practice


In [1]:
# example 1
y = 1
w = 2.0
x = 0.5
b = -0.2

# step 1: compute z 
z = w*x + b

# step 2: estimate sigmoid(z)
import math
e = math.e
sigmoid = 1/(1+e**(-z))

# step 3: compute the likelihood
likelihood = sigmoid**y *(1-sigmoid)**(1-y)

# step 4: compute the log-likelihood
import numpy as np
def ln(x):
    return np.log(x)
log_lh = ln(likelihood)
#print(log_lh)

# step 5: compute
loss = -log_lh
print(f"the loss: {loss}")

the loss: 0.37110066594777763


In [2]:
# example 2:
y = 0 
w = -1.5
x = 1.2
b = 0.3

# step 1: compute z
z = w*x+b
print(f"logit: {z}")

# step 2 compute sigmoid
sigmoid = 1/(1+e**(-z))
print(f"sigmoid: {sigmoid}")

# step 3 compute the likelihood
lh = sigmoid**y * (1-sigmoid)**(1-y)
print(f"likelihood: {lh}")

# step 4 compute the log-likelihood
log_lh = ln(lh)
print(f"the log-likelihood: {log_lh}")

# step 5 compute the loss
loss = -log_lh
print(f"the loss: {loss}")

logit: -1.4999999999999998
sigmoid: 0.18242552380635638
likelihood: 0.8175744761936437
the log-likelihood: -0.2014132779827524
the loss: 0.2014132779827524


### negative ln-likelihood = cost function for logistic regression = binary cross-entropy 

In [3]:
# exmple of interpreting 
ln_45=-np.log(0.45)
ln_45

0.7985076962177716

the output tells us that loss is very big and our model is not very good

#### practicing of calculating the loss function value over a dataset (weights are known)

In [4]:
import pandas as pd
np.random.seed(42)

w = 2.0
b = -1.0

n_samples = 100
X = np.random.randn(n_samples) 

z = w*X + b
def sigmoid(z):
    return 1/(1 + np.exp(-z))

probs = sigmoid(z)
y = np.random.binomial(1, probs)

df = pd.DataFrame({'X': X, 'prob': probs, 'y': y})
print(df.head())

          X      prob  y
0  0.496714  0.498357  0
1 -0.138264  0.218142  0
2  0.647689  0.573312  1
3  1.523030  0.885549  1
4 -0.234153  0.187200  1


In [None]:
m = len(df)
sum = 0
for i in range(m): 
    sum+= df["y"][i]*np.log(df["prob"][i]) + (1-df["y"][i])*np.log(1-df["prob"][i])
loss_val = -sum/m
print(f"the value of loss function with current weights: w = {w} and b = {b} is {loss_val}")

the value of loss function with current weights: w = 2.0 and b = -1.0 is 0.43508814398787854


**let's make a function from the code above:**

In [6]:
def loss_val(w, b, x, y):
    z = w*x + b
    probs = sigmoid(z)
    m = len(probs)
    sum = 0
    for i in range(m):
        sum += y[i]*np.log(probs[i]) + (1-y[i])*np.log(1-probs[i])
    return -sum/m
        

In [9]:
# let's test the function

lv_21 = loss_val(w, b, df["X"], df["y"])
lv_21

0.43508814398787854

the output above says that the function works 

#### now, let's test on various w and b manually:

In [10]:
w = 1.5
b = 0.2
lv = loss_val(w, b, df["X"], df["y"])
lv

0.5730380808445099

bad weights

In [13]:
w = 2.5
b = 1.5
lv = loss_val(w, b, df["X"], df["y"])
lv

0.9802465673291713

very bad weights

In [29]:
w = 1.8
b = -1.8

lv = loss_val(w, b, df["X"], df["y"])
lv

0.46343167904032156

## How to find these $w$ and $b$? 

Now, the whole "train the model" means finding 