**Mathematical expression of the algorithm**:

For one example $x^{(i)}$:
$$z^{(i)} = w^T x^{(i)} + b \tag{1}$$
$$\hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})\tag{2}$$ 
$$ \mathcal{L}(yhat^{(i)}, y^{(i)}) =  - y^{(i)}  \log(yhat^{(i)}) + (1-y^{(i)} )  \log(1-yhat^{(i)})\tag{3}$$

The cost is then computing:
$$ J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(yhat^{(i)}, y^{(i)})\tag{6}$$

Gradient Computing:
- $$ \frac{\partial J}{\partial w} = \frac{1}{m}X(yhat-Y)^T\tag{7}$$
- $$ \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (yhat^{(i)}-y^{(i)})\tag{8}$$


In [85]:
import numpy as np
import tensorflow as tf

## Weight and bias initializer

In [86]:
def initializer(input_dim: int) -> tuple:
    w = tf.zeros([input_dim, 1], dtype=tf.float64)
    b = 0.0
    return w, b

In [87]:
W, b = initializer(3)
W

<tf.Tensor: shape=(3, 1), dtype=float64, numpy=
array([[0.],
       [0.],
       [0.]])>

In [88]:
tf.transpose(W)

<tf.Tensor: shape=(1, 3), dtype=float64, numpy=array([[0., 0., 0.]])>

## Calculate $z$ for all $x^{(i)}$:
$$z^{(i)} = w^T x^{(i)} + b \tag{1}$$

In [89]:
def forward(W: tf.Tensor, b: tf.float64, X: tf.Tensor):
    wT = tf.transpose(W)
    Z = tf.tensordot(wT, X, axes=1) + b
    return Z

In [90]:
X = tf.Variable(
    [
        [2, 4, -3],
        [3, 6, -2],
        [4, 6, -1]
        ], dtype=tf.float64
)
Y = tf.Variable([1], dtype=tf.float64)
tf.tensordot(tf.transpose(W), X, axes=1)

<tf.Tensor: shape=(1, 3), dtype=float64, numpy=array([[ 0.,  0., -0.]])>

In [91]:
z = forward(W, b, X)
z

<tf.Tensor: shape=(1, 3), dtype=float64, numpy=array([[0., 0., 0.]])>

# Sigmoid Funtion
compute $sigmoid(z) = \frac{1}{1 + e^{-z}}$ for $z = w^T x + b$ to make predictions. Use np.exp() or tf.exp().

In [92]:
def sigmoid(Z: tf.Tensor):
    a = 1/(1 + tf.exp(-Z))
    return a

In [93]:
yhat = sigmoid(z)
yhat

<tf.Tensor: shape=(1, 3), dtype=float64, numpy=array([[0.5, 0.5, 0.5]])>

## Calculate the Cost :
 $J = -\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}\log(yhat^{(i)})+(1-y^{(i)})\log(1-yhat^{(i)}))$

In [94]:
def compute_cost(Y: tf.Tensor, Yhat: tf.Tensor):
    m = Yhat.shape[1]
    loss = tf.reduce_sum((Y * tf.math.log(Yhat)) + ((1-Y) * tf.math.log(1-Yhat)))
    c = (-1/m) * loss
    return c


In [95]:
compute_cost(Y, yhat)

<tf.Tensor: shape=(), dtype=float64, numpy=0.6931471805599452>

## Forward Propagation:
- You get X
- You compute $yhat = \sigma(w^T X + b) $
- You calculate the cost function: $J = -\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}\log(yhat^{(i)})+(1-y^{(i)})\log(1-yhat^{(i)}))$

In [96]:
def forward_prop(W: tf.Tensor, b: tf.Tensor, X: tf.Tensor, Y: tf.Tensor):
    Z = forward(W, b, X)
    Yhat = sigmoid(Z)
    cost = compute_cost(Y, Yhat)
    return Yhat, tf.squeeze(cost)


In [97]:
X = tf.Variable(
    [
        [2, 3, 4, 5, 6],
        [7, 2, 3, 4, 8],
    ], dtype=tf.float64
)
Y = tf.Variable([[1, 1, 0, 0, 1]], dtype=tf.float64)
Y.shape[1]

5

In [98]:
W, b = initializer(2)
Yhat, cost = forward_prop(W, b, X, Y)
Yhat, cost

(<tf.Tensor: shape=(1, 5), dtype=float64, numpy=array([[0.5, 0.5, 0.5, 0.5, 0.5]])>,
 <tf.Tensor: shape=(), dtype=float64, numpy=0.6931471805599454>)

## Back Propagation: 

- $$ \frac{\partial J}{\partial w} = \frac{1}{m}X(yhat-y)^T\tag{7}$$
- $$ \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (yhat^{(i)}-y^{(i)})\tag{8}$$

In [99]:
def back_prop(X: tf.Tensor, Yhat: tf.Tensor, Y: tf.Tensor) -> dict:
    m = Y.shape[1]
    loss = Yhat - Y
    dW = (1/m) * (tf.tensordot(X, tf.transpose(loss), axes=1))
    db = (1/m) * tf.reduce_sum(loss)
    return {'dW': dW, 'db':db}

In [100]:
grads = back_prop(X, Yhat, Y)
grads

{'dW': <tf.Tensor: shape=(2, 1), dtype=float64, numpy=
 array([[-0.2],
        [-1. ]])>,
 'db': <tf.Tensor: shape=(), dtype=float64, numpy=-0.1>}

# Optimizer

In [101]:

def optimizer(X: tf.Tensor, Y: tf.Tensor, epochs: int = 100, alpha=0.01):
    # initialize params W, b
    input_dim = X.shape[0]
    W, b = initializer(input_dim)

    # iterations
    for epoch in range(epochs):
        # forward propagation
        Yhat, cost = forward_prop(W, b, X, Y)

        # back propagation
        grads = back_prop(X, Yhat, Y)

        # update state
        W = W - (alpha * grads['dW'])
        b = b - (alpha * grads['db'])

        if epoch % 10 == 0:
            print(f"Epoch: {epoch} => Cost: {cost}")

    return W, b, grads, cost


In [102]:
X = tf.Variable(
    [
        [2, 3, 4, 5, 6],
        [7, 2, 3, 4, 8],
    ], dtype=tf.float64
)
Y = tf.Variable([[1, 1, 0, 0, 1]], dtype=tf.float64)
Y.shape[1]

5

In [103]:
optimizer(X, Y, alpha=0.1, epochs = 8000)

Epoch: 0 => Cost: 0.6931471805599454
Epoch: 10 => Cost: 0.5275387689313074
Epoch: 20 => Cost: 0.4793124844199505
Epoch: 30 => Cost: 0.4555991833558648
Epoch: 40 => Cost: 0.4417688005570246
Epoch: 50 => Cost: 0.4327236051355491
Epoch: 60 => Cost: 0.42629156283270425
Epoch: 70 => Cost: 0.4214092479570941
Epoch: 80 => Cost: 0.4175024566156901
Epoch: 90 => Cost: 0.4142381561926552
Epoch: 100 => Cost: 0.4114124555851992
Epoch: 110 => Cost: 0.4088951541994968
Epoch: 120 => Cost: 0.40660026134154426
Epoch: 130 => Cost: 0.40446940238220946
Epoch: 140 => Cost: 0.4024620344548698
Epoch: 150 => Cost: 0.40054945112102047
Epoch: 160 => Cost: 0.3987109881334195
Epoch: 170 => Cost: 0.3969315553302035
Epoch: 180 => Cost: 0.3951999928205159
Epoch: 190 => Cost: 0.3935079535376927
Epoch: 200 => Cost: 0.3918491299168692
Epoch: 210 => Cost: 0.39021871025224864
Epoch: 220 => Cost: 0.3886129911796839
Epoch: 230 => Cost: 0.387029098026298
Epoch: 240 => Cost: 0.385464780776423
Epoch: 250 => Cost: 0.38391826374

(<tf.Tensor: shape=(2, 1), dtype=float64, numpy=
 array([[-6.31638053],
        [ 3.56959116]])>,
 <tf.Tensor: shape=(), dtype=float64, numpy=13.079798078200502>,
 {'dW': <tf.Tensor: shape=(2, 1), dtype=float64, numpy=
  array([[ 0.0036813 ],
         [-0.00182351]])>,
  'db': <tf.Tensor: shape=(), dtype=float64, numpy=-0.008442131761156758>},
 <tf.Tensor: shape=(), dtype=float64, numpy=0.09826032272992391>)

In [104]:
from sklearn.datasets import load_breast_cancer
X, Y = load_breast_cancer(return_X_y=True)

In [105]:
from sklearn.preprocessing import StandardScaler
X = StandardScaler().fit_transform(X)
X = tf.Variable(X.T, dtype=tf.float64)
Y = tf.Variable([Y], dtype=tf.float64)

In [106]:
Y.shape

TensorShape([1, 569])

In [109]:
optimizer(X, Y, epochs=3000, alpha=0.01)

Epoch: 0 => Cost: 0.6931471805599453
Epoch: 10 => Cost: 0.5416293257815342
Epoch: 20 => Cost: 0.45481920916160395
Epoch: 30 => Cost: 0.3987961113545307
Epoch: 40 => Cost: 0.35937603038040783
Epoch: 50 => Cost: 0.32991681475112083
Epoch: 60 => Cost: 0.30691966431074974
Epoch: 70 => Cost: 0.28836744957984617
Epoch: 80 => Cost: 0.2730142226447647
Epoch: 90 => Cost: 0.260047637282393
Epoch: 100 => Cost: 0.24891452349984458
Epoch: 110 => Cost: 0.23922450046606478
Epoch: 120 => Cost: 0.23069371117235724
Epoch: 130 => Cost: 0.2231104464903821
Epoch: 140 => Cost: 0.216313318971083
Epoch: 150 => Cost: 0.2101769409703162
Epoch: 160 => Cost: 0.20460225531283954
Epoch: 170 => Cost: 0.19950984244945102
Epoch: 180 => Cost: 0.19483518498600388
Epoch: 190 => Cost: 0.19052525108591978
Epoch: 200 => Cost: 0.18653598596242674
Epoch: 210 => Cost: 0.1828304408486758
Epoch: 220 => Cost: 0.17937735734770324
Epoch: 230 => Cost: 0.17615008225692128
Epoch: 240 => Cost: 0.17312572569313128
Epoch: 250 => Cost: 0.

(<tf.Tensor: shape=(30, 1), dtype=float64, numpy=
 array([[-0.48926276],
        [-0.50199637],
        [-0.47999754],
        [-0.49438371],
        [-0.18761089],
        [-0.07366309],
        [-0.40871162],
        [-0.52590086],
        [-0.10615976],
        [ 0.26963756],
        [-0.52542925],
        [ 0.00163278],
        [-0.42180792],
        [-0.45789707],
        [-0.04415781],
        [ 0.24785782],
        [ 0.10792388],
        [-0.07808221],
        [ 0.12897158],
        [ 0.30294989],
        [-0.62785443],
        [-0.64854158],
        [-0.59084119],
        [-0.59702738],
        [-0.48663663],
        [-0.21352893],
        [-0.42182526],
        [-0.57312273],
        [-0.43064863],
        [-0.1352551 ]])>,
 <tf.Tensor: shape=(), dtype=float64, numpy=0.43151924226652916>,
 {'dW': <tf.Tensor: shape=(30, 1), dtype=float64, numpy=
  array([[ 0.00285345],
         [ 0.00475967],
         [ 0.0026586 ],
         [ 0.00328614],
         [ 0.00143613],
         [-0.0

# Model
- Initialize $$ w,b $$
- Forward Propagation:
    - You get X
    - You compute $yhat = \sigma(w^T X + b) $
    - You calculate the cost function: $J = -\frac{1}{m}\sum_{i=1}^{m}(y^{(i)}\log(yhat^{(i)})+(1-y^{(i)})\log(1-yhat^{(i)}))$
- Back Propagation: 
    - $$ \frac{\partial J}{\partial w} = \frac{1}{m}X(yhat-y)^T\tag{7}$$
    - $$ \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (yhat^{(i)}-y^{(i)})\tag{8}$$
- Update weights:
    - $$ w = w - {\alpha} * \frac{\partial J}{\partial w} $$
    - $$ b = b- {\alpha}  * \frac{\partial J}{\partial b} $$

# Test Model