## Cost function

In a previous lab, you developed the *logistic loss* function. Recall, loss is defined to apply to one example. Here you combine the losses to form the **cost**, which includes all the examples.


Recall that for logistic regression, the cost function is of the form 

$$ J(\mathbf{w},b) = \frac{1}{m} \sum_{i=0}^{m-1} \left[ loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)}) \right] \tag{1}$$

where
* $loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)})$ is the cost for a single data point, which is:

    $$loss(f_{\mathbf{w},b}(\mathbf{x}^{(i)}), y^{(i)}) = -y^{(i)} \log\left(f_{\mathbf{w},b}\left( \mathbf{x}^{(i)} \right) \right) - \left( 1 - y^{(i)}\right) \log \left( 1 - f_{\mathbf{w},b}\left( \mathbf{x}^{(i)} \right) \right) \tag{2}$$
    
*  where m is the number of training examples in the data set and:
$$
\begin{align}
  f_{\mathbf{w},b}(\mathbf{x^{(i)}}) &= g(z^{(i)})\tag{3} \\
  z^{(i)} &= \mathbf{w} \cdot \mathbf{x}^{(i)}+ b\tag{4} \\
  g(z^{(i)}) &= \frac{1}{1+e^{-z^{(i)}}}\tag{5} 
\end{align}
$$
 

In [2]:
import numpy as np
import math

In [7]:
def compute_cost_logistic(X,y,w,b):
    """
    Args:
        X(ndarray,(m,n)): Training features
        y(ndarray,(m,)): Training outputs
        w(ndarray,(n,)): Parameters
        b(scalar): Parameter

    Returns:
        cost(scalar): Cost of logistic regression
    """
    cost=0
    m=X.shape[0]
    for i in range(m):
        z_i=np.dot(X[i],w)+b
        f_wb_i=1/(1+math.exp(-1*z_i)) # Sigmoid function
        cost+=y[i]*np.log(f_wb_i)+(1-y[i])*np.log(1-f_wb_i)
    cost/=-1*m

    return cost

In [9]:
X_train = np.array([[0.5, 1.5], [1,1], [1.5, 0.5], [3, 0.5], [2, 2], [1, 2.5]])  #(m,n)
y_train = np.array([0, 0, 0, 1, 1, 1])       
w_tmp = np.array([1,1])
b_tmp = -3
print(compute_cost_logistic(X_train, y_train, w_tmp, b_tmp))

0.36686678640551745


**Expected output**: 0.3668667864055175

In [7]:
def compute_gradient_logistic(X,y,w,b):
    """
    Args:
        X(ndarray,(m,n)): Training features
        y(ndarray,(m,)): Training outputs
        w(ndarray,(n,)): Model Parameters
        b(scalar): Model Parameter
    
    Returns:
        dj_dw(ndarray,(m,)): Gradient of cost w.r.t parameters w
        dj_db(scalar): Gradient of cost w.r.t parameter b
    """
    m=X.shape[0]
    n=w.shape[0]
    dj_dw=np.zeros(n,dtype=float)
    dj_db=0.

    for i in range(m):
        z_i=np.dot(X[i],w)+b
        f_wb_i=1/(1+math.exp(-1*z_i))
        diff=f_wb_i-y[i]
        for j in range(n):
            dj_dw[j]+=X[i][j]*diff
        dj_db+=diff
    dj_db/=m;dj_dw/=m
    return dj_dw,dj_db

In [5]:
# Schikit Learn
X = np.array([[0.5, 1.5], [1,1], [1.5, 0.5], [3, 0.5], [2, 2], [1, 2.5]])
y = np.array([0, 0, 0, 1, 1, 1])
from sklearn.linear_model import LogisticRegression
lr_model = LogisticRegression()
lr_model.fit(X, y)

In [6]:
y_pred = lr_model.predict(X)

print("Prediction on training set:", y_pred)

Prediction on training set: [0 0 0 1 1 1]


In [7]:
print("Accuracy on training set:", lr_model.score(X, y))

Accuracy on training set: 1.0
