# Linear & Logistic Regression

## Idea

### Linear Regression

Linear regression is defined as the approach to model a relationship between one or more input variables and one scalar output variable.
For multiple inputs this statistic approach tries to estimate a function which "solves" the equations.


### Logistic Regression

Logistic Regression is defined as the approach to model the distribution of dependent **discrete** variables. 

![](./linear-logistic-regression.png) - [Reference](https://www.saedsayad.com/logistic_regression.htm)


## Improvement

* Output from logistic regression can't be negative or greater than 1
    * Neither could a probability
* Logistic regression is great for categorical values (binary, classes, etc...)
* Linear regression is great for continuous values (weight, number of hours, etc...) 

![](./logistic-regression-vs-linear.jpg)

## Concept

### Logistic Regression

Given $x$, we want $\hat{y} = P(y=1|x)$   

with $0\leq\hat{y}\leq1$ and $x \in \mathbb{R}^{n_x}$

Parameters: $w \in \mathbb{R}^{n_x}, b \in \mathbb{R}$

Output: $\hat{y}= \sigma(z)$

with $z = w^Tx+b$

### Sigmoid

$\sigma{(z)} = \frac{1}{1+e^{-z}}$

If x is large: $\sigma(z) \approx \frac{1}{1+very small} = 1$

If x is a large negative: $\sigma(z) \approx \frac{1}{1+very large} = 0$

![](./sigmoid.png)

Other activation functions

![](./activation-functions.png)

### Loss-function

"Linear regression uses mean squared error as its cost function. If this is used for logistic regression, then it will be a non-convex function of parameters (theta). Gradient descent will converge into global minimum only if the function is convex." -[Reference](https://towardsdatascience.com/logistic-regression-detailed-overview-46c4da4303bc) 

![](./convex-non-convex.png)

For logistic regression the cost function is defined as follows:

$L(\hat{y},y) = -(y\log{\hat{y}}+(1-y)\log{(1-\hat{y})})$

If $y=1$: $L(\hat{y},y) = -\log{\hat{y}} \leftarrow$ want $\log{\hat{y}}$ large, want $\hat{y}$ large.

If $y=0$: $L(\hat{y},y) = -\log{(1-\hat{y})} \leftarrow$ want $\log{\hat{y}}$ large, want $\hat{y}$ small.

### Cost-function

$J(w,b)=\frac{1}{m}\sum^m_{i=1}{L(\hat{y}^{(i)},y^{(i)})} = \frac{1}{m}\sum^m_{i=1}[y^{(i)}\log{\hat{y}^{(i)}}+(1-y^{(i)})\log{(1-\hat{y}^{(i)})}]$


## Example

In [4]:
# import libraries
import numpy as np

In [7]:
def sigmoid(X):
    return 1 / (1 + np.exp(-X))

In [1]:
def forward(X, W, B):
    return sigmoid(np.dot(W.T, X) + B)

In [None]:
def compute_cost(A, Y):
    m = Y.shape
    return -1/m * np.sum(Y * np.log(A) + (1 - Y) * np.log(1 - A))

In [2]:
def predict(X, W, B):
    # retrieve example size
    m = X.shape(1)
    
    # forward
    A = forward(X, W, B)
    
    # for binary classification
    # create output array of size (1,m)
    Y_pred = np.zeros((1, m))
    
    
    for i in range(A.shape[1]):
        Y_pred[A > 0.5] = 1
        Y_pred[A <= 0.5] = 0
        
    return Y_pred

## References

### Linear Regression

1. [Linear Regression - Wiki](https://en.wikipedia.org/wiki/Linear_regression)

### Logistic Regression

1. [Logistic Regression - saedsayad](https://www.saedsayad.com/logistic_regression.htm)
2. [(Univariate|Simple) Logistic regression](https://gerardnico.com/data_mining/simple_logistic_regression)
3. [Sigmoid - Wiki](https://en.wikipedia.org/wiki/Sigmoid_function)
4. [Logistic Regression — Detailed Overview](https://towardsdatascience.com/logistic-regression-detailed-overview-46c4da4303bc)