# Logistic Regression
---
- Author: Diego Inácio
- GitHub: [github.com/diegoinacio](https://github.com/diegoinacio)
- Notebook: [regression_logistic.ipynb](https://github.com/diegoinacio/machine-learning-notebooks/blob/master/Machine-Learning-Fundamentals/regression_logistic.ipynb)
---
Overview and implementation of *Logistic Regression* analysis.

In [None]:
%matplotlib inline
import matplotlib
import matplotlib.pyplot as plt
import numpy as np

from regression__utils import *

$$ \large
h_{\theta}(x)=g(\theta^Tx)=\frac{e^{\theta^Tx}}{1+e^{\theta^Tx}}=\frac{1}{1+e^{-\theta^Tx}}
$$

where:

$$ \large
\theta^Tx=
\begin{bmatrix}
    \theta_0 \\
    \theta_1 \\
    \vdots \\
    \theta_i
\end{bmatrix}
\begin{bmatrix}
    1 & x_{11} & \cdots & x_{1i} \\
    1 & x_{21} & \cdots & x_{2i} \\
    \vdots & \vdots & \ddots & \vdots \\
    1 & x_{n1} & \cdots & x_{ni}
\end{bmatrix}
$$

where:

- $\large h_\theta(x)$ is the hypothesis;
- $\large g(z)$ is the logistic function or <em>sigmoid</em>;
- $\large \theta_i$ is the parameters (or <em>weights</em>).

In [None]:
def arraycast(f):
    '''
    Decorator for vectors and matrices cast
    '''
    def wrap(self, *X, y=[]):
        X = np.array(X)
        X = np.insert(X.T, 0, 1, 1)
        if list(y):
            y = np.array(y)[np.newaxis]
            return f(self, X, y)
        return f(self, X)
    return wrap

class logisticRegression(object):
    def __init__(self, rate=0.001, iters=1024):
        self._rate = rate
        self._iters = iters
        self._theta = None
    @property
    def theta(self):
        return self._theta
    def _sigmoid(self, Z):
        return 1/(1 + np.exp(-Z))
    def _dsigmoid(self, Z):
        return self._sigmoid(Z)*(1 - self._sigmoid(Z))
    @arraycast
    def fit(self, X, y=[]):
        self._theta = np.ones((1, X.shape[-1]))
        for i in range(self._iters):
            thetaTx = np.dot(X, self._theta.T)
            h = self._sigmoid(thetaTx)
            delta = h - y.T
            grad = np.dot(X.T, delta).T
            self._theta -= grad*self._rate
    @arraycast
    def pred(self, x):
        return self._sigmoid(np.dot(x, self._theta.T)) > 0.5

In [None]:
# Synthetic data 5
x1, x2, y = synthData5()

![logistic regression data](output/regression_logistic_data.png "Logistic Regression Data")

In [None]:
%%time
# Training
rlogb = logisticRegression(rate=0.001, iters=512)
rlogb.fit(x1, x2, y=y)
rlogb.pred(x1, x2)

![logistic regression training](output/regression_logistic_gradDesc.gif "Logistic Regression Training")

To find the boundary line components:

$$ \large
    \theta_0+\theta_1 x_1+\theta_2 x_2=0
$$

Considering $\large x_2$ as the dependent variable:

$$ \large
    x_2=-\frac{\theta_0+\theta_1 x_1}{\theta_2}
$$

In [None]:
# Prediction
w0, w1, w2 = rlogb.theta[0]
f = lambda x: - (w0 + w1*x)/w2

![regressão logística prediction](output/regression_logistic_pred.png "Logistic Regression Prediction")