## This session is a primer for Softmax / Logistic Regression
Logistic regression is a simple classification algorithm for learning to predict binary labels. Softmax Regression is the generalized form of logistic regression. 

![](./images/logistic_regression_schematic.png)

Logistic regression learns weights so as to maximize the likelihood of the data.

In [1]:
from IPython.display import Latex
from IPython.display import Math
import sys
import numpy

## Sigmoid Function

![](./images/sigmoid.png)

The function is often called the “sigmoid” or “logistic” function – it is an S-shaped function that “squashes” the value of θ⊤x into the range 0..1 so that we may interpret our classes as a probability

In [2]:
def sigmoid(x):
    return 1. / (1 + numpy.exp(-x))

## Softmax regression
Softmax Regression (also called multinomial logistic regression) is a generalized form of logistic regression. 
![](./images/Softmax_Classifier.png)


In Softmax Regression (SMR), we simply replace the sigmoid logistic function by the so-called softmax function $\phi_{softmax}(\cdot)$.

In [3]:
def softmax(x):
    e = numpy.exp(x - numpy.max(x))  # prevent overflow
    if e.ndim == 1:
        return e / numpy.sum(e, axis=0)
    else:  
        return e / numpy.array([numpy.sum(e, axis=1)]).T  # ndim = 2

## Logistic Regression Class

#### Regularization Equation
![](./images/regularization.png)
#### Gradient Descent (Log-Likelihood)
Formula for the gradient of the log-likelihood with respect to the kth weight is
![](./images/gd_loglikelihood.png)



In [4]:
class LogisticRegression(object):
    def __init__(self, input, label, n_in, n_out):
        self.x = input
        self.y = label
        self.Weights = numpy.zeros((n_in, n_out))  # initialize W 0
        self.biases = numpy.zeros(n_out)          # initialize bias 0

        # self.params = [self.Weights, self.biases]

    def train(self, lr=0.1, input=None, L2_regularization=0.00):
        if input is not None:
            self.x = input

        # p_y_given_x = sigmoid(numpy.dot(self.x, self.Weights) + self.biases)
        p_y_given_x = softmax(numpy.dot(self.x, self.Weights) + self.biases)
        d_y = self.y - p_y_given_x
        
        self.Weights += lr * numpy.dot(self.x.T, d_y) - lr * L2_regularization * self.Weights
        self.biases += lr * numpy.mean(d_y, axis=0)
        
        # cost = self.negative_log_likelihood()
        # return cost

    def negative_log_likelihood(self):
        # sigmoid_activation = sigmoid(numpy.dot(self.x, self.Weights) + self.biases)
        sigmoid_activation = softmax(numpy.dot(self.x, self.Weights) + self.biases)

        cross_entropy = - numpy.mean(
            numpy.sum(self.y * numpy.log(sigmoid_activation) +
            (1 - self.y) * numpy.log(1 - sigmoid_activation),
                      axis=1))

        return cross_entropy


    def predict(self, x):
        # return sigmoid(numpy.dot(x, self.Weights) + self.biases)
        return softmax(numpy.dot(x, self.Weights) + self.biases)

## Test Run
Now we will do a testing run using randomly generated data

In [5]:
def test_lr(learning_rate=0.01, n_epochs=200):
    # training data
    x = numpy.array([[1,1,1,0,0,0],
                     [1,0,1,0,0,0],
                     [1,1,1,0,0,0],
                     [0,0,1,1,1,0],
                     [0,0,1,1,0,0],
                     [0,0,1,1,1,0]])
    y = numpy.array([[1, 0],
                     [1, 0],
                     [1, 0],
                     [0, 1],
                     [0, 1],
                     [0, 1]])


    # construct LogisticRegression
    classifier = LogisticRegression(input=x, label=y, n_in=6, n_out=2)

    # train
    for epoch in range(n_epochs):
        classifier.train(lr=learning_rate)
        cost = classifier.negative_log_likelihood()
        print ( 'Training epoch %d, cost is ' % epoch, cost)
        learning_rate *= 0.95


    # test
    x = numpy.array([1, 1, 0, 0, 0, 0])
    print ( classifier.predict(x))

In [6]:
if __name__ == "__main__":
    test_lr()

Training epoch 0, cost is  1.34345264825
Training epoch 1, cost is  1.30455598779
Training epoch 2, cost is  1.2691578322
Training epoch 3, cost is  1.23687047036
Training epoch 4, cost is  1.2073565859
Training epoch 5, cost is  1.1803220394
Training epoch 6, cost is  1.15550971744
Training epoch 7, cost is  1.13269430085
Training epoch 8, cost is  1.11167781874
Training epoch 9, cost is  1.0922858705
Training epoch 10, cost is  1.07436441371
Training epoch 11, cost is  1.05777703022
Training epoch 12, cost is  1.04240259613
Training epoch 13, cost is  1.02813329242
Training epoch 14, cost is  1.01487290331
Training epoch 15, cost is  1.00253535772
Training epoch 16, cost is  0.991043476319
Training epoch 17, cost is  0.980327892779
Training epoch 18, cost is  0.970326122837
Training epoch 19, cost is  0.960981759043
Training epoch 20, cost is  0.95224377257
Training epoch 21, cost is  0.944065906409
Training epoch 22, cost is  0.936406146759
Training epoch 23, cost is  0.929226261457