# Logistic Regression

**Logistic regression** is the appropriate regression analysis to conduct when the dependent variable is dichotomous (binary).  Like all regression analyses, the logistic regression is a predictive analysis.  Logistic regression is used to describe data and to explain the relationship between one dependent binary variable and one or more nominal, ordinal, interval or ratio-level independent variables.

\begin{equation*}
P(y=1|\ x)=\ \frac{1}{1-e^{-x*\beta}}
\end{equation*}


### Linear regression: \begin{equation*}  h_\theta(x) = \theta^Tx \end{equation*}
\begin{align}  where : \theta = weight\ matrix \ \ i.e \ \ [b_0,b_1,b_2,......,b_n] \\
\ \  \ \ X = feature\ matrix \ \ i.e \ \ [x_0,x_1,x_2,.......,x_n]
\end{align}
                            
### \begin{equation*}  h_\theta(x) = \ b_0x_0\ + \ b_1x_1\ + \ b_2x_2\ + ......+ \ b_nx_n \ = \sum_{i=0}^{n}b_ix_i \end{equation*}
\begin{align}
where : n = number \ of \ features \\
\end{align}

### Logistic regression:  \begin{equation*}  h_\theta(x) = \ g(\  \theta^Tx )\end{equation*}

\begin{equation*}
g(z)=\ \frac{1}{1-e^{-z}}
\end{equation*}

\begin{equation*}where :\ z= \sum_{i=0}^{n}b_ix_i = \theta^Tx  \end{equation*}

<center><img src="files\logit.jpg" /></center>

## Cost Function: 
\begin{equation*}
J(\theta)=\frac{1}{m} \sum_{i=1}^{m}Cost\ (\ h_\theta(x^i)\ ,\ y^i)\\
\\
where: m = number \ of \ training \ examples
\end{equation*}
<br />
<br />
\begin{equation*}Cost\ (\ h_\theta(x)\ ,\ y)= \begin{cases} -\log(h_\theta(x)), & \text{if $y$=1 }\\[2ex] -\log(1-h_\theta(x)), & \text{if $y$=0 } \end{cases} \end{equation*}

\begin{equation*}
Cost\ (\ h_\theta(x)\ ,\ y)= -y\ \log(h_\theta(x)) - (1-y)\ \log(1-h_\theta(x))
\end{equation*}

<h3>Our Aim to minimize : $$J(\theta)$$ </h3> 

\begin{equation*}
\min_{\theta}{J(\theta)}= -\frac{1}{m}
\left[  \sum_{i=1}^{m} \ y_i\ \log(h_\theta(x_i)) + (1-y_i)\ \log(1-h_\theta(x_i))
\right]
\end{equation*}

In [1]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
from tensorflow.examples.tutorials.mnist import input_data

In [2]:
mnist  = input_data.read_data_sets('data/', one_hot=True)
train_x  = mnist.train.images
train_y  = mnist.train.labels
test_x   = mnist.test.images
test_y   = mnist.test.labels

Extracting data/train-images-idx3-ubyte.gz
Extracting data/train-labels-idx1-ubyte.gz
Extracting data/t10k-images-idx3-ubyte.gz
Extracting data/t10k-labels-idx1-ubyte.gz


In [3]:
# Parameters of Logistic Regression
learning_rate   = 0.01
training_epochs = 200
batch_size      = 100
display_step    = 5

In [4]:
# Create Graph for Logistic Regression
x = tf.placeholder("float", [None, 784], name="input") 
y = tf.placeholder("float", [None, 10], name="output")  
W = tf.Variable(tf.zeros([784, 10]), name="weights")
b = tf.Variable(tf.zeros([10]), name="bias")

\begin{equation*}
Softmax \ Function:
g(z)=\ \frac{1}{1-e^{-z}} \\
where \ z= X*W+b \\
\\
X = input \ matrix \\
\\
W = weight \ matrix \\
\\
b = bias
\end{equation*}

In [5]:
# Softmax function
pred = tf.nn.softmax(tf.matmul(x, W) + b)

### Cost function is the form of:

$$J(\theta)=-\frac{1}{m}\sum_{i=1}^{m}y^{i}\log(h_\theta(x^{i}))+(1-y^{i})\log(1-h_\theta(x^{i}))$$

In [6]:
# Cost function
cost = tf.reduce_mean(-tf.reduce_sum((y * tf.log(pred)) + ((1 - y) * tf.log(1 - pred)), reduction_indices=1))

In [7]:
# Gradient descent
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) 

In [8]:
# Number of correct prediction
correct = tf.equal(tf.argmax(pred, 1), tf.argmax(y, 1))    

# Accuracy
accuracy = tf.reduce_mean(tf.cast(correct, "float"))

In [9]:
# Initializing the variables
init = tf.global_variables_initializer()

### Launch a Session :

In [10]:
sess = tf.Session()
sess.run(init)

In [None]:
for epoch in range(training_epochs):
    sum_cost = 0.
    num_batch = int(mnist.train.num_examples/batch_size)
    # Loop over all batches
    for i in range(num_batch): 
        randidx  = np.random.randint(train_x.shape[0], size=batch_size)
        batch_xs = train_x[randidx, :]
        batch_ys = train_y[randidx, :]                
        # Fit training using batch data
        feeds = {x: batch_xs, y: batch_ys}
        sess.run(optimizer, feed_dict=feeds)
        # Compute average loss
        sum_cost += sess.run(cost, feed_dict=feeds)
    avg_cost = sum_cost / num_batch
    # Display logs per epoch step
    if epoch % display_step == 0:
        train_acc = sess.run(accuracy, feed_dict={x: batch_xs, y: batch_ys})
        print ("Epoch: %03d/%03d cost: %.9f train_acc: %.3f" 
               % (epoch, training_epochs, avg_cost, train_acc))
print ("Optimization Finished!")

# Test model
test_accuracy = sess.run(accuracy, feed_dict={x: test_x, y: test_y})
print (("Test Accuracy: %.3f") % (test_accuracy))