### Logistic Regression
Logistic regression is common method for solving binary classification problem. Logistic regression explains the relationship between one dependent binary variable and independent variables. It predicts the probability of occurence of an event.

#### Linear Regression vs. Logistic Regression
Linear regression gives continuous output like house prices while logistic regression supplies a constant output like a patient is cancer or not. Linear regression uses Ordinary Least Square (OLS) while logistic regression uses Maximum Likelihood Estimation (MLE) method.
![title](images/linearLogistic.png)
MLE is a maximization method which is used to determine the parameters that are most likely to produce observed data. These parameters can be used to predict the data needed. On the other hand, OLS is about fitting the regression line on data points that has the least square error.

#### Sigmoid Function
Sigmoid is useful to get values between 0 and 1.

#### Accuracy
Accuracy is an evaluation metric for classification models. It is measured by (Number of Correct Predictions)/(Total Number of Predictions). Accuracy should be checked carefully when there is a class-imbalanced data set which has significantly different frequency between positive and negative labels in binary classification problem.

*References: https://www.datacamp.com/community/tutorials/understanding-logistic-regression-python

In [20]:
%matplotlib inline
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import scipy as sc
import pandas as pd

df_iris = pd.read_csv(u'data/iris.txt',sep=' ')

target = np.array(df_iris['c'])
features = np.array(df_iris[['sl','sw','pl','pw']])

#print("features: ", features)

In [21]:
# multiplication of two matrix.
#this is our model for the forward pass.
def f(x, w):
    return x.dot(w)

# Error function. f_est is our predicted value
def E(y, f_est):
    return np.sum((y-f_est)**2)

In [34]:
# array size is determined by feature's array. It is all about starting point.
w = np.array([1.,1.,1.,1.])

#Learning algorithm
eta = 0.0001

for epoch in range(10000):
    f_est = f(features, w)
    e = target - f_est
    # Feature matrix transposed 
    dE = -features.T.dot(e)
    
    w = w - eta * dE

#print("f_est: ", f_est)
#print("target: ", target)
#print("e: ", e)
#print("dE: ",dE)
#print(w)
    

In [35]:
f_est = f(features, w)
result = np.round(f_est)

#A confusion matrix is a table that is used to evaluate the performance of a classification model. 
#The fundamental of a confusion matrix is the number of correct result is represented in the diagonal
#It is a 3*3 matrix because we have 1,2, and 3 in our class(target).
confusion_matrix = np.zeros((3,3))

for i in range(len(result)):
    confusion_matrix[int(target[i] - 1), int(result[i] - 1)] += 1
    
confusion_matrix

array([[50.,  0.,  0.],
       [ 0., 48.,  2.],
       [ 0.,  4., 46.]])

In [36]:
accuracy = np.sum(np.diag(confusion_matrix))/np.sum(confusion_matrix)

print("Accuracy is:", accuracy*100, "%")

Accuracy is: 96.0 %
