** <h1>What is "Logistic Regression"?</h1> **

Logistic Regression is a Machine Learning classification algorithm that is used to predict the probability of a categorical dependent variable. <br></br>
In logistic regression, the dependent variable is a binary variable that contains data coded as 1 (yes, success, etc.) or 0 (no, failure, etc.).<br></br>

<h2>Where we use Logistic Regression?</h2>
To predict whether an email is spam (1) or not (0)<br></br>
Whether the tumor is malignant (1) or not (0)<br></br>
To predict  whether a voice/face man (1) or woman (0)<br></br>
Logistic regression is generally used where the dependent variable is Binary or Dichotomous. That means the dependent variable can take only two possible values such as “Yes or No”, “Default or No Default”, “Living or Dead”, "Man or Woman", “Responder or Non Responder”, “Yes or No” etc. Independent factors or variables can be categorical or numerical variables.

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python docker image: https://github.com/kaggle/docker-python
# For example, here's several helpful packages to load in 

import numpy as np # linear algebra
import pandas as pd # data processing, CSV file I/O (e.g. pd.read_csv)
import matplotlib.pyplot as plt
# Input data files are available in the "../input/" directory.
# For example, running this (by clicking run or pressing Shift+Enter) will list the files in the input directory

import os
print(os.listdir("../input"))

# Any results you write to the current directory are saved as output.

In [None]:
data = pd.read_csv("../input/voice.csv") #read data
data.head() #first five datas

As you see, the "label" column has binary data. It has "male" and "female". So, we can use logistic regression in here.
But, predicted value can't be an object. It must be integer or category type. We must convert label column' s type to integer. 

In [None]:
print(data.label.unique())
data.label = [1 if i =='female' else 0 for i in data.label ]
y = data.label.values.reshape(-1,1)
x_data = data.drop(["label"], axis=1)

<h2>Normalization</h2>
We must normalize features into same scalar. Because each feature can have different type of measures. Formula of normalization: ![](http://nichea.sourceforge.net/images/figures/functions/standardization/formula1.png)


In [None]:
x = (x_data - np.min(x_data))/(np.max(x_data) - np.min(x_data)).values
data.head()

In [None]:
x.head()
# Do you see the difference?

<h2> Train Test Split </h2>
We split our data for test and train our regression. We use sklearn library for that. 
I use %20 for test my regression and %80 for train my regression.

In [None]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x,y,test_size = 0.2, random_state = 42)
# y => label
#x => feature
#test_size => %20
#random state => something like id

x_train = x_train.T
x_test = x_test.T
y_train = y_train.T
y_test = y_test.T

print("x_train shape: ",x_train.shape )
print("y_train shape: ",y_train.shape )
print("x_test shape: ",x_test.shape )
print("y_test shape: ",y_test.shape )

<h2> Application of Logistic Regression </h2>
1. Initialize parameters (weight and bias)
2. Forward propagation
3. Update
4. Predict <br>
<h3>Initializing Parameters </h3>
 Parameters are weights and bias. <br> </br>
 Bias: intercept

In [None]:
def initialize_weights_and_bias(dimension):
    w = np.full((dimension,1),0.01)
    #make dimensionx1 matrix full of 0.01. We use 0.01 but if we write 0, 0*x = 0 and that will cause our
    #code can't learn
    b = 0.0 # we want float
    return w,b
    
    #b is initial bias

<h3> Forward Propagation </h3>
z = (w.T)x + b => x is array, w is weights and  b is bias <br></br>
We put z into "sigmoid function" that returns y_head. What is sigmoid function?<br></br>
<b>Sigmoid function</b> makes z between zero and one so that is probability.It gives probabilistic result. It is derivative so we can use it in gradient descent algorithm. You can see formula and graph below.<br></br>
![](https://cdn-images-1.medium.com/max/1600/1*Xu7B5y9gp0iL5ooBj7LtWw.png)
Then we calculate "loss (error) function"


In [None]:
def sigmoid(z):    
    y_head = 1/(1 + np.exp(-z)) #formula of sigmoid 
    return y_head

# If z = 0, function must give us 0.5 mathematically
sigmoid(0)


Now we should calculate loss function. Formula of loss function is: <br></br> ![](https://image.ibb.co/eC0JCK/duzeltme.jpg)

It says that, if you are making wrong prediction, loss becomes big. After that, the cost function is summation of loss function. Each weight creates loss function. Cost function is summation of loss functions that is created by each input. We should find min cost.


In [None]:
def forward_backward_propagation(w,b,x_train,y_train):
    
    z = np.dot(w.T,x_train) + b
    y_head = sigmoid(z)
    loss = -y_train*np.log(y_head)-(1-y_train)*np.log(1-y_head)
    cost = (np.sum(loss))/x_train.shape[1]
    
    #backward prop.
    derivative_weight = (np.dot(x_train,((y_head-y_train).T)))/x_train.shape[1] # derivative weight
    derivative_bias = np.sum(y_head-y_train)/x_train.shape[1] #derivative bias
    gradients = {"derivative_weight": derivative_weight, "derivative_bias": derivative_bias} #parameters
    
    return cost,gradients


<h3>Updating Parameters</h3>

In [None]:
#number of it. = how many time backward-forward
def update(w, b, x_train, y_train, learning_rate, number_of_iteration):
    cost_list = []
    cost_list2 = []
    index = []
    
    #update parameters is number-of-iter. times
    for i in range(number_of_iteration):
        cost,gradients = forward_backward_propagation(w,b,x_train,y_train)
        #we need cost for  know how many time make iteration
        cost_list.append(cost)
        
        #update
        w = w - learning_rate * gradients["derivative_weight"]
        b= b - learning_rate * gradients["derivative_bias"]
        #stop when derivatives approach to zero

        if i % 10 == 0:
            cost_list2.append(cost) 
            index.append(i)
            print("cost after iteration %i: %f" %(i,cost))

    parameters = {"weight": w, "bias": b} #important part
    plt.plot(index,cost_list2)
    plt.xticks(index,rotation = 'vertical')
    plt.xlabel("number of cost iteration")
    plt.ylabel("cost")
    plt.show()
    
    return parameters, gradients, cost_list

<h3>Prediction Method</h3>

In [None]:
def predict(weight, bias, x_test):
    #x_test input for forward propagation    
    z = sigmoid(np.dot(weight.T,x_test)+bias)
    y_prediction = np.zeros((1,x_test.shape[1]))
    # if z > 0.5, prediction = 1 (y_head = 1)
    # if z < 0.5, prediction = 0 (y_head = 0)
    for i in range(z.shape[1]):
        if z[0,i] <= 0.5:
            y_prediction[0,i] = 0
        else:
            y_prediction[0,i] = 1
    return y_prediction

<h3> Logistic Regression Method ~ Last </h3> 

In [None]:
def logistic_reg(x_train, y_train, x_test, y_test, learning_rate, num_iteration):
    dimension = x_train.shape[0] #need for initialize weight, that is 30
    w,b = initialize_weights_and_bias(dimension)
    parameters, gradients, cost_list = update(w,b,x_train,y_train,learning_rate,num_iteration)
    y_prediction_test = predict(parameters["weight"],parameters["bias"], x_test)
    
    #print errors
    print("test accuracy: {} %".format(100 - np.mean(np.abs(y_prediction_test - y_test)) * 100))
    
logistic_reg(x_train, y_train, x_test, y_test,learning_rate = 1, num_iteration = 500)


<h2>Simple Logistic Regression Method</h2>
Of course we don' t write that long code for all time we need. Python has a simple method for that. Love Python <33

In [None]:
from sklearn.linear_model import LogisticRegression
lr = LogisticRegression()
lr.fit(x_train.T,y_train.T)
print("test accuracy {}".format(lr.score(x_test.T,y_test.T)))