Logistic Regression

Introduction

The last two posts delt with linear regression where we train a model to output a value given either a single or multiple input variables. Now we will move into the field of classification with Logistic Regression. With this we attempt to predict which class something belongs to given some form of input variables. For example, we would like to predict if an email is spam or not spam or if a transaction is fraudulent or genuine.

Binary Classification

For the example of 'spam' or 'not spam', we instead would want an output number of either 0, or 1 as our output. Technically, 0 stands for 'negative class' and 1 stands for 'positive class' but in this case 0 would be 'not spam' and 1 would be 'spam'.

Logistic Function

Previously we used the simple y=mx + b to guess our value but since classification isn't a linear problem we will instead have to use a different hypothesis function.

The benefit of using this function is that it gives us a 0 or 1 value given any input number and is thus better suited for classification.

The implementation in the code for this functions is:

def sigmoid(z):
    return  1 / (1 + np.exp(-z))

Cost Function

Like with our hypothesis function, we cannot use a the same cost function as we use in linear regression as this would result in a wavy line with many local optima so gradient descent wouldn't find the global minimum value. To solve this, we have to use a modified cost function that gives us a convex function so we can find optimal values. The updated cost function:

Or a vectorized version:

This is implemented in our code as:

def cost(h, y):
    return (-y * np.log(h) - (1 - y) * np.log(1 - h)).mean()

Gradient Descent

To minimize our cost function we can use the exact same gradient descent algorithm used in linear regression.

def gradient(X, h, y):
    return np.dot(X.T, (h - y)) / y.shape[0]

def logistic_regression(X, y, theta, alpha, iters):
    cost_array = np.zeros(iters)
    for i in range(iters):
        h = sigmoid(np.dot(X, theta))
        cost_num = cost(h, y)
        cost_array[i] = cost_num
        gradient_val = gradient(X, h, y)
        theta = theta - (gradient_val * alpha)
    return theta, cost_array

The Data

To test my implementation of logistic regression I have decided to use the famous iris dataset. For this example I have only decided to use 2 features (length and width) and 2 classes (iris setosa = 0 and iris versicolour = 1).

   length  width  Type
0     5.1    3.5     0
1     4.9    3.0     0
2     4.7    3.2     0
3     4.6    3.1     0
4     5.0    3.6     0
...
    length  width  Type
95     5.7    3.0     1
96     5.7    2.9     1
97     6.2    2.9     1
98     5.1    2.5     1
99     5.7    2.8     1

Shown on a scatter plot:

Results

With starting theta values of 0 our cost function gives us a value of: Initial cost value for theta values [0. 0. 0.] is: 0.6931471805599453

With a learning rate of 0.01 and iterations of 10,000, our logistic regression algorithm gives the following result: Final cost value for theta values [-0.70846899 3.04714971 -5.10943662] is: 0.10131731164299049

Hypothesis = 0.05 actual =  0 
Hypothesis = 0.25 actual =  0 
Hypothesis = 0.06 actual =  0 
Hypothesis = 0.07 actual =  0 
Hypothesis = 0.02 actual =  0 
Hypothesis = 0.02 actual =  0 
Hypothesis = 0.02 actual =  0 
Hypothesis = 0.05 actual =  0 
Hypothesis = 0.11 actual =  0 
Hypothesis = 0.17 actual =  0 
Hypothesis = 0.04 actual =  0 
Hypothesis = 0.03 actual =  0 
Hypothesis = 0.20 actual =  0 
Hypothesis = 0.05 actual =  0 
Hypothesis = 0.03 actual =  0 
Hypothesis = 0.00 actual =  0 
Hypothesis = 0.02 actual =  0 
Hypothesis = 0.05 actual =  0 
Hypothesis = 0.06 actual =  0 
Hypothesis = 0.01 actual =  0 
Hypothesis = 0.16 actual =  0 
Hypothesis = 0.02 actual =  0 
Hypothesis = 0.01 actual =  0 
Hypothesis = 0.12 actual =  0 
Hypothesis = 0.03 actual =  0 
Hypothesis = 0.31 actual =  0 
Hypothesis = 0.05 actual =  0 
Hypothesis = 0.06 actual =  0 
Hypothesis = 0.10 actual =  0 
Hypothesis = 0.06 actual =  0 
Hypothesis = 0.13 actual =  0 
Hypothesis = 0.16 actual =  0 
Hypothesis = 0.00 actual =  0 
Hypothesis = 0.00 actual =  0 
Hypothesis = 0.17 actual =  0 
Hypothesis = 0.14 actual =  0 
Hypothesis = 0.14 actual =  0 
Hypothesis = 0.17 actual =  0 
Hypothesis = 0.07 actual =  0 
Hypothesis = 0.07 actual =  0 
Hypothesis = 0.03 actual =  0 
Hypothesis = 0.78 actual =  0 
Hypothesis = 0.03 actual =  0 
Hypothesis = 0.03 actual =  0 
Hypothesis = 0.01 actual =  0 
Hypothesis = 0.20 actual =  0 
Hypothesis = 0.01 actual =  0 
Hypothesis = 0.05 actual =  0 
Hypothesis = 0.03 actual =  0 
Hypothesis = 0.09 actual =  0 
Hypothesis = 0.99 actual =  1 
Hypothesis = 0.92 actual =  1 
Hypothesis = 0.99 actual =  1 
Hypothesis = 0.99 actual =  1 
Hypothesis = 0.99 actual =  1 
Hypothesis = 0.91 actual =  1 
Hypothesis = 0.84 actual =  1 
Hypothesis = 0.88 actual =  1 
Hypothesis = 0.99 actual =  1 
Hypothesis = 0.79 actual =  1 
Hypothesis = 0.99 actual =  1 
Hypothesis = 0.87 actual =  1 
Hypothesis = 1.00 actual =  1 
Hypothesis = 0.96 actual =  1 
Hypothesis = 0.82 actual =  1 
Hypothesis = 0.98 actual =  1 
Hypothesis = 0.74 actual =  1 
Hypothesis = 0.96 actual =  1 
Hypothesis = 1.00 actual =  1 
Hypothesis = 0.97 actual =  1 
Hypothesis = 0.71 actual =  1 
Hypothesis = 0.97 actual =  1 
Hypothesis = 1.00 actual =  1 
Hypothesis = 0.97 actual =  1 
Hypothesis = 0.98 actual =  1 
Hypothesis = 0.98 actual =  1 
Hypothesis = 1.00 actual =  1 
Hypothesis = 0.99 actual =  1 
Hypothesis = 0.94 actual =  1 
Hypothesis = 0.97 actual =  1 
Hypothesis = 0.98 actual =  1 
Hypothesis = 0.98 actual =  1 
Hypothesis = 0.96 actual =  1 
Hypothesis = 0.98 actual =  1 
Hypothesis = 0.60 actual =  1 
Hypothesis = 0.55 actual =  1 
Hypothesis = 0.98 actual =  1 
Hypothesis = 1.00 actual =  1 
Hypothesis = 0.74 actual =  1 
Hypothesis = 0.96 actual =  1 
Hypothesis = 0.94 actual =  1 
Hypothesis = 0.93 actual =  1 
Hypothesis = 0.98 actual =  1 
Hypothesis = 0.94 actual =  1 
Hypothesis = 0.93 actual =  1 
Hypothesis = 0.79 actual =  1 
Hypothesis = 0.86 actual =  1 
Hypothesis = 0.97 actual =  1 
Hypothesis = 0.89 actual =  1 
Hypothesis = 0.91 actual =  1

Given hθ(x) ≥0.5 → y = 1 hθ(x) <0.5 → y = 0 then our model has been trained succesfully.

Usage

python irislogisticreg.py

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md
irisdata.csv		irisdata.csv
irislogisticreg.py		irislogisticreg.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Logistic Regression

Introduction

Binary Classification

Logistic Function

Cost Function

Gradient Descent

The Data

Results

Usage

Links

About

Uh oh!

Releases

Packages

Languages

drbilo/logistic-regression

Folders and files

Latest commit

History

Repository files navigation

Logistic Regression

Introduction

Binary Classification

Logistic Function

Cost Function

Gradient Descent

The Data

Results

Usage

Links

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages