# Machine Learning Algorithm: Logistic Regression
Logistic regression is a predictive analysis classification technique to conduct when the dependent variable is binary. Although it can be exetended for multi-class target variable via one-vs-all strategy. It is used to describe data and to explain the relationship between one dependent binary variable and one or more features or independent variables. 

## Assumptions
  * The dependent variable should be binary in nature viz. true / false.
  * There should be no outliers/anomaly in source data presented.
  * There should be no high correlations among the input variables. 
  * Logistic regression requires a large sample sizes.
  
The goal of a Logistic Regression is to construct a model: a hypothesis that can be used to estimate Y based on X. We used Linear regression to build our classification model. Given target value (y) and set of attribute values (x), the goal of Linear Regression is to find an equation that describes the target in terms of the attributes.
 y = θ0 + θ1x1 + θ2x2 + … + θmxm, where θm is the weight associated with attribute xm. 
 
Hypothesis is given by the equation:hθ(x) = θ0x0 + θ1x1...θnxn, where our goal is to choose θ0, θ1...θn such that for each training example (x(i), y(i)), hθ(x(i)) is as close as possible to y(i) on average. We used squared error cost function J(θ0, θ1...θn) to minimize J to find the optimal hypothesis. There are two main ways to find θ0, θ1, ...θn - <b>Closed Form Solution and Batch Gradient Descent.</b>

## Batch Gradient Descent: 
The gradient descent algorithm starts with an initial set of values and iteratively moves toward a set of parameter values that minimize the function. In linear regression, we need to fit our data better to the given hypothesis which is defined by the Error function. If we minimize this function, we will get the best line for our data. To compute it, we will need to differentiate the error function. In our case, we have multiple input variables and class variables. Hence, we will need to compute a partial derivative for each of them.

<img src="data/update_weights.png" style="width: 300px;float:left"/>

Where, 
* J is a convex quadratic function. 
* The intuition behind the convergence is that J(θ) approaches 0 as we approach the bottom of our convex function.  Regardless of where we started, gradient descent will eventually arrive at the global minimum

  ### Features
  * General-purpose algorithm for finding a local minimum of any continuous differentiable function
  * Iterative; not as efficient as closed form
  * Applicable to many optimisation tasks
  
## Worked Example

We now move forward to our owl classification example. There are three types: Barn Owl, Snowy Owl & Long-Eared Owl. Our job is to build a classifier which distinguishes between them given several features. We have been given owls.csv file. Each line in the file describes one type of Owl found in Ireland: body-length, wing-length, body-width, wing-width, type. As this is a multi-class classification problem, we will be using one-vs-all strategy for computing the results.

## Note: 
All the Util functions used in below classifier are kept in /model/utils.py and /common/utils.py

### Interactive UI: Classification_run.py

When u run the the program via <B>classification_run.py</B>, you will be prompted to enter filename, column names and the target attribute.

<img src="console.png" style="width: 1000px;float:left"/>

### Pre-processing: SupervisedLearningTester.py

As a part of pre-processing, it import source file and perform various operations using python pandas library.

In [0]:
def test_logistic_regression(file_path, column_names, target_column):
    print()
    print(Color.BOLD+Color.PURPLE + "==>Pre-processing source file:"+file_path+Color.END)

    df = pd.read_csv(file_path, header=0)
    
    #Assingning names to the columns as per the data set
    df.columns=column_names.split(",")
    #Re-numbering data set index from 1 (using numpy)
    df.index = np.arange(1, len(df) + 1)

    #Mapping data set into dependent attributes and class label attribute
    X = df.drop(target_column,axis=1)
    y = np.array(df[target_column])
    
    #Converting target class attributes into numerical ones.
    y = TestUtil.replace_target_class_values(y)

Now we are ready to feed our data to our hypothesis. Let's buld Logistic Regression model step by step.



### Hypothesis Representation
* In order to perform classification using Logistic Regression we need to modify our hypothesis so that it gives values between 0 and 1. we used sigmoid function (also known as Logistic function).

<img src="sigmoid.png" style="width: 200px;float:left"/><br><br><br><br><br><br>

<br><br>
When z approaches positive infinity, g(Z) will tend to 1. Similarly, when z approaches negative infinity, g(z) will tend to 0.
### Sigmoid function Implementation:
* Takes the z=g(x) valie and applies logisic function.
* Output is always between 0 and 1.

In [1]:
def _logistic_function(self, z):
        i = 1 / (1 + np.exp(-z))
        return i

### Hypothesis function
* Takes input attributes (sparse matrix form) and the corresponding theta values.
* Invoke logistic function for each of the input attribute set. 

In [None]:

def _hypothesis(self, X,theta_vector):
        z = 0.0
        for i in range(len(theta_vector)):
            z += X.item(i)*theta_vector.item(i)
        return self._logistic_function(z)

### Cost Function
We cannot use the same cost function of linear regression because the Logistic Function will cause the output to be wavy, causing many local optima. In other words, it will not be a convex function.

Instead, our cost function for logistic regression looks like:
<br><img src="update_weights.png" style="width: 450px;float:left" />
<br><br><br><br><br><br><br><br><br>
If we take partial derivative of this cost function using calculus and apply it to gradient descent: <br><img src="gradient.png" style="width: 450px;float:left" />
<br><br><br><br><br><br><br>

### Batch Gradient Implementation:

In [0]:
def _batch_gradient_descent(self, X,y,j,theta_vector):
        sum = 0
        for i in range(len(y)):
            sum+=(self._hypothesis(X[i],theta_vector)-y[i])*X[i][j]
        return sum/len(y)

### Classifier - linear_model.py: LogisticRegression()
It has four default parameters:
* alpha (learning rate): 0.009
* tol (tolerance level): 10^(-3)
* maxiter (maximum iteration): 2000
* multi_class (multi class notifier): auto

In [None]:
class LogisticRegression:
    def __init__(self, alpha=0.009, tol=1e-3, maxiter=2000, multi_class="auto"):
        self.alpha = alpha
        self.tol = tol
        self.maxiter = maxiter
        self.multi_class = multi_class
        self.nd_theta_vector = None


### fit method

* The fit method accepts input attribute(sparse matrix) and target attribute(array-like structure)
* It converts input and target attributes into the array-like structure.
* Then it adds intercept (X0=1) to the input matrix.
* Finally on the basis of multi_class variable it calls the respective fit method.

    ### fit_binary_model
    * Initialize theta vector to zero and delta vector to one(to avoid tolerance check initially)
    * Then it calculates cos function using _batch_gradient_descent in a loop for @maxiter times.
    * Finally when the function converges, the theta vector is added to the list.
    
  ### fit_multinomial_model
  * It is called in case of the multi-class classification problem.
  * It maps the target variable into a binary classification problem (assigning 1 and 0 accordingly)  via one-vs-all strategy.
  * It calls fit_binary_model iteratively until all combination of multi-class is exhausted

In [None]:
    def fit(self, X, y):
        self.nd_theta_vector = None
        X = ArrayUtil.convert_to_array(X)
        y = ArrayUtil.convert_to_array(y)
        X = LinearUtil.append_intercept(X)
        self.multi_class = LinearUtil.check_target_class_type(self.multi_class, y)

        if self.multi_class == "binary":
            self._fit_binary_model(X, y)
        elif self.multi_class == "multinomial":
            self._fit_multinomial_model(X, y)
        else:
            raise InvalidInputException(Color.BOLD+Color.RED+'Invalid attribute value:multi_class='+self.multi_class+'. Please enter value as {auto, binary or multinomial}'+Color.END)

    def _fit_multinomial_model(self, X, y):
        self.target_class_vector = set(y)
        for i in self.target_class_vector:
            modified_y = np.array([1 if o == i else 0 for o in y])
            self._fit_binary_model(X, modified_y)
        self.plot_graph(X,y)

    def _fit_binary_model(self, X, y):

        theta_vector = ArrayUtil.create_zero_vector(X)
        delta_vector = ArrayUtil.create_one_vector(X)
        itr = 0
        while LinearUtil.check_tolerance(delta_vector,self.tol) and itr < self.maxiter:

            for j in range(len(theta_vector)):
                theta = theta_vector[j] - (self.alpha * self._batch_gradient_descent(X, y, j, theta_vector))
                delta_vector[j] = theta

            theta_vector = delta_vector
            itr = itr + 1

        if self.nd_theta_vector is None:
            self.nd_theta_vector = theta_vector
        else:
            self.nd_theta_vector = np.vstack([self.nd_theta_vector, theta_vector])

### predict
* It is called when the multi_class variable is binary.
* it pre-processes the test data by converting it to an array like structure and adding intercept.
* It predicts target attribute by applying obtained theta values to the test data.
* Process continues until all the test sets are completed.
* Finally, predicted values are mapped via threshold (y=1 if predicted>=0.5 or y=0 if predicted<0)

### predict_multinomial
* It is called when the multi_class variable is multinomial.
* it pre-processes the test data by converting it to an array like structure and adding intercept.
* Predicted values from all the different hypothesis are stored in a temporary list.
* It finds the maximum probability of all the values from the temporary list and assigns it to resultant vector.
* Process continues in a loop until all the test sets are completed.

In [None]:
    def predict(self, X, threshold=0.5):

        if self.multi_class == "multinomial":
            return self.predict_multinomial(X)

        X = ArrayUtil.convert_to_array(X)
        X = LinearUtil.append_intercept(X)
        predicted_values=[]
        for i in range(X.shape[0]):
            predicted_values.append(self._hypothesis(X[i], self.nd_theta_vector))
        return np.array([1 if o>=threshold else 0 for o in predicted_values])

    def predict_multinomial(self, X):

        X = ArrayUtil.convert_to_array(X)
        X = LinearUtil.append_intercept(X)

        result_array = np.zeros(X.shape[0])
        for i in range(X.shape[0]):
            temp_predict_list = []
            for theta in self.nd_theta_vector:
                temp_predict_list.append(self._hypothesis(X[i], theta))
            result_array[i] = list(self.target_class_vector)[np.argmax(temp_predict_list)]
        return np.array(result_array)

### Testing Program: SupervisedLearningTester.py
* Create a model of the type LogisticRegression()
* Iteratively split the file into 2/3 for training, 1/3 for testing with shuffling data every time. 
* Calling predict function on 1/3 of the test data. 
* Predicted values are then used to compute accuracy and confusion matrix for each of the iterations.
* Finally, computing average accuracy of all the 10 iterations.

In [2]:
 model = LogisticRegression()
    lr_accuracy_score = []  # list to store accuracy for all 10 combination of data sets.
    for itr in range(10):
        X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.33, Shuffle=True)
        model.fit(X_train, Y_train)
        y_pred = model.predict(X_test)
        accuracy=LinearUtil.calculate_accuracy(Y_test, y_pred)
        print(Color.BOLD+Color.PURPLE + "Fold "+str(itr+1)+" Accuracy: "+str(accuracy)+Color.END)
        lr_accuracy_score.append(accuracy)
        LinearUtil.confusion_matrix(Y_test, y_pred)

    print()

    print(Color.BOLD+Color.PURPLE + "Average accuracy of a model after 10 fold is " + str(sum(lr_accuracy_score) / len(lr_accuracy_score))+Color.END)

IndentationError: unexpected indent (<ipython-input-2-98072dcdb8fc>, line 2)

## Observations and Test Results: (for entire logs, please see Appendix console.log)
* Results as taken from PyCharm_2018.1.4(Python IDE) console window.
* Entire console log is attached with this report.
* It displays theta values for 3 hypothesis (as there were 3 target type attributes) found by the model.

<img src="observations.png" style="width: 600px;float:left"/>
<br><br><br><br><br><br><br><br><br><br><br>

### These are the final 3 weights (3 class problem):

Finished after  2000  iterations: theta= [ 0.14526234 -0.6985769   0.00495833  0.39798835 -0.23206944]

Finished after  2000  iterations: theta= [ 0.21249024  1.14925333  0.34541407 -1.83277407 -0.80193038]

Finished after  2000  iterations: theta= [-0.45363001 -1.02217097 -0.91944209  1.4997036   1.05582103]

### Confusion Matrix 
We can learn from the confusion matrix as follows:
 * The total class 0 in the dataset is the sum of the values on the 0 column (14)
 * The total class 1 in the dataset is the sum of the values on the 1 column (13)
 * The total class 2 in the dataset is the sum of the values on the 2 column (15)
 * The correct values are organized in a diagonal line from top left to bottom-right of the matrix (14+13+15).
 * More errors were made by predicting 2 as 0 than predicting other (0,1) classes.

<img src="confusion_matrix.png" style="width: 300px;float:left"/>
<br><br><br><br><br><br><br><br><br><br><br><br><br><br><br>

### Summary: Prediction Accuracy
* Avearge: 0.8800
* Standard Deviation: 0.0669
* Median: 0.8999

* only for Iteration 1: Macro average
    * Precison: 0.94444444444444453
    * Recall: 0.94117647058823517
    * F_score: 0.93743890518084072

<b>Picking a learning rate = 0.009 and number of iterations = 2000 the algorithm classified all instances successfully with the accuracy rate of 0.88. Gradient descent only minimizes the cost if the learning rate is low enough. So if we trained our model with a smaller learning rate and more iterations we would find approximately equal weights.</b>

## References
* Scikit-learn packages and its dependencies. http://scikit-learn.org Accessed 1 Dec, 2018 
* Matplotlib library. https://matplotlib.org/contents.html Accessed 1 Dec, 2018
* Pandas Library. http://pandas.pydata.org/pandas-docs/stable/ Accessed 1 Dec, 2018
* Coursera Machine Learning course: https://www.coursera.org/learn/machine-learning/home/welcome Accessed 1 Dec, 2018

<br><br><br><br><br><br><br><br><br><br>
## Appendix
### classification_run.py

In [None]:
'''
Starting point for classification program.
Run this file to execute Logistic Regression.
'''
from tester.SupervisedLearningTester import test_logistic_regression
from common.utils import Color

print (Color.BOLD+Color.PURPLE + "-----------------------------------------------------------------------------------------------------------------------------"+Color.END)
print (Color.BOLD+Color.PURPLE + "Demonstration of supervised classification based Machine Learning using Logistic Regression. v1.0 - Dhaval Salwala (18230845)"+Color.END)
print (Color.BOLD+Color.PURPLE + "Part of Continuous Assessment - Machine Learning and Data Mining [CT475], National University Of Ireland Galway"+Color.END)
print (Color.BOLD+Color.PURPLE + "Tutor: Professor Michael Madden"+Color.END)
print (Color.BOLD+Color.PURPLE + "-----------------------------------------------------------------------------------------------------------------------------"+Color.END)
print ()
print (Color.BOLD+Color.RED+"==> Please enter the absolute path of the classification data file. You can use one of the sample file present"
                 " in the data directory of the project. \nMake sure all the attributes are numeric except for the target attribute."+Color.END)
print()
file_path = input(Color.BOLD+"1. Enter Absolute File Path (Valid Extension: .xls .xlsx .txt .csv) :"+Color.END)
column_names = input(Color.BOLD+"2. Enter Column names (including target column) in the same order as in file seperated by comma without any space :"+Color.END)
target_column = input(Color.BOLD+"3. Enter Target column name :"+Color.END)

print ()
print (Color.BOLD+Color.PURPLE + "==>Initialising Supervised Learning... Preparing for launch..."+Color.END)
test_logistic_regression(file_path,column_names,target_column)

### tester/SupervisedLearningTester.py

In [None]:
'''
Module to test your classification program.
'''
import numpy as np
import pandas as pd
from common.utils import Color
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_recall_fscore_support
from common.utils import TestUtil
from model.utils import LinearUtil
from model.classification.linear_model import Logistic_Regression


def test_logistic_regression(file_path, column_names, target_column):
    print()
    print(Color.BOLD+Color.PURPLE + "==>Pre-processing source file:"+file_path+Color.END)

    df = pd.read_csv(file_path, header=0) # reading from file
    df.columns=column_names.split(",") # Assigning names to the columns as per the data set
    df.index = np.arange(1, len(df) + 1) # Re-numbering data set index from 1 (using numpy)

    # Mapping data set into dependent attributes and class label attribute
    X = df.drop(target_column,axis=1)
    y = np.array(df[target_column])
    # Converting target class attributes into numerical ones.
    y = TestUtil.replace_target_class_values(y)


    print(Color.BOLD+Color.PURPLE + "Applying 10 Fold Cross Validation."+Color.END)
    # try block to catch any error
    try:
        model = Logistic_Regression()
        lr_accuracy_score = []  # list to store accuracy for all 10 combination of data sets.
        for itr in range(10):
            X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.33, shuffle=True) # splits data into train/test
            model.fit(X_train, Y_train) # fit model to train data
            y_pred = model.predict(X_test) # predict values on train model
            accuracy=LinearUtil.calculate_accuracy(Y_test, y_pred) #calculate accuracy
            print(Color.BOLD+Color.PURPLE + "Fold "+str(itr+1)+" Accuracy: "+str(accuracy)+Color.END)
            lr_accuracy_score.append(accuracy)
            print ("Precision, Recall and F_score: "+str(precision_recall_fscore_support(Y_test, y_pred, average='macro')))
            LinearUtil.confusion_matrix(Y_test, y_pred) # preparing confusion matrix
    except Exception as e:
        print("ERROR: " + str(e))

    print()
    #computing average accuracy of all the 10 iterations.
    print(Color.BOLD+Color.PURPLE + "Average accuracy of a model after 10 fold is " + str(sum(lr_accuracy_score) / len(lr_accuracy_score))+Color.END)

### model/classification/linear_model.py

In [None]:
'''
Linear model class containing Linear_Regression
Implementation of linear algorithms
'''
import numpy as np
from common.utils import ArrayUtil
from model.utils import LinearUtil
from common.utils import Color
from model.exception.linear_exception import InvalidInputException

class Logistic_Regression:
    def __init__(self, alpha=0.009, tol=1e-3, maxiter=2000, multi_class="auto"):
        # Initialising all default values
        self.alpha = alpha
        self.tol = tol
        self.maxiter = maxiter
        self.multi_class = multi_class
        self.nd_theta_vector = None

    # fit model to train data
    def fit(self, X, y):
        self.nd_theta_vector = None
        X = ArrayUtil.convert_to_array(X) # converting train data to array-like
        y = ArrayUtil.convert_to_array(y) # converting train data to array-like
        X = LinearUtil.append_intercept(X) # adding intercept
        self.multi_class = LinearUtil.check_target_class_type(self.multi_class, y)

        if self.multi_class == "binary":
            self._fit_binary_model(X, y)
        elif self.multi_class == "multinomial":
            self._fit_multinomial_model(X, y)
        else:
            raise InvalidInputException(Color.BOLD+Color.RED+'Invalid attribute value:multi_class='+self.multi_class+'. Please enter value as {auto, binary or multinomial}'+Color.END)

    # fit multinomial model
    def _fit_multinomial_model(self, X, y):
        self.target_class_vector = set(y)
        for i in self.target_class_vector:
            modified_y = np.array([1 if o == i else 0 for o in y])
            self._fit_binary_model(X, modified_y)

    # fit binary model
    def _fit_binary_model(self, X, y):

        theta_vector = ArrayUtil.create_zero_vector(X)
        delta_vector = ArrayUtil.create_one_vector(X)
        itr = 0
        while LinearUtil.check_tolerance(delta_vector,self.tol) and itr < self.maxiter:

            # calling _batch_gradient_descent iteratively  until all the theta values have been evaluated.
            for j in range(len(theta_vector)):
                theta = theta_vector[j] - (self.alpha * self._batch_gradient_descent(X, y, j, theta_vector))
                delta_vector[j] = theta

            # assigning delta back to theta vector
            theta_vector = delta_vector
            itr = itr + 1

        # storing theta values for each of trained hypothesis.
        if self.nd_theta_vector is None:
            self.nd_theta_vector = theta_vector
        else:
            self.nd_theta_vector = np.vstack([self.nd_theta_vector, theta_vector])

        print("\nFinished after ", itr, " iterations: theta=", theta_vector)

    def _batch_gradient_descent(self, X, y, j, theta_vector):
        sum = 0
        for i in range(len(y)):
            sum += (self._hypothesis(X[i], theta_vector) - y[i]) * X[i][j]
        return sum / len(y)

    # logistic regression hypothesis
    def _hypothesis(self, X, theta_vector):
        z = 0.0
        for i in range(len(theta_vector)):
            z += X.item(i) * theta_vector.item(i)
        return self._logistic_function(z)

    # sigmoid function
    def _logistic_function(self, z):
        i = 1 / (1 + np.exp(-z))
        return i

    # predict values for binary model
    def predict(self, X, threshold=0.5):

        if self.multi_class == "multinomial":
            return self.predict_multinomial(X)

        X = ArrayUtil.convert_to_array(X)
        X = LinearUtil.append_intercept(X)
        predicted_values=[]
        for i in range(X.shape[0]):
            predicted_values.append(self._hypothesis(X[i], self.nd_theta_vector))
        return np.array([1 if o>=threshold else 0 for o in predicted_values]) # applying threshold before returning array.

    # predict values for multinomial model
    def predict_multinomial(self, X):

        X = ArrayUtil.convert_to_array(X)
        X = LinearUtil.append_intercept(X)

        result_array = np.zeros(X.shape[0])
        for i in range(X.shape[0]):
            temp_predict_list = []
            for theta in self.nd_theta_vector:
                temp_predict_list.append(self._hypothesis(X[i], theta)) # iteratively call hypothesis for each of theta values.
            result_array[i] = list(self.target_class_vector)[np.argmax(temp_predict_list)] # finding max of all hypothesis prediction
        return np.array(result_array)

### model/utils.py

In [None]:
'''
Utility Class for linear models
'''
import numpy as np
from common.utils import Color
from model.exception.linear_exception import InvalidInputException
import pandas as pd
import matplotlib.pyplot as plt

class LinearUtil:

    # check tolerance level of theta values
    @staticmethod
    def check_tolerance(delta_vector,tol):
        stop_loop = False
        for x in np.nditer(delta_vector):
            if abs(x) > tol:
                stop_loop = True
                break
        return stop_loop

    # add intercept to input attributes
    @staticmethod
    def append_intercept(X):
        x0 = np.ones((X.shape[0], 1))
        return np.concatenate((x0, X), axis=1)

    # check target type class value
    @staticmethod
    def check_target_class_type(multi_class, y):
        if multi_class=="auto":
            if len(set(y)) > 2:
                return "multinomial"
            else:
                return "binary"
        else:
            return multi_class

    # calculate accuracy of predicted results
    @staticmethod
    def calculate_accuracy(Y_test, y_pred):

        if len(Y_test)!=len(y_pred):
            raise InvalidInputException(
                Color.BOLD + Color.RED + 'Both input should be of same length.' + Color.END)
        else:
            return (Y_test == y_pred).sum() / float(len(Y_test))

    # create confusion matrix on predicted values
    @staticmethod
    def confusion_matrix(Y_test, y_pred):
        y_actu = pd.Series(Y_test, name='Actual')
        y_pred = pd.Series(y_pred, name='Predicted')
        df_confusion = pd.crosstab(y_actu, y_pred)
        print ("Confusion Matrix: "+str(df_confusion))
        plt.matshow(df_confusion, cmap=plt.cm.gray_r)
        plt.colorbar()
        tick_marks = np.arange(len(df_confusion.columns))
        plt.xticks(tick_marks, df_confusion.columns, rotation=45)
        plt.yticks(tick_marks, df_confusion.index)
        plt.ylabel(df_confusion.index.name)
        plt.xlabel(df_confusion.columns.name)
        plt.show()

### common/utils.py

In [None]:
'''
Common utils to be used all by services
'''
import numpy as np


class ArrayUtil:

    @staticmethod
    def convert_to_array(X):
        if type(X) is not np.ndarray:
            return np.array(X)
        else:
            return X

    @staticmethod
    def create_zero_vector(X):
        return np.zeros(X.shape[1])

    @staticmethod
    def create_one_vector(X):
        return np.ones(X.shape[1])


class TestUtil:

    # convert target attribute into numerical values
    @staticmethod
    def convert_target_column(values):
        class_map = {}
        i = 0
        for val in values:
            class_map[val] = i
            i = i + 1
        print("Converting target variable into numerical form with the following mapping.")
        print (class_map)
        print()
        return class_map

    # Map target attribute value to its corresponding numerical value.
    @staticmethod
    def replace_target_class_values(y):
        class_map = TestUtil.convert_target_column(set(y))
        y = list(y)
        for i in range(len(y)):
            y[i] = class_map[y[i]]
        return np.array(y)

class Color:
   PURPLE = '\033[95m'
   CYAN = '\033[96m'
   DARKCYAN = '\033[36m'
   BLUE = '\033[94m'
   GREEN = '\033[92m'
   YELLOW = '\033[93m'
   RED = '\033[91m'
   BOLD = '\033[1m'
   UNDERLINE = '\033[4m'
   END = '\033[0m'

### model/exception/linear_exception.py

In [None]:
'''
Exception class for Linear models
'''
class InvalidInputException(Exception):

    def __init__(self, value):
        self.parameter = value

    def __str__(self):
        return repr(self.parameter)

## Console Logs
Demonstration of supervised classification based Machine Learning using Logistic Regression. v1.0 - Dhaval Salwala (18230845)
Part of Continuous Assessment - Machine Learning and Data Mining [CT475], National University Of Ireland Galway
#### Tutor: Professor Michael Madden
-----------------------------------------------------------------------------------------------------------------------------

==> Please enter the absolute path of the classification data file. You can use one of the sample file present in the data directory of the project. 
Make sure all the attributes are numeric except for the target attribute.

1. Enter Absolute File Path (Valid Extension: .xls .xlsx .txt .csv) :/home/dsalwala/owls.csv
2. Enter Column names (including target column) in the same order as in file seperated by comma without any space :body-length,wing-length,body-width,wing-width,type
3. Enter Target column name :type

==>Initialising Supervised Learning... Preparing for launch...

==>Pre-processing source file:/home/dsalwala/owls.csv
Converting target variable into numerical form with the following mapping.
{'BarnOwl': 0, 'LongEaredOwl': 1, 'SnowyOwl': 2}

Applying 10 Fold Cross Validation.

Finished after  2000  iterations: theta= [ 0.02715753 -0.6842089   0.00638416  0.42579352 -0.35747045]

Finished after  2000  iterations: theta= [ 0.22666919  1.1001088   0.37835726 -1.8160435  -0.79307123]

Finished after  2000  iterations: theta= [-0.38342875 -1.00782828 -0.91257651  1.47442589  1.11500378]
Fold 1 Accuracy: 0.933333333333
Precision, Recall and F_score: (0.94444444444444453, 0.94117647058823517, 0.93743890518084072, None)
Confusion Matrix: Predicted  0.0  1.0  2.0
Actual                  
0           14    0    3
1            0   13    0
2            0    0   15

Finished after  2000  iterations: theta= [ 0.16848991 -0.66287538  0.08391121  0.26672085 -0.25977649]

Finished after  2000  iterations: theta= [ 0.18746899  1.13338041  0.28596034 -1.76368468 -0.79024594]

Finished after  2000  iterations: theta= [-0.4509548  -1.04976882 -0.91335917  1.52175478  1.06690595]
Fold 2 Accuracy: 0.955555555556
Precision, Recall and F_score: (0.95833333333333337, 0.94444444444444453, 0.94747474747474758, None)
Confusion Matrix: Predicted  0.0  1.0  2.0
Actual                  
0           10    0    2
1            0   19    0
2            0    0   14

Finished after  2000  iterations: theta= [ 0.11150799 -0.78957977  0.10219165  0.40857251 -0.27411633]

Finished after  2000  iterations: theta= [ 0.22087141  1.14275363  0.32700018 -1.82601715 -0.79805189]

Finished after  2000  iterations: theta= [-0.44589537 -0.95944881 -0.95229227  1.46196264  1.05619498]
Fold 3 Accuracy: 0.977777777778
Precision, Recall and F_score: (0.96296296296296291, 0.98333333333333339, 0.97184514831573654, None)
Confusion Matrix: Predicted  0.0  1.0  2.0
Actual                  
0            8    0    0
1            0   17    0
2            1    0   19

Finished after  2000  iterations: theta= [ 0.06630102 -0.83899381  0.05199899  0.4335272  -0.32723222]

Finished after  2000  iterations: theta= [ 0.22907638  1.15960626  0.3583813  -1.81893984 -0.80271193]

Finished after  2000  iterations: theta= [-0.4180923  -0.89312578 -0.92224671  1.41984154  1.13787278]
Fold 4 Accuracy: 0.888888888889
Precision, Recall and F_score: (0.9242424242424242, 0.90740740740740744, 0.90350151640474208, None)
Confusion Matrix: Predicted  0.0  1.0  2.0
Actual                  
0           13    0    5
1            0   10    0
2            0    0   17

Finished after  2000  iterations: theta= [ 0.01363075 -0.65617528  0.03333821  0.30711149 -0.27115646]

Finished after  2000  iterations: theta= [ 0.23751678  1.12892166  0.34321894 -1.79592148 -0.81199061]

Finished after  2000  iterations: theta= [-0.39102498 -1.0515347  -0.90235013  1.53594973  1.0855108 ]
Fold 5 Accuracy: 0.777777777778
Precision, Recall and F_score: (0.84848484848484851, 0.82456140350877194, 0.78291316526610633, None)
Confusion Matrix: Predicted  0.0  1.0  2.0
Actual                  
0            9    0   10
1            0   14    0
2            0    0   12

Finished after  2000  iterations: theta= [ 0.12901805 -0.72077762 -0.01640336  0.44920296 -0.260281  ]

Finished after  2000  iterations: theta= [ 0.22576609  1.16088251  0.35168483 -1.83130534 -0.84405637]

Finished after  2000  iterations: theta= [-0.4614179  -0.99790431 -0.88066931  1.44199976  1.10652934]
Fold 6 Accuracy: 0.933333333333
Precision, Recall and F_score: (0.95238095238095244, 0.93333333333333324, 0.93732193732193725, None)
Confusion Matrix: Predicted  0.0  1.0  2.0
Actual                  
0           12    0    3
1            0   12    0
2            0    0   18

Finished after  2000  iterations: theta= [ 0.10586365 -0.67514884  0.02826976  0.32783773 -0.25471861]

Finished after  2000  iterations: theta= [ 0.21964631  1.09587127  0.3510795  -1.80797558 -0.79280022]

Finished after  2000  iterations: theta= [-0.44084069 -1.01633824 -0.89230219  1.50916016  1.03571636]
Fold 7 Accuracy: 0.866666666667
Precision, Recall and F_score: (0.89473684210526316, 0.875, 0.86057692307692302, None)
Confusion Matrix: Predicted  0.0  1.0  2.0
Actual                  
0           10    0    6
1            0   16    0
2            0    0   13

Finished after  2000  iterations: theta= [ 0.10814145 -0.7651334   0.02483855  0.32401414 -0.25328986]

Finished after  2000  iterations: theta= [ 0.22012109  1.1592874   0.36416002 -1.81712733 -0.81614572]

Finished after  2000  iterations: theta= [-0.4305889  -0.9709106  -0.92936261  1.55106778  1.06421355]
Fold 8 Accuracy: 0.644444444444
Precision, Recall and F_score: (0.8160919540229884, 0.75757575757575746, 0.68253968253968245, None)
Confusion Matrix: Predicted  0.0  1.0  2.0
Actual                  
0            6    0   16
1            0   10    0
2            0    0   13

Finished after  2000  iterations: theta= [ 0.07936323 -0.55912504 -0.08007645  0.38237127 -0.29358472]

Finished after  2000  iterations: theta= [ 0.21751874  1.11278943  0.35247066 -1.77146703 -0.79157893]

Finished after  2000  iterations: theta= [-0.41097308 -1.09579174 -0.81509911  1.44421534  1.07878783]
Fold 9 Accuracy: 0.866666666667
Precision, Recall and F_score: (0.88235294117647056, 0.89473684210526316, 0.86607142857142849, None)
Confusion Matrix: Predicted  0.0  1.0  2.0
Actual                  
0           13    0    6
1            0   15    0
2            0    0   11

Finished after  2000  iterations: theta= [ 0.14526234 -0.6985769   0.00495833  0.39798835 -0.23206944]

Finished after  2000  iterations: theta= [ 0.21249024  1.14925333  0.34541407 -1.83277407 -0.80193038]

Finished after  2000  iterations: theta= [-0.45363001 -1.02217097 -0.91944209  1.4997036   1.05582103]
Fold 10 Accuracy: 0.955555555556
Precision, Recall and F_score: (0.96666666666666667, 0.95238095238095244, 0.95681511470985148, None)
Confusion Matrix: Predicted  0.0  1.0  2.0
Actual                  
0           12    0    2
1            0   13    0
2            0    0   18

Average accuracy of a model after 10 fold is 0.88

Process finished with exit code 0