# Logistic Regression

Logistic regression, despite its name, is a classification algorithm rather than regression algorithm. It measures the relationship between the categorical dependent variable and one or more independent variables by estimating the probability of occurrence of an event using its logistics function.

In [None]:
# importing numpy
import numpy as np

# importing Linear regression model
from sklearn.linear_model import LogisticRegression

## Example 

### Data

In [None]:
# dataset
from sklearn.datasets import load_iris

# loading dataset
X, Y = load_iris(return_X_y = True)

# printing to see elements
print('X values : \n', X[:5])
print('Y values : \n', Y[:5])

### Creating a Train test split

In [None]:
# importing train test split
from sklearn.model_selection import train_test_split

# creating a train-test split
X_train, X_test, Y_train, Y_test = train_test_split( X, Y, 
                    test_size = 0.4, random_state = 1 )

# printing size of train and test data

print('X_train : ', X_train.shape)
print('X_test : ', X_test.shape)

print('Y_train : ', Y_train.shape)
print('Y_test : ', Y_test.shape)

### Creating an Logistic regression Object

**"solver"** parameter represents which algorithm to use in the optimization problem. 

* liblinear − It is a good choice for small datasets. It also handles L1 penalty. For multiclass problems, it is limited to one-versus-rest schemes.

* newton-cg − It handles only L2 penalty.

* lbfgs − For multiclass problems, it handles multinomial loss. It also handles only L2 penalty.

* saga − It is a good choice for large datasets. For multiclass problems, it also handles multinomial loss. Along with L1 penalty, it also supports ‘elasticnet’ penalty.

* sag − It is also used for large datasets. For multiclass problems, it also handles multinomial loss.

In [None]:
# we can also add penalty
LR = LogisticRegression(random_state=0, solver = 'lbfgs')

### Train model on training dataset

In [None]:
LR.fit(X_train, Y_train)

### Prediction using the trained model

In [None]:
Y_pred = LR.predict(X_test)

In [None]:
print('Coefficients: \n', LR.coef_)

### Accuracy


In [None]:
# accuracy 
print('Accuracy on Train : ', round(LR.score(X_train, Y_train)*100, 2))
print('Accuracy on Test : ', round(LR.score(X_test, Y_test)*100, 2))
print('Accuracy on Whole Dataset : ', round(LR.score(X, Y)*100, 2))

### Metrics

In [None]:
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

# Confusion Matrix 
result = confusion_matrix(Y_test, Y_pred)
print("Confusion Matrix:")
print(result)

# Classification report
result1 = classification_report(Y_test, Y_pred)
print("\nClassification Report:")
print (result1)

# Accuracy score
result2 = accuracy_score(Y_test, Y_pred)
print("\nAccuracy:",result2)