***

## Logistic Regression Classification

***

### 👉🏾Data Preprocessing

#### ☑️ Importing Dataset and Libraries

In [None]:
import pandas as pd
import numpy as np 
import matplotlib.pyplot as plt

In [None]:
data = pd.read_csv('/content/Social_Network_Ads.csv')
data

Unnamed: 0,Age,EstimatedSalary,Purchased
0,19,19000,0
1,35,20000,0
2,26,43000,0
3,27,57000,0
4,19,76000,0
...,...,...,...
395,46,41000,1
396,51,23000,1
397,50,20000,1
398,36,33000,0


#### ☑️ Setting Features

In [None]:
x = data.iloc[:, 0:-1].values
y = data.iloc[:, -1].values

#### ☑️ Splitting Dataset 

In [None]:
from sklearn.model_selection import train_test_split as tts
x_train, x_test, y_train, y_test = tts(x,y, test_size = 0.2, random_state = 0)

#### ☑️ Feature Scaling

In [None]:
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
x_train = sc.fit_transform(x_train)
x_test = sc.transform(x_test)

### 👉🏾 Training the LR on the training set 

#### --- Hyperparameter Tuning with GridSearchCV

In [None]:
from sklearn.model_selection import GridSearchCV as gsCV
from sklearn.linear_model import LogisticRegression
C_range = [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]
hyper_p = {'C': C_range}

clf = LogisticRegression(random_state = 0)
clf_ = gsCV(clf, param_grid = hyper_p, cv = 7)
clf_.fit(x_train, y_train)

GridSearchCV(cv=7, estimator=LogisticRegression(random_state=0),
             param_grid={'C': [0.1, 0.2, 0.3, 0.4, 0.5, 0.6]})

In [None]:
print(clf_.best_params_)
print(clf_.best_score_)

{'C': 0.3}
0.825672877846791


### 👉🏾 Predictions


#### ☑️ Predicting for the first customer

In [None]:
clf_.predict(sc.transform([[30, 87000]]))

array([0])

#### ☑️ Predicting The test set

In [None]:
y_pred = clf_.predict(x_test)

In [None]:
a = np.concatenate((y_test.reshape(len(y_test),1),y_pred.reshape(len(y_pred),1)),1)
a


### 👉🏾 Confusion Matrix and Accuracy Score

#### ☑️ Confusion Matrix

In [None]:
from sklearn.metrics import confusion_matrix as cm
conf = cm(y_test, y_pred)
conf

array([[57,  1],
       [ 5, 17]])

So there's 57 correct prediction in the class of 0 ( customers that did not buy the SUV ).<br>
And 17 in the class of 1 ( customers that did buy the SUV ).

#### ☑️ Accuracy Score

In [None]:
from sklearn.metrics import accuracy_score as ac_
score = ac_(y_test, y_pred)
score

0.925

This result means that 92.5% of our test set are correctly predicted