<img src="../../../../images/logreg.png" style="background:white; display: block; margin-left: auto;margin-right: auto; width:60%"/>

---
<h2>1. Importing the Dataset</h2>

In [1]:
import pandas as pd
import numpy as np

df = pd.read_csv('../../../../data/clean/Social_Network_Ads.csv')
display(df.head())
x = df.iloc[:, :-1].values
y = df.iloc[:, -1].values

Unnamed: 0,Age,EstimatedSalary,Purchased
0,19,19000,0
1,35,20000,0
2,26,43000,0
3,27,57000,0
4,19,76000,0


---
<h2>2. Splitting the Dataset</h2>

In [2]:
from sklearn.model_selection import train_test_split

x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42, stratify=y)
print("train dataset size : {} observations\ntest dataset size : {} observations".format(x_train.shape[0], x_test.shape[0]))

train dataset size : 320 observations
test dataset size : 80 observations


---
<h2>3. Feature Scaling</h2>

In [3]:
from sklearn.preprocessing import StandardScaler

stand_x = StandardScaler().fit(x_train)
x_ss = stand_x.transform(x_train)

---
<h2>4. Training the Model with Training Dataset</h2>

In [4]:
from sklearn.linear_model import LogisticRegression

'''
> Here are values for penalty parameter:
    - l1 : or L1 Lasso regularization adds an L1 penalty equal to the absolute value of the magnitude of coefficients.
           this type is only supported by ‘liblinear’ solver.
    - l2 (default) : or L2 Ridge regularization adds an L2 penalty equal to the square of the magnitude of coefficients.
                     this type is supported by ‘newton-cg’, ‘sag’, ‘liblinear’ and ‘lbfgs’ solvers only.
    - elasticnet : it combines L1 & L2 methods, but do add a hyperparameter.
                   this type is only supported by the ‘saga’ solver.
    - none : no regularization is applied and this type is not supported by ‘liblinear’ solver
> There are some numerical optimizers or solver that can be used, like:
    - newton-cg : For multiclass problems, handle L2 or no penalty
    - lbfgs (default) : For multiclass problems, handle L2 or no penalty
    - liblinear : For small datasets, limited to one-versus-rest schemes, handle L1 penalty, doesn't support 'none' penalty=
    - sag : is faster for large datasets, for multiclass problems, handle L2 or no penalty
    - saga : is faster for large datasets, for multiclass problems, handle L2 or no penalty, also handle L1 and elasticnet penalty
> 'C' parameter indicates inverse of regularization strength which must be a positive float,
  smaller value specify stronger regularization
'''
logreg = LogisticRegression(random_state=0)
logreg.fit(x_ss, y_train)

LogisticRegression(random_state=0)

---
<h2>5. Predicting the Test Dataset Results</h2>

In [5]:
y_pred = logreg.predict(stand_x.transform(x_test))

pd.DataFrame(data=np.stack((y_test, y_pred), axis=1),
             index=None, columns=['y actual', 'y prediction'],
             copy=False).head(10)

Unnamed: 0,y actual,y prediction
0,1,1
1,0,0
2,0,0
3,0,1
4,0,0
5,1,1
6,0,0
7,1,1
8,0,1
9,0,0


---
<h2>6. Visualizing the Training and Test Set Results (uncomment all codes to see the result - warning! high res.)</h2>

In [6]:
import matplotlib as mpl
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap

mpl.style.use('ggplot')
%matplotlib inline

# ============================================= TRAINING SET =============================================

# fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(6,4))
# x_set, y_set = stand_x.inverse_transform(x_ss), y_train
# x1, x2 = np.meshgrid(np.arange(start = x_set[:,0].min() - 10,
#                                stop = x_set[:, 0].max() + 10,
#                                step=0.25),
#                      np.arange(start = x_set[:,1].min() - 1000,
#                                stop = x_set[:, 1].max() + 1000,
#                                step=0.25)
#                      )

# ax.contourf(x1, x2,
#                logreg.predict(stand_x.transform(np.array([x1.ravel(), x2.ravel()]).T)).reshape(x1.shape),
#                alpha=0.75, cmap=ListedColormap(('red', 'green')))

# for i, j in enumerate(np.unique(y_set)):
#     ax.scatter(x_set[y_set==j, 0], x_set[y_set==j, 1], c=ListedColormap(('red', 'green'))(i), label=j)

# ax.set_xlim(x1.min(), x1.max())
# ax.set_ylim(x2.min(), x2.max())
# ax.set_title("Training dataset")
# ax.set_xlabel("Customer Age")
# ax.set_ylabel("Customer Estimate Salary")
# ax.legend(["not buy", "buy"], loc="upper left")

# plt.show()

In [7]:
# ============================================= TEST SET =============================================

# fig, ax = plt.subplots(nrows=1, ncols=1, figsize=(6,4))
# xt_set, yt_set = x_test, y_test
# xt1, xt2 = np.meshgrid(np.arange(start = xt_set[:,0].min() - 10,
#                                stop = xt_set[:, 0].max() + 10,
#                                step=0.25),
#                      np.arange(start = xt_set[:,1].min() - 1000,
#                                stop = xt_set[:, 1].max() + 1000,
#                                step=0.25)
#                      )

# ax.contourf(xt1, xt2,
#                logreg.predict(stand_x.transform(np.array([xt1.ravel(), xt2.ravel()]).T)).reshape(xt1.shape),
#                alpha=0.75, cmap=ListedColormap(('red', 'green')))

# for i, j in enumerate(np.unique(yt_set)):
#     ax.scatter(xt_set[yt_set==j, 0], xt_set[yt_set==j, 1], c=ListedColormap(('red', 'green'))(i), label=j)

# ax.set_xlim(xt1.min(), xt1.max())
# ax.set_ylim(xt2.min(), xt2.max())
# ax.set_title("Test dataset")
# ax.set_xlabel("Customer Age")
# ax.set_ylabel("Customer Estimate Salary")
# ax.legend(["not buy", "buy"], loc="upper left")

# plt.show()

---
<h2>7. Making the Confusion Matrix</h2>

In [8]:
from sklearn.metrics import confusion_matrix

print(confusion_matrix(y_test, y_pred))
print("\nConfusion matrix result shows that:\n\t- 48 correct predictions of the class 0 (who didn\'t buy the product)\
        \n\t- 3 incorrect predictions of the class 1 (predicted as user who bought the product but in reality not to)\
        \n\t- 19 correct predictions of the class 1 (who bought the product)\
        \n\t- 10 incorrect predictions of the class 0 (predicted as user who didn\'t buy the product but in reality they bought the product)")

[[48  3]
 [10 19]]

Confusion matrix result shows that:
	- 48 correct predictions of the class 0 (who didn't buy the product)        
	- 3 incorrect predictions of the class 1 (predicted as user who bought the product but in reality not to)        
	- 19 correct predictions of the class 1 (who bought the product)        
	- 10 incorrect predictions of the class 0 (predicted as user who didn't buy the product but in reality they bought the product)


---
<h2>8. Predicting with New Input</h2>

In [9]:
new_age = 30
new_sal = 87000
result = logreg.predict(stand_x.transform([[new_age, new_sal]]))

if result:
    print("The {} years old user with estimate salary about {:,.2f} will buy the product".format(new_age, new_sal))
else:
    print("The {} years old user with estimate salary about {:,.2f} won\'t buy the product".format(new_age, new_sal))

The 30 years old user with estimate salary about 87,000.00 won't buy the product
