## Bank Note Authentication Analysis

Whenever we go to the bank to deposit some cash money, the cashier places banknotes in a machine which tells whether a banknote is real or not. This is a classification problem where we are given some input data and we have to classify the input into one of the several predefined categories. Rule-based as well as statistical techniques are commonly used for solving classification problems. Machine learning algorithms fall in the category of statistical techniques.

In this notebook, will explain the process of building a banknote authentication system using machine learning algorithms. After reading this article, you will be able to understand how classification systems are built using machine learning algorithms.

### Importing pyforest library

In [None]:
import warnings

warnings.filterwarnings('ignore')

In [None]:
!pip install pyforest

In [None]:
from pyforest import *

In [None]:
#lazy_imports()

### Loding BankNote_Authentication(UCI) dataset from kaggle

In [None]:
df = pd.read_csv("../input/banknote-authenticationcsv/BankNote_Authentication.csv")
df.head()

In [None]:
df.shape

In [None]:
df['class'].value_counts()

#### There is no need of EDA process because data is already balanced and no missing value

In [None]:
df.isna().sum()

In [None]:
df.info()

### Visualization

In [None]:
sns.distplot(df['class'],color='red',kde=True);

In [None]:
df.describe().T

In [None]:
sns.heatmap(df.corr(),annot=True,cmap='viridis')

In [None]:
df.hist(bins=20,figsize=(11,9),layout=(2,3));

In [None]:
sns.pairplot(df,hue='class');

### Data to Build model

In [None]:
X = df.drop('class',axis=1)
X.head()

In [None]:
y = df['class']
y.head()

### Splitting Data into train & test

In [None]:
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=0.20,random_state=1)

In [None]:
print("X_train:", X_train.shape)
print("y_train:", y_train.shape)
print("X_test:", X_test.shape)
print("y_test:", y_test.shape)

### Data Preprocessing

In [None]:
from sklearn.preprocessing import StandardScaler

In [None]:
scaler = StandardScaler()

X=scaler.fit_transform(X)
X_train = scaler.fit_transform(X_train)
X_test = scaler.fit_transform(X_test)

print("X_train:\n", X_train)
print("\n")
print("X_test:\n", X_test)

### 1) Logistic Regression

In [None]:
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score

In [None]:
classifire = LogisticRegression(solver='liblinear',random_state=1)

classifire.fit(X_train, y_train)
accuracies = cross_val_score(estimator = classifire,X=X_train,y=y_train,cv=10)

print("Accuracies:\n ",accuracies)

In [None]:
print("Mean Accuracies: ",accuracies.mean())

In [None]:
classifire_pred = classifire.predict(X_test)
classifire_pred

In [None]:
from sklearn import metrics

In [None]:
cm = metrics.confusion_matrix(y_test,classifire_pred,labels=[0,1])

df_cm = pd.DataFrame(cm,index=[i for i in [0,1]],columns=[i for i in ['predicted 0','predicted 1']])

sns.heatmap(df_cm,annot=True)

### 2) Support Vector Machine

In [None]:
from sklearn.svm import SVC

In [None]:
svm_classifire = SVC(kernel="linear")

svm_classifire.fit(X_train,y_train)

svm_accuracies = cross_val_score(estimator=svm_classifire,X=X_train,y=y_train,cv=10)

print("Accuracies:\n ",svm_accuracies)

In [None]:
print("Mean Accuracy:",svm_accuracies.mean())

In [None]:
svm_classifire_pred = svm_classifire.predict(X_test)
svm_classifire_pred

In [None]:
cm = metrics.confusion_matrix(y_test,svm_classifire_pred,labels=[0,1])

df_cm = pd.DataFrame(cm,index=[i for i in [0,1]],columns=[i for i in ['predicted 0','predicted 1']])

sns.heatmap(df_cm,annot=True)

### 3) Random Forest Classifier

In [None]:
from sklearn.ensemble import RandomForestClassifier

In [None]:
rfc_classifier = RandomForestClassifier(n_estimators=50, criterion='entropy',random_state=1)
rfc_classifier.fit(X_train, y_train)

rfc_accuracies = cross_val_score(estimator=rfc_classifier,X=X_train,y=y_train,cv=10)

print("Accuracies:\n ",rfc_accuracies)

In [None]:
print("Mean Accuracy:",rfc_accuracies.mean())

In [None]:
rfc_classifier_pred = rfc_classifier.predict(X_test)
rfc_classifier_pred

In [None]:
cm = metrics.confusion_matrix(y_test,rfc_classifier_pred,labels=[0,1])

df_cm = pd.DataFrame(cm,index=[i for i in [0,1]],columns=[i for i in ['predicted 0','predicted 1']])

sns.heatmap(df_cm,annot=True)

### 4) KNN Classifier

In [None]:
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import KFold, GridSearchCV

param_grid = {'leaf_size': [2,5,7,9,11],'n_neighbors':[2,5,7,9,11],'p':[1,2]}

grid = GridSearchCV(KNeighborsClassifier(),param_grid=param_grid)

grid.fit(X_train, y_train)


In [None]:
grid.best_params_

In [None]:
KNN = KNeighborsClassifier(n_neighbors=2 ,p=1 ,leaf_size=2 )

In [None]:
KNN.fit(X_train, y_train)

In [None]:
KNN_pred = KNN.predict(X_test)

In [None]:
knn_accuracies = cross_val_score(estimator=KNN,X=X_train,y=y_train,cv=10)

print("Accuracies:\n ",knn_accuracies)

In [None]:
print("Mean Accuracy:",knn_accuracies.mean())

In [None]:
cm = metrics.confusion_matrix(y_test,KNN_pred,labels=[0,1])

df_cm = pd.DataFrame(cm,index=[i for i in [0,1]],columns=[i for i in ['predicted 0','predicted 1']])

sns.heatmap(df_cm,annot=True)

### 5) Multilayer Preception (Neural Network)

In [None]:
from sklearn.neural_network import MLPClassifier 

In [None]:
multi_class = MLPClassifier(hidden_layer_sizes=(8,4),max_iter=8000, alpha=0.0001, solver='sgd',verbose=10,random_state= 21,tol=0.000000001)

In [None]:
multi_class.fit(X_train, y_train)

In [None]:
multi_class_predict=multi_class.predict(X_test)

In [None]:
cm = metrics.confusion_matrix(y_test,multi_class_predict,labels=[0,1])

df_cm = pd.DataFrame(cm,index=[i for i in [0,1]],columns=[i for i in ['predicted 0','predicted 1']])

sns.heatmap(df_cm,annot=True)

### Comparing All the Algorithms

In [None]:
print("LogisticRegression: {0:.4f}%".format(accuracies.mean()*100))
print("\n")
print("Support Vector Machine: {0:.4f}%".format(svm_accuracies.mean()*100))
print("\n")
print("Random Forest: {0:.4f}%".format(rfc_accuracies.mean()*100))
print("\n")
print("KNN: {0:.4f}%".format(knn_accuracies.mean()*100))
print("\n")
print("Nueral Network: {}%".format(100))
print("\n")

Here, Nueral Network Gives 100% accuracy. But, for begginers, all above algorithms are must needed.