# Deep Learning Bootcamp - Assignment 1 - Beginners: BankNote
**Predict if a note is genuine or not**

# Objective

Being a Data Science Enthusiast, you committed yourself to use the power of Data Science and come up with an efficient model that accurately predicts if a note is genuine or not.

**Evaluation Criteria**

Submissions are evaluated using Accuracy Score. 

*Data Description*

*   VWTI: Variance of Wavelet Transformed Image
*   SWTI: Skewness of Wavelet Transformed Image
*   CWTI: Curtosis of Wavelet Transformed Image
*   EI: Entropy of Image
*   Class: Class (1: genuine, 0: forged)





In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
%matplotlib inline
from IPython.display import display, HTML, display_html
import seaborn as sns
import datetime


  import pandas.util.testing as tm


In [None]:
#Load Dataset
bank_note_data = pd.read_csv("https://raw.githubusercontent.com/dphi-official/Datasets/master/bank_note_data/training_set_label.csv" )

In [None]:
bank_note_data

Unnamed: 0,VWTI,SWTI,CWTI,EI,Class
0,2.263400,-4.4862,3.65580,-0.612510,0
1,3.271800,1.7837,2.11610,0.613340,0
2,-3.941100,-12.8792,13.05970,-3.312500,1
3,0.519500,-3.2633,3.08950,-0.984900,0
4,2.569800,-4.4076,5.98560,0.078002,0
...,...,...,...,...,...
1091,1.640600,3.5488,1.39640,-0.364240,0
1092,-0.048008,-1.6037,8.47560,0.755580,0
1093,2.942100,7.4101,-0.97709,-0.884060,0
1094,1.964700,6.9383,0.57722,0.663770,0


In [None]:
bank_note_data.describe()

Unnamed: 0,VWTI,SWTI,CWTI,EI,Class
count,1096.0,1096.0,1096.0,1096.0,1096.0
mean,0.4485,1.780643,1.493533,-1.157454,0.445255
std,2.852623,5.922621,4.375655,2.084983,0.497221
min,-7.0364,-13.7731,-5.2861,-8.5482,0.0
25%,-1.79085,-2.1252,-1.574975,-2.246975,0.0
50%,0.54043,2.20585,0.6719,-0.56919,0.0
75%,2.83535,6.793925,3.57445,0.39998,1.0
max,6.5633,12.7302,17.9274,2.4495,1.0


In [None]:
bank_note_data.dtypes

VWTI     float64
SWTI     float64
CWTI     float64
EI       float64
Class      int64
dtype: object

**Since the dataset is not null and datatype is float and int , preprocessing is not required.**

# Separating Input Features and Output Features

In [None]:
X = bank_note_data.drop('Class', axis = 1)    # Input Variables/features
y = bank_note_data.Class      # output variables/features

# Splitting the data

In [None]:
# import train_test_split
from sklearn.model_selection import train_test_split 

# Assign variables to capture train test split output
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# find the number of input features
n_features = X.shape[1]
print(n_features)

4


# Training our model

**The 5 Step Model Life-Cycle**

A model has a life-cycle, and this very simple knowledge provides the backbone for both modeling a dataset and understanding the tf.keras API.

The five steps in the life-cycle are as follows:

Define the model.

Compile the model.

Fit the model.

Make predictions on the test data.

Evaluate the model.


**1. Define the model**

In [None]:
from tensorflow.keras import Sequential    # import Sequential from tensorflow.keras
from tensorflow.keras.layers import Dense  # import Dense from tensorflow.keras.layers
from numpy.random import seed     # seed helps you to fix the randomness in the neural network.  
import tensorflow

**I have used 1 hidden layer with 8 preceptrons and activation method as relu for both input and hidden layer. For output layer, sigmoid is used**

In [None]:
# define the model
model = Sequential()
model.add(Dense(10, activation='relu', input_shape=(n_features,)))
model.add(Dense(8, activation='relu'))
model.add(Dense(1,activation='sigmoid'))

In [None]:
model.summary()

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
dense (Dense)                (None, 10)                50        
_________________________________________________________________
dense_1 (Dense)              (None, 8)                 88        
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 9         
Total params: 147
Trainable params: 147
Non-trainable params: 0
_________________________________________________________________


**2. Compile the model**

In [None]:
# import RMSprop optimizer
from tensorflow.keras.optimizers import RMSprop
optimizer = RMSprop(0.01)    # 0.01 is the learning rate

In [None]:
model.compile(optimizer='adam',loss='binary_crossentropy',
              metrics=['accuracy'])    # compile the model

**3. Fitting the model**

In [None]:

# 4. Set the `tensorflow` pseudo-random generator at a fixed value
#tensorflow.random.set_seed(seed_value) 
model.fit(X_train, y_train, epochs=20, batch_size=30)    # fit the model

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<tensorflow.python.keras.callbacks.History at 0x7fe9d9322ac8>

**4. Evaluate the model**

In [None]:
model.evaluate(X_test, y_test)



[0.016563642770051956, 1.0]

# Evaluate the test data set with our model.

**5. Make a Prediction**

In [None]:
test_data = pd.read_csv('https://raw.githubusercontent.com/dphi-official/Datasets/master/bank_note_data/testing_set_label.csv')

In [None]:
test_data

Unnamed: 0,VWTI,SWTI,CWTI,EI
0,-0.40804,0.542140,-0.52725,0.658600
1,-3.71810,-8.508900,12.36300,-0.955180
2,5.50400,10.367100,-4.41300,-4.021100
3,1.68490,8.748900,-1.26410,-1.385800
4,4.74320,2.108600,0.13680,1.654300
...,...,...,...,...
270,-1.00500,0.084831,-0.24620,0.456880
271,2.21230,-5.839500,7.76870,-0.853020
272,4.38460,-4.879400,3.36620,-0.029324
273,3.88400,10.027700,-3.92980,-4.081900


In [None]:
# make a prediction
test_Pred=model.predict(test_data)

In [None]:
# Convert the values from float to int for prediction submission
test_Pred1 = np.rint(test_Pred)
test_Pred1 = test_Pred1.astype(int)
test_Pred1

array([[1],
       [1],
       [0],
       [0],
       [0],
       [1],
       [1],
       [0],
       [1],
       [1],
       [0],
       [1],
       [0],
       [1],
       [0],
       [0],
       [0],
       [1],
       [0],
       [1],
       [1],
       [1],
       [0],
       [0],
       [0],
       [1],
       [0],
       [0],
       [1],
       [1],
       [1],
       [1],
       [0],
       [0],
       [0],
       [1],
       [1],
       [0],
       [0],
       [1],
       [1],
       [0],
       [1],
       [0],
       [1],
       [1],
       [1],
       [1],
       [0],
       [0],
       [1],
       [0],
       [1],
       [0],
       [1],
       [1],
       [1],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [1],
       [0],
       [0],
       [0],
       [1],
       [0],
       [0],
       [0],
       [0],
       [0],
       [0],
       [1],
       [1],
       [0],
       [1],
       [0],
       [0],
    

In [None]:
#to create a file for prediction - RFC model result
result = pd.DataFrame(test_Pred1)
result.index= test_data.index
result.columns = ['Prediction']

#to download file locally
from google.colab import files
result.to_csv('FakeNote.csv')
files.download('FakeNote.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>