<a href="https://colab.research.google.com/github/paulgureghian/Deep_Learning_with_Keras/blob/master/Bike_Buyer_Keras_Model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Created by Paul A. Gureghian in Jan 2019.**

**I will use Keras to build a DL model to predict  whether someone will buy a bike**
**based on some predictor variables.**

This notebook focuses on a specific sub-field of machine learning called **predictive modeling.**

Within predicitve modeling is a speciality or another sub-field called **deep learning.**

I will craft a binary deep learning model with a library called Keras. 

It will just predict one of two outcomes: positive or negative. 

I will use a 'Bike Buyer' csv dataset.  BBCN. 

In [13]:
### Mount Google drive
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


# Step 1. Import our packages.

In [0]:
### Import packages
import numpy as np
import pandas as pd
from keras.layers import Dense
from keras.models import Sequential

# Step 2.  Set our random seed.


**The only purpose of seeding is to get the same result on the same data.**

In [0]:
### Set random seed
seed = 9
np.random.seed(seed)

# Step 3.  Import the data set.


**I define a variable called 'bbcn' and put the data set in it.**  

In [16]:
### Import the dataset
bike_buyer = ('/content/drive/My Drive/Bike_Buyer/bike_buyer.csv')
df = pd.read_csv(bike_buyer)
df.head()

Unnamed: 0,MaritalStatus,Gender,YearlyIncome,TotalChildren,NumberChildrenAtHome,EnglishEducation,HouseOwnerFlag,NumberCarsOwned,CommuteDistance,Region,Age,BikeBuyer
0,5,1,9.0,2,0,5,1,0,2,2,5,1
1,5,1,6.0,3,3,5,0,1,1,2,4,1
2,5,1,6.0,3,3,5,1,1,5,2,4,1
3,5,2,7.0,0,0,5,0,1,10,2,5,1
4,5,2,8.0,5,5,5,1,4,2,2,5,1


# Step 4.  Split the Output Variables.


**The first thing I will do is put the data in an array data type structure.** 

In [17]:
### Store dataset values
array = df.values
print(array)

[[5. 1. 9. ... 2. 5. 1.]
 [5. 1. 6. ... 2. 4. 1.]
 [5. 1. 6. ... 2. 4. 1.]
 ...
 [3. 2. 4. ... 3. 5. 0.]
 [3. 1. 4. ... 1. 4. 0.]
 [3. 1. 4. ... 1. 4. 0.]]


**Define X(features) and y(target).**

In [18]:
### Define X and y
X = array[:,0:11] 
y = array[:,11]

print('X_shape:\n', X.shape, '\n')
print('X:\n', X, '\n')
print('y_shape:\n', y.shape, '\n')
print('y:\n', y)

X_shape:
 (500, 11) 

X:
 [[5. 1. 9. ... 2. 2. 5.]
 [5. 1. 6. ... 1. 2. 4.]
 [5. 1. 6. ... 5. 2. 4.]
 ...
 [3. 2. 4. ... 2. 3. 5.]
 [3. 1. 4. ... 2. 1. 4.]
 [3. 1. 4. ... 2. 1. 4.]] 

y_shape:
 (500,) 

y:
 [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 0. 0. 1. 1. 1. 1. 0. 1. 1. 1. 0.
 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 1. 0. 1. 1.
 1. 0. 1. 0. 1. 1. 1. 1. 1. 1. 1. 0. 1. 1. 1. 1. 1. 0. 0. 0. 0. 1. 1. 0.
 1. 0. 0. 1. 1. 1. 0. 0. 1. 0. 1. 1. 1. 0. 0. 0. 1. 1. 1. 0. 1. 1. 0. 1.
 1. 1. 0. 1. 1. 1. 0. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 0. 0. 0. 0. 1. 0. 0.
 1. 0. 0. 0. 1. 0. 1. 0. 0. 1. 0. 0. 1. 0. 1. 1. 1. 0. 0. 0. 0. 0. 0. 0.
 1. 1. 0. 1. 0. 0. 0. 1. 1. 0. 1. 1. 1. 0. 1. 0. 1. 0. 1. 0. 0. 0. 0. 1.
 1. 0. 0. 1. 0. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 0. 1.
 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.
 1. 1. 0. 0. 0. 0. 0. 0. 1. 0. 0. 1. 0. 0. 0. 0. 0. 0. 0. 0. 0. 1. 1. 1.
 1. 1. 1. 1. 1. 1. 1. 1. 0. 1. 1. 1. 1. 0. 0. 0. 0. 1. 0. 1. 0.

In [19]:
### Print dataset
df.head()

Unnamed: 0,MaritalStatus,Gender,YearlyIncome,TotalChildren,NumberChildrenAtHome,EnglishEducation,HouseOwnerFlag,NumberCarsOwned,CommuteDistance,Region,Age,BikeBuyer
0,5,1,9.0,2,0,5,1,0,2,2,5,1
1,5,1,6.0,3,3,5,0,1,1,2,4,1
2,5,1,6.0,3,3,5,1,1,5,2,4,1
3,5,2,7.0,0,0,5,0,1,10,2,5,1
4,5,2,8.0,5,5,5,1,4,2,2,5,1


# Step 5.  Build the Model.


**The first layer has 11 neurons and expects 11 input variables.** 

**The second hidden layer has 8 neurons.**

**The third hidden layer has 8 neurons.** 

**The output layer has 1 neuron to predict the class.** 

In [20]:
### Define the model 
model = Sequential()
model.add(Dense(11, input_dim=11, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu'))
model.add(Dense(8, init='uniform', activation='relu')) 
model.add(Dense(1, init='uniform', activation='sigmoid'))

  
  This is separate from the ipykernel package so we can avoid doing imports until
  after removing the cwd from sys.path.
  """


# Step 6.  Compile the Model.

* **The 'loss' is the difference the predicted values and the actual values.**
* **The optimizer adjusts the weights so the loss can be reduced.**
* **A metric is a function that is used to judge the performance of the model.** 

In [0]:
### Compile the model
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

# Step 7.  Fit the Model.

**Set the training hyperparameters.**

* **Epoch:** A full pass over all of the training dataset.

* **Batch_Size:** Denotes the subset size of the training dataset. 

In [22]:
### Train the model
model.fit(X, y, nb_epoch=200, batch_size=30)

  """Entry point for launching an IPython kernel.


Epoch 1/200
Epoch 2/200
Epoch 3/200
Epoch 4/200
Epoch 5/200
Epoch 6/200
Epoch 7/200
Epoch 8/200
Epoch 9/200
Epoch 10/200
Epoch 11/200
Epoch 12/200
Epoch 13/200
Epoch 14/200
Epoch 15/200
Epoch 16/200
Epoch 17/200
Epoch 18/200
Epoch 19/200
Epoch 20/200
Epoch 21/200
Epoch 22/200
Epoch 23/200
Epoch 24/200
Epoch 25/200
Epoch 26/200
Epoch 27/200
Epoch 28/200
Epoch 29/200
Epoch 30/200
Epoch 31/200
Epoch 32/200
Epoch 33/200
Epoch 34/200
Epoch 35/200
Epoch 36/200
Epoch 37/200
Epoch 38/200
Epoch 39/200
Epoch 40/200
Epoch 41/200
Epoch 42/200
Epoch 43/200
Epoch 44/200
Epoch 45/200
Epoch 46/200
Epoch 47/200
Epoch 48/200
Epoch 49/200
Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78

<keras.callbacks.History at 0x7f32d06207f0>

# Step 8.  Score the Model.

In [23]:
### Get the model accuracy
scores = model.evaluate(X, y)
print("%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))

acc: 70.20%
