# Keras Overview

Keras is one of the most popular deep learning frameworks available today

It allows us to build powerful neural nets with only a few lines of code

Keras doesn't do all the work itself - it is a front end layer that runs on top of other Deep Learning toolkits

The back end is supported by either TensorFlow or Theano, which handle the heavy calculations

What makes Keras Unique?

-- Best practices are built in 

-- The default settings in Keras are designed to give good results in most cases

-- It comes with several pre-trained models for image recognition which can be configured for one's own data

# Keras Backends

What are the similarities and differences for TensorFlow and Theano?

Theano was created by the Montreal Institute for Learning Algorithms (MILA) at University of Montreal. It was written in Python and is compatible with GPU.

TensorFlow was created by Google and is used internally to power products such as Google Translate. It allows for GPU acceleration and distributed computing. It is also easy to used TensorFlow in Google Cloud to develop models.

In this course, we will use Keras with a TensorFlow backend. Coding will be done in Keras and calculations in TensorFlow.

Why not use TensorFlow on its own?

-- TensorFlow is a low level solution designed to give you total control over neural net, but requires one to do a lot of work to build models

-- Code is long and detailed and training a network could take several pages of code

-- Keras is a high level solution that allows for faster experimentation and enables us to write less code

When would we want to use TensorFlow alone?

-- Researching new types of ML algorithms and need full control of what we are building

-- Building a large scale system that will support many users

-- If processing and memory efficiency is more important than time saved while coding

When would you want to use Keras?

-- Education and experimentation

-- Prototyping new ML systems

-- Production systems that don't have highly specialized requirements

# Keras Process

Create model with: 

model = keras.model.Sequential()

Add layers with: 

model.add(keras.layers.Dense())

Compile the model with a loss function and optimizer using:

model.compile(loss='mean_squared_error', optimizer = 'adam')

Once this is done, we can begin training the model:

model.fit(training_x, training_y)

We can test the model with:

error_rate = model.evaluate(testing_x, testing_y)

We can then save the model to a file:

model.save('trained_model.h5')

To evaluate the saved model, it must be loaded back in:

model = keras.model.load_model('trained_model.h5')

Make new predictions using the model:

pred = model.predict(new_data)

# Sequential API

When making a model in Keras, we need to choose how many layers there should be, how many nodes there should be, and which nodes should be connected

The Keras Sequential API is the easiest way to build neural nets in Keras

As shown above, we create an empty sequential model object then add layers to it in sequence

Before data flows from one layer to the next, they pass through an activation function

We can choose the activation function for each layer by passing it in as an argument

E.g. rectified linear unit can be selected with model.add(number_neurons, activation='relu')

Customizable layer settings in Keras:

-- Layer activation function

-- Initializer function for node weights

-- Regularization function for node weights

The default settings were made based on best practices, so we only need to specify layer size and activation function

Two other major layer types are convolutional and recurrent:

-- keras.layers.convolutional.Conv2D() - typically used for images or spatial data

-- keras.layers.recurrent.LSTM() - memory is built into each neuron and work well for sequential data

Once the layers are set up, the model must be compiled with an optimizer and loss function

-- This builds out the defined model in the backend

-- The optimizer is the algorithm used to train the neural net

-- The loss function is how the training process measures how right or wrong the predictions are

# Preprocessing training data

We will illustrate the preprocessing steps with the sales_training_data.csv file provided in the exercise folder

The ultimate goal will be to predict sales of future video games given the historical data in our training set

In [1]:
import pandas as pd

In [2]:
df = pd.read_csv("sales_data_training.csv")

df.head()

Unnamed: 0,critic_rating,is_action,is_exclusive_to_us,is_portable,is_role_playing,is_sequel,is_sports,suitable_for_kids,total_earnings,unit_price
0,3.5,1,0,1,0,1,0,0,132717,59.99
1,4.5,0,0,0,0,1,1,0,83407,49.99
2,3.0,0,0,0,0,1,1,0,62423,49.99
3,4.5,1,0,0,0,0,0,1,69889,39.99
4,4.0,1,0,1,0,1,0,1,161382,59.99


In [3]:
df.shape

(1000, 10)

In [4]:
df.describe()

Unnamed: 0,critic_rating,is_action,is_exclusive_to_us,is_portable,is_role_playing,is_sequel,is_sports,suitable_for_kids,total_earnings,unit_price
count,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0,1000.0
mean,3.66,0.466,0.267,0.243,0.347,0.746,0.187,0.27,110705.229,54.17
std,0.834024,0.499092,0.442614,0.42911,0.476254,0.435515,0.390107,0.444182,44970.558163,8.036927
min,2.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,31355.0,39.99
25%,3.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,78830.25,49.99
50%,3.75,0.0,0.0,0.0,0.0,1.0,0.0,0.0,104335.5,59.99
75%,4.5,1.0,1.0,0.0,1.0,1.0,0.0,1.0,133271.0,59.99
max,5.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,301860.0,59.99


In [5]:
# loading in test data

test_df = pd.read_csv("sales_data_test.csv")

test_df.head()

Unnamed: 0,critic_rating,is_action,is_exclusive_to_us,is_portable,is_role_playing,is_sequel,is_sports,suitable_for_kids,total_earnings,unit_price
0,3.5,1,1,1,0,1,0,1,247537,59.99
1,2.5,0,0,0,1,1,0,0,73960,59.99
2,3.5,0,0,0,0,1,1,0,82671,59.99
3,4.0,1,1,0,0,1,0,0,137456,39.99
4,2.0,1,0,1,0,1,0,0,89639,59.99


In [7]:
test_df.shape

(400, 10)

In [8]:
# note that the data are all in their original units

# we need to scale these values so they can be processed in our model

# creating a scaler using mix max scaler from scikit learn

from sklearn.preprocessing import MinMaxScaler

scaler = MinMaxScaler(feature_range = (0, 1))

In [9]:
# now we need to fit and transform og data using scaler

train_scaled = scaler.fit_transform(df)

# we only use transform for the test set to make sure it is scaled the same amount as the training data

test_scaled = scaler.transform(test_df)

In [11]:
# note that the scaled data is not a dataframe anymore - need to create new object with pandas

train_scaled.head()

AttributeError: 'numpy.ndarray' object has no attribute 'head'

In [12]:
# creating new dataframes

train_scaled_df = pd.DataFrame(train_scaled, columns = df.columns)

test_scaled_df = pd.DataFrame(train_scaled, columns = df.columns)

In [13]:
train_scaled_df.head()

Unnamed: 0,critic_rating,is_action,is_exclusive_to_us,is_portable,is_role_playing,is_sequel,is_sports,suitable_for_kids,total_earnings,unit_price
0,0.5,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.374714,1.0
1,0.833333,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.192425,0.5
2,0.333333,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.114852,0.5
3,0.833333,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.142452,0.0
4,0.666667,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.480682,1.0


In [14]:
test_scaled_df.head()

Unnamed: 0,critic_rating,is_action,is_exclusive_to_us,is_portable,is_role_playing,is_sequel,is_sports,suitable_for_kids,total_earnings,unit_price
0,0.5,1.0,0.0,1.0,0.0,1.0,0.0,0.0,0.374714,1.0
1,0.833333,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.192425,0.5
2,0.333333,0.0,0.0,0.0,0.0,1.0,1.0,0.0,0.114852,0.5
3,0.833333,1.0,0.0,0.0,0.0,0.0,0.0,1.0,0.142452,0.0
4,0.666667,1.0,0.0,1.0,0.0,1.0,0.0,1.0,0.480682,1.0


In [15]:
# data looks good - save new data in a csv file

train_scaled_df.to_csv("sales_training_data_scaled.csv", index = False)

test_scaled_df.to_csv("sales_testing_data_scaled.csv", index = False)

In [17]:
# time to define a model using sequential API

from keras.models import Sequential
from keras.layers import *

In [18]:
# pull in scaled dataframes

training_df = pd.read_csv("sales_training_data_scaled.csv")

testing_df = pd.read_csv("sales_testing_data_scaled.csv")

In [19]:
# create X and Y values

X = training_df.drop('total_earnings', axis = 1)

y = training_df['total_earnings']

In [20]:
# define the sequential model

model = Sequential()

In [22]:
# add first dense layer with 50 nodes and 9 inputs (1 per X feature) and rectified linear unit activation function

model.add(Dense(50, input_dim = 9, activation = 'relu'))

In [23]:
# add another dense layer with 100 nodes and relu activation function

model.add(Dense(100, activation='relu'))

In [24]:
# add a third dense layer with 50 nodes and relu again

model.add(Dense(50, activation='relu'))

In [25]:
# we don't really know at first what will give us the best results

# nonetheless, we can tell the model to predict one final value using 1 output node and a linear activation function

model.add(Dense(1, activation = 'linear'))

In [26]:
# the find step of building a model is compiling it using model.compile and specifying a loss function and optimizer

# in this case we will use mean squared error as the loss function and adam as the optimizer

model.compile(loss = 'mse', optimizer = 'adam')

In [27]:
# time to start training the model with model.fit

# a single training pass across the entire dataset is called an epoch

# in this case, we will start with 50 passes through the dataset 

# if epochs is too small it will lose accuracy, but if too large it will take a long time and not add effectiveness

# need to try different values to see where point of diminishing returns occurs

# we will also shuffle the data and make verbose = 2 to show more data to us during training

model.fit(X, y, epochs=50, shuffle=True, verbose=2)

Epoch 1/50
 - 0s - loss: 0.0137
Epoch 2/50
 - 0s - loss: 0.0017
Epoch 3/50
 - 0s - loss: 6.5602e-04
Epoch 4/50
 - 0s - loss: 3.5244e-04
Epoch 5/50
 - 0s - loss: 2.3923e-04
Epoch 6/50
 - 0s - loss: 1.9373e-04
Epoch 7/50
 - 0s - loss: 1.2799e-04
Epoch 8/50
 - 0s - loss: 1.1508e-04
Epoch 9/50
 - 0s - loss: 9.2179e-05
Epoch 10/50
 - 0s - loss: 7.6452e-05
Epoch 11/50
 - 0s - loss: 7.6290e-05
Epoch 12/50
 - 0s - loss: 6.2527e-05
Epoch 13/50
 - 0s - loss: 6.8618e-05
Epoch 14/50
 - 0s - loss: 6.3850e-05
Epoch 15/50
 - 0s - loss: 5.8307e-05
Epoch 16/50
 - 0s - loss: 4.3718e-05
Epoch 17/50
 - 0s - loss: 4.2576e-05
Epoch 18/50
 - 0s - loss: 3.7419e-05
Epoch 19/50
 - 0s - loss: 3.5669e-05
Epoch 20/50
 - 0s - loss: 5.3037e-05
Epoch 21/50
 - 0s - loss: 3.5809e-05
Epoch 22/50
 - 0s - loss: 3.1683e-05
Epoch 23/50
 - 0s - loss: 2.7888e-05
Epoch 24/50
 - 0s - loss: 2.8579e-05
Epoch 25/50
 - 0s - loss: 3.2584e-05
Epoch 26/50
 - 0s - loss: 3.1268e-05
Epoch 27/50
 - 0s - loss: 2.7697e-05
Epoch 28/50
 - 0s 

<keras.callbacks.History at 0x2c256d0d550>

In [28]:
# create testing datasets

X_test = testing_df.drop('total_earnings', axis = 1)

y_test = testing_df['total_earnings']

In [29]:
# lets find the testing error rate

test_error_rate = model.evaluate(X_test, y_test, verbose = 0)

In [33]:
print(f"The testing MSE was {test_error_rate}")

The testing MSE was 4.447766454541124e-05


In [34]:
# now that the model has been trained and tested, we can expose it to new data and make predictions

# we will load in a single line of data for a new video game we want to forecast

new_product = pd.read_csv("proposed_new_product.csv")

In [35]:
# note that data is already scaled

new_product

Unnamed: 0,critic_rating,is_action,is_exclusive_to_us,is_portable,is_role_playing,is_sequel,is_sports,suitable_for_kids,unit_price
0,0.7,1.0,1.0,1.0,0.0,1.0,0.0,1.0,0.8


In [36]:
# make a prediction about the new product

pred = model.predict(new_product)

In [37]:
# keras returns predictions in a 2d array bc it assumes multiple predictions with multiple output values

# we will grab the first object in the arrays

pred = pred[0][0]

In [38]:
# rescale from 0 to 1 range back into dollars - we can use the constants from our scaled data

pred = pred + 0.1159

pred = pred / 0.0000036968

In [39]:
print(f"The predicted total earnings for the new product is {pred}")

The predicted total earnings for the new product is 261804.2334053223


# Saving trained models

Instead of retraining a model each time we want to use it, we can save it to our files

To save a Keras model, we just call model.save() and pass in file name

We will save to .h5 file format because it is meant for storing Python array data

We can also load in saved models using the load_model function from keras.models

In [40]:
model.save("trained_model.h5")

In [41]:
# now we can load our model in using load_model

from keras.models import load_model

mod = load_model("trained_model.h5")

In [43]:
# we will go over the same new product prediction process using the loaded in model

pred = mod.predict(new_product)

pred = pred[0][0]

pred = pred + 0.1159

pred = pred / 0.0000036968

print(f"The predicted total earnings for the new product is {pred}")

The predicted total earnings for the new product is 261804.2334053223
