<a href="https://colab.research.google.com/github/DavidSenseman/BIO1173/blob/master/Class_03_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

---------------------------
**COPYRIGHT NOTICE:** This Jupyterlab Notebook is a Derivative work of [Jeff Heaton](https://github.com/jeffheaton) licensed under the Apache License, Version 2.0 (the "License"); You may not use this file except in compliance with the License. You may obtain a copy of the License at

> [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

------------------------

# **BIO 1173: Intro Computational Biology**

**Module 3: Introduction to TensorFlow**

* Instructor: [David Senseman](mailto:David.Senseman@utsa.edu), [Department of Integrative Biology](https://sciences.utsa.edu/integrative-biology/), [UTSA](https://www.utsa.edu/)


### Module 3 Material

* Part 3.1: Deep Learning and Neural Network Introduction
* **Part 3.2: Using Keras to Build Regression Models**
* Part 3.3: Using Keras to Build Classification Models
* Part 3.4: Saving and Loading a Keras Neural Network
* Part 3.5: Early Stopping in Keras to Prevent Overfitting

## Google CoLab Instructions

The following code ensures that Google CoLab is running the correct version of TensorFlow.

In [1]:
try:
    %tensorflow_version 2.x
    COLAB = True
    print("Note: using Google CoLab")
except:
    print("Note: not using Google CoLab")
    COLAB = False

Note: not using Google CoLab


### Lesson Setup

Run the next code cell to load necessary packages

In [2]:
# You MUST run this code cell first

# Import TensorFlow modules
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras.callbacks import EarlyStopping

# Import Keras modules
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation

# Import scikit-learn metrics
from sklearn import metrics

# Import other needed packages
import time
import numpy as np
import pandas as pd
import requests
import os
os.environ['tf.compat.v1.logging.set_verbosity'] = '1'
import shutil
path = '/'
memory = shutil.disk_usage(path)
dirpath = os.getcwd()

# Print out diagnostics
print("Your current working directory is : " + dirpath)
print("Disk", memory)
print("TensorFlow version:", tf.version.VERSION)
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))

Your current working directory is : C:\Users\David\BIO1173\Class_03_2
Disk usage(total=4000108531712, used=999028875264, free=3001079656448)
TensorFlow version: 2.10.0
Num GPUs Available:  1


# Part 3.2: Using Keras to Build Regression Models

[Keras](https://keras.io/) is a **_software layer_** or API (application interface) that "sits" on top of TensorFlow. This makes buidling neural networks much easier. 

For example, the following code uses Keras to add one layer to a neural network called `model`:
~~~text
model.add(Dense(25, input_dim=x_0.shape[1], activation='relu'))
~~~

When you run this line of code, this apparently simple function, `model.add(Dense(25...)` executes a large number of hidden commands "behind the scenes". 

For example, here the code Keras uses to just define the class `SimpleDense(Layer)`:

~~~text
class SimpleDense(Layer):
    def __init__(self, units=32):
        super().__init__()
        self.units = units

    # Create the state of the layer (weights)
    def build(self, input_shape):
        self.kernel = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="glorot_uniform",
            trainable=True,
            name="kernel",
        )
        self.bias = self.add_weight(
            shape=(self.units,),
            initializer="zeros",
            trainable=True,
            name="bias",
        )

    # Defines the computation
    def call(self, inputs):
        return ops.matmul(inputs, self.kernel) + self.bias

# Instantiates the layer.
linear_layer = SimpleDense(4)

# This will also call `build(input_shape)` and create the weights.
y = linear_layer(ops.ones((2, 2)))
assert len(linear_layer.weights) == 2

# These weights are trainable, so they're listed in `trainable_weights`:
assert len(linear_layer.trainable_weights) == 2
~~~

Rather than asking you to create each individual layer in a network with hundreds lines of Python and Tensorflow code, you will be asked only use a handful of Keras commands. 

Unless you are researching entirely new structures of deep neural networks, it is unlikely that you need to program TensorFlow directly. For this class, we will usually use TensorFlow _through_ Keras, rather than directly program in TensorFlow


## Example 1: Simple TensorFlow/Keras Regression

In [Regression Analysis](https://en.wikipedia.org/wiki/Regression_analysis), the goal is to estimate the relationships between a [dependent variable](https://en.wikipedia.org/wiki/Dependent_and_independent_variables) (often called the 'outcome' or 'response' variable, or a 'label' in machine learning parlance) and one or more [independent variables](https://en.wikipedia.org/wiki/Dependent_and_independent_variables)  (often called 'predictors', 'covariates', 'explanatory variables' or 'features').  

With multiple independent variables, the equation for a linear regression is 

> $ Y_{i} = \alpha + \beta X_{i,1} + \beta  X_{i,2} + \beta X_{i,n} $

where $Y_{i}$ is the **_independent_** variable and $X_{i,}$ are the **_dependent_** variables. 

In the words of "machine learning", the $y$-value is the _response variability_ that we are trying to predict, while the $X$-values are the **_features_** that we are using to predict the value of $y$. While we know already know the values for $X$ and $Y$, what we don't know is the values of the coefficients, $\alpha$ and the $\beta$ for each category of independent variable. 


Example 1 shows how to encode the Apple Quality dataset for regression and predict values. We will see if we can predict the ripeness of an apple based on an apples's size, weight, sweetness, crunchiness, and other features. Example 1 is divided into 3 steps to make understanding of the coding easier to follow. 

### Example 5-Step 1: Read dataset and create a DataFrame.

The first step in most lessons in the course will begin by downloading the dataset from the course HTTPS server, and creating a DataFrame to hold the information. The code in the cell below, downloads the Apple Quality dataset and creates a DataFrame called `ripeDF`. In most Python examples, just the letters `df` is used as the name of a dataframe. In some lessons will adopt this convention, but most of the time we will use a more descriptive name, as in this example, `ripeDF` as the name of our DataFrame.   

As customary, after creating our new DataFrame, we disply the first few rows and columns to make sure the data was read correctly.


In [3]:
# Example 5-Step 1: Read data and create DataFrame

# Read the datafile and create DataFrame
ripeDF = pd.read_csv(
    "https://biologicslab.co/BIO1173/data/apple_quality.csv", 
    na_values=['NA', '?'])

# Create variable for later
appleNum = ripeDF['A_id']

# Set the max rows and max columns
pd.set_option('display.max_rows', 8)
pd.set_option('display.max_columns', 8)

# Display the DataFrame
display(ripeDF)

Unnamed: 0,A_id,Size,Weight,Sweetness,...,Juiciness,Ripeness,Acidity,Quality
0,0,-3.970049,-2.512336,5.346330,...,1.844900,0.329840,-0.491590,good
1,1,-1.195217,-2.839257,3.664059,...,0.853286,0.867530,-0.722809,good
2,2,-0.292024,-1.351282,-1.738429,...,2.838636,-0.038033,2.621636,bad
3,3,-0.657196,-2.271627,1.324874,...,3.637970,-3.413761,0.790723,good
...,...,...,...,...,...,...,...,...,...
3996,3996,-0.293118,1.949253,-0.204020,...,0.024523,-1.087900,1.854235,good
3997,3997,-2.634515,-2.138247,-2.440461,...,2.199709,4.763859,-1.334611,bad
3998,3998,-4.008004,-1.779337,2.366397,...,2.161435,0.214488,-2.229720,good
3999,3999,0.278540,-1.715505,0.121217,...,1.266677,-0.776571,1.599796,good


If your code is correct you should see the following output:

![__](http://biologicslab.co/BIO1173/images/class_03_2_Exm1.png)

Notice that the only column that contains non-numeric values is `Quality`.


### Example 1-Step 2: Create Feature Vector

The first step in creating a feature vector is to convert any categorical values (strings) into numerical values. All of the data in the Apple Quality dataset is numeric with the exception of the column `Quality` which has 2 categorical string values, `bad` and `good`. 

Instead of using One-Hot Encoding, the code below uses the Pandas `map()` method to convert the string `bad` to `0` and the string `good` to the value `1`. 
~~~text
# Define the mapping dictionary
mapping = {'bad': 0, 'good': 1}
# Map the strings to integers
ripeDF['Quality'] = ripeDF['Quality'].map(mapping)
~~~
The next step is the generate the X-values for the regression. In this example we want to use **ALL** of the numeric values in `ripeDF` DataFrame, **except** the values in the column `Ripeness`. This should make sense. We never want to include the Y-values with the X-values.

The $X$-values are the numbers contained in the `ripeDF` DataFrame columns, `Size`, `Weight`, `Sweetness`, `Crunchiness`, `Juiciness`, `Acidity` and `Quality`. One way to extract these **values** and create a Numpy array called `ripeX` is shown in this code chunk: 
~~~text
# Generate X-values
ripeX = ripeDF[['Size', 'Weight', 'Sweetness', 'Crunchiness',
       'Juiciness', 'Acidity', 'Quality']].values
ripeX = np.asarray(ripeX).astype('float32')
~~~
Notice the double square brackets `[[ ]]` followed by the word `values`. The command takes the Pandas numerical values in the specified columns and **_converts_** them into a Numpy array called `ripeX`. 

After we generate our X-values, we need to make sure all of the values are in the correct format for Keras, which is `float32`. You should **aways** convert your X and Y values to `float32` to avoid errors. 


Now we can generate our Y-values. The Y-values are in the `Ripeness` column, and we can use a similar code chunk to extract them into a Numpy array called `ripeY` as shown in this code chunk:
~~~text
# Generate Y-values
ripeY = ripeDF['Ripeness'].values
ripeY = np.asarray(ripeY).astype('float32')
~~~
Again, we are using the Pandas method `.values()` to convert the numerical values in the `ripeDF` DataFrame into Numpy arrays, `ripeX` and `ripeY`. These are the X and Y values that we will use to train our neural network.

The last line of code prints out the Numpy array created for the $Y$-values.

In [4]:
# Example 1-Step 2: Create Feature Vector

# Convert strings to integers
mapping = {'bad': 0, 'good': 1}  # define mapping
ripeDF['Quality'] = ripeDF['Quality'].map(mapping) # map

# Generate X values
ripeX = ripeDF[['Size', 'Weight', 'Sweetness', 'Crunchiness',
       'Juiciness', 'Acidity', 'Quality']].values
ripeX = np.asarray(ripeX).astype('float32')

# Generate Y values
ripeY = ripeDF['Ripeness'].values
ripeY = np.asarray(ripeY).astype('float32')

# Print Y values
print(ripeY)

[ 0.3298398   0.8675301  -0.03803333 ...  4.7638593   0.21448839
 -0.77657145]


If your code is correct, you should see that following output:
~~~text
[ 0.3298398   0.86753008 -0.03803333 ...  4.76385918  0.21448838
 -0.77657147]
~~~

If you compare the Numpy array printed above for the $y$-values, you will see that they are exactly the same as the `Ripeness` values that were printed out in Example 1-Step 1 above.

### Example 1-Step 3: Construct, compile and train the neural network

The last step is to build our neural network, compile and train it. Using Keras these steps are relatively easy -- once you understand what the various commands mean.

We begin by telling Keras what kind of model we want. There are actually several different types of neural networks. In this example we will using a **Feedforward Neural Network (FNN)** which is the simplest type, where information flows from the input layer directly through hidden layers to the output layer without loops. Keras refers to FNNs as "Sequential".

~~~text
# Specify the model type
ripeModel = Sequential()
~~~
For our `ripeModel` model, we are going to have 3 layers:

* one input layer
* one hidden layer
* one output layer 

As your might expect, we need to add each layer in the correct order, starting with the input layer. 

The input layer is also considered a hidden layer, so it is referred to as `Hidden 1`somewhat special since it needs to have one neuron for each X-value. The line of code for adding the input layer is: 
~~~text
ripeModel.add(Dense(25, input_dim=ripeX.shape[1], 
                        activation='relu'))  # Hidden 1 
~~~
Pay particular attention to the argument `input_dim=ripeX.shape[1]`. This argument tells Keras exactly how many input neurons should be put into the input layer. When students in the course build neural networks using "copy-and-paste", they often forget to change this parameter which will prevent their model from training. 

In this example there are `7` different inputs: 'Size', 'Weight', 'Sweetness', 'Crunchiness','Juiciness', 'Acidity', and 'Quality' so the value of `input_dim=ripeX.shape[1]` would be `7`.

Next, we tell Keras that we want a **_second hidden layer_** with 10 neurons. The code for adding the second hidden layer with 10 neurons is:

~~~text
ripeModel.add(Dense(10, activation='relu'))  # Hidden 2
~~~
The argument `Dense` tells Keras that we want **_every_** neuron in the second hidden layer to be connected to **_every_** neuron in the next layer -- the output layer.

The output layer is where we will "find" our answer. In a regression model, there is only a **_single neuron_** in the output layer. The numerical prediction of `Ripeness` for a given apple, ($Y$) is the numerical value in this one output neuron.  In other words, the numerical value in the output neuron, at the end of training, will predict the `Ripeness` of a particular apple given its 'Size', 'Weight', 'Sweetness', 'Crunchiness','Juiciness', 'Acidity', and 'Quality'. 

~~~text
ripeModel.add(Dense(1)) # Output
~~~
Notice that we don't specify an activation type for the output layer, since this the last neuron in the sequence.

Once we have specified all of the different layers that we want in our model, the next step is to **_Compile_** the model. The compile step sets up the framework for your model. It involves:

* Checking for format errors.
* Defining the loss function, which quantifies how well the model’s predictions match the actual target values.
* Choosing an optimizer (such as stochastic gradient descent) or setting the learning rate.
* Selecting metrics to evaluate the model’s performance during training.

In our model, we will select the [mean squared error](https://en.wikipedia.org/wiki/Mean_squared_error) as the loss function, and 'adam' as the optimizer. The Adam optimizer is a popular algorithm used in deep learning that helps adjust the parameters of a neural network in real-time to improve its accuracy and speed. Adam stands for _Adaptive Moment Estimation_, which means that it adapts the learning rate of each parameter based on its historical gradients and momentum. 

~~~text
# Complile model
ripeModel.compile(loss='mean_squared_error', optimizer='adam')
~~~

The next line of code uses the Keras `model.summary()` function to print out a summary of our model.
~~~text
# Print model (optional)
ripeModel.summary()
~~~
This step is optional. 

The last line of code "fits the model to the data". This is computerspeak for training the model.

~~~text
# Train model
ripeModel.fit(ripeX,ripeY,verbose=2,epochs=100)
~~~

In [5]:
# Example 5-Step 3: Buid the neural network 

# Specify the model type as sequential
ripeModel = Sequential()

# Construct model
ripeModel.add(Dense(25, input_dim=ripeX.shape[1], 
                        activation='relu'))  # Hidden 1
ripeModel.add(Dense(10, activation='relu'))  # Hidden 2
ripeModel.add(Dense(1)) # Output

# Complile model
ripeModel.compile(loss='mean_squared_error', optimizer='adam')

# Print model
ripeModel.summary()

# Train model
ripeModel.fit(ripeX,ripeY,verbose=2,epochs=100)

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense (Dense)               (None, 25)                200       
                                                                 
 dense_1 (Dense)             (None, 10)                260       
                                                                 
 dense_2 (Dense)             (None, 1)                 11        
                                                                 
Total params: 471
Trainable params: 471
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
125/125 - 1s - loss: 2.9913 - 593ms/epoch - 5ms/step
Epoch 2/100
125/125 - 0s - loss: 2.0743 - 322ms/epoch - 3ms/step
Epoch 3/100
125/125 - 0s - loss: 1.7320 - 326ms/epoch - 3ms/step
Epoch 4/100
125/125 - 0s - loss: 1.5860 - 324ms/epoch - 3ms/step
Epoch 5/100
125/125 - 0s - loss: 1.4896 - 328ms/epoch - 3ms/st

<keras.callbacks.History at 0x1e2b58bcbe0>

If your code is correct you should see the following output:

~~~text
Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 25)                200       
                                                                 
 dense_1 (Dense)             (None, 10)                260       
                                                                 
 dense_2 (Dense)             (None, 1)                 11        
                                                                 
=================================================================
Total params: 471
Trainable params: 471
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
125/125 - 1s - loss: 2.8682 - 602ms/epoch - 5ms/step
Epoch 2/100
125/125 - 0s - loss: 2.0818 - 327ms/epoch - 3ms/step
Epoch 3/100
125/125 - 0s - loss: 1.7714 - 332ms/epoch - 3ms/step
Epoch 4/100
125/125 - 0s - loss: 1.5415 - 324ms/epoch - 3ms/step
Epoch 5/100
125/125 - 0s - loss: 1.4121 - 324ms/epoch - 3ms/step

........................

Epoch 95/100
125/125 - 0s - loss: 0.7104 - 461ms/epoch - 4ms/step
Epoch 96/100
125/125 - 0s - loss: 0.7077 - 465ms/epoch - 4ms/step
Epoch 97/100
125/125 - 0s - loss: 0.7116 - 466ms/epoch - 4ms/step
Epoch 98/100
125/125 - 0s - loss: 0.7080 - 497ms/epoch - 4ms/step
Epoch 99/100
125/125 - 0s - loss: 0.7100 - 463ms/epoch - 4ms/step
Epoch 100/100
125/125 - 0s - loss: 0.7053 - 430ms/epoch - 3ms/step

<keras.callbacks.History at 0x1d8ba2baa30>

-----------------------------------

## **"Fit" the Model?**

In machine learning the term, "fit the model", means to **_train_** the model using the numerical values in our dataset. During training, the model **_learns_** from the data. In essence, the model makes a prediction of what it "thinks" is the correct answer, and then adjusts its trainable parameters (weights and biases) in an effort to make its predictions more accurate (i.e. minimize the loss function).

The **fit step** involves:

* Forward passes (feeding input data through the network).
* Backward passes (calculating derivatives using backpropagation).
* Updating weights based on gradients to improve predictions.

The fit step is by far the most _computationally_ demanding step. This is where GPU's and TPU's are used to speed up the training. With relatively small neural networks like this one, a relatively modern laptop will come with a **_central processing unit (CPU)_** that can handled all the computations involved in a training a small neural network in a reasonable time period. Increasing, newer laptops are being equipped with "AI chips" that can speed of training by a significant amout. 

The command:
~~~text
# Fit the model to the data
model_0.fit(x_0,y_0,verbose=2,epochs=100)
~~~
has the following 4 arguments: 

* the $X$-values
* the $Y$-values
* the level of _verbosity_ (how much feedback should be printed out during training)
* the number of `epochs`

In the example above, the number of epoch is set to `100`. An epoch means training the neural network with all the training data for one complete cycle. During an epoch, all of the data is used exactly once. A forward pass and a backward pass together are counted as one pass. 

With the verbosity set to `2`, Keras will print out the loss value, the number of milliseconds the epoch required, and the time per step. 

-------------------------

Notice that the **loss value** decreases from `3.3706` after the 1st epoch (`Epoch 1/100`), 
~~~text
Epoch 1/100
125/125 - 1s - loss: 2.4560 - 966ms/epoch - 8ms/step
~~~
to less than a third of that amount, `0.7313` after the last epoch (`Epoch 100/100`)
~~~text
Epoch 100/100
125/125 - 0s - loss: 0.7313 - 374ms/epoch - 3ms/step
~~~
This decrease in loss is due to the neural network **_learning_**. After each epoch, the network makes slight adjustments in the network's _trainable parameters_ (i.e. biases and connection weights), and runs the complete dataset through the model again to see if the updated parameters do a better job of predicting the `Ripeness` of each apple in the dataset.

## **Exercise 1: Simple Tensorflow Regression**

For **Exercise 1** you are to build the same Feed Forward Neural network (FNN) demonstrated in Example 1. However, this time your goal will be to build a regression neural network that can predict the `Acidity` of an apple. In other words, the column `Acidity` will be your response variable (Y-values). 

As was done in Example 1, **Exercise 3** has been divided into 3 steps:

1. Read dataset and create a DataFrame
2. Create Feature Vector
3. Construct, compile and train the neural network

### **Exercise 1-Step 1: Read dataset and create a DataFrame**

In the cell below, write the code to create a DataFrame called `acidDF` by reading the Apple Quality dataset from the course HTTPS server. Set the dislay options to show `8` rows and `8` columns, and then display your `acidDF` DataFrame. 

In [6]:
# Insert your code for Exercise 5-Step 1 here:

acidDF = pd.read_csv(
    "https://biologicslab.co/BIO1173/data/apple_quality.csv", 
    na_values=['NA', '?'])

# Set the max rows and max columns
pd.set_option('display.max_rows', 8)
pd.set_option('display.max_columns', 8)

# Display the DataFrame
display(acidDF)

Unnamed: 0,A_id,Size,Weight,Sweetness,...,Juiciness,Ripeness,Acidity,Quality
0,0,-3.970049,-2.512336,5.346330,...,1.844900,0.329840,-0.491590,good
1,1,-1.195217,-2.839257,3.664059,...,0.853286,0.867530,-0.722809,good
2,2,-0.292024,-1.351282,-1.738429,...,2.838636,-0.038033,2.621636,bad
3,3,-0.657196,-2.271627,1.324874,...,3.637970,-3.413761,0.790723,good
...,...,...,...,...,...,...,...,...,...
3996,3996,-0.293118,1.949253,-0.204020,...,0.024523,-1.087900,1.854235,good
3997,3997,-2.634515,-2.138247,-2.440461,...,2.199709,4.763859,-1.334611,bad
3998,3998,-4.008004,-1.779337,2.366397,...,2.161435,0.214488,-2.229720,good
3999,3999,0.278540,-1.715505,0.121217,...,1.266677,-0.776571,1.599796,good


If your code is correct you should see the following output:

![__](http://biologicslab.co/BIO1173/images/class_03_2_Exm1.png)


### **Exercise 1-Step 2: Create Feature Vector**

In the cell below, create a Feature Vector for your regression model. Start by mapping the categorical values in the column `Quality` to integers as was demonstrated in Example 1-Step 2. 

Then generate your X-values. Remember that your model will be predicting `Acidity` so you **don't** want to include the values with your other X-values. You can use this code chunk to generate your X-values:

~~~text
# Generate X values
acidX = acidDF[['Size', 'Weight', 'Sweetness', 'Crunchiness',
       'Juiciness', 'Ripeness', 'Quality']].values
acidX = np.asarray(acidX).astype('float32')
~~~
As you can see, the string `Acidity` has been replaced by the string `Ripeness`. 

You can use this code chunk to generate your Y-values:

~~~text
# Generate Y values
acidY = acidDF['Acidity'].values
acidY = np.asarray(acidY).astype('float32')
~~~

As a check, print out the first 10 values in your Numpy array `acidY`.

In [7]:
# Insert your code for Exercise 5-Step 2 here

# Convert strings to integers
mapping = {'bad': 0, 'good': 1}  # define mapping
acidDF['Quality'] = acidDF['Quality'].map(mapping) # map

# Generate X values
acidX = acidDF[['Size', 'Weight', 'Sweetness', 'Crunchiness',
       'Juiciness', 'Ripeness', 'Quality']].values
acidX = np.asarray(acidX).astype('float32')

# Generate Y values
acidY = acidDF['Acidity'].values
acidY = np.asarray(acidY).astype('float32')

# Print Y values
print(acidY)

[-0.49159047 -0.7228094   2.6216364  ... -1.3346114  -2.2297199
  1.5997964 ]


If your code is correct, you should see the following output:
~~~text
[-0.49159048 -0.72280937  2.62163647 ... -1.33461139 -2.22971981
  1.59979646]
~~~

### **Exercise 1-Step 3: Construct, compile and train the neural network**

In the cell below, contruct a regression neural network called `acidModel` by "copy-and-paste" the code in Example 1-Step 3.

Don't forget to change the value of the argument `input_dim` in the input layer. Your model should read:
~~~text
acidModel.add(Dense(25, input_dim=acidX.shape[1], 
                        activation='relu'))  # Hidden 1
~~~
As mentioned above, students in this course often forget to change this variable causing the model not to train. 

In [8]:
# Insert your code for Exercise 1-Step 3 here

# Specify the model type as sequential
acidModel = Sequential()

# Build model
acidModel.add(Dense(25, input_dim=acidX.shape[1], 
                        activation='relu'))  # Hidden 1 
acidModel.add(Dense(10, activation='relu'))  # Hidden 2
acidModel.add(Dense(1))  # Output

# Complile model
acidModel.compile(loss='mean_squared_error', optimizer='adam')

# Print model summary
acidModel.summary()

# Train model
acidModel.fit(acidX,acidY,verbose=2,epochs=100)

Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 dense_3 (Dense)             (None, 25)                200       
                                                                 
 dense_4 (Dense)             (None, 10)                260       
                                                                 
 dense_5 (Dense)             (None, 1)                 11        
                                                                 
Total params: 471
Trainable params: 471
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
125/125 - 1s - loss: 4.0237 - 745ms/epoch - 6ms/step
Epoch 2/100
125/125 - 0s - loss: 3.4108 - 422ms/epoch - 3ms/step
Epoch 3/100
125/125 - 0s - loss: 3.0480 - 417ms/epoch - 3ms/step
Epoch 4/100
125/125 - 0s - loss: 2.7907 - 414ms/epoch - 3ms/step
Epoch 5/100
125/125 - 0s - loss: 2.5917 - 417ms/epoch - 3ms/

<keras.callbacks.History at 0x1e2b82bbac0>

If your code is correct, your output should start with something similar to the following:

~~~text
Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense_3 (Dense)             (None, 25)                200       
                                                                 
 dense_4 (Dense)             (None, 10)                260       
                                                                 
 dense_5 (Dense)             (None, 1)                 11        
                                                                 
=================================================================
Total params: 471
Trainable params: 471
Non-trainable params: 0
_________________________________________________________________
Epoch 1/100
125/125 - 1s - loss: 4.2960 - 746ms/epoch - 6ms/step
Epoch 2/100
125/125 - 0s - loss: 3.5910 - 415ms/epoch - 3ms/step
Epoch 3/100
125/125 - 0s - loss: 3.2880 - 473ms/epoch - 4ms/step
Epoch 4/100
125/125 - 0s - loss: 3.0075 - 415ms/epoch - 3ms/step
Epoch 5/100
125/125 - 0s - loss: 2.7870 - 409ms/epoch - 3ms/step
~~~

and end with something similar to the following:
~~~text
Epoch 95/100
125/125 - 0s - loss: 1.1699 - 451ms/epoch - 4ms/step
Epoch 96/100
125/125 - 0s - loss: 1.1745 - 453ms/epoch - 4ms/step
Epoch 97/100
125/125 - 1s - loss: 1.1708 - 508ms/epoch - 4ms/step
Epoch 98/100
125/125 - 0s - loss: 1.1617 - 464ms/epoch - 4ms/step
Epoch 99/100
125/125 - 0s - loss: 1.1697 - 414ms/epoch - 3ms/step
Epoch 100/100
125/125 - 0s - loss: 1.1617 - 411ms/epoch - 3ms/step

<keras.callbacks.History at 0x1811106fdf0>
~~~

In the particular example, your `acidModel` started with an error rate (`loss:`) equal to `4.296` after the first epoch. At the end of `100` epochs, the model's prediction of the acidity level had improved substantially, with a loss equal to only `1.1617`.

Due to the random nature that Keras initializes weights and biases, your output will likely be somewhat different.

## Introduction to Neural Network Hyperparameters

If you look at the above code, you will see that the neural network contains four layers. The first layer is the input layer because it contains the **input_dim** parameter that the programmer sets to be the number of inputs the dataset has. The network needs one input neuron for every column in the data set (including dummy variables).  

There are also several hidden layers, with 25 and 10 neurons each. You might be wondering how the programmer chose these numbers. Selecting a hidden neuron structure is one of the most common questions about neural networks. Unfortunately, there is no right answer. These are hyperparameters. They are settings that can affect neural network performance, yet there are no clearly defined means of setting them.

In general, more hidden neurons mean more capability to fit complex problems. However, too many neurons can lead to overfitting and lengthy training times. Too few can lead to underfitting the problem and will sacrifice accuracy. Also, how many layers you have is another hyperparameter. In general, more layers allow the neural network to perform more of its feature engineering and data preprocessing. But this also comes at the expense of training times and the risk of overfitting. In general, you will see that neuron counts start larger near the input layer and tend to shrink towards the output layer in a triangular fashion. 

Some techniques use machine learning to optimize these values. These will be discussed later in this course.

## Controlling the Amount of Output

The program produces one line of output for each training epoch. You can eliminate this output by setting the verbose setting of the fit command:

* **verbose=0** - No progress output (use with Jupyter if you do not want output).
* **verbose=1** - Display progress bar, does not work well with Jupyter.
* **verbose=2** - Summary progress output (use with Jupyter if you want to know the loss at each epoch).

## Regression Prediction

Next, we will perform actual predictions. The program assigns these predictions to the **pred** variable. For Example 5, these will be predictions of apple **Ripeness** from the neural network; For Exercise 5, these will be predictions of apple **Quality** from the neural network. 

### Example 2: Use model to make predictions

The code in the cell below uses Keras' `model.predict()` function to predict the `Ripeness` of each of the 4000 apples in the Apple Quality dataset based on its 'Size', 'Weight', 'Sweetness', 'Crunchiness','Juiciness', 'Acidity', and 'Quality'. The predictions are stored in a variable called `pred`. 

Keep in mind that these `Ripeness` **_predictions_** are being made by the neural network model, `ripeModel` after it was trained ('fitted') to the dataset for 100 epochs. 

In [9]:
# Example 2: Predict the Ripeness of each apple in the dataset

# Use model to make Ripeness predictions
ripePred = ripeModel.predict(ripeX)

# Print shape of pred
print(f"Shape of ripePred: {ripePred.shape}")

# Print first 10 predictions
print(ripePred[0:10])

Shape of ripePred: (4000, 1)
[[ 0.30069673]
 [ 0.87126034]
 [-0.06779067]
 [-3.7803648 ]
 [-1.5543201 ]
 [ 1.640624  ]
 [-1.6265161 ]
 [ 1.6308248 ]
 [ 3.7290533 ]
 [ 2.7232528 ]]


If your code is correct you should see something similar to the following output:
~~~text
125/125 [==============================] - 0s 2ms/step
Shape of ripePred: (4000, 1)
[[ 0.41057712]
 [ 0.8406092 ]
 [ 0.03273106]
 [-3.8475904 ]
 [-1.5179614 ]
 [ 1.9175235 ]
 [-1.9796481 ]
 [ 1.3494562 ]
 [ 4.5986714 ]
 [ 2.5490847 ]]
~~~


### **Exercise 2: Use model to make predictions**

In the cell below, use Keras' `model.predict()` function to predict the `Acidity` of each of the 4000 apples in the Apple Quality dataset based on its 'Size', 'Weight', 'Sweetness', 'Crunchiness','Juiciness', 'Ripeness' and `Quality`. Store these  predictions in a variable called `acidPred`. 

Print out the shape of `acidPred` and your model's first `10` acidity predictions.

In [10]:
# Insert your code for Exercise 2 here

acidPred = acidModel.predict(acidX)
print(f"Shape of acidPred: {acidPred.shape}")
print(acidPred[0:10])


Shape of acidPred: (4000, 1)
[[-2.0719798]
 [ 0.900121 ]
 [ 3.8747263]
 [ 1.8134047]
 [ 0.7781885]
 [-2.3871996]
 [ 2.6942391]
 [-2.086415 ]
 [-3.2495368]
 [-1.493188 ]]


If your code is correct you should see something similar to the following output:
~~~text
125/125 [==============================] - 0s 2ms/step
Shape of acidPred: (4000, 1)
[[-0.96637136]
 [ 0.5449684 ]
 [ 2.9393399 ]
 [ 1.3829033 ]
 [ 1.7506608 ]
 [-2.8392882 ]
 [ 3.751297  ]
 [-1.6653696 ]
 [-2.7585254 ]
 [-1.591665  ]]
~~~

### Example 3: Determine the accuracy of the model's predictions

An obvious question is how good are the neural network's predictions?  Since we know the correct `Ripeness` for each apple in the dataset, we can measure how close each neural network prediction was to the actual value.

A common measure in regression analysis is the [Root-mean-square error (RMSE)](https://en.wikipedia.org/wiki/Root-mean-square_deviation).

RMSE measures of the differences between predicted values and true values. 

The code in the cell below computes the RMSE of the `Ripeness` predictions made by `ripeModel` with the actual `Ripeness` values in the Apple Quality dataset, which are stored in the array `ripeY`. The RMSE is stored in a new variable called `score`.


In [11]:
# Example 3: Determine the RMSE for model_0

# Measure RMSE error
score = np.sqrt(metrics.mean_squared_error(ripePred,ripeY))

# Print out the RSME
print(f"Final score (RMSE) for ripeModel: {score}")

Final score (RMSE) for ripeModel: 0.8483414053916931


If your code is correct, you should see something similar to the following output:
~~~text
Final score (RMSE) for ripeModel: 0.8224384784698486
~~~
So what does this RMSE value mean? 

The answer is somewhat complicated. What can be said with certainty, is that an RMSE value equal to `0` would indicate `100%` perfect predictions. However, that rarely happens with real data.

We also know that RMSE will always be non-negative (i.e., positive) since it is the _square_ of two numbers.

However, beyond that, interpreting the meaning of RMSE is not straightforward. In general, a lower RMSE is better than a higher one. However, comparisons across different types of data would be invalid because the magnitude of the RSME is dependent on the scale of the numbers used. In other words, you will get a larger RSME when trying to predict a bigger number than a smaller one.

### Example 4: Compare predictions to actual values

The best way to get a sense of how accurate were the predictions of the `ripeModel` is to print out the actual values next to the model's predictive values. 

The code in the cell below uses a `for` loop to print out the first `10` comparisons.

In [12]:
# Example 4: Print out predictions and actual values

# set the print option to 4 decimal places
np.set_printoptions(formatter={'float': '{:.4f}'.format})

# For loop for printing values
for i in range(10):
    print(f"{i+1}. Apple:{appleNum[i]}" 
         + f"  Ripeness: {ripeY[i]}"
         + f"  Predicted: {ripePred[i]}") 

1. Apple:0  Ripeness: 0.3298397958278656  Predicted: [0.3007]
2. Apple:1  Ripeness: 0.867530107498169  Predicted: [0.8713]
3. Apple:2  Ripeness: -0.03803332895040512  Predicted: [-0.0678]
4. Apple:3  Ripeness: -3.4137613773345947  Predicted: [-3.7804]
5. Apple:4  Ripeness: -1.303849458694458  Predicted: [-1.5543]
6. Apple:5  Ripeness: 1.9146158695220947  Predicted: [1.6406]
7. Apple:6  Ripeness: -1.8474167585372925  Predicted: [-1.6265]
8. Apple:7  Ripeness: 0.9744378328323364  Predicted: [1.6308]
9. Apple:8  Ripeness: 4.080920696258545  Predicted: [3.7291]
10. Apple:9  Ripeness: 1.620856761932373  Predicted: [2.7233]


If your code is correct you should see an output that is similar to the following:

~~~text
1. Apple:0  Ripeness: 0.3298397958278656  Predicted: [0.4106]
2. Apple:1  Ripeness: 0.867530107498169  Predicted: [0.8406]
3. Apple:2  Ripeness: -0.03803332895040512  Predicted: [0.0327]
4. Apple:3  Ripeness: -3.4137613773345947  Predicted: [-3.8476]
5. Apple:4  Ripeness: -1.303849458694458  Predicted: [-1.5180]
6. Apple:5  Ripeness: 1.9146158695220947  Predicted: [1.9175]
7. Apple:6  Ripeness: -1.8474167585372925  Predicted: [-1.9796]
8. Apple:7  Ripeness: 0.9744378328323364  Predicted: [1.3495]
9. Apple:8  Ripeness: 4.080920696258545  Predicted: [4.5987]
10. Apple:9  Ripeness: 1.620856761932373  Predicted: [2.5491]
~~~

By inspection, we can see that the `ripeModel` is doing a reasonable job, but definitely **not** a perfect job, of predicting the `Ripeness` of an individual apple. 

### **Exercise 3: Determine the quality of the model's predictions**

Write the code in the cell below, to compute the [Root-mean-square error (RMSE)](https://en.wikipedia.org/wiki/Root-mean-square_deviation) for apple `Acidity` predicted by your `acidModel`. Print out the RSME for your `acidModel`.


In [13]:
# Insert your code for Exercise 3 here

# Measure RMSE error
score = np.sqrt(metrics.mean_squared_error(acidPred,acidY))

# Print out the RSME
print(f"Final score (RMSE) for acidModel: {score}")

Final score (RMSE) for acidModel: 1.0666450262069702


If your code is correct you should see something similar to the following output:
~~~text
Final score (RMSE) for acidModel: 1.084747914279273
~~~

### **Exercise 4: Compare predictions to actual values**

In the cell below write the code to print out predictions made by your `acidModel` for the `Acidity` of the first 10 apples as well as their actual `Acidity` values, side-by-side. 

In [14]:
# Insert your code for Exercise 4 here

# Set print precision
np.set_printoptions(formatter={'float': '{:.4f}'.format})

# For loop for printing values
for i in range(10):
    print(f"{i+1}. Apple:{appleNum[i]}" 
         + f"  Acidity: {acidY[i]}"
         + f"  Predicted: {acidPred[i]}") 

1. Apple:0  Acidity: -0.4915904700756073  Predicted: [-2.0720]
2. Apple:1  Acidity: -0.722809374332428  Predicted: [0.9001]
3. Apple:2  Acidity: 2.621636390686035  Predicted: [3.8747]
4. Apple:3  Acidity: 0.7907232046127319  Predicted: [1.8134]
5. Apple:4  Acidity: 0.5019840598106384  Predicted: [0.7782]
6. Apple:5  Acidity: -2.981523275375366  Predicted: [-2.3872]
7. Apple:6  Acidity: 2.414170503616333  Predicted: [2.6942]
8. Apple:7  Acidity: -1.4701250791549683  Predicted: [-2.0864]
9. Apple:8  Acidity: -4.8719048500061035  Predicted: [-3.2495]
10. Apple:9  Acidity: 2.185607671737671  Predicted: [-1.4932]


If your code is correct you should see an output that is something similar to the following:
~~~text
1. Apple:0  Acidity: -0.4915904700756073  Predicted: [-1.1459]
2. Apple:1  Acidity: -0.722809374332428  Predicted: [0.6893]
3. Apple:2  Acidity: 2.621636390686035  Predicted: [3.5310]
4. Apple:3  Acidity: 0.7907232046127319  Predicted: [1.1122]
5. Apple:4  Acidity: 0.5019840598106384  Predicted: [0.6371]
6. Apple:5  Acidity: -2.981523275375366  Predicted: [-3.5875]
7. Apple:6  Acidity: 2.414170503616333  Predicted: [2.1825]
8. Apple:7  Acidity: -1.4701250791549683  Predicted: [-1.7862]
9. Apple:8  Acidity: -4.8719048500061035  Predicted: [-3.1646]
10. Apple:9  Acidity: 2.185607671737671  Predicted: [-1.0773]
~~~

Compared to `ripeModel`, your `acidModel` seems to be able to make more accurate predictions. This is consistent with the observation that the RSME for your `acidModel` was smaller than the RMSE for the `ripeModel` created in Example 1.  

## **Lesson Turn-in**

When you have completed all of the code cells, and run them in sequential order (the last code cell should be number 14), use the **File --> Print.. --> Save to PDF** to generate a PDF of your JupyterLab notebook. Save your PDF as `Class_03_2.lastname.pdf` where _lastname_ is your last name, and upload the file to Canvas.