<a href="https://colab.research.google.com/github/DavidSenseman/BIO1173/blob/master/Class_04_1.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

---------------------------
**COPYRIGHT NOTICE:** This Jupyterlab Notebook is a Derivative work of [Jeff Heaton](https://github.com/jeffheaton) licensed under the Apache License, Version 2.0 (the "License"); You may not use this file except in compliance with the License. You may obtain a copy of the License at

> [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

------------------------

# **BIO 1173: Intro Computational Biology**

##### **Module 4: Training for Tabular Data**

* Instructor: [David Senseman](mailto:David.Senseman@utsa.edu), [Department of Integrative Biology](https://sciences.utsa.edu/integrative-biology/), [UTSA](https://www.utsa.edu/)

### Module 4 Material

* **Part 4.1: Encoding a Feature Vector for Keras Deep Learning**
* Part 4.2: Keras Multiclass Classification for Deep Neural Networks with ROC and AUC
* Part 4.3: Keras Regression for Deep Neural Networks with RMSE
* Part 4.4: Backpropagation, Nesterov Momentum, and ADAM Neural Network Training
* Part 4.5: Neural Network RMSE and Log Loss Error Calculation from Scratch


# Google CoLab Instructions

The following code ensures that Google CoLab is running the correct version of TensorFlow.
  Running the following code will map your GDrive to ```/content/drive```.

In [None]:
try:
    from google.colab import drive
    drive.mount('/content/drive', force_remount=True)
    COLAB = True
    print("Note: using Google CoLab")
    %tensorflow_version 2.x
except:
    print("Note: not using Google CoLab")
    COLAB = False

### Lesson Setup

Run the next code cell to load necessary packages

In [None]:
# You MUST run this code cell first
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation

import numpy as np
import pandas as pd


import os
import shutil
path = '/'
memory = shutil.disk_usage(path)
dirpath = os.getcwd()
print("Your current working directory is : " + dirpath)
print("Disk", memory)
print("Numpy version =", (np.__version__)) 
print("Tensorflow version =", (tf.__version__))
print("Available GPU acceleration =", tf.test.gpu_device_name())


# Part 4.1: Encoding a Feature Vector for Keras Deep Learning

Neural networks can accept many types of data. We will begin with tabular data, where there are well-defined rows and columns. This data is what you would typically see in Microsoft Excel spreadsheet.

Neural networks require numeric input. This numeric form is called a **_feature vector_**. Each input neurons receive one feature (or column) from this vector. Each row of training data typically becomes one vector. 

In this lesson, we will focus on how to encode **_tabular data_** stored in a Pandas DataFrame into a feature vector that can be used by two types of neural networks, classification and regression.

### Example 1: Read dataset and store values in a DataFrame

The code in the cell below reads the Apple Quality dataset file, `apple_quality.csv`, located on the course HTTPS server [https://corgi.genomelab.utsa.edu](https://corgi.genomelab.utsa.edu). If you are working off campus, and you can't download the dataset, you will need to run the UTSA Virtual Private Network (vip.utsa.edu) in order to access this file.

The code in the cell below uses the Pandas function `pd.read_csv()` to read the read data file over the internet. The argument for `pd.read_csv()`, includes the URL for server as well as the filepath and the filename of the data file. As the dataset is read, it is saved to a new DataFrame called `aqDF`. 

The code then sets the maximum number of rows to 6 and the maximum number of columns to 0, before displaying the DataFrame. Setting the maximum columns (or rows) to `0` means to display **all** columns, even if it means the output extends past the edge of the notebook.

**WARNING:** Be careful **NEVER** set the maximum number of rows to `0`. Most datasets, especially the ones used in this course have hundreds or even thousands of rows. If you set `max_rows = 0`, you will end up with page-after-page of screen output! 


In [None]:
# Example 1: Read dataset and store values in a DataFrame

# Read in data and create DataFrame
aqDF = pd.read_csv(
    "https://corgi.genomelab.utsa.edu/BIO1173/Datasets/apple_quality.csv", 
    na_values=['NA', '?'])

# Set the max rows and columns
pd.set_option('display.max_rows', 6) # NEVER set max_rows to 0
pd.set_option('display.max_columns', 0) # 0 displays all the columns

# Display the DataFrame
display(aqDF)

If your code is correct, you should see the following table.

![___](https://corgi.genomelab.utsa.edu/BIO1173/images/Class_04_1_Table1.png)

You should make the following observations from the above tabular data:
* The target column is the column that you seek to predict. By convention, the rightmost column is usually the target column, or _response variable_. In this DataFrame the target column is `Quality`.   
* There is an `A_id` column. We should exclude this column from our analysis because it contains no information useful for making a prediction.
* Many of these fields are numeric and might not require further processing.
* Categorical values (strings) only found in the target column (`Quality`). Values in the target column do **not** need to be One-Hot Encoded until later.


### **Exercise 1: Read dataset and store values in a DataFrame**

In the cell below, use the Pandas function `pd.read_csv()` to read the Heart Disease dataset file, `heart_disease.csv`, located on the course HTTPS server [https://corgi.genomelab.utsa.edu](https://corgi.genomelab.utsa.edu). If you are working off campus, you may need to be running the UTSA Virtual Private Network (vip.utsa.edu) in order to access this file.

Set the display for 8 rows and **all** of the columns print out display of `hfDF`.
 

In [None]:
# Insert your code for Exercise 1 here



If your code is correct, you should see the following table.

![Heart Failure DataFrame](https://corgi.genomelab.utsa.edu/BIO1173/images/Class_04_1_Table2.png "Tabular Data Example")



You should make the following observations from the above tabular data:
* The target column is the column that you usually want to predict with a classification neural network. As above, the rightmost column, `HeartDisease`, would be the usual target column. 
* The column `FastingBS` doesn't appear to contain information that would be especially useful for predicting heart disease so it will be dropped. 
* Some fields are numeric and might not require further processing.
* There are categorical values in 5 columns including: `Sex`, `ChestPainType`, `RestingECG`, `ExerciseAngina` and `ST_Slope`. The categorical values (strings) in these columns will need to be One-Hot Encoded before they can be used.

### Example 2: Drop unecessary columns

Since the column `A_id` in the Apple Quality dataset doesn't contain information that will be useful for predicting fruit quality, this column should not be included in the analysis. The code below uses the Pandas `pd.drop()` method to eliminate this column from the analysis. 

In [None]:
# Example 2: Drop unecesary columns 

# Drop the A_id column 
aqDF.drop('A_id', axis=1, inplace=True)

# Set the max rows and max columns
pd.set_option('display.max_columns', 8)
pd.set_option('display.max_rows', 8)

# Display the updated DataFrame
display(aqDF)

If you code is correct you should see the following table:

![__](https://corgi.genomelab.utsa.edu/BIO1173/images/class_04_1_Exam2.png)

You should note that the column `A_id` has been removed. Instead of the original 9 columns, there are now only 8 columns.

### **Exercise 2: Drop unecessary columns**

Since the column `FastingBS` in the Heart Disease dataset doesn't contain information that will be especially useful for predicting heart disease, this column should not be included in the analysis. In the cell below, write the code to drop this column. Display 8 rows and 8 columns of your updated DataFrame.

In [None]:
# Insert your code for Exercise 2 here



### Example 3: One-Hot encode categorical variables

The Heart Failure dataset has 5 columns that have categorical variables (strings): Sex, ChestPainType, RestingECG, ExerciseAngina and ST_Slope. Example 3 illustrates how to One-Hot encode these categorical variables in one column `Sex`.

The following line of code uses the Pandas function, `pd.get_dummies()` to One-Hot encode the string values in the `Sex` column and turn them into the integer values `0` and `1`. 

Once you created the dummies, you must complete two more steps. First, you need to use the Pandas `pd.concat()` function to **_add_** the dummies to your DataFrame. The line of code that accomplishes this is:

Now that the data in the column has been replaced by the dummy values, the next step is to **_drop_** the column containing the strings from the DataFrame. The line of code that does this is:  

The code then sets the max rows and max columns to `8` and displays the contents of the updated DataFrame.

In [None]:
# Example 3: One-Hot encode categorical variables

# Get dummy values
dummies = pd.get_dummies(hdDF['Sex'],prefix="Sex", dtype=int)

# Add the dummies into the DataFrame
hdDF = pd.concat([hdDF,dummies],axis=1)

# Drop the column replaced by the dummies
hdDF.drop('Sex', axis=1, inplace=True)

# Set the max rows and max columns
pd.set_option('display.max_columns', 8)
pd.set_option('display.max_rows', 8)

# Display the updated DataFrame
display(hdDF)

If your code is correct, you should see the following table.

![__](https://corgi.genomelab.utsa.edu/BIO1173/images/class_04_1_Ex3A.png)


### **Exercise 3: One-Hot encode categorical values**

Example 3 above, illustrated how to One-Hot encode the column, `Sex`, in the Heart Disease DataFrame, `hdDF`. For **Exercise 3** you will need to  One-Hot encode the remaining 4 columns, `ChestPainType`, `RestingECG`, `ExerciseAngina` and `ST_Slope`. For your convience, this exercise will be broken down into 4 separate steps, labeled Exercise 3A to 3D, one step for each column. 

**WARNING--WARNING--WARNING**

Any time that you are making a major modification to a DataFrame, such as dropping a column, it is _very_ easy to start generating errors when you are working on your exercises. Problems arise after you drop a column from a DataFrame. If you try re-run the same code cell again, the column has already been dropped, you will generate an error. The only way to avoid these errors is to **GO BACK AND RE-RUN _ALL_ THE CODE CELLS** starting from **Exercise 1** where you originally created your complete DataFrame, `hdDF`.


### **Exercise 3A: One-Hot encode the column `ChestPainType`**

In the cell below, One-Hot encode the column `ChestPainType` in the Heart Disease DataFrame, `hdDF`. Add the dummies to the DataFrame and then drop the column `ChestPainType`. Set the display for 9 columns and 8 rows and print out a display of your updated DataFrame. 

In [None]:
# Insert your code for Exercise 3A here



### **Exercise 3B: One-Hot encode the column `RestingECG`**

In the cell below, One-Hot encode the column `RestingECG` in the Heart Disease DataFrame, `hfDF`. Add the dummies to the DataFrame and then drop the column `RestingECG`. Set the display for 9 columns and 8 rows and print out a display of your updated DataFrame. 

In [None]:
# Insert your code for Exercise 3B here



### **Exercise 3C: One-Hot encode the column `ExerciseAngina`**

In the cell below, One-Hot encode the column `ExerciseAngina` in the Heart Disease DataFrame, `hdDF`. Add the dummies to the DataFrame and then drop the column `ExerciseAngina`. Set the display for 9 columns and 8 rows and print out a display of your updated DataFrame. 

In [None]:
# Insert your code for Exercise 3C here



### **Exercise 3D: One-Hot encode the column `ST_Slope`**

In the cell below, One-Hot encode the column `ST_Slope` in the Heart Disease DataFrame, `hdDF`. Add the dummies to the DataFrame and then drop the column `ST_Slope`. Set the display for 9 columns and 8 rows and print out a display of your updated DataFrame. 

In [None]:
# Insert your code for Exercise 3D here



If your code is correct, you should see the following table.

![__](https://corgi.genomelab.utsa.edu/BIO1173/images/class_04_1_Ex3E.png)

You should notice that the number of columns in your DataFrame, `hfDF`, has increased from the original 12 columns to total of **20** columns. The "column inflation` in an invariable consequence of One-Hot encoding. 

### Sample Code: Print out column names

If your coding has been correct so far, your DataFrame, `hdDF` should have 20 columns. This is an inconviently large number of columns to display on your computer screen. In this situation, you can take advantage of the Pandas method `pd.columns` to print out a complete list of all the column names in DataFrame, using this line of code:

>  `print(list(hdDF.columns))` 

Run the next cell to check whether your code is correct before going on to the next part of the lesson. The code in this cell prints out a list of the column names in your updated DataFrame, `hdDF`.

In [None]:
# Run this cell to print the column names

print(list(hdDF.columns))

If your coding has been correct so far, you should see the following list of column names:

If the output from `print(list(hfDF.columns))` doesn't match the sample output above, figure out which columns were not successfully One-Hot encoded and make the necessary code fixes. Remember, you will need to keep re-running all of the code cells starting with **Exercise 1** to restore your DataFrame `hdDF` to its original condition, before you debug the code cells in **Exercise 3**.

## Generate X and Y for a Classification Neural Network

Now that unecessary columns have been dropped, and all of the string data has been One-Hot encoded (except for the target column), we are ready to use the data stored in the updated DataFrame as input for a neural network. 

There are two basic ways to used tabular data (i.e. data stored in a DataFrame) as input into a neural network that peforms classification or as input for a neural network that performs regression. How you generate X and Y values will differ depending upon which type of neural network are are using.

We will begin by building and training a **_classification_** neural network.

### Example 4: Generate X and Y for _Classification_ Neural Network

The goal of a classification neural network is to accurately categorize input data (X) into predefined classes or categories (Y). The network learns to identify patterns and features within the input data that are associated with each class, allowing it to make predictions about the class of new, unseen data. During training (fitting), the neural network tries to minimize the classification error as a way to improve the overall accuracy of the network's predictions.

In a classification neural network, the output layer will have exactly _the same number of neurons as the number of predefined classes or categories_. For example, with the Apple Quality dataset there are two predifined classes/categories: (1) good and (2) bad. The output layer will have exactly two neurons. One neuron will represent `bad` and the other `good`. At the end of a run (epoch) the `bad` output neuron will have a "voltage" (numerical value) close to `1` if the neural network predicted the apple was bad while the `good` output neuron will have a much smaller value, closer to `0`. Conversely, if the neural network predicted the apple was `good`, the `good` output neuron will have a value close to `1` while the `bad` output neuron will have a much lower voltage, closer to `0`. In other words, the output neuron with the highest value represents the neural network's classification prediction.

When using tabular data (i.e. data stored in a DataFrame), the first step is to figure out which columns in the DataFrame will be used to generate the X-values and which column will be used for the Y-values. For the Apple Quality dataset, our goal will be to classify apple **_quality_** so the information in the column `Quality` will be our Y-values and the remaining columns will be our X-values. 

In the code below, we start by creating a variable called `aqX_columns` by dropping the column `Quality`. In other words, the variable `aqX_columns` holds a list of all the columns in the DataFrame that we want to use as our X-values. The line of code that actually creates our X-values is:

> `aqX = aqDF[aqX_columns].values`

The Pandas method `pd.values` converts the numerical data in the `aqDF` DataFrame into the Numpy array, `aqX` that holds the numerical values. Keep in mind that neural networks, like most machine learning algorithms, work on Numpy numerical arrays, not numbers store in a DataFrame.

The next step is to One-Hot encode the categorical values in target column `Quality`. The line of code that does this is:

> `dummies = pd.get_dummies(aqDF['Quality'], dtype=int) # Classification`

Finally, we create our Y-values from the dummy values using this line of code:

> `aqY = dummies.values`

Again, the Pandas method, `pd.values` converts the data into the Numpy array, `aqY`. 


In [None]:
# Example 4: Generate X and Y values

# Drop column containing Y-values 
aqX_columns = aqDF.columns.drop('Quality')

# Create X from the values in aqX_columns
aqX = aqDF[aqX_columns].values

# One-hot encode Y-values
dummies = pd.get_dummies(aqDF['Quality'], dtype=int) # Classification

# Create Y from dummy values
aqY = dummies.values

# Print out X and Y
print(aqX)
print(aqY)

If your code is correct you should see the following output:

The output above shows that our X and Y values for analysis by a regression neural network. All of the values are numerical (i.e. no strings) and stored in a Numpy array.

### **Exercise 4: Generate X and Y for Classification Neural Network**

In the cell below generate X and Y values that can be used in a Heart Disease classification neural network. The Y-values are in the target column, `HeartDisease`, while the X-values are in all of the other columns. 

Even though the values in the column `HeartDisease` are already numerical (not strings), you will still need to One-Hot encode this column. The reason you need to One-Hot encode the `HeartDisease` column is to create the Y variable, `hdY`, with the correct format. 

Call your X-values, `hdX` and your Y-values, `hdY`. After generating these variables, print out their values. 

In [None]:
# Insert your code for Exercise 4 here 



If your code is correct you should see the following output:

If you see any strings (words) in the output above, if means that you code had one (or more) errors in it. It will need to be corrected before you can use it will your neural network.

### Example 5: Construct a Classification Neural Network

When building a classification neural network, there are two important points to remember:

* Classification neural networks have an output neuron count equal to the number of classes.
* Classification neural networks should use the **softmax** activation function on the output layer and **categorical_crossentropy** as the loss function when you compile your neural network.

When building a new neural network, you must begin by telling Keras the name we what for our new network, what kind of neural network we want to build. In this example, we are building a simple linear network with the name `aqModel`, so we begin construction with the line:

> `aqModel = Sequential()`

Next we start adding "hidden" layers to our network in sequential order. There are no "hard-and-fast" rules when it comes to how many layers a network should have and how many neurons should be put in each layer. There are trade offs when it comes to the number of layers and neurons that will discussed later in this course. 

For now, we will construct a simple classification neural network with just 2 hidden layers and one output layer. We have to tell the first hidden layer how many inputs (X values) it will be receiving. We do this with the `input_dim` argument as shown here:

> `input_dim=aqX.shape[1]`

For this example, we will put 50 neurons in the first layer, and half that many (25) in the next (2nd) hidden layer. The argument `Dense` tell Keras to connect every neuron in that layer to every neuron in the next layer. At present, the activation function `relu` is considered to be the "best" for hidden layer neurons in this kind of simple neural network.

As mentioned at the beginning of Example 5, we need to make sure there is one neuron in the output layer for each category that we want to classify. This is accomplished by the argument `aqY.shape[1]` in this line of code:

> `aqModel.add(Dense(aqY.shape[1],activation='softmax')) # Output`

The other point to remember was to use the loss function `categorical_crossentropy` and the `adam` optimizer when compiling the neural network. 

Once the neural network has been successfully compliled (one of the jobs of the compiler is to spot errors in your neural network), it can be "fitted" (trained) to the X and Y values. A complete "run" of the neural network is called an `epoch`. At the end of each epoch, the model evaluates its "inaccuracy" (loss), prints it out (if `verbose=2`), makes small modifications to the weigth/strength of each connection between neurons, and gets ready to run again. 

By fitting the model over-and-over again, the model will gradually find the best weight for each neural connection, by minimumizing the loss function. 

In this example, the number of epochs is set to 100.  


In [None]:
# Build the neural network

# Build neural network
aqModel = Sequential()
aqModel.add(Dense(50, input_dim=aqX.shape[1], activation='relu')) # Hidden 1
aqModel.add(Dense(25, activation='relu')) # Hidden 2
aqModel.add(Dense(aqY.shape[1],activation='softmax')) # Output

# Compile the model
aqModel.compile(loss='categorical_crossentropy', optimizer='adam')

# Train the model for 100 epochs
aqModel.fit(aqX,aqY,verbose=2,epochs=100)


If your code is correct you should see similiar to the following output. Notice that the loss value started at 0.4838 and gradually decreased to a value of 0.0675 after 100 epochs.

### **Exercise 5: Construct, Compile and Fit a Classification Neural Network**

In the cell below, construct, compile and fit a classification neural network to X values stored in `hdX` and the Y-values stores in `hdY`, using Example 5 as a template. Call your new neural network, `hdModel`. Train (fit) your network for 100 epochs.


In [None]:
# Insert your code for Exercise 5 here



If your code is correct you should see something similar to the output shown below. In this example, the loss function decreased from 3.2447 to 0.3265 after training 100 cycles (epochs).

## Generate X and Y for a Regression Neural Network

As mentioned above, the procedure for generating X and Y for a regression neural network is somewhat different the procedure used above. Even though these differences are not large, they are important. If your X and Y values are not generated in the correct format, your neural network will not compile and run.

### Example 6: Generate X and Y for Regression Neural Network

The goal of a regression neural network is to predict a **_continuous output value_** based on input data. The "target" column, `Quality` in the Apple Quality dataset, would **not** be a good candidate for a regression analysis. The reason is that data in this column is _not_ a continuous variable, but a binary variable with only two values, `0` (bad) or `1` (good). 

For regression, we want to predict a variable that has a range of values. By inspection of the statistical summary for `aqDF` (See the Appendix below), there are several possible candidates. Each of these columns have continuous numerical values  that would be suitable for a regression: `Weight`, `Sweetness`, `Crunchiness`, `Juiciness`, `Ripeness` and `Acidity`.  

For Example 6, we will generate X and Y for a regression neural network designed to predict the `Sweetness` of an apple based on its other characteristics, including its quality. 

We begin by creating a new DataFrame called `aq2DF` from our already _updated_ DataFrame `aqDF` using the Pandas `pd.copy()` method. We will use `aq2DF` for creating X and Y. Note that we are **not** using a copy of the original DataFrame created in Example 1, since we don't want the column `A_id` that was dropped in Example 2.

When trying to predict the `Sweetness` of an apple, the apple's `Quality` would seem like an important factor to include in the analysis. Since the column `Quality` has categorical values (i.e. `bad` and `good`) we will need to convert these strings into numerical values. However, instead of One-Hot encoding this column, we use the alternative approach of mapping these categorical values to `0` for `bad` and `1` for `good`. Also, since we are mapping the values _inside_ the column, we **don't** want to drop the `Quality` column as was done above when this column was One-Hot encoded.

We do, however, need to drop the column `Sweetness` when creating our variable `aq2X_columns`. This should make sense since we don't want the actual sweetness values to be included as part of the X-values. That would defeat the entire purpose of the neural network! 

A very important difference between regression and catagorical network, when creating the Y data for a regression neural network, we **don't** want to One-Hot encode it, like we did above with the classification example. Ironically, One-Hot encoding in this instance will generate the _wrong_ format for the Y values. Instead, we simply generate the array `aq2Y` using numerical values in the target column:

> `aq2Y = aq2DF['Sweetness']`

Finally, the code prints out X and Y so we can visually inspect them.


In [None]:
# Example 6: Generate X and Y for regression

# Create a copy and use it for X and Y
aq2DF=aqDF.copy()

# Map categorical values in Quality to integers
mapping = {'bad': 0, 'good': 1}
aq2DF['Quality'] = aq2DF['Quality'].map(mapping)

# Drop column containing Y-values 
aq2X_columns = aq2DF.columns.drop('Sweetness')

# Create X from the values in aq2X_columns
aq2X = aq2DF[aq2X_columns].values

# Assign Y to target column
aq2Y = aq2DF['Sweetness']

# Print out X and Y
print(aq2X)
print(aq2Y)


If your code is correct you should see the following output:

### **Exercise 6: Generate X and Y for Regression Neural Network**

In the cell below, generate X and Y for a regression neural network that can predict the maximum heart rate,`MaxHR`, for the Heart Disease dataset. Start by making a copy of the DataFrame `hdDF` and calling it `hd2DF`. 

Use this new DataFrame, `hd2DF` for generating your X and Y values, `hd2X` and `hd2Y`, respectively. Since the values in the rightmost column `HeartDisease` are already in numerical form, you will **not** to map these values to integers as was done in Example 6. Don't forget to drop the column `MaxHR` before you generate your X-values!

When you have generated X and Y, print out the values in `hd2X` and `hd2Y`. 

In [None]:
# Insert your code for Exercise 6 here




If your code is correct you should see the following output:

### Example 7: Construct, compile and fit a regression neural network

The x and y values are now ready for a regression neural network. When building a neural network for regression, there are three points to remember. 

* **First:** a regression neural network only has a single neuron in its output layer. The "voltage" in this single output neuron at the end of each epoch, represents the network's numerical prediction. So in this example, the numerical value in the output neuron would be the `Sweetness` value. 

The difference between the predicted sweetness values, and the actual sweetness values stored in the Y variable, `ap2Y`, is the loss value. The better the neural network becomes at predicting sweetness based on all of the apple's other attributes (i.e., `Weight`, `Crunchiness`, `Juiciness`, `Ripeness`, `Acidity` and `Quality`), the smaller the loss value becomes. 

* **Second:** a regression neural network does **not** use an activation function in the output layer. 

* **Third:** when you compile a regression neural network, the loss function should be `mean_squared_error`. 


In [None]:
# Example 7: Construct, compile and fit

# Construct regression neural network
aq2Model = Sequential()
aq2Model.add(Dense(25, input_dim=aq2X.shape[1], activation='relu')) # Hidden 1
aq2Model.add(Dense(10, activation='relu')) # Hidden 2
aq2Model.add(Dense(1)) # Output: 1 neuron, no activation

# Compile model with MSE loss function
aq2Model.compile(loss='mean_squared_error', optimizer='adam')

# Fit model to the data
aq2Model.fit(aq2X,aq2Y,verbose=2,epochs=100)


If your code is correct you should see something similar to the following output:

### **Exercise 7: Construct, compile and fit a regression neural network**

In the cell below create a regression neural network for the Heart Disease dataset that can predict the maximum heart rate (`MaxHR`).  Use Example 7 as a template. Call your neural network, `hd2Model`. Train (fit) you model for 100 epochs.

In [None]:
# Insert your code for Exercise 7 here



If your code is correct you should see something similar to the following output:

## **Lesson Turn-in**

When you have completed all of the code cells, run all of them in sequential order (the last code cell should be number 20, not counting the Appendix below). Then use the **File --> Print.. --> Save to PDF** to generate a PDF of your JupyterLab notebook. Save your PDF as `Class_04_1.lastname.pdf` where _lastname_ is your last name, and upload the file to Canvas.

## Appendix

The code in the cells use the Pandas method `pd.describe()` to print out a statistical summary of the data in the Apple Quality and in the Heart Disease datasets. DataFrame columns with categorical values are **not** included in the statistical summary.


In [None]:
aqDF.describe()

In [None]:
hdDF.describe()