# SCS 3546: Deep Learning
> Assignment 1: Deep Learning Using Keras

### Your name & student number:

<pre> Olivier Mawaba Sangam</pre>

<pre> Please enter your student number here. </pre>

## Assignment Description

In this assignment you will demonstrate your ability to:

- Train a neural network using Keras to solve a regression problem.

- Perform sensible data preprocessing.

- Experiment with hyperparemter tuning and different model architectures to achieve best performance.



### Grade Allocation

**15 points total**

- Part 1: 4 Marks
- Part 2: 9 Marks
- Clarity: 2 Marks

The marks for clarity are awarded for code documentation and how well you explained/supported your answers, including the use of visualizations where appropriate.

In [515]:
# setting up the notebook with important libraries
import tensorflow as tf
import numpy as np
from tensorflow import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
from sklearn.model_selection import cross_val_score
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline


# Preamble

### Hyperparameters

A hyperparameter is a parameter whose value is set before the learning process begins.

Some important Neural Networks hyperparameters include:

- number of hidden layers
- number of neurons
- learning rate
- activation function
- optimizer settings

Hyperparameters are crucial to the performance, speed, and quality of the machine learning models.

Through Hyper parameter optimization, we find a tuple (best combination) of hyperparameters that yields an optimal model which minimizes a predefined loss function on given test data.

Important hyperparameters that could be tuned include:

- num_hidden_layers
- neurons_per_layer
- dropout_rate
- activation
- optimizer
- learning_rate
- batch_size

### Loss Function

- MSE (Mean Squared Error) is used as the score/loss function that will be minimized for hyperparameter optimization.
- In this assignment, we are going to use Cross-Validation to calculate the score (MSE) for a given set of hyperparameter values

- MSE is a desirable metric because by taking the square root gives us an error value we can directly understand in the context of the problem; for example, in this assignment it translates to thousands of dollars

- Note: Your results may vary given the stochastic nature of the algorithm, evaluation procedure, or differences in numerical precision

## Dataset Description

We will be using the **Boston Housing dataset** for this assignment. This dataset was collected in 1978 by the US Census Service, and each of the 506 entries represent aggregated data about 14 features for homes from various suburbs in Boston, Massachusetts.

You are **not** expected to perform Exploratory Data Analysis (EDA) on this dataset.

The information and plots that follow are meant to help you get familiar with the data. Your efforts on this assignment should focus on **model training and hyperparameter tuning**, not on EDA.

Features Include:
- CRIM: per capita crime rate by town
- ZN: proportion of residential land zoned for lots over 25,000 sq.ft.
- INDUS: proportion of non-retail business acres per town.
- CHAS: Charles River dummy variable (1 if tract bounds river; 0 otherwise)
- NOX: nitric oxides concentration (parts per 10 million)
- RM: average number of rooms per dwelling
- AGE: proportion of owner-occupied units built prior to 1940
- DIS: weighted distances to five Boston employment centres
- RAD: index of accessibility to radial highways
- TAX: full-value property-tax rate per 10,000 dollars
- PTRATIO: pupil-teacher ratio by town
- B: Proportion of African Americans by town
- LSTAT: Percent lower status of the population
- MEDV: Median value of owner-occupied homes in  1000’s dollars (i.e. the outcome variable)

Below is a sample of this data:

# Data import from tensorflow.keras.datasets

In [516]:
from tensorflow.keras.datasets import boston_housing
from sklearn.model_selection import train_test_split
import pandas as pd

In [517]:
# Boston_housing is a tuple  containing  numpy arrays

(X_train, y_train), (X_test, y_test) = boston_housing.load_data()

# Merging features and target
X = np.concatenate([X_train, X_test])
y = np.concatenate([y_train, y_test])

# Creating a pandas dataframe
feature_names = [
    'CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE',
    'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT'
]
df = pd.DataFrame(data=X, columns=feature_names)
df['MEDV'] = y

In [518]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 506 entries, 0 to 505
Data columns (total 14 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   CRIM     506 non-null    float64
 1   ZN       506 non-null    float64
 2   INDUS    506 non-null    float64
 3   CHAS     506 non-null    float64
 4   NOX      506 non-null    float64
 5   RM       506 non-null    float64
 6   AGE      506 non-null    float64
 7   DIS      506 non-null    float64
 8   RAD      506 non-null    float64
 9   TAX      506 non-null    float64
 10  PTRATIO  506 non-null    float64
 11  B        506 non-null    float64
 12  LSTAT    506 non-null    float64
 13  MEDV     506 non-null    float64
dtypes: float64(14)
memory usage: 55.5 KB


The boxplots below help you understand the univariate distributions of these features. Take note of any outliers.

In [519]:
import math

from plotly.subplots import make_subplots
import plotly.graph_objects as go

total_items = len(df.columns)
items_per_row = 3
total_rows = math.ceil(total_items / items_per_row)
fig = make_subplots(rows=total_rows, cols=items_per_row)

cur_row = 1
cur_col = 1
for index, column in enumerate(df.columns):
    fig.add_trace(go.Box(y=df[column], name=column), row=cur_row, col=cur_col)

    if cur_col % items_per_row == 0:
        cur_col = 1
        cur_row = cur_row + 1
    else:
        cur_col = cur_col + 1

fig.update_layout(height=1000, width=550,  showlegend=False)
fig.show()

The plots below show how each feature trends with `MEDV`, which is the target variable we seek to predict; `MEDV` is on the y-axis, while the feature value is on the x-axis.

In [520]:
import math

import numpy as np
import plotly.graph_objects as go
from plotly.subplots import make_subplots

total_items = len(df.columns)
items_per_row = 3
total_rows = math.ceil(total_items / items_per_row)
fig = make_subplots(rows=total_rows, cols=items_per_row, subplot_titles=df.columns)

cur_row = 1
cur_col = 1
for index, column in enumerate(df.columns):
    fig.add_trace(
        go.Scattergl(
            x=df[column],
            y=df['MEDV'],
            mode="markers",
            marker=dict(size=3)
        ),
        row=cur_row,
        col=cur_col
    )

    intercept = np.poly1d(np.polyfit(df[column], df['MEDV'], 1))(np.unique(df[column]))

    fig.add_trace(
        go.Scatter(
            x=np.unique(df[column]),
            y=intercept,
            line=dict(color='red', width=1)
        ),
        row=cur_row,
        col=cur_col
    )

    if cur_col % items_per_row == 0:
        cur_col = 1
        cur_row = cur_row + 1
    else:
        cur_col = cur_col + 1

fig.update_layout(height=1000, width=550, showlegend=False)
fig.show()

# Assignment Start
***

- Please follow all instructions carefully.

- Use MSE (Mean Squared Error) as the score/loss function that will be minimized during optimization.








#Data Import

The code below imports the data for you as numpy arrays. The feature columns are in the same order as the list of features given earlier.

In [521]:
# from tensorflow.keras.datasets import boston_housing
# (X_train, y_train), (X_test, y_test) = boston_housing.load_data()
# The data has already been imported above

In [522]:
print(X_train.shape) # these are the features
print(y_train.shape) # this is the target label (MEDV)

(404, 13)
(404,)


# Part 1: Impact of Changing Model Architecture

In this section, we will be comparing a simple single-layer baseline model with two other models having a different network topology.

## a) Baseline model [2 points]

Use Keras to develop a baseline neural network model that has **one single fully-connected hidden layer with the same number of neurons as input features (i.e. 13 neurons).**

Make sure to **standardize** your features (i.e. subtract mean and divide by standard deviation) before training your model. You can also perform any other data-preprocessing that you deem necessary.

- Note: No activation function is used for the output layer because it is a regression problem and we are interested in predicting numerical values directly without transformation.

- The ADAM optimization algorithm should be used to optimize mean squared error loss function.

- Plot learning curves and report on both training and validation performance.

In [523]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 506 entries, 0 to 505
Data columns (total 14 columns):
 #   Column   Non-Null Count  Dtype  
---  ------   --------------  -----  
 0   CRIM     506 non-null    float64
 1   ZN       506 non-null    float64
 2   INDUS    506 non-null    float64
 3   CHAS     506 non-null    float64
 4   NOX      506 non-null    float64
 5   RM       506 non-null    float64
 6   AGE      506 non-null    float64
 7   DIS      506 non-null    float64
 8   RAD      506 non-null    float64
 9   TAX      506 non-null    float64
 10  PTRATIO  506 non-null    float64
 11  B        506 non-null    float64
 12  LSTAT    506 non-null    float64
 13  MEDV     506 non-null    float64
dtypes: float64(14)
memory usage: 55.5 KB


In [524]:
# your code here, use as many cells as you need

In [525]:
# be sure to comment on your results for each and every question, describing not just
# the what, but the 'how' and 'why' where possible, to demonstrate your understanding of
# course content

In [526]:
df['CHAS']=df['CHAS'].astype(int)

In [527]:
# Splitting the data in training and validation sets
target='MEDV'
X=df.drop(columns=[target])
y=df[target]

X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)

In [528]:
# Defining the variables
# Numerical features will be standardized
# categorical_features=['CHAS']
categorical_features=[]
num_features = X_train.drop(columns=[col for col in categorical_features]).columns.tolist()

print(f"Numerical features: {num_features} \n Categorical features: {categorical_features} \n Target variable:{target}")

Numerical features: ['CRIM', 'ZN', 'INDUS', 'CHAS', 'NOX', 'RM', 'AGE', 'DIS', 'RAD', 'TAX', 'PTRATIO', 'B', 'LSTAT'] 
 Categorical features: [] 
 Target variable:MEDV


In [529]:
print(type(X_train))

<class 'pandas.core.frame.DataFrame'>


In [530]:
from tensorflow.keras import layers
from tensorflow.keras.layers import Normalization, IntegerLookup, CategoryEncoding


In [531]:
# Defining numerical pre-processing input
input_numerical = tf.keras.Input(shape=(len(num_features),), name="numerical_input")

# Creating a normalization layer
normalization_layer = Normalization()

# Fitting it to our data
normalization_layer.adapt(X_train.drop(columns=categorical_features))

# Applying the normalization layer to standardize the numerical input
input_numerical_standardized = normalization_layer(input_numerical)


In [532]:
df['CHAS'].unique()

array([0, 1])

In [544]:
# Defining categorical pre-processing input
input_categorical = tf.keras.Input(shape=(1,), name="cat_input", dtype=tf.int32)

# Creating a category encoding layer
integer_lookup_layer = IntegerLookup(output_mode='binary', max_tokens=2)
integer_lookup_layer.adapt(X_train[categorical_features])

# Applying the category encoding layer to encode the categorical input
input_categorical_encoded = integer_lookup_layer(input_categorical)

# Keep only one column of the one-hot encoded output (e.g., second column)
input_categorical_encoded = input_categorical_encoded[:, 1:2]

In [545]:
# Combine preprocessing layers
# Original numerical and categorical input tensors
preprocessing_inputs = [input_numerical, input_categorical]

# Concatenate the standardized numerical and binary-encoded categorical inputs into one output
preprocessing_outputs = tf.keras.layers.Concatenate()([input_numerical_standardized, input_categorical_encoded])

# Create a Keras preprocessing model that takes the original inputs and returns the inputs transformed and concatenated as output
preprocessing_model = tf.keras.Model(preprocessing_inputs, preprocessing_outputs)

In [546]:
preprocessing_outputs.shape

TensorShape([None, 13])

In [547]:
len(num_features)

13

In [548]:
# Model building

# Layer 1: Hidden Dense Layer with 13 Neurons and ReLU Activation
layer1 = tf.keras.layers.Dense(13, activation='relu')

# Layer 2: Output Dense Layer for Regression (single neuron, no activation function)
layer2 = tf.keras.layers.Dense(1)

sub_model=tf.keras.Sequential([layer1,layer2])

In [549]:
main_output=sub_model(preprocessing_outputs)

In [550]:
# Combine the layers into a Sequential model
model = tf.keras.Model(inputs=[input_numerical, input_categorical], outputs=main_output)

In [551]:
# Compile the model with ADAM optimizer and mean absolute error loss
model.compile(loss='mean_squared_error',
              metrics=[tf.keras.metrics.MeanSquaredError(), tf.keras.metrics.MeanAbsoluteError()],
              optimizer='adam')

In [552]:
train_inputs

[         CRIM   ZN  INDUS  CHAS    NOX     RM    AGE     DIS   RAD    TAX  \
 477   0.18337  0.0  27.74     0  0.609  5.414   98.3  1.7554   4.0  711.0   
 15   73.53410  0.0  18.10     0  0.679  5.957  100.0  1.8026  24.0  666.0   
 332   0.44178  0.0   6.20     0  0.504  6.552   21.4  3.3751   8.0  307.0   
 423   5.20177  0.0  18.10     1  0.770  6.127   83.4  2.7227  24.0  666.0   
 19    0.14866  0.0   8.56     0  0.520  6.727   79.9  2.7778   5.0  384.0   
 ..        ...  ...    ...   ...    ...    ...    ...     ...   ...    ...   
 106   6.39312  0.0  18.10     0  0.584  6.162   97.4  2.2060  24.0  666.0   
 270   3.53501  0.0  19.58     1  0.871  6.152   82.6  1.7455   5.0  403.0   
 348   0.38214  0.0   6.20     0  0.504  8.040   86.5  3.2157   8.0  307.0   
 435  10.06230  0.0  18.10     0  0.584  6.833   94.3  2.0882  24.0  666.0   
 102   0.47547  0.0   9.90     0  0.544  6.113   58.8  4.0019   4.0  304.0   
 
      PTRATIO       B  LSTAT  
 477     20.1  344.05  23.97  


In [553]:
# Prepare the input data
train_inputs = [X_train[num_features], X_train[categorical_features]]
val_inputs = [X_val[num_features], X_val[categorical_features]]

# Train the model
history = model.fit(train_inputs, y_train,
                    validation_data=(val_inputs, y_val),
                    epochs=100, batch_size=32,verbose=0)

FailedPreconditionError: ignored

In [None]:
import matplotlib.pyplot as plt

In [None]:
def plot_learning_curves(history):
  # Find the epoch with the lowest validation MAE
  min_val_mae_epoch = np.argmin(history.history['val_mean_absolute_error'])
  min_val_mae = history.history['val_mean_absolute_error'][min_val_mae_epoch]

  # Find the epoch with the lowest validation MSE
  min_val_mse_epoch = np.argmin(history.history['val_mean_squared_error'])
  min_val_mse = history.history['val_mean_squared_error'][min_val_mse_epoch]

  # Plot learning curves
  # Plot training and validation MSE
  fig, axes = plt.subplots(2, 1, figsize=(10, 10))

  axes[0].plot(history.history['mean_squared_error'], label='Training MSE')
  axes[0].plot(history.history['val_mean_squared_error'], label='Validation MSE')

  # Highlight min_val_mse point
  axes[0].scatter(min_val_mse_epoch, min_val_mse, color='red')
  axes[0].text(min_val_mse_epoch, min_val_mse, f'Min MSE: {min_val_mse:.2f}', color='red')

  axes[0].set_xlabel('Epoch')
  axes[0].set_ylabel('Mean Squared Error')
  axes[0].legend()
  axes[0].set_title('Mean Squared Error')

  # Plot training and validation MAE
  axes[1].plot(history.history['mean_absolute_error'], label='Training MAE')
  axes[1].plot(history.history['val_mean_absolute_error'], label='Validation MAE')

  # Highlight min_val_mae point
  axes[1].scatter(min_val_mae_epoch, min_val_mae, color='red')
  axes[1].text(min_val_mae_epoch, min_val_mae, f'Min MAE: {min_val_mae:.2f}', color='red')

  axes[1].set_xlabel('Epoch')
  axes[1].set_ylabel('Mean Absolute Error')
  axes[1].legend()
  axes[1].set_title('Mean Absolute Error')

  # Adjust layout to prevent overlap
  plt.tight_layout()
  plt.show()

plot_learning_curves(history)

*Using cross-validation*

In [None]:
from sklearn.model_selection import KFold

In [None]:
def train_with_cv(model,df,num_features,categorical_features,target,num_fold,num_epochs,num_batch_size):
  """
    Train a compiled keras model using k-fold cross-validation.

    Parameters:
        model (tf.keras.Model): The compiled Keras model to be trained.
        df (pandas.DataFrame): The DataFrame containing the dataset.
        num_features (list): List of numerical feature column names.
        categorical_features (list): List of categorical feature column names.
        target (str): label of target column.
        num_fold (int): Number of folds for k-fold cross-validation.
        num_epochs (int): Number of training epochs for each fold.
        num_batch_size (int): Batch size for training.

    Returns:
        tuple: A tuple containing the model history for the fold with the lowest validation MAE,
               the average MAE across folds, and the standard deviation of MAE across folds.
    """
  X_train=df[num_features+categorical_features+[target]]

  # Storing metrics
  all_scores = []
  all_histories =[]

  # Defining the K-fold Cross Validator
  kfold = KFold(n_splits=num_fold, shuffle=True,random_state=42)

  # enumerate(kfold.split(X_train) returns np.arrays so we use the same formatting as
  # above to keep the features names
  for fold_num, (train_indices, val_indices) in enumerate(kfold.split(X_train)):
    X_train_fold, X_val_fold = X_train.iloc[train_indices], X_train.iloc[val_indices]
    y_train_fold = X_train_fold[target]
    y_val_fold = X_val_fold[target]


    # Prepare the input data
    train_inputs  = [X_train_fold[num_features], X_train_fold[categorical_features]]
    val_inputs = [X_val_fold[num_features], X_val_fold[categorical_features]]

    y_train_fold = X_train_fold[target]
    y_val_fold = X_val_fold[target]

    # Train the model
    history = model.fit(train_inputs, y_train_fold,
                        validation_data=(val_inputs, y_val_fold),
                        epochs=num_epochs, batch_size=num_batch_size,verbose=0)

    # Evaluate the model on the validation data
    val_mae_history = history.history['val_mean_absolute_error']

    # Store minimum MAE for this fold
    all_scores.append(np.min(val_mae_history))

    # Store the model for this fold
    all_histories.append(history)



  # Get the fold with the minimum MAE
  min_mae_index = np.argmin(all_scores)

  # Minimum validation MAE from the history object
  min_val_mae = np.min(all_histories[min_mae_index].history['val_mean_absolute_error'])

  # Best model
  min_mae_fold = all_histories[min_mae_index]

  # Avg MAE accross nfolds
  avg_mae=np.mean(all_scores)

  # Std deviation MAE accross nfolds
  std_mae=np.std(all_scores)

  print(f"Fold #{min_mae_index} had the lowest validation MAE of: {min_val_mae}")
  print('\n',f"The model has an average MAE of {avg_mae} with a standard deviation of {std_mae}.")

  return min_mae_fold,avg_mae,std_mae

In [None]:
df.info()

In [None]:
# Training parameters
num_fold=3
num_epochs=100
num_batch_size=32

In [None]:
model

In [None]:
# Empty DataFrame to log results
results_df = pd.DataFrame(columns=['model_name', 'avg_mae', 'std_mae'])

# Function to log the results of model in a dataframe
def log_results(results_df,model_id,avg_mae,std_mae):
  # Append the new results
  results_df.loc[len(results_df)] = {'model_name': model_id, 'avg_mae': avg_mae, 'std_mae': std_mae}

  # Display the results
  results_df

  return results_df

In [None]:
# Train and evaluate with cross-validation
model1_hist,avg_mae,std_mae = train_with_cv(model,df,num_features,categorical_features,target,num_fold,num_epochs,num_batch_size)

In [None]:
# Log models results
log_results(results_df,"model_1",avg_mae,std_mae)
results_df

In [None]:
# Plot learning curves
plot_learning_curves(model1_hist)

## b) Deeper Network [1 point]

Construct and evaluate a model with 2 dense layers having a smaller number of neurons (e.g. 16, 8).

In [None]:
preprocessing_outputs

In [None]:
# your code here, use as many cells as you need
# be sure to comment on your results for each and every question, describing not just
# the what, but the 'how' and 'why' where possible, to demonstrate your understanding of
# course content

def deeper_network():
  # Define initializer for relu
  initializer = tf.keras.initializers.HeNormal()

  # Create the model
  model = tf.keras.Sequential()

  # Layer 1: Hidden Dense Layer with 13 Neurons and ReLU Activation
  model.add(Dense(16, input_dim=13, kernel_initializer=initializer, activation='relu'))

  # Layer 2: Output Dense Layer for Regression (single neuron, no activation function)
  model.add(Dense(8, input_dim=13, kernel_initializer=initializer, activation='relu'))

  # Last layer
  model.add(Dense(1, kernel_initializer='normal'))

  # Combine the model layers into the main model
  main_output = model(preprocessing_outputs)
  model = tf.keras.Model(inputs=[input_numerical, input_categorical], outputs=main_output)

  # Compile the model with ADAM optimizer and mean absolute error loss
  model.compile(loss='mean_squared_error',
                metrics=[tf.keras.metrics.MeanSquaredError(), tf.keras.metrics.MeanAbsoluteError()],
                optimizer='adam')

  return model

In [None]:
model2 = deeper_network()

In [None]:
from tensorflow.keras.utils import plot_model

plot_model(model2,show_dtype=True,expand_nested=True,layer_range=None,show_shapes=True)

In [None]:
# Train and evaluate with cross-validation
model2_hist,avg_mae,std_mae = train_with_cv(model2,df,num_features,categorical_features,target,num_fold,num_epochs,num_batch_size)

In [None]:
# Log models results
log_results(results_df,"model_2",avg_mae,std_mae)
results_df

In [None]:
# Plot learning curves
plot_learning_curves(model2_hist)

Conclusion:



## c) Wider Network [1 point]

Construct and evaluate a wider model with more neurons (e.g. 32, 16).

In [None]:
# your code here, use as many cells as you need

# be sure to comment on your results for each and every question, describing not just
# the what, but the 'how' and 'why' where possible, to demonstrate your understanding of
# course content

# Part 2: Hyperparameter Tuning Experiments

In the following experiments, you will evaluate and compare models trained with different hyperparameters. Please follow the specifications given for each model.

## a) Model 1 [2 points]

- 2 Dense layers:
  - The first with 64 neurons using a ReLU activation function.
  - The second with 64 neurons using a ReLU activation function.
- Choose an appropriate output layer and activation.
- Train model with 100 epochs and obtain cross-validated performance (e.g. with 3 cross-folds).
- Plot both loss and mean absolute error (i.e. learning curves) for both training and validation.
- Report MAE from CV with standard deviation.

In [None]:
# your code here, use as many cells as you need

# be sure to comment on your results for each and every question, describing not just
# the what, but the 'how' and 'why' where possible, to demonstrate your understanding of
# course content

## b) Model 2 [2 points]

- 2 Dense layers:
  - The first with 128 neurons using a ReLU activation function.
  - The second with 64 neurons using a ReLU activation function.
- Choose an appropriate output layer and activation.
- Train model with 100 epochs and obtain cross-validated performance (e.g. with 3 cross-folds).
- Plot both loss and mean absolute error (i.e. learning curves) for both training and validation.
- Report MAE from CV with standard deviation.

In [None]:
# your code here, use as many cells as you need

# be sure to comment on your results for each and every question, describing not just
# the what, but the 'how' and 'why' where possible, to demonstrate your understanding of
# course content

## c) Model 3 [2 points]

- Same as Model 2, but use tanh activation functions instead of relu.

In [None]:
# your code here, use as many cells as you need

# be sure to comment on your results for each and every question, describing not just
# the what, but the 'how' and 'why' where possible, to demonstrate your understanding of
# course content

## d) Model 4 [2 points]

- Same as Model 2, but use the rmsprop optimizer when training.

In [None]:
# your code here, use as many cells as you need

# be sure to comment on your results for each and every question, describing not just
# the what, but the 'how' and 'why' where possible, to demonstrate your understanding of
# course content

## e) Model Comparison [1 point]

Which model performed best? Offer your thoughts on why the particular choice of hyperparameters led to improved performance for this model.

In [None]:
# explain WHY you think the best model was better than the rest, in terms
# of how those hyperparameters theoretically impact the model

# provide visualizations (e.g. tables or comparison plot) to support your response where possible

NOTE: 2 additional points are awarded based on code documentation and overall clarity of work.

In [None]:
# We are looking for a clear explanation of results with each response. We want you to attempt to
# explain the _how_ and _why_ behind your answers, and not just the what, do demonstrate
# your knowledge of the concepts discussed in class. Answers should be backed up with
# visualizations (e.g. plots, charts).

# Code should be easy to follow by using sensical naming conventions for function and variable
# names, providing useful code comments, and refactoring repeated code into re-usable functions.