<a href="https://colab.research.google.com/github/akarajic/CUS615/blob/main/ProblemSet3/Problem_set_03_SVM_Classifer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

This notebook is part of Dr. Christoforos Christoforou's course materials. You may not, nor may you knowingly allow others to reproduce or distribute lecture notes, course materials or any of their derivatives without the instructor's express written consent.

# Problem Set 03 - Support Vector Machines Classifiers
**Professor:** Dr. Christoforos Christoforou

For this problem set you will need the following libraries, which are pre-installed with the colab environment: 

* [Numpy](https://www.numpy.org/) is an array manipulation library, used for linear algebra, Fourier transform, and random number capabilities.
* [Pandas](https://pandas.pydata.org/) is a library for data manipulation and data analysis.
* [Matplotlib](https://matplotlib.org/) is a library which generates figures and provides graphical user interface toolkit.

You can load them using the following import statement:

In [19]:
import numpy as np
import pandas as pd
import matplotlib.pylab as plt

## 1. Objective 
As part of this problem set, you will expore work on the `wine quality dataset`  in order to: 
- To explore the physiocochemical properties of red wine
- To determine an optimal machine learning model for red wine quality classification

For that, you will be using an `instance-based` classifier, namely K-NN algorithm. Review the information provided in the problem set, and complete all challenges listed.  

## 2. Wine Quality Dataset - Data Description

For this dataset you will be using the `wine quality dataset`. Below is a description of the various parameters listed in that dataset (i.e. potential features):

* fixed.acidity (tartaric acid - g / dm^3): most acids involved with wine or fixed or nonvolatile (do not evaporate readily) 
* volatile.acidity (acetic acid - g / dm^3): the amount of acetic acid in wine, which at too high of levels can lead to an unpleasant, vinegar taste 
* citric.acid (g / dm^3): the amount of acetic acid in wine, which at too high of levels can lead to an unpleasant, vinegar taste 
* residual.sugar (g / dm^3): the amount of sugar remaining after fermentation stops, it's rare to find wines with less than 1 gram/liter and wines with greater than 45 grams/liter are considered sweet 
* chlorides (sodium chloride - g / dm^3): the amount of salt in the wine 
* free.sulfur.dioxide (mg / dm^3): the free form of SO2 exists in equilibrium between molecular SO2 (as a dissolved gas) and bisulfite ion; it prevents microbial growth and the oxidation of wine 
* total.sulfur.dioxide (mg / dm^3): amount of free and bound forms of S02; in low concentrations, SO2 is mostly undetectable in wine, but at free SO2 concentrations over 50 ppm, SO2 becomes evident in the nose and taste of wine 
* density (g / cm^3): the density of water is close to that of water depending on the percent alcohol and sugar content 
* pH: describes how acidic or basic a wine is on a scale from 0 (very acidic) to 14 (very basic); most wines are between 3-4 on the pH scale 
* sulphates (potassium sulphate - g / dm3): a wine additive which can contribute to sulfur dioxide gas (S02) levels, wich acts as an antimicrobial and antioxidant 
* alcohol (% by volume): the percent alcohol content of the wine 
* quality: quality score between 0 and 10



## Download dataset from kaggle
You will use the Kaggle CLI to dowload the `Wine Quality Dataset` to your colab enviroment. You will need to upload your kaggle API (see problem_set 01 for direction on how to obtain your API key. 

In [20]:
# install kaggle CLI
!pip install -q kaggle

In [21]:
# Upload the kaggle API key of your account 
from google.colab import files 
files.upload()
!mkdir ~/.kaggle
!cp kaggle.json ~/.kaggle
!chmod 600 ~/.kaggle/kaggle.json

Saving kaggle.json to kaggle (1).json
mkdir: cannot create directory ‘/root/.kaggle’: File exists


In [22]:
# View list of data files available in the dataset. 
# Format : kaggle dataset files <dataset-URI>
!kaggle datasets files cchristoforou/practice-dataset-for-tutorials

name                      size  creationDate         
-----------------------  -----  -------------------  
wine.data                 11KB  2021-02-25 21:48:50  
country_total.csv        533KB  2021-02-25 21:48:50  
countries.csv              2KB  2021-02-25 21:48:50  
dataset_37_diabetes.csv   33KB  2021-02-25 21:48:50  
wineQualityReds.csv       92KB  2021-02-25 21:48:50  


In [23]:
# Download - Specify the parameters.  
kaggle_dataset_URI = "cchristoforou/practice-dataset-for-tutorials"
output_folder = "sample_data/problem_set02"
kaggle_data_file1 = "wineQualityReds.csv"

In [24]:
# Download the first file from dataset - countries.csv
!kaggle datasets download $kaggle_dataset_URI --file $kaggle_data_file1 --path $output_folder 


wineQualityReds.csv: Skipping, found more recently modified local copy (use --force to force download)


## Load the data 
The code below showcase how to load the data in a pandas `DataFrame` and apply a train_test_split on the data. 

In [25]:
# Code to load the data from file. Here we use the pandas library to read the csv file. 
datafile = "./sample_data/problem_set02/wineQualityReds.csv"
wine_df = pd.read_csv(datafile)
wine_df.drop(wine_df.columns[0],axis=1,inplace=True)

In [26]:
# Split the data into a training and testing set using the sklearn function train_test_split
# Notice that 
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(wine_df.drop('quality',axis=1), wine_df['quality'], test_size=.25, random_state=42)


## Challenge 1
Use the variables `X_train`, `X_test`, `y_train`, and `y_test` to explore your data. In particular, calculate and display the following information.

* Number of samples in the training set in total and in each class.
* Number of samples in the testing set in total and in each class.
* Number of features in the dataset. 
* Number of classes in the dataset.
* IDs of the number of classes.


In [27]:
import numpy as np
print(X_train.shape)
print('There are 1199 rows and 11 columns in the x_train class')
print(X_test.shape)
print('There are 400 rows and 11 columns in the x_test class')
print(y_train.shape)
print('There are 1199 rows in the y_train class')
print(y_test.shape)
print('There are 400 rows in the y_test class')
print('There are 119 samples in the training set in total and 400 samples in the testing set in total')

print('There are 11 features in the x_train and x_test classes. They are: ')
print(X_train.columns)

print(X_train.value_counts())
print(X_test.value_counts())
print(y_train.value_counts())
print(y_test.value_counts())

(1199, 11)
There are 1199 rows and 11 columns in the x_train class
(400, 11)
There are 400 rows and 11 columns in the x_test class
(1199,)
There are 1199 rows in the y_train class
(400,)
There are 400 rows in the y_test class
There are 119 samples in the training set in total and 400 samples in the testing set in total
There are 11 features in the x_train and x_test classes. They are: 
Index(['fixed.acidity', 'volatile.acidity', 'citric.acid', 'residual.sugar',
       'chlorides', 'free.sulfur.dioxide', 'total.sulfur.dioxide', 'density',
       'pH', 'sulphates', 'alcohol'],
      dtype='object')
fixed.acidity  volatile.acidity  citric.acid  residual.sugar  chlorides  free.sulfur.dioxide  total.sulfur.dioxide  density  pH    sulphates  alcohol
7.0            0.69              0.07         2.5             0.091      15.0                 21.0                  0.99572  3.38  0.60       11.3       3
9.3            0.36              0.39         1.5             0.080      41.0              

# Challenge 2

Train a **SVM** classifier using the `(X_train,y_train)` dataset and use the trained model to predict the underlying classes for the observations in the test dataset `X_test`. Store your prediction in a variable called `y_pred`.

In [28]:
# Your solution 
from sklearn import svm
model  = svm.SVC(kernel='linear')
model.fit(X_train,y_train)
y_pred= model.predict(X_test)
print(y_pred)


[5 5 6 5 6 5 5 5 5 6 6 5 5 5 5 6 5 5 6 5 5 5 6 6 5 5 6 5 5 6 5 5 6 5 5 5 6
 6 5 6 5 5 6 5 6 6 6 5 5 6 5 5 6 6 5 5 6 5 6 5 5 6 5 5 6 5 6 5 6 5 6 5 6 6
 6 5 6 6 6 6 5 6 5 6 6 6 5 6 6 5 6 5 6 6 5 5 5 6 5 5 5 5 6 6 5 6 5 5 6 5 6
 5 6 5 6 6 6 5 5 6 6 5 6 5 5 5 6 6 6 6 6 5 5 6 6 5 5 5 5 6 6 6 6 6 6 5 6 5
 6 5 6 6 5 6 6 6 5 6 5 6 6 6 6 5 5 6 5 5 5 5 5 5 6 5 5 6 6 5 5 5 5 6 5 6 5
 6 6 6 6 5 6 6 6 6 6 5 5 5 5 6 5 5 5 5 6 6 5 5 5 6 6 5 6 6 6 6 5 5 6 5 5 6
 6 6 5 5 5 6 5 5 5 5 6 6 6 6 5 6 5 5 5 5 6 6 5 5 6 5 6 5 6 6 5 5 5 5 5 6 6
 6 6 6 5 6 6 6 5 5 6 6 5 6 5 5 5 5 6 6 6 5 6 5 5 5 5 6 5 6 5 6 5 6 5 5 5 6
 5 6 6 6 5 5 6 5 5 5 6 6 6 6 6 6 5 5 5 6 5 5 6 5 6 6 6 5 5 5 6 6 5 6 6 6 5
 5 5 6 6 6 5 5 6 6 6 5 6 5 6 5 6 6 5 6 5 5 6 6 5 5 5 6 6 5 5 6 5 6 6 5 5 5
 5 5 6 6 5 6 5 6 5 5 5 6 6 5 6 6 6 5 5 6 5 6 5 5 6 6 6 5 5 6]


# Challenge 3

Evaluate the performance of your classifier. Calculate and display the following:
* print the `confusion matrix`.
* `normalized confusion matrix`. 
* the probablitity of correct classification (accuracy score). 
* the `precision`, `recall`, and `f1-score` for each class.

In [29]:
from sklearn.metrics import confusion_matrix,precision_score,accuracy_score,recall_score,f1_score,classification_report 
cm = confusion_matrix(y_test, y_pred)
cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
ac = accuracy_score(y_test,y_pred)
ps = precision_score(y_test, y_pred, average='macro')
rs = recall_score(y_test, y_pred, average='macro')
f1s = f1_score(y_test, y_pred, average='macro')
print('Confusion Matrix:')
print(cm)
print(' \n Normalized Confusion Matrix:')
print(cm_normalized)
print(' \n Accuracy Score:')
print(ac)
print(' \n Precision Score:')
print(ps)
print(' \n Recall Score:')
print(rs)
print(' \n F1-Score:')
print(f1s)

Confusion Matrix:
[[  0   0   1   0   0   0]
 [  0   0  10   3   0   0]
 [  0   0 125  39   0   0]
 [  0   0  66 103   0   0]
 [  0   0   3  45   0   0]
 [  0   0   0   5   0   0]]
 
 Normalized Confusion Matrix:
[[0.         0.         1.         0.         0.         0.        ]
 [0.         0.         0.76923077 0.23076923 0.         0.        ]
 [0.         0.         0.76219512 0.23780488 0.         0.        ]
 [0.         0.         0.39053254 0.60946746 0.         0.        ]
 [0.         0.         0.0625     0.9375     0.         0.        ]
 [0.         0.         0.         1.         0.         0.        ]]
 
 Accuracy Score:
0.57
 
 Precision Score:
0.18966020429435063
 
 Recall Score:
0.22861042959542022
 
 F1-Score:
0.20724014016696943


  _warn_prf(average, modifier, msg_start, len(result))


# Challenge 4

The code below loads the same dataset, but treats it as a binary classification problem. That is, instead of classifying an observation into one of 10 categories (0..10), we consider all observations with score above 5 as being good and all observation below or equal to five as being bad.





In [30]:
# Code to load the data from file. Here we use the pandas library to read the csv file. 
datafile = "./sample_data/problem_set02/wineQualityReds.csv"
wine_df = pd.read_csv(datafile)
wine_df.drop(wine_df.columns[0],axis=1,inplace=True)

wine_df['quality'] = np.where(wine_df['quality']>5,"Good","Bad")

In [31]:
X_train, X_test, y_train, y_test = train_test_split(wine_df.drop('quality',axis=1), wine_df['quality'], test_size=.25, random_state=42)


## Callenge 4.1
Use the variables `X_train`, `X_test`, `y_train`, and `y_test` to explore your data. In particular, calculate and display the following information.
* Number of samples in the training set in total and in each class.
* Number of samples in the testing set in total and in each class.
* Number of features in the dataset. 
* Number of classes in the dataset.
* IDs of the number of classes.




In [32]:
import numpy as np
print(X_train.shape)
print('There are 1199 rows and 11 columns in the x_train class')
print(X_test.shape)
print('There are 400 rows and 11 columns in the x_test class')
print(y_train.shape)
print('There are 1199 rows in the y_train class')
print(y_test.shape)
print('There are 400 rows in the y_test class')
print('There are 119 samples in the training set in total and 400 samples in the testing set in total')

print('There are 11 features in the x_train and x_test classes. They are: ')
print(X_train.columns)

print(X_train.value_counts())
print(X_test.value_counts())
print(y_train.value_counts())
print(y_test.value_counts())


(1199, 11)
There are 1199 rows and 11 columns in the x_train class
(400, 11)
There are 400 rows and 11 columns in the x_test class
(1199,)
There are 1199 rows in the y_train class
(400,)
There are 400 rows in the y_test class
There are 119 samples in the training set in total and 400 samples in the testing set in total
There are 11 features in the x_train and x_test classes. They are: 
Index(['fixed.acidity', 'volatile.acidity', 'citric.acid', 'residual.sugar',
       'chlorides', 'free.sulfur.dioxide', 'total.sulfur.dioxide', 'density',
       'pH', 'sulphates', 'alcohol'],
      dtype='object')
fixed.acidity  volatile.acidity  citric.acid  residual.sugar  chlorides  free.sulfur.dioxide  total.sulfur.dioxide  density  pH    sulphates  alcohol
7.0            0.69              0.07         2.5             0.091      15.0                 21.0                  0.99572  3.38  0.60       11.3       3
9.3            0.36              0.39         1.5             0.080      41.0              

## Challenge 4.2 
Train a **Support Vector Machine** classifier using the `(X_train,y_train)` dataset and use trained model to predict the underlying classes for the observations in the test dataset `X_test`. Store your prediction in a variable called `y_pred`.

In [33]:
from sklearn import svm
model  = svm.SVC(kernel='linear')
model.fit(X_train,y_train)
y_pred= model.predict(X_test)
print(y_pred)

['Bad' 'Bad' 'Good' 'Bad' 'Good' 'Bad' 'Bad' 'Bad' 'Good' 'Good' 'Good'
 'Bad' 'Bad' 'Bad' 'Bad' 'Good' 'Bad' 'Bad' 'Good' 'Bad' 'Bad' 'Bad'
 'Good' 'Good' 'Bad' 'Bad' 'Good' 'Bad' 'Bad' 'Good' 'Bad' 'Bad' 'Good'
 'Bad' 'Bad' 'Bad' 'Good' 'Good' 'Good' 'Good' 'Bad' 'Bad' 'Good' 'Bad'
 'Good' 'Good' 'Good' 'Bad' 'Bad' 'Good' 'Bad' 'Bad' 'Good' 'Good' 'Bad'
 'Bad' 'Good' 'Bad' 'Good' 'Bad' 'Bad' 'Good' 'Bad' 'Bad' 'Good' 'Bad'
 'Good' 'Bad' 'Good' 'Bad' 'Good' 'Bad' 'Good' 'Good' 'Good' 'Bad' 'Good'
 'Good' 'Good' 'Good' 'Bad' 'Good' 'Bad' 'Good' 'Good' 'Good' 'Bad' 'Good'
 'Good' 'Bad' 'Good' 'Bad' 'Good' 'Good' 'Bad' 'Good' 'Bad' 'Good' 'Bad'
 'Bad' 'Bad' 'Bad' 'Good' 'Good' 'Bad' 'Good' 'Bad' 'Bad' 'Good' 'Bad'
 'Good' 'Bad' 'Good' 'Bad' 'Good' 'Good' 'Good' 'Good' 'Bad' 'Good' 'Good'
 'Bad' 'Good' 'Bad' 'Bad' 'Bad' 'Good' 'Good' 'Good' 'Good' 'Good' 'Bad'
 'Bad' 'Good' 'Good' 'Bad' 'Bad' 'Bad' 'Bad' 'Good' 'Good' 'Good' 'Good'
 'Good' 'Good' 'Bad' 'Good' 'Bad' 'Good' 'Bad' 'Good' 'Go

## Challenge 4.3
Evaluate the performance of your classifier. Calculate and display the following:
* print the `confusion matrix`.
* `normalized confusion matrix`. 
* the probablitity of correct classification (accuracy score). 
* the `precision`, `recall`, and `f1-score` for each class.

In [34]:
from sklearn.metrics import confusion_matrix,precision_score,accuracy_score,recall_score,f1_score,classification_report 
cm = confusion_matrix(y_test, y_pred)
cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
ac = accuracy_score(y_test,y_pred)
ps = precision_score(y_test, y_pred, average='macro')
rs = recall_score(y_test, y_pred, average='macro')
f1s = f1_score(y_test, y_pred, average='macro')
print('Confusion Matrix:')
print(cm)
print(' \n Normalized Confusion Matrix:')
print(cm_normalized)
print(' \n Accuracy Score:')
print(ac)
print(' \n Precision Score:')
print(ps)
print(' \n Recall Score:')
print(rs)
print(' \n F1-Score:')
print(f1s)


Confusion Matrix:
[[135  43]
 [ 66 156]]
 
 Normalized Confusion Matrix:
[[0.75842697 0.24157303]
 [0.2972973  0.7027027 ]]
 
 Accuracy Score:
0.7275
 
 Precision Score:
0.7277806945173629
 
 Recall Score:
0.7305648344974187
 
 F1-Score:
0.7267468459942718


# Challenge 5

The **SVM** classifier accepts a number of parameters. These parameters include the parameter `C` (i.e. the regularization parameter), the `kernel` which specified the kernel function to be used, and the parameter `gamma` which can be used to specify the kernel coefficents for certain kernels (i.e. `rbf`, `poly` and `sigmoid`). You can find more information about the various parameters in implementation of the SVM classifier on the following website:

- [SVM documentation on sklearn](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC)
- [User Guide on Support Vector Machines](https://scikit-learn.org/stable/modules/svm.html#svm-classification)
- [Kernel Function Supported by sklearn library](https://scikit-learn.org/stable/modules/svm.html#svm-kernels)


After reading the documentation to understand how the various parameters are used, evaluate the classifier for different values of C, gamma and kernel parameters and identify which configuration achieve the best performance on the testing set. Plot or print your results.


In [37]:
from sklearn import svm
model  = svm.SVC(C=1.0,kernel='poly',gamma='scale')
model.fit(X_train,y_train)
y_pred= model.predict(X_test)

from sklearn.metrics import confusion_matrix,precision_score,accuracy_score,recall_score,f1_score,classification_report 
cm = confusion_matrix(y_test, y_pred)
cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
ac = accuracy_score(y_test,y_pred)
ps = precision_score(y_test, y_pred, average='macro')
rs = recall_score(y_test, y_pred, average='macro')
f1s = f1_score(y_test, y_pred, average='macro')
print('Confusion Matrix:')
print(cm)
print(' \n Normalized Confusion Matrix:')
print(cm_normalized)
print(' \n Accuracy Score:')
print(ac)
print(' \n Precision Score:')
print(ps)
print(' \n Recall Score:')
print(rs)
print(' \n F1-Score:')
print(f1s)

Confusion Matrix:
[[ 48 130]
 [ 14 208]]
 
 Normalized Confusion Matrix:
[[0.26966292 0.73033708]
 [0.06306306 0.93693694]]
 
 Accuracy Score:
0.64
 
 Precision Score:
0.6947890818858561
 
 Recall Score:
0.6032999291426258
 
 F1-Score:
0.5714285714285714


Using the kernel parameter of 'polu' resulted in a lower accuracy score of 0.64

In [38]:
from sklearn import svm
model  = svm.SVC(C=1.0,kernel='rbf',gamma='scale')
model.fit(X_train,y_train)
y_pred= model.predict(X_test)

from sklearn.metrics import confusion_matrix,precision_score,accuracy_score,recall_score,f1_score,classification_report 
cm = confusion_matrix(y_test, y_pred)
cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
ac = accuracy_score(y_test,y_pred)
ps = precision_score(y_test, y_pred, average='macro')
rs = recall_score(y_test, y_pred, average='macro')
f1s = f1_score(y_test, y_pred, average='macro')
print('Confusion Matrix:')
print(cm)
print(' \n Normalized Confusion Matrix:')
print(cm_normalized)
print(' \n Accuracy Score:')
print(ac)
print(' \n Precision Score:')
print(ps)
print(' \n Recall Score:')
print(rs)
print(' \n F1-Score:')
print(f1s)

Confusion Matrix:
[[ 65 113]
 [ 32 190]]
 
 Normalized Confusion Matrix:
[[0.36516854 0.63483146]
 [0.14414414 0.85585586]]
 
 Accuracy Score:
0.6375
 
 Precision Score:
0.6485828995270662
 
 Recall Score:
0.6105121975908493
 
 F1-Score:
0.5982683982683983


The parameter 'rbf' is even lower than 'poly', with an accuracy of 0.6375

In [39]:
from sklearn import svm
model  = svm.SVC(C=1.0,kernel='sigmoid',gamma='scale')
model.fit(X_train,y_train)
y_pred= model.predict(X_test)

from sklearn.metrics import confusion_matrix,precision_score,accuracy_score,recall_score,f1_score,classification_report 
cm = confusion_matrix(y_test, y_pred)
cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
ac = accuracy_score(y_test,y_pred)
ps = precision_score(y_test, y_pred, average='macro')
rs = recall_score(y_test, y_pred, average='macro')
f1s = f1_score(y_test, y_pred, average='macro')
print('Confusion Matrix:')
print(cm)
print(' \n Normalized Confusion Matrix:')
print(cm_normalized)
print(' \n Accuracy Score:')
print(ac)
print(' \n Precision Score:')
print(ps)
print(' \n Recall Score:')
print(rs)
print(' \n F1-Score:')
print(f1s)

Confusion Matrix:
[[ 62 116]
 [120 102]]
 
 Normalized Confusion Matrix:
[[0.34831461 0.65168539]
 [0.54054054 0.45945946]]
 
 Accuracy Score:
0.41
 
 Precision Score:
0.4042746244581107
 
 Recall Score:
0.40388703310051627
 
 F1-Score:
0.40404040404040403


The kernel parameter of 'sigmoid' had the worst accuracy of 0.41, proving that 'linear' is the best kernel parameter for this dataset.

In [46]:
from sklearn import svm
model  = svm.SVC(C=5.0,kernel='linear',gamma='scale')
model.fit(X_train,y_train)
y_pred= model.predict(X_test)

from sklearn.metrics import confusion_matrix,precision_score,accuracy_score,recall_score,f1_score,classification_report 
cm = confusion_matrix(y_test, y_pred)
cm_normalized = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
ac = accuracy_score(y_test,y_pred)
ps = precision_score(y_test, y_pred, average='macro')
rs = recall_score(y_test, y_pred, average='macro')
f1s = f1_score(y_test, y_pred, average='macro')
print('Confusion Matrix:')
print(cm)
print(' \n Normalized Confusion Matrix:')
print(cm_normalized)
print(' \n Accuracy Score:')
print(ac)
print(' \n Precision Score:')
print(ps)
print(' \n Recall Score:')
print(rs)
print(' \n F1-Score:')
print(f1s)

Confusion Matrix:
[[136  42]
 [ 68 154]]
 
 Normalized Confusion Matrix:
[[0.76404494 0.23595506]
 [0.30630631 0.69369369]]
 
 Accuracy Score:
0.725
 
 Precision Score:
0.7261904761904762
 
 Recall Score:
0.7288693187569593
 
 F1-Score:
0.7244419950399559


Increasing the C value only made the accuracy score worse.


Copyright Statement: Copyright © 2020 Christoforou. The materials provided by the instructor of this course, including this notebook, are for the use of the students enrolled in the course. Materials are presented in an educational context for personal use and study and should not be shared, distributed, disseminated or sold in print — or digitally — outside the course without permission. You may not, nor may you knowingly allow others to reproduce or distribute lecture notes, course materials as well as any of their derivatives without the instructor's express written consent.