**<span style="color:#448844">Note</span>** This notebook is meant to be interactive. Launch this notebook in Jupyter to see its full potential.

# Naive Bayes Exercise

In this notebook, you will learn to implement a Naive Bayes classifier using sklearn. We will be creating two classifiers, one which assumes a Gaussian distribution, and another that assumes a multinomial distrbution.

## Instructions for All Labs
* Read each cell and implement the TODOs sequentially. The markdown/text cells also contain instructions which you need to follow to get the whole notebook working.
* Do not change the variable names unless the instructor allows you to.
* Some markdown cells contain questions.
  * For questions <span style="color:red;">colored in red</span>, you must submit your answers in the corresponding Assignment in the course page. Make sure that you enter your responses in the item with the matching question code. Answers that do not follow the prescribed format will automatically be marked wrong by the checker.
  * For questions <span style="color:green;">colored in green</span>, you don't have to submit your answers, but you must think about these questions as they will help enrich your understanding of the concepts covered in the labs.
* You are expected to search how to some functions work on the Internet or via the docs. 
* You may add new cells for "scrap work".
* The notebooks will undergo a "Restart and Run All" command, so make sure that your code is working properly.
* You may not reproduce this notebook or share them to anyone.

In [1]:
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline
plt.style.use('ggplot')

plt.rcParams['figure.figsize'] = (12.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'

# Fix the seed of the random number 
# generator so that your results will match ours
np.random.seed(1)

%load_ext autoreload
%autoreload 2

# Gaussian Naive Bayes


Our first dataset (iris dataset), our assumption is our data follows a Gaussian distribution.

**Dataset:**

Our first data set is the iris data set which contains 3 classes of 50 instances each. Each class refers to a type of iris plant.

Attribute Information:

1. Sepal length in cm
2. Sepal width in cm
3. Petal length in cm
4. Petal width in cm
5. Class (Species):
    - Iris Setosa
    - Iris Versicolour
    - Iris Virginica

## Preprocessing our data

Let's load the iris dataset.

In [None]:
import pandas as pd

iris = pd.read_csv('iris.csv')
iris.head()

In [None]:
iris.groupby('species').describe()

Right now, we have to convert our nominal labels (word labels: setosa, versicolor, viriginica) into numerical labels (number labels: 0,1,2; 0 for setosa, 1 for versicolor, 2 for viriginica)

In [4]:
from sklearn import preprocessing

This will convert the unique nominal values there are in iris["species"] to a unique number

In [None]:
label_enc = preprocessing.LabelEncoder()
label_enc.fit(iris["species"])

We can check the original labels

This will transform the list to match the numerical code mapping

In [None]:
label_enc.transform(iris["species"])

Let's see the mapping of the original nominal labels and the numerical codes

In [None]:
print("Original labels:", label_enc.classes_, "\n")

print("Mapping from nominal to numerical labels:")
print(dict(zip(label_enc.classes_,label_enc.transform(label_enc.classes_))))

Now that we have the numerical encoding and the mapping, we can now change the `species` column to its numerical mapping

In [8]:
iris["species"] = label_enc.transform(iris["species"])

Let's see the results now:

In [None]:
iris.head()

Like in the previous notebooks, we will separate our `X` from our target `y` (species). 


__Note__: `iris.values[:,:-1]` will get all rows, and all columns except for the last column


__Note__: `iris.values[:,-1]` will get the last column only. We set the the labels as integers because its default data type is float.

In [10]:
X = iris.values[:,:-1]
y = iris.values[:,-1].astype(int)

In [None]:
print("X shape: ", X.shape)
print("y shape: ", y.shape)

Let's separate the training from the test set. 

Set the test size to `0.3`, make sure to stratify based on `species`/`y`. Also set the `random_state` to `42` so our results match.

In [12]:
from sklearn.model_selection import train_test_split
# train_test_split?

In [13]:
# write code here
X_train, X_test, y_train, y_test = None

In [None]:
X_train.shape

**Sanity Check:** X_train should have a shape of `(105, 4)`

## Building our model
Because our features `X` are continuous values, we will use `sklearn`'s `GaussianNB` model.

In [15]:
from sklearn.naive_bayes import GaussianNB

Intialize a `GaussianNB` model

In [16]:
iris_nb = GaussianNB()

Train it

In [None]:
iris_nb.fit(X_train, y_train)

And, get its training predictions

In [None]:
predictions = iris_nb.predict(X_train)
predictions

We will be computing for the accuracy multiple times in this notebook, so let's create a function for this.

`compute_accuracy()` will compute for the accuracy given two vectors of equal length

__Inputs:__
- `predictions`: A numpy array of shape `(N,)` consisting of `N` samples representing the predicted values
- `actual`: A numpy array of shape `(N,)` consisting of `N` samples representing the actual (target) values

__Outputs:__
- `accuracy`: A scalar representing the percentage of elements where `predictions` and `actual` match out of the total number of elements

In [19]:
def compute_accuracy(predictions, actual):
    # write code here
    return None

Let's see how well our model performed on the training data

In [None]:
print("Training accuracy: ", compute_accuracy(predictions, y_train), "%")

That's a good result. Let's see if it will perform well on our test set.

In [None]:
predictions = iris_nb.predict(X_test)
predictions

In [None]:
print("Test accuracy: ", compute_accuracy(predictions, y_test), "%")

**Sanity Check:** You should get a 91.111% accuracy

## Checking the learned parameters
We can also peer into the parameters the model learned.

This is how you get the number of instances (of each class) the model received as the training set

In [None]:
iris_nb.class_count_

You can also get the priors the model learned

In [None]:
# write code here


<span style="color:red;">**Question 7-1:** What is the prior computed for the first class? Round of to four decimal places.</span>

<span style="color:red;">**Question 7-2:** What is the prior computed for the second class? Round of to four decimal places.</span>

<span style="color:red;">**Question 7-3:** What is the prior computed for the second class? Round of to four decimal places.</span>

<span style="color:green;">**Question:** How are the priors calculated?</span>

Gaussian Naive Bayes classifiers have **`k * d * 2`** number of parameters (not including the priors)

> where <br>
> **`k`** - number of classes <br>
> **`d`** - number of dimensions/features <br>
> **`2`** - because we calculate for the means and variances of each feature <br>

Get the computed means of the model

In [None]:
# write code here


<span style="color:red;">**Question 7-4:** What is the shape of the computed means?</span>

Get the computed variances of the model

In [None]:
# write code here


<span style="color:red;">**Question 7-5:** What is the shape of the computed variances?</span>

____________

# Multinomial Naive Bayes

Our second dataset (spam/not spam), our assumption is our data follows a multinomial distribution.

**Dataset:**

Our goal with this dataset is to classify a sentence as either **spam** or **not spam** (ham). 

You can check out `the spam/ham.csv` for examples of spam and not spam messages. Check the file and see its body contents.

(This section is a slight modification from <a src="http://www.ritchieng.com/machine-learning-multinomial-naive-bayes-vectorization/">Ritchie Ng's notebook</a>)

## Sample data

Before we go and train with the spam/ham dataset, we have to convert the `content` column into numbers we can crunch. In our case, our features will be the frequency of words in the data instance.

**Example:**


|                                                  | Never | gonna | give | you | up | let | down | make | cry | say | goodbye |
|--------------------------------------------------|-------|-------|------|-----|----|-----|------|------|-----|-----|---------|
|                          Never gonna give you up |   1   |   1   |   1  |  1  |  1 |  0  |   0  |   0  |  0  |  0  |    0    |
| Never gonna give you up Never gonna let you down |   2   |   2   |   1  |  2  |  1 |  1  |   1  |   0  |  0  |  0  |    0    |
|                         Never gonna make you cry |   1   |   1   |   0  |  1  |  0 |  0  |   0  |   1  |  1  |  0  |    0    |
|                          Never gonna say goodbye |   1   |   1   |   0  |  0  |  0 |  0  |   0  |   0  |  0  |  1  |    1    |

<div style="text-align: right"><sub>Reference: Never Gonna Give You Up by Rick Astley</sub></div>

In [27]:
data = ["Never gonna give you up",
        "Never gonna give you up Never gonna let you down",
        "Never gonna make you cry",
        "Never gonna say goodbye"]

First, let's convert our words all to lower case. This is a common practice.

In [None]:
for i in range(len(data)):
    data[i] = data[i].lower()
    
data

Now, we'll count for the frequency of each word of each sentence.

We will use [CountVectorizer](http://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.text.CountVectorizer.html) to convert the text into a matrix of word/token counts

In [29]:
from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer()

This following code will get the words in our dataset

In [None]:
count_vect.fit(data)

Let's see our words/tokens/features:

In [None]:
word_features = count_vect.get_feature_names()
word_features

The following code computes for the counts of each word for each of our data sentences. 

It outputs a **sparse count matrix**. 

__Note:__ The sparse refers to the matrix having mostly 0 values for the columns (see table above). If we store this as a normal matrix, it will take up a lot of space. To save space, the following data is stored in this fashion:
> `(<sentence>, <word>)         <count>`

All combinations where the count is 0 will be ignored

In [None]:
count_sparse_matrix = count_vect.transform(data)
print(count_sparse_matrix)

It may seem a lot of work to save little space, but as your data grows this will save a ton of memory.

In [None]:
n_sentences = count_sparse_matrix.shape[0]
n_word_features = count_sparse_matrix.shape[1]

# header
for i in range(n_word_features):
    print(word_features[i], end ="\t")
print("sentence", end="\n")
    
for i in range(n_sentences):
    for j in range(n_word_features):
        print(count_sparse_matrix[i, j], end="\t")
    print(data[i], end="\n")

From the [scikit-learn documentation](http://scikit-learn.org/stable/modules/feature_extraction.html#text-feature-extraction):

> In this scheme, features and samples are defined as follows:

> - Each individual token occurrence frequency (normalized or not) is treated as a **feature**.
> - The vector of all the token frequencies for a given document/sentence is considered a multivariate **sample**.

> A **corpus of documents** can thus be represented by a matrix with **one row per document/sentence** and **one column per token** (e.g. word) occurring in the corpus.

> We call **vectorization** the general process of turning a collection of text documents into numerical feature vectors. This specific strategy (tokenization, counting and normalization) is called the **Bag of Words** or "Bag of n-grams" representation. Documents are described by word occurrences while completely ignoring the relative position information of the words in the document.

For our training, we will convert count_sparse_matrix into a count_dense_matrix.

In [None]:
count_dense_matrix = count_sparse_matrix.toarray()
count_dense_matrix

Here is our data in a pandas DataFrame

In [None]:
pd.DataFrame(count_dense_matrix, columns=count_vect.get_feature_names(), index=data)

**Summary:**

- `vect.fit(train)` **learns the vocabulary** of the training data
- `vect.transform(train)` uses the **fitted vocabulary** to build a document-term matrix from the training data
- `vect.transform(test)` uses the **fitted vocabulary** to build a document-term matrix from the testing data (and **ignores tokens** it hasn't seen before)

# On with the spam/ham dataset
## Preprocessing

Load the text data from the csv file

In [None]:
spamham = pd.read_csv("spam_ham.csv")
spamham.dropna(inplace=True)
spamham.head(10)

In [37]:
from sklearn import preprocessing

Before we proceed to vectorizing, let's change our label type from "spam" and "ham" to numerical values.

Use `sklearn`'s `LabelEncoder` to to encode the spamham dataset's labels (`type` in the csv)

In [None]:
# write code here
label_enc = None

Then, get the mapping so we know what the `0`s and `1`s mean later in the notebook

In [None]:
mapping = dict(zip(label_enc.classes_, label_enc.transform(label_enc.classes_)))
print("Mapping:", mapping)

Now, call `LabelEncoder` to label encode the `type` column

In [None]:
# write code here
spamham["type"] = None

spamham.head(10)

**Sanity Check:** The type column should now be in 1's and 0's. Make sure that they are still properly labelled.

Now, we will separate our features `X` from our labels `y`. Disregard the `location` column (it points to the text file where the text `body` came from)

In [None]:
# write code here
X = None
y = None

print("X shape : ", X.shape)
print("y shape : ", y.shape)

**Sanity Check:**
You should see the following:
```
X shape :  (30974,)
y shape :  (30974,)
```

In [None]:
print(X[0:5])
print(y[0:5])

**Sanity Check:**
You should see the following:
```
0    LUXURY WATCHES - BUY YOUR OWN ROLEX FOR ONLY $...
1    Academic Qualifications available from prestig...
2    Greetings all. This is to verify your subscrip...
3    try chauncey may conferred the luscious not co...
4    It's quiet. Too quiet. Well, how about a straw...
Name: body, dtype: object
0    1
1    1
2    0
3    1
4    0
Name: type, dtype: int64
```

Split the dataset into train and test data sets. Set the test size to 30%, and `random_state` to 42. Make sure we also stratify based on the type (spam/ham).

In [43]:
# write code here
X_train, X_test, y_train, y_test = None

In [None]:
y_train.value_counts()

In [None]:
y_test.value_counts()

You should see that the distribution of classes in the train and test sets are maintained (1.648:1)

### Vectorization

Let's process the data as we did in the section before. Note that we will get a new dictionary based on the training dataa (we won't use the *Never gonna give you up* dataset anymore).

In [46]:
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()

Get the words from the training set (we should train without knowing the words from the test set)

In [None]:
count_vect.fit(X_train)

**Sanity Check:** This is a large dataset, it may take a few seconds.

And, then get the frequency of each word in  each sentence

In [48]:
X_train_count_sparse_matrix = count_vect.transform(X_train)

__Note:__ A shorthand of these two lines is `count_vect.fit_transform()`

In [None]:
X_train_count_sparse_matrix.shape

**Sanity Check:** The shape should be around (21681, 147622)

Let's check out the fitted vocabulary

In [None]:
count_vect.get_feature_names()

**Sanity Check:** This will really get funny characters. Try seeing the 45,000th words onward to see more "normal" words

In [None]:
count_vect.get_feature_names()[45000:]

Now, we also have to transform our test data to our fitted vocab. 


**Note:** We should not fit the test data's vocabulary. We're going to use the word features we culled from the training dataset.

In [None]:
# write code here
X_test_count_sparse_matrix = None

X_test_count_sparse_matrix.shape

**Sanity Check:**
The number of features (dimensions, not instances) of the train and test should match

**Now we have two transformed sparse matrices:**
- X_train_count_sparse_matrix
- X_test_count_sparse_matrix



## Modelling

Now that we've got preprocessing done, we can focus on building the model. Here, we will use sklearn's `MultinomialNB` because our assumption is that our data follows a multinomial distribution.

In [53]:
from sklearn.naive_bayes import MultinomialNB
# MultinomialNB?

Initialize your `MultinomialNB` model

In [54]:
spam_nb = MultinomialNB()

Fit it to out training data (`X_train_count_sparse_matrix`)

In [None]:
spam_nb.fit(X_train_count_sparse_matrix, y_train)

And, get our training predictions

In [None]:
predictions = spam_nb.predict(X_train_count_sparse_matrix)
predictions

Let's see how well our model worked on our training data

In [None]:
print("Spam/ham training accuracy: ", compute_accuracy(predictions, y_train), "%")

Then, let's see how it does on our test set

In [None]:
predictions = spam_nb.predict(X_test_count_sparse_matrix)
predictions

In [None]:
print("Spam/ham test accuracy: ", compute_accuracy(predictions, y_test), "%")

**Sanity Check:** You should get around a 98% accuracy

We should also be able to call `classification_report` to see how well our model performed with different metrics

In [60]:
from sklearn.metrics import classification_report
# classification_report?

Print the test classification report of our model. Set the `target_names` to `mapping.keys()` so we can see what `0` and `1` refers to.

In [None]:
# write code here


**Sanity check:** You should get the following results
```
              precision    recall  f1-score   support

         ham     0.9573    0.9972    0.9768      3509
        spam     0.9982    0.9730    0.9855      5784

    accuracy                         0.9821      9293
   macro avg     0.9778    0.9851    0.9811      9293
weighted avg     0.9828    0.9821    0.9822      9293
```

<span style="color:red;">**Question 7-6:** Among the classes (`ham` or `spam`), which is more likely to get labelled its class?</span>

## Testing our model with our own input

In [None]:
input_test = input("Enter text to check if spam or ham : ")
while input_test.lower() != "q":

    input_test_matrix = count_vect.transform([input_test])

    results = spam_nb.predict(input_test_matrix)
    results_label = ["HAM", "SPAM"]
    print("Text : " + input_test + " is " + results_label[results[0]])

    input_test = input("Enter text to check if spam or ham : ")

## Checking the learned parameters
Let's see the parameters the `MultinomialNB` model learned.

Get the token counts the model computed

In [None]:
token_counts = spam_nb.feature_count_
token_counts.shape

**Sanity check:** You should get a `(2, 147622)` matrix

<span style="color:green;">Question: Why did we get a `(2, 147622)` matrix for the token counts?</span>

To get the token counts of `spam` or `ham`, we can use our `mapping`.

In [64]:
spam_token_counts = token_counts[mapping["spam"]]
ham_token_counts = token_counts[mapping["ham"]]

We can sort the token counts to see the word that occurs the less/most for that class

In [None]:
np.sort(spam_token_counts)

While `np.sort` returns the actual counts, `np.argsort` returns the sorted indices

In [None]:
np.argsort(spam_token_counts)

**Sanity check:** You should see the following:

`array([ 73810, 119826, 119827, ..., 128265,  21596, 124004])`

The two sorts show that the `73,810th` word occurred `0` times, while the `124,004th` occurred `53,076` times in the spam sentences. Note that these are raw counts that are skewed because there are significantly more spam sentences. **The model normalizes the counts relative to the class.**

To get the the `ith` word/token, we can use `count_vect.get_feature_names()`

In [None]:
count_vect.get_feature_names()[73810]

<span style="color:red;">Question 7-7: What word occurred the most in the spam sentences? Write the word in small letters only.</span>

The following code lists the top occurring words per class:

In [None]:
top = 50

ham_idx = np.argsort(ham_token_counts)[::-1][:top]
spam_idx = np.argsort(spam_token_counts)[::-1][:top]

print("spam \t ham")
print("------------")

for i in range(top):
    print(count_vect.get_feature_names()[ham_idx[i]], "\t", count_vect.get_feature_names()[spam_idx[i]])

<s>With the code above, you can now reword your scam so that it can bypass a common spam filter.</s>

The model does not depend on raw counts but instead uses the log probability. Get the model's log probabilities.

In [None]:
# write code here


Let's sort the `feature_log_prob_` similar to the way we sorted the token counts

In [None]:
np.sort(spam_nb.feature_log_prob_[mapping["spam"]])

In [None]:
np.argsort(spam_nb.feature_log_prob_[mapping["spam"]])

We can see that the order is maintained, and the `124,004th` word is still the most occurring word in the `spam` sentences.

We can also see the class count and computed priors for each class

In [None]:
spam_nb.class_count_

In [None]:
spam_nb.class_log_prior_

Note that the priors are computed based on the count of each class (spam or not spam) in the dataset. The log probability is computed.

# Tuning our Naive Bayes model

In this section we will reuse our spam/ham dataset. We will resplit our dataset in the following manner:
1. Allot 20% of the original dataset as our hold-out test set.
1. Allot 25% of our remaining data as our validation data set. The remaining 80% will serve as our training data.

We will use `sklearn`'s `ParameterGrid` to tune our hyperparameters

Let's separate our test set. Set the test set to `20%`, stratify based on the target class, and set the `random_state` to 42.

In [None]:
X_train_val, X_test, y_train_val, y_test = train_test_split(X, y, test_size=.2, random_state=42, stratify=y)

print("X_test shape : ", X_test.shape)
print("y_test shape : ", y_test.shape)

We will the same thing to separate our validation set. Set the validation set size to `25%`, stratify based on the target class, and set the `random_state` to 42. 

__Don't forget that we are now splitting `X_train_val` and `y_train_val`.__ There should be no data leakage.

In [None]:
X_train, X_validation, y_train, y_validation = train_test_split(X_train_val, y_train_val, test_size=.25, random_state=42, stratify=y_train_val)

print("X_train shape : ", X_train.shape)
print("y_train shape : ", y_train.shape)
print("X_validation shape : ", X_validation.shape)
print("y_validation shape : ", y_validation.shape)

## Vectorization

Now that we have our data sets prepared, we can now start computing for the token counts. Remember that we will have to refit our vectorizer to our new train data.

Initialize a `CountVectorizer`. Set it to remove English stop words. This should remove common words like `the`, `of`, `and` that likely will not anything meaningful to distinguish the two classes apart.

In [76]:
# write code here
count_vect = None

Get the count matrix for the training and validation set. 

In [77]:
X_train_count_sparse_matrix = count_vect.fit_transform(X_train)
X_validation_count_sparse_matrix = count_vect.transform(X_validation)

In [None]:
print("X_train_count_sparse_matrix shape: ", X_train_count_sparse_matrix.shape)
print("X_validation_count_sparse_matrix shape: ", X_validation_count_sparse_matrix.shape)

**Sanity check:** You should see the following values:
```
X_train_count_sparse_matrix shape:  (18584, 146302)
X_validation_count_sparse_matrix shape:  (6195, 146302)
```

<span style="color:red;">**Question 7-8:** Why should we not get the count vectorizer to fit on the validation data set instead?</span>

## GridSearch with `ParameterGrid`
In this section, we will use `ParameterGrid` to get the combinations of hyperparameters we will try on our model.

In [79]:
from sklearn.model_selection import ParameterGrid

Set our base classifier for the spam/ham classifier. Don't train yet

In [80]:
# write code here
spam_nb = None

For this model, we can tweak the `alpha` (our smoothing operator) and whether or not we want to compute for the prior (`fit_prior`). You can read more about this in the docs.

In [None]:
spam_nb.get_params()

For the following section, we will define our hyperparameters. For now, set the following hyperparameter choices:

__Hyperparameters__:
- alpha could be 1, 3, 5, 10, 15, 20, 50
- fit_prior could be true or false

In [82]:
# write code here
hyperparameters = [{
    
    
}]

If we call `ParameterGrid`, it should list the following:

In [None]:
list(ParameterGrid(hyperparameters))

For every iteration, we will:
1. Set the parameters of our base model to the current hyperparameter combination 
1. Fit our model to our training data
1. Compute for our training accuracy
1. Run predictions on our validation data
1. Compute for our training accuracy
1. Keep track of the best performing validation accuracy and its associate hyperparam combo.

In [None]:
best_score = 0
for g in ParameterGrid(hyperparameters):
    print(g)
    
    spam_nb.set_params(**g)
    
    # write code here
    
    predictions = None
    train_acc = None
    
    # write code here
    predictions = None
    val_acc = None
    
    print(f"Train acc: {train_acc}% \t Val acc: {val_acc}%", end="\n\n")
    
    if val_acc > best_score:
        best_score = val_acc
        best_grid = g

print("Best accuracy: ", best_score, "%")
print("Best grid: ", best_grid)

<span style="color:red;">Question 7-9: What is the best found value for `alpha`? Round off to four decimal places.</span>

<span style="color:red;">Question 7-10: What is the best found value for `fit_prior`?</span>

## Retraining our estimator with the best hyperparameters

Now that we know the best hyperparameters, we can now make a new classifier and retrain it.

In [85]:
# write code here
spam_nb = None

Make sure you train it with both our training and validation set. You can keep the trained `count_vect`.

In [86]:
# write code here
X_train_val_count_sparse_matrix = None

In [None]:
# write code here


## Testing phase

Run predictions on the test data set

In [88]:
X_test_val_count_sparse_matrix = count_vect.transform(X_test)
predictions = spam_nb.predict(X_test_val_count_sparse_matrix)

Compute for the test accuracy

In [None]:
test_acc = compute_accuracy(predictions, y_test)
print("Test accuracy: ", test_acc, "%")

<span style="color:red;">Question 7-11: What is the final test accuracy? Round off to four decimal places.</span> 

# Summary

In this notebook, we created two kinds of Naive Bayes models: Gaussian and Multinomial. 

We also saw the models' learned parameters. For Gaussian NB models, the model learns the mean and standard deviation of each feature per class, while multinomial NB models learn the log probability of each token per class.

We also experienced creating a natural language processing (NLP) machine learning model. Unlike its deep learning counterpart, the features are more hand-crafted because we dictate what the model should look at. In this case, we specifically designed it to look at token/term frequency/count, but we could build more sophisticated versions like inverse document frequency or term frequency-inverse document frequency (TF-IDF). 

## <center>fin</center>


<!-- DO NOT MODIFY OR DELETE THIS -->

<sup>made/compiled by daniel stanley tan & courtney anne ngo 🐰 & thomas james tiam-lee</sup> <br>
<sup>for comments, corrections, suggestions, please email:</sup><sup> danieltan07@gmail.com & courtneyngo@gmail.com & thomasjamestiamlee@gmail.com</sup><br>
<sup>please cc your instructor, too</sup>
<!-- DO NOT MODIFY OR DELETE THIS -->