<img src="../../../imgs/CampQMIND_banner.png">

# Naive Bayes
This notebook goes through a short review of bayes rule and implements both Gaussian and Multinomial naive bayes model from sklearn.

<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Naive-Bayes" data-toc-modified-id="Naive-Bayes-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Naive Bayes</a></span><ul class="toc-item"><li><span><a href="#Probability-Review" data-toc-modified-id="Probability-Review-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Probability Review</a></span><ul class="toc-item"><li><span><a href="#Conditional-Probability" data-toc-modified-id="Conditional-Probability-1.1.1"><span class="toc-item-num">1.1.1&nbsp;&nbsp;</span>Conditional Probability</a></span></li></ul></li><li><span><a href="#Bayes-Rule" data-toc-modified-id="Bayes-Rule-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Bayes Rule</a></span></li><li><span><a href="#Independence-Assumption" data-toc-modified-id="Independence-Assumption-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Independence Assumption</a></span></li><li><span><a href="#What-do-we-want-to-model?" data-toc-modified-id="What-do-we-want-to-model?-1.4"><span class="toc-item-num">1.4&nbsp;&nbsp;</span>What do we want to model?</a></span></li></ul></li><li><span><a href="#Gaussian-Naive-Bayes" data-toc-modified-id="Gaussian-Naive-Bayes-2"><span class="toc-item-num">2&nbsp;&nbsp;</span>Gaussian Naive Bayes</a></span></li><li><span><a href="#Multinomial-Naive-Bayes" data-toc-modified-id="Multinomial-Naive-Bayes-3"><span class="toc-item-num">3&nbsp;&nbsp;</span>Multinomial Naive Bayes</a></span></li><li><span><a href="#Resources" data-toc-modified-id="Resources-4"><span class="toc-item-num">4&nbsp;&nbsp;</span>Resources</a></span></li></ul></div>

In [1]:
from IPython.display import IFrame
IFrame('https://www.youtube.com/embed/CPqOCI0ahss',560,315)

## Probability Review

### Conditional Probability

$$P(A|B) = \frac{P(A\cap{B})}{P(B)} $$

## Bayes Rule

$$P(A|B) = \frac{P(A)\cdot P(B|A)}{P(A)}$$

We can expand the denominator with the law of total probability.

$$P(A|B) = \frac{P(A)\cdot P(B|A)}{P(A)\cdot P(B|A) + P(A^C)\cdot P(B|A^C)}$$


## Independence Assumption

$$P(A\cap{B}) = P(A){P(B)} $$

## What do we want to model?

The objective is to find $P(y|x_{1},...,x_{n})$, which is the probability of y given the observed features.

By relaxing the independence assumption and removing the normalizing term, the result is

$$P(y|x_{1},...,x_{n})\propto{}P(y)\prod_{i=1}^{n}P(x_{i}|y)$$


# Gaussian Naive Bayes

Most useful for modelling real valued features.

In [2]:
def load_titanic() -> tuple:
    """
    Loads preprocessed titanic
    """
    import seaborn as sns
    import pandas as pd
    cols = ['survived', 'pclass', 'sex', 'age', 'sibsp', 'parch', 'fare', 'alone']
    df = sns.load_dataset('titanic')[cols]
    df.sex = df.sex.astype('category')
    df.alone = df.alone.astype('category')
    df.pclass = df.pclass.astype('category')
    df.parch = df.parch.astype('category')
    df.sibsp = df.sibsp.astype('category')
    df.age = df.age.fillna(df.age.median(skipna=True))
    df = pd.get_dummies(df)
    return df.drop("survived",axis=1), df.survived

In [3]:
import warnings
warnings.filterwarnings("ignore")
from sklearn.model_selection import train_test_split
X,y = load_titanic()
X_train, X_test, y_train, y_test = train_test_split(X,y, random_state=0)

X.head()

Unnamed: 0,age,fare,pclass_1,pclass_2,pclass_3,sex_female,sex_male,sibsp_0,sibsp_1,sibsp_2,...,sibsp_8,parch_0,parch_1,parch_2,parch_3,parch_4,parch_5,parch_6,alone_False,alone_True
0,22.0,7.25,0,0,1,0,1,0,1,0,...,0,1,0,0,0,0,0,0,1,0
1,38.0,71.2833,1,0,0,1,0,0,1,0,...,0,1,0,0,0,0,0,0,1,0
2,26.0,7.925,0,0,1,1,0,1,0,0,...,0,1,0,0,0,0,0,0,0,1
3,35.0,53.1,1,0,0,1,0,0,1,0,...,0,1,0,0,0,0,0,0,1,0
4,35.0,8.05,0,0,1,0,1,1,0,0,...,0,1,0,0,0,0,0,0,0,1


In [4]:
from sklearn.naive_bayes import GaussianNB
from sklearn.metrics import classification_report

nb = GaussianNB()
nb.fit(X_train, y_train)
preds = nb.predict(X_test)
print(classification_report(y_test, preds))

              precision    recall  f1-score   support

           0       0.85      0.08      0.14       139
           1       0.39      0.98      0.56        84

    accuracy                           0.42       223
   macro avg       0.62      0.53      0.35       223
weighted avg       0.67      0.42      0.30       223



# Multinomial Naive Bayes

Multinomial Naive Bayes useful for modeling a Bag of Words representation of the text.

In [5]:
import pandas as pd
data = pd.read_csv("../../data/IMDB Dataset.csv",nrows=10000)
data.head()

Unnamed: 0,review,sentiment
0,One of the other reviewers has mentioned that ...,positive
1,A wonderful little production. <br /><br />The...,positive
2,I thought this was a wonderful way to spend ti...,positive
3,Basically there's a family where a little boy ...,negative
4,"Petter Mattei's ""Love in the Time of Money"" is...",positive


In [6]:
# We would like the data to be in a document term matrix (Bag of Words)
from sklearn.feature_extraction.text import CountVectorizer

vect = CountVectorizer(stop_words='english', ngram_range=(1,2),max_features = 5000)
bagofwords = vect.fit_transform(data.review)

In [7]:
df = pd.DataFrame(bagofwords.toarray(), columns=vect.get_feature_names())
y = data.sentiment.apply(lambda x: 1 if x=="positive" else 0)
df["target"] = y
df.head()

Unnamed: 0,000,10,10 10,10 br,10 minutes,10 years,100,11,12,13,...,young girl,young man,young people,young woman,younger,youth,zero,zombie,zombies,zone
0,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
1,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0
2,0,0,0,0,0,0,0,0,0,0,...,0,0,0,1,0,0,0,0,0,0
3,0,1,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,2,0,0
4,0,0,0,0,0,0,0,0,0,0,...,0,0,0,0,0,0,0,0,0,0


In [8]:
from sklearn.naive_bayes import MultinomialNB
X_train, X_test, y_train, y_test = train_test_split(df.drop("target",axis=1),df.target, random_state=0)

nb = MultinomialNB()
nb.fit(X_train, y_train)
preds = nb.predict(X_test)
print(classification_report(y_test, preds))

              precision    recall  f1-score   support

           0       0.84      0.84      0.84      1232
           1       0.84      0.85      0.85      1268

    accuracy                           0.84      2500
   macro avg       0.84      0.84      0.84      2500
weighted avg       0.84      0.84      0.84      2500



# Resources

- https://programmerbackpack.com/naive-bayes-classifier-explained/
- https://towardsdatascience.com/naive-bayes-explained-9d2b96f4a9c0