# Naive Bayes
<div class="alert alert-block alert-info">
<b>Content:</b> In this notebook, 
    we demonstrate the simplest version of the Naive Bayes classifier on the tennis dataset.
</div>


In [None]:
import pandas as pd
import numpy as np

from sklearn.preprocessing import OrdinalEncoder
from sklearn.model_selection import GridSearchCV, RepeatedStratifiedKFold, StratifiedKFold, cross_validate
from sklearn.pipeline import Pipeline

from sklearn.naive_bayes import CategoricalNB
from sklearn.metrics import accuracy_score

In [None]:
df=pd.read_csv("data/play_tennis.csv", sep=',')

In [None]:
df

In [None]:
df=df.drop("day", axis='columns')
X_df=df.drop("play", axis='columns')
y_df=df.loc[:, ['play']]

X_raw=X_df.to_numpy()
y_raw=y_df.to_numpy()
X_raw, y_raw

We can safely use an ordinal encoder, because Naive Bayes is ignorant towards relations among the class labels.

<div class="alert alert-block alert-warning">
<b>Warning:</b> The same is not true for other classification algorithms.
</div>


In [None]:
target_enc = OrdinalEncoder()
y=target_enc.fit_transform(y_df)[:,-1]
y

In [None]:
clf = Pipeline([('encoder', OrdinalEncoder()), ('classifier', CategoricalNB())])

outer_cv = RepeatedStratifiedKFold(n_splits=3, n_repeats=10, random_state=1)
cv_result=cross_validate(clf, X=X_raw, y=y, cv=outer_cv, scoring=("balanced_accuracy"), n_jobs=8)
print(f"The mean balanced acc is {cv_result['test_score'].mean():.2f} with std {cv_result['test_score'].std():.2f}.")

In [None]:
clf.fit(X_raw, y)

In [None]:
clf['classifier'].category_count_ #feature, class, category

We see the different distributions per feature and class

In [None]:
new_instances=np.array([
    ["Sunny", "Hot", "High", "Strong"],
    ["Overcast", "Hot", "High", "Strong"]
])

In [None]:
pred=clf.predict(new_instances)
pred

In [None]:
target_enc.inverse_transform(pred.reshape(-1,1))

In [None]:
probas = clf.predict_proba(new_instances)
probas

<div class="alert alert-block alert-info">
<b>Take Aways:</b> 

* Run Naive Bayes
* Interpret the results and the probabilities.
</div>

<div class="alert alert-block alert-success">
<b>Play with:</b> 
    
* create further weather situations ans classify them
</div>