# Naive Bayes practice:

- In today's practice we will use dinner.csv dataset to predict whether a person will "Cook" or "Order" dinner based on the other features provided in the dataset.

## Features (Predictors):
- Weather: Describes the weather conditions (Clear, Cloudy, Rainy, Snowy).
- Time: Describes the time of day (Evening, Night, Midday, Midnight).
- Day of the week: Indicates whether it's a Weekend or Weekday.
## Target (Response Variable):
- Dinner: The action taken regarding dinner, either "Cooks" or "Orders".


The task is a classification problem where we use the weather, time, and day of the week to predict the target variable, "Dinner", which has two classes: "Cooks" or "Orders". The Naive Bayes classifier is used to model the relationship between these features and the target, allowing it to make predictions on new, unseen data.

In [None]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.naive_bayes import CategoricalNB
from sklearn.preprocessing import LabelEncoder
from sklearn.metrics import accuracy_score, classification_report




In [None]:
df=pd.read_csv("dinner_data.csv")
p_orders = len(df[df['Dinner'] == 'Orders']) / len(df)
print(p_orders)


0.5029469548133595


To do:
-  We will compute the prior probabilities of the target classes "Orders" and "Cooks" in the dataset.

Note:
- Prior probabilities are the probabilities of each class occurring in the dataset without considering any other features. They represent the overall distribution of the target classes in the data.

In [12]:
# Calculate the prior probabilities P(Orders) and P(Cooks)
p_orders = len(df[df['Dinner'] == 'Orders']) / len(df)
p_cooks = len(df[df['Dinner'] == 'Cooks']) / len(df)

print("Prior probability of orders is: ", p_orders)
print("Prior probability of cooks is: ", p_cooks)

Prior probability of orders is:  0.5029469548133595
Prior probability of cooks is:  0.49705304518664045


- To do :
- Compute the likelihood of observing "Snowy" weather given that the "Dinner" is "Orders" :

P(Snowy∣Orders)= (Number of instances where Dinner is Orders) / (Number of instances where Weather is Snowy and Dinner is Orders)
​
- Compute the likelihood of the time being "Midnight" given that the "Dinner" is "Orders" :

P(Midnight∣Orders)=  (Number of instances where Dinner is Orders) /(Number of instances where Time is Midnight and Dinner is Orders)
​


- Compute the likelihood of it being a "Weekday" given that the "Dinner" is "Orders":

P(Weekday∣Orders)= (Number of instances where Dinner is Orders)/
(Number of instances where Day is Weekday and Dinner is Orders)
​







In [13]:
# Calculate the likelihoods P(Snowy|Orders), P(Midnight|Orders), P(Weekday|Orders)
p_snowy_given_orders = len(df[(df['Weather'] == 'Snowy') & (df['Dinner'] == 'Orders')]) / len(df[df['Dinner'] == 'Orders'])
p_midnight_given_orders = len(df[(df['Time'] == 'Midnight') & (df['Dinner'] == 'Orders')]) / len(df[df['Dinner'] == 'Orders'])
p_weekday_given_orders = len(df[(df['Day of the week'] == 'Weekday') & (df['Dinner'] == 'Orders')]) / len(df[df['Dinner'] == 'Orders'])


To Do:
- Compute the  likelihood of "Snowy" weather given "Cooks".
- Compute the likelihood of "Midnight" given "Cooks".
- Compute the likelihood of "Weekday" given "Cooks".

In [14]:
# Calculate the likelihoods P(Snowy|Cooks), P(Midnight|Cooks), P(Weekday|Cooks)
p_snowy_given_cooks = len(df[(df['Weather'] == 'Snowy') & (df['Dinner'] == 'Cooks')]) / len(df[df['Dinner'] == 'Cooks'])
p_midnight_given_cooks = len(df[(df['Time'] == 'Midnight') & (df['Dinner'] == 'Cooks')]) / len(df[df['Dinner'] == 'Cooks'])
p_weekday_given_cooks = len(df[(df['Day of the week'] == 'Weekday') & (df['Dinner'] == 'Cooks')]) / len(df[df['Dinner'] == 'Cooks'])




To do :
- Compute the posterior probabilitiesfor both "Orders" and "Cooks" using Bayes' Theorem, excluding the denominator (the evidence) since it's constant for comparison purposes.

- Compute the posterior probability of "Orders" given "Snowy", "Midnight", and "Weekday" :

𝑃(Orders∣Snowy, Midnight, Weekday) ∝ P(Orders) x P(Snowy∣Orders) x P(Midnight∣Orders) x P(Weekday∣Orders)

- Compute the posterior probability of "Cooks" given "Snowy", "Midnight", and "Weekday" :

P(Cooks∣Snowy, Midnight, Weekday) ∝ P(Cooks) x P(Snowy∣Cooks) x P(Midnight∣Cooks) x P(Weekday∣Cooks).

- Predict the decision based on the higher posterior probability. So the decision will be either orders or cooks.

In [16]:
# Calculate the posterior probability of Ordering and Cooking given the features Snowy, Midnight, Weekday
# Since P(Snowy, Midnight, Weekday) is constant for both, we can ignore it for comparing the relative probabilities
posterior_orders = p_orders * p_snowy_given_orders * p_midnight_given_orders * p_weekday_given_orders
posterior_cooks = p_cooks * p_snowy_given_cooks * p_midnight_given_cooks * p_weekday_given_cooks

# Decision based on the highest posterior probability
decision = 'Orders' if posterior_orders > posterior_cooks else 'Cooks'



- Encode the categorical variables from the dataset into numerical values

In [17]:
# Encode categorical variables into numerical values
label_encoders = {}

for column in df.columns:
    label_encoders[column] = LabelEncoder()
    df[column] = label_encoders[column].fit_transform(df[column])
