# Weather condition to play outdoor

# Suppose we have a dataset of weather conditions and the corresponding target variable "Play". So using this dataset we need to decide whether we should play or not on a particular day according to the weather conditions. So to solve this problem, we need to follow the below steps:



# Convert the given dataset into frequency tables.
# Generate a Likelihood table by finding the probabilities of given features.
# Now, use the Bayes theorem to calculate the posterior probability.

# Import  Libraries

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns 
import warnings
warnings.filterwarnings('ignore')

#  Dataset Loading

In [5]:
file_path = r'C:\Users\shubham\Desktop\Tennis.csv'
data = pd.read_csv(file_path)
data.head()

Unnamed: 0,Outlook,Temperature,Humidity,Wind,Play Tennis
0,Sunny,Hot,High,Weak,No
1,Sunny,Hot,High,Strong,No
2,Overcast,Hot,High,Weak,Yes
3,Rain,Mild,High,Weak,Yes
4,Rain,Cool,Normal,Weak,Yes


# The first step is to convert this data set into frequency tables. We will create frequency tables for each feature in the dataset.

In [12]:
# Frequency Table for outlook
outlook_freq = data.groupby('Outlook').size() / len(data)

# Frequency Table for temprature
temp_freq = data.groupby('Temperature').count()['Outlook'] / len(data)

# Frequency Table for humidity
humid_freq = data.groupby('Humidity').size() / data['Humidity'].count()

# Frequency Table for wind
wind_freq = data.groupby('Wind')['Wind'].value_counts(normalize=True)

# Frequency Table for play
play_freq = data.groupby('Play Tennis').size() / len(data)


# Next , we need to generate a likelihood table by finding the probabilities of given features.

In [14]:
# likelihood table for outlook given play
outlook_play_lh = pd.pivot_table(data, index='Outlook', columns='Play Tennis', aggfunc='size', fill_value=0) / len(data)
# likelihood table for tempprature given play
temp_play_lh = pd.pivot_table(data, index='Temperature', columns='Play Tennis', aggfunc='size', fill_value=0) / len(data)
# likelihood table for humidity given play
humid_play_lh = pd.pivot_table(data, index='Humidity', columns='Play Tennis', aggfunc='size', fill_value=0) / len(data)
# likelihood table for wind given play
wind_play_lh = pd.crosstab(index=data['Wind'], columns=data['Play Tennis'], normalize='index')


# Finally , we can use Bayes theorem to calculate the posterior probability of playing on a particular day given the weather conditions.

In [16]:
# posterior probabilty of playing on a particular day given the weather conditions.

In [17]:
def predict(outlook, temp, humidity, wind):
    p_yes = outlook_play_lh.loc[outlook, 'Yes'] * temp_play_lh.loc[temp, 'Yes'] * humid_play_lh.loc[humidity, 'Yes'] * wind_play_lh.loc[wind, 'Yes'] * play_freq['Yes']
    p_no = outlook_play_lh.loc[outlook, 'No'] * temp_play_lh.loc[temp, 'No'] * humid_play_lh.loc[humidity, 'No'] * wind_play_lh.loc[wind, 'No'] * play_freq['No']

    total_probability = p_yes + p_no

    return p_yes / total_probability, p_no / total_probability
