# Weather condition to play outdoor

# Problem statement

Suppose we have a dataset of weather conditions and the corresponding target variable "Play". So using this dataset we need to decide whether we should play or not on a particular day according to the weather conditions. So to solve this problem, we need to follow the below steps:



1. Convert the given dataset into frequency tables.
2. Generate a Likelihood table by finding the probabilities of given features.
3. Now, use the Bayes theorem to calculate the posterior probability.

In [1]:
import pandas as pd
data = pd.DataFrame({
    'Weather': ['Rainy','Sunny', 'Overcast','Overcast','Sunny','Rainy','Sunny','Overcast','Rainy','Sunny','Sunny','Rainy','Overcast','Overcast']
                ,
    'Play': ['Yes','Yes',"Yes",'Yes','No','Yes','Yes','Yes','No','No','Yes','No','Yes','Yes']})
data

Unnamed: 0,Weather,Play
0,Rainy,Yes
1,Sunny,Yes
2,Overcast,Yes
3,Overcast,Yes
4,Sunny,No
5,Rainy,Yes
6,Sunny,Yes
7,Overcast,Yes
8,Rainy,No
9,Sunny,No


# Frequency Tables

In [2]:
weather_freq = data['Weather'].value_counts()
play_freq = data['Play'].value_counts()

In [3]:
print(weather_freq,'\n\n',play_freq)

Sunny       5
Overcast    5
Rainy       4
Name: Weather, dtype: int64 

 Yes    10
No      4
Name: Play, dtype: int64


# likelihood table | conditional proablity

In [4]:
likelihood_table = pd.crosstab(data['Weather'],data['Play'],normalize= 'index')
likelihood_table

Play,No,Yes
Weather,Unnamed: 1_level_1,Unnamed: 2_level_1
Overcast,0.0,1.0
Rainy,0.5,0.5
Sunny,0.4,0.6


# Prior Probablity of playing outside

In [5]:
prior_play_yes = play_freq['Yes']/len(data)                   
prior_play_no = play_freq['No']/len(data)

# P(A)

In [6]:
prior_play_yes

0.7142857142857143

# evidence   | P(B)

In [7]:
evidence =  likelihood_table.multiply(prior_play_yes, axis=1).sum(axis = 0)

# Posterior Probablity

In [8]:
post_prob_yes_sunny = (likelihood_table.loc['Sunny','Yes'] * prior_play_yes) / evidence['Yes']
post_prob_yes_overcast = (likelihood_table.loc['Overcast','Yes'] * prior_play_yes) / evidence['Yes']
post_prob_yes_rainy = (likelihood_table.loc['Rainy','Yes'] * prior_play_yes) / evidence['Yes']

# Result

In [9]:
print("Posterior probabilities:")
print("P(Play = Yes | Weather = Sunny):", post_prob_yes_sunny)
print("P(Play = Yes | Weather = Overcast):", post_prob_yes_overcast)
print("P(Play = Yes | Weather = Rainy):", post_prob_yes_rainy)

Posterior probabilities:
P(Play = Yes | Weather = Sunny): 0.2857142857142857
P(Play = Yes | Weather = Overcast): 0.4761904761904762
P(Play = Yes | Weather = Rainy): 0.2380952380952381
