# Introduction To Probability
## Challenge 1

A and B are events of a probability space with $(\omega, \sigma, P)$ such that $P(A) = 0.3$, $P(B) = 0.6$ and $P(A \cap B) = 0.1$

Which of the following statements are false?
* $P(A \cup B) = 0.6$
* $P(A \cap B^{C}) = 0.2$
* $P(A \cap (B \cup B^{C})) = 0.4$
* $P(A^{C} \cap B^{C}) = 0.3$
* $P((A \cap B)^{C}) = 0.9$

In [None]:
"""

- Statement 1: FALSE
    We can calculate the value as:
    P (A U B) = P(A) + P(B) - P (intersection A and B) --> 0.3 + 0.6 - 0.1 = 0.8

- Statement 2: TRUE
    We calculate it as P(A) minus the intersection --> 0.3 - 0.1 = 0.2

- Statement 3: FALSE
    Intersection of A with everything would equal A (0.3)

- Statement 4: FALSE
    The intersection of not A and not B is the opposite of the union of A and B (i.e. everything
    outside A and B) 
    Thus, we can calculate it as 1 - 0.8 = 0.2

- Statement 5: TRUE
    This includes everything except for the intersection between A and B. 
    We calculate it as 1 - 0.1 = 0.9
    

"""

## Challenge 2
There is a box with 10 white balls, 12 red balls and 8 black balls. Calculate the probability of:
* Taking a white ball out.
* Taking a white ball out after taking a black ball out.
* Taking a red ball out after taking a black and a red ball out.
* Taking a red ball out after taking a black and a red ball out with reposition.

**Hint**: Reposition means putting back the ball into the box after taking it out.

In [None]:
"""
There is a total of: 10 + 12 + 8 = 30 balls in the box. 

The probability of any ball being taken out is equal --> Laplace's Rule applies

The events are independent (choosing one ball does not cause choosing other ball). Therefore, the 
intersection probability applies. 

Note: as there is no reposition (except in the last exercise), the balls taken out must be discounted. 


a. - Probability of taking a white ball: P(A) = n(A)/n(S)
        P(A) = n(A)/n(S)  -->  10/30 = 0.33  --> There's a 33% probability

b. - Probability of taking a white ball after taking a black ball:
        8/30 * 10/29 = 80/870 = 9.2% probability

c. - Probability of taking a red ball after a black and a red ball:
        8/30 * 12/29 * 11/28 = 1056/24366 = 4.33% probability

d. - Red ball after black and red ball (but with reposition):
        8/30 * 12/30 * 12/30 = 1152/27000 = 4.27% probability

"""

## Challenge 3

You are planning to go on a picnic today but the morning is cloudy. You hate rain so you don't know whether to go out or stay home! To help you make a decision, you gather the following data about rainy days:

* Knowing that it is a rainy day, the probability of cloudy is 50%!
* The probability of any day (rainy or not) starting off cloudy is 40%. 
* This month is usually dry so only 3 of 30 days (10%) tend to be rainy. 

What is the probability of rain, given the day started cloudy?

In [None]:
"""
Here we must apply the bayesian probability. 
The hypothesis (the event that happens before) is that the day is cloudy.
The consequence (the event that happens after) is that it rains.

The formula of the bayesian probability is:
P(B | A) = P (A intersection B) / P(A)

We know that the probability of rainy day happening after the day starting cloudy will be calculated as:
    Probability (intersection cloudy and rainy) / probability (cloudy):

First, let's calculate the probability of the intersection (cloudy and rainy):
    The probability of rainy is 10%
    Half of the days that it rains, it is also cloudy.
    Therefore, it is rainy and cloudy 5% of the days --> 0.05 probability

Second, let's take into account the probability of being a cloudy day, which is 40% (0.4)

Therefore, knowing that it is a cloudy day, the probability of rain should be:
    0.05 / 0.4 = 0,125 --> 12.5% chance of rain on a day that starts cloudy

"""

## Challenge 4

One thousand people were asked through a telephone survey whether they thought more street lighting is needed at night or not.

Out of the 480 men that answered the survey, 324 said yes and 156 said no. On the other hand, out of the 520 women that answered, 351 said yes and 169 said no. 

We wonder if men and women have a different opinions about the street lighting matter. Is gender relevant or irrelevant to the question?

Consider the following events:
- The answer is yes, so the person that answered thinks that more street lighting is needed.
- The person who answered is a man.

We want to know if these events are independent, that is, if the fact of wanting more light depends on whether one is male or female. Are these events independent or not?

**Hint**: To clearly compare the answers by gender, it is best to place the data in a table.

In [5]:
# (This is not the most effective way to do it from a Pandas user perspective, as I am 
# hard coding the total amounts manually instead of calculating it automatically 
# - for speed I am copying it from the paper where I worked on the problem first)

import pandas as pd

lst = [['male', 324, 156, 480], ['female', 351, 169, 520],['Total all',675,325,1000]]
survey = pd.DataFrame(lst, columns=['Gender','Yes','No','Total'])
survey

Unnamed: 0,Gender,Yes,No,Total
0,male,324,156,480
1,female,351,169,520
2,Total all,675,325,1000


In [None]:
"""
For two events to be independent, the following conditions must be met:
    - P(A|B) = P(A), P(B|A) = P(B)
    - P(A ∩ B) = P(A)P(B)

Let's take the following two events as per the exercise:
    A --> gender = male
    B --> answer = yes

Looking at the values of the table, let's check the probabilities we need:
    P(A) = 480/1000 = 0.48 probability of surveyed person being male
    P(B) = 675/1000 = 0.675 probability of the answer being yes
    P (A ∩ B) = 324/1000 = 0.324 probability of the surveyed person being male and saying yes

Let's check if the condition P(B|A) = P(B) is met:
    P(B|A) = P(A ∩ B) / P(A) = 0.324 / 0.48 = 0.675
    P (B) = 0.675
Therefore: P(B|A) = P(B) --> True

Same for P(A|B) = P(A)
    P(A|B) = P(A ∩ B) / P(B) = 0.324 / 0.675 = 0.48
    P(A) = 0.48
Therefore: P(A|B) = P(A) --> True

CONCLUSION: the 2 events (gender and answer) are independent. The gender does not reveal any information
about the opinion of the person surveyed. 67.5% of both male and female participants answered positively.

"""