Solution to: [Day 3: Conditional Probability](https://www.hackerrank.com/challenges/s10-mcq-4/problem)

<h1 id="tocheading">Table of Contents</h1>
<div id="toc"></div>

In [1]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')

<IPython.core.display.Javascript object>


This script contains 3 sections:
	- Notes on Conditional Probability
	- Sample Problems
	- Math solution to Conditional Probability Challenge
	- Monte Carlo simulation of Problem


# Notes on Conditional Probability

## Conditional Probability
**Conditional probability** is defined as the probability of an event occurring, assuming that one or more other events have already occurred. Two events, A and B are considered to be independent if event A has no effect on the probability of event B (i.e. P(B | A) = P(A)).

If events A and B are not independent, then we must consider the probability that both events occur. 
This can be referred to as the intersection of events A and B:

\begin{equation}
\large
P(A \cap B) = P(B | A) * P(B)
\end{equation}


We can then use this definition to find the conditional probability by dividing the probability of the intersection of the two events (A $\cap$ B) by the probability of the event that is assumed to have already occurred (event A):

\begin{equation}
\large
P(B | A) = 
\frac
{P(A \cap B)}
{P(A)}
\end{equation}

## Bayes' Theorem
Let A and B be two events such that P(A | B) denotes the probability of the occurrence of A given that B has occurred and P(B | A) denotes the probability of the occurrence of B given that A has occurred, then:

\begin{equation}
\large
P(A | B) = 
\frac{ P(B | A) * P(A)}
{P(B)}
\end{equation}

\begin{equation}
\large
= \frac
{P(A | B) * P(A)}
{P(B | A) * P(A) + P(B | A_{C}) * P(A_{C})}
\end{equation}


*c subscript indicates complement.*

# Sample Problems
## Sample Problem 1

Question 1
If the probability of student A passing an exam is 2/7 and the probability of student B *failing* the exam is 3/7, 
then find the probability that at least 1 of the 2 students will pass the exam.

We are given P(A) and P(Bc).


There are 4 possible events in our sample space:

1. A passes the exam and B fails P(A ∩ Bc).
2. B passes the exam and A fails P(B ∩ $A_{C}$).
3. A and B both pass the exam P(A ∩ B).
4. A and B both fail the exam P(Ac ∩ Bc).

We are interested in 3 of these events: 1, 2, and 3.

### Approach 1: Use first 3 events
Calculate the probability of events 1 - 3:

\begin{equation}
\large
P(\text{1+ students pass}) = P(\text{S1 passes}) + P(\text{S2 passes}) - P(\text{S1 & S2 pass})
\end{equation}

\begin{equation}
\large
= \frac{2}{7} + \frac{4}{7} - (\frac{2}{7} * \frac{4}{7})
\end{equation}

\begin{equation}
\large
=\frac{42}{49} - \frac{8}{49}
\end{equation}

\begin{equation}
\large
\frac{34}{49}
\end{equation}

### Approach 2: Use 4th event
Calculate the probability that both will fail the exam, and subtract that from the problem space P(S) to find the complement:

\begin{equation}
\large
P(\text{1+ students pass}) = P(S) - P({both fail})
\end{equation}

\begin{equation}
\large
= 1 - (\frac{5}{7} * \frac{3}{7})
\end{equation}

\begin{equation}
\large
= 1 - \frac{15}{49}
\end{equation}

\begin{equation}
\large
= \frac{34}{49}
\end{equation}

## Sample Problem 2

Historical data shows that it has only rained 5 days per year in some desert region (assuming a 365 day year). A meteorologist predicts that it will rain today. 

- When it actually rains, the meteorologist correctly predicts rain 90% of the time. 
- When it doesn't rain, the meteorologist incorrectly predicts rain 10% of the time. 

Find the probability that it will rain today.


In this question, the probability of rain today depends on whether or not the meteorolgist predicted it will rain today.
We define the following events:

- Event R: It rains today. P(R) =  5/365 = 1/73
- Event Rc: It doesn't rain today. P(R) = 360/365 = 72/73
- Event M: The meteorologist predicted it will rain today:
    - P(M | R) = 9/10
    - P(M | Rc) = 1/10

Now we want to find the value of P(R|M):

### Bayes Theorem solution:
Given that P(E) is the probability of evidence (a.k.a. P(S) = 1)

\begin{equation}
\large
P(R | M) = \frac
{P(M|R) * P(R)}
{P(E)}
\end{equation}

\begin{equation}
\large
= \frac
{P(M | R) * P(R)}
{P(M | R) * P(R) + P(M | R_{C}) * P(R_{C}}
\end{equation}

\begin{equation}
\large
= \frac
{ \frac{9}{10} * \frac{1}{73} }
{\frac{9}{10} * \frac{1}{73} + \frac{1}{10} * \frac{72}{73}  }
\end{equation}

\begin{equation}
\large
= \frac
{\frac{9}{730}}
{\frac{9}{730} + \frac{72}{730} }
\end{equation}

\begin{equation}
\large
= \frac
{9}
{81}
\end{equation}


\begin{equation}
\large
\frac{1}{9}
\end{equation}


# Math solution to Conditional Probability Challenge

Suppose a family has 2 children, one of which is a boy. What is the probability that both children are boys?

*Note*: This is a [famous question in probability theory](https://en.wikipedia.org/wiki/Boy_or_Girl_paradox)

The wording on this question is a little difficult, but the logic followed makes sense:

Possible families:
- 1	:	(B, B)
- 2	:	(B, G)
- 3	:	(G, B)
- 4	:	(G, G)

We select only the families with at least one boy, thus we have 3 options:
- 1	:	(B, B)
- 2	:	(B, G)
- 3	:	(G, B)

What is the probability that we select the family with both boys?
1/3!

# Monte Carlo Solution


## Imports

In [2]:
import random
from typing import List, Set

## Constants

In [3]:
CHILDREN = 2
TYPES = ['B', 'G']
NEEDED_GENDER = 'B'
DESIRED_GENDER = 'G'

## Create families

In [4]:
def create_all_possible_families(num_children: int, child_types: List[str]) -> List[List[str]]:
	"""Returns list of list of all possible family combinations.
	
	Referenced: https://www.geeksforgeeks.org/print-all-possible-combinations-of-r-elements-in-a-given-array-of-size-n/
	"""
	def helper(combos, combo) -> list:
		"""Helper recursive function to return all possible family combinations."""
		if len(combo) == num_children:		## Base case
			combos.append(combo)
			return combos
		
		for child in child_types:
			combos = helper(
				combos,
				combo + [child])
		return combos
	

	return helper([], [])

In [5]:
def remove_without_gender(families: List[List[str]], needed_gender: str) -> List[List[str]]:
	"""Returns families after removing the needed gender from the family.
	
	If the gender is not inside the family, remove the gender.
	"""
	removed = 0
	for i in range(len(families)):
		family = families[i - removed]
		flag_removed_gender = False

		for j in range(len(family)):
			if family[j] == needed_gender:
				del family[j]
				flag_removed_gender = True
			
			if flag_removed_gender:
				break
		
		if not flag_removed_gender:
			del families[i - removed]
			removed += 1

	return families

In [6]:
def get_ratio(families: List[List[str]], gender: str, iterations: int) -> float:
	"""Returns probabily that a child picked from the families is the specified gender."""
	number_picked = 0
	for _ in range(iterations):
		family = random.choice(families)
		child = random.choice(family)
		
		if child == gender:
			number_picked += 1

	return number_picked / iterations

## Main

In [7]:
def main():
	possible_families = create_all_possible_families(
		num_children= CHILDREN,
		child_types= TYPES
	)

	allowed_families = remove_without_gender(possible_families, NEEDED_GENDER)

	iterations = 100000
	ratio = get_ratio(allowed_families, NEEDED_GENDER, iterations)
	print(ratio)		# approximates 1/3

In [8]:
if __name__ == "__main__":
	main()

0.33297
