# Introduction to Conditional Probability 

One of the most important concepts in probability is *conditional probability*. This is the part of probability that is about determining how different events can affect each other. For data science/statistics, this is particularly important because we are often trying to determine whether an observed effect depends on some underlying feature or analyzing the nature of such dependence. 


Probability is generally considered a tricky subject, where answers defy initial intuitions, and there are many examples from conditional probability that fit this description. 

Let's start with a simple example that fools many who are new to probability or even those who have some basic knowledge of probability but who haven't studied conditional probability in detail:

**Example: The Magician's Coins**

Suppose you attend a magic show. The magician shows you two coins: a two-headed coin and a fair coin. She asks you to pick one of the coins and flip it at random, observing only the top face. If the outcome of that first flip is Heads, does that affect the probability that Heads would come up if you flip that coin again?

To be able to answer this question, we first need to know what is the probability of getting heads when one of the coins is chosen at random and flipped. I will show you how to easily analyze this probability in a minute, but let's first build a simple simulation to simulate this probability.

As a first step, let's choose a random coin from the list `['fair', '2head']` and print it out for a very small simulation:

In [1]:
import random

In [12]:
def choose_coin(num_sims=10):
    coins=['fair','2head']
    for sim in range(num_sims):
        coin=random.choice(coins)
        print(coin)

In [13]:
one_flip()

2head
fair
fair
fair
2head
fair
fair
fair
fair
2head


Now, we can check the outcome of the coin choice and randomly choose one of the faces of the chosen coin. If you are following along in a separate Jupyter notebook, I suggest you copy the function from above, rename it, and then add the rest of the code (the part below the comment in the function below):

In [14]:
def choose_and_flip(num_sims=10):
    coins=['fair','2head']
    for sim in range(num_sims):
        coin=random.choice(coins)
        # Delete the print statement and add the following:
        if coin=='fair':
            faces=['H','T'] 
        else:
            faces=['H','H']
        value=random.choice(faces)
        print(coin,value)

In [11]:
choose_and_flip()

2head H
fair T
fair T
fair H
fair T
2head H
2head H
2head H
fair T
fair T


By inspection, you should see that the relative frequency of heads is more than 1/2. That should be intuitive, and you may be able to guess the answer.

Finally, let's estimate the probability of heads by determining the relative frequency of 'H'. We initialize a counter called `num_heads` to zero outside the simulation loop and then increment it every time the outcome is `H`. 

**Important: Note that we need to drastically increase the number of simulated coin flips, so we will turn off printing inside the for loop.**

In [15]:
def one_flip(num_sims=100_000):
    coins=['fair','2head']
    
    num_heads=0
    for sim in range(num_sims):
        coin=random.choice(coins)
        if coin=='fair':
            faces=['H','T'] 
        else:
            faces=['H','H']
        value=random.choice(faces)
        if value=='H':
            num_heads+=1
            
    print("Prob. of H is approximately", num_heads/num_sims)

In [17]:
one_flip()

Prob. of H is approximately 0.74921


**JMS: Working here**

How can we calculate this probability using equally likely outcomes?

F: H T

$F^c$: H H

2. The magician withdraws a coin and flips it. If it comes up Heads, what is the probability that a second flip of that same coin will also be Heads?

How can we simulate this?

Do this before Thursday's class. Make a simulation two flips where you count:

* the number of time heads occurred on the first flip of the coin
* the number of times heads occured on a second flip of the coin when heads occurred on the first flip

We are looking for the ratio of the second counter to the first counter. I.e., we are ignoring all those cases where heads did not occur on the first flip because we are only asking about events where heads did occur on the first flip.

In [6]:
def two_flips(num_sims=10000):
    coins=['fair','2head']
    heads_count1=0
    heads_count2=0
    for sim in range(num_sims):
        coin=random.choice(coins)
        if coin=='fair':
            faces=['H','T']
        else:
            faces=['H','H']
        value=random.choice(faces)
        if value=='H':
            heads_count1+=1
            value=random.choice(faces)
            if value=='H':
                heads_count2+=1
    print("Prob. of heads on second flip given heads on first flip is",
          heads_count2/heads_count1)

In [5]:
two_flips()

Prob. of heads on second flip given heads on first flip is 0.8333111259160559


In [1]:
def one_flip2(num_sims=10000):
    coins=['fair','2head']
    heads_count=0
    two_head_count=0
    for sim in range(num_sims):
        coin=random.choice(coins)
        if coin=='fair':
            faces=['H','T']
        else:
            faces=['H','H']
        value=random.choice(faces)
        if value=='H':
            heads_count+=1
            if coin=='2head':
                two_head_count+=1
    print("Prob. of heads is",heads_count/num_sims)
    print("Prob of 2head coin when got heads is", two_head_count/heads_count)



In [5]:
one_flip2(100000)

Prob. of heads is 0.7502
Prob of 2head coin when got heads is 0.6700746467608638
