## Key concepts 

In this notebook, you will be introduced to the following concepts:

- Intersections of events 
- Marginal probabilities
- Independence

The notebook also relies on a separate file `independence_code.py` where you will code an experiment to measure independence. Note that not all of the answers are provided in the notebook; we will figure out the missing parts together as a class.

### Warm up discussion questions

_Answer the following questions with a partner in a text file, then turn in your text file to Canvas using today's link. In your text file, you should answer each question with a short sentence, explaining your reasoning._

- If the Broncos throw at least one interception during a game, are they more or less likely to win the game?
- If a person smokes cigarettes, does it make it more or less likely they will get lung cancer? 
- If you flip a coin while walking forwards, are you more likely to get heads then if you flip a coin while walking backwards?

### A quick reminder from earlier

Recall that the sample space $\Omega$ is the set of all possible outcomes of an experiment, and that an event $A$ is a subset of $\Omega$. 

- For instance, if the Buffs play a game on saturday, $A=\{W\}$ might be the event that the Buffs win. Note that $A \subset \Omega = \{W, L\}$ 


#### Class warm up question

Say you flip a coin twice. Use "1" to indicate that you get heads and "0" to indicate that you get tails. So for instance, getting heads and heads would be written as "11". What is the sample space $\Omega$? Let's answer this one together as a class using a whiteboard.

#### Canvas group question

_Using your same text file, answer the following question with a partner, then turn in your answers to Canvas using today's link. Don't delete the earlier part of your text file. Just add your answer to this question at the bottom in a new section._

Say you flip a coin twice. Let $A$ be the event that you flip a heads on your first toss. What is $A$. Recall that $A$ is a set, specifically a subset of $\Omega$. So you should answer this question by defining a set.

Hint: $A = \{"11" ...$ ? 

### Intersections of events 

Because events are just sets, it makes sense to talk about the intersection of two events $A$ and $B$. More formally, let's define $A \cap B$ as the set of all elements in $\Omega$ that are in $A$ and $B$.
   - There is nothing really new here. This is the same set intersection operation we defined earlier in the class. 
   
#### Class question 



Say $A$ is the event that you flip a heads on your first toss. Say $B$ is the event that you flip a heads on your second toss. Let's do the following on a whiteboard.
   - What is $A$?
   - What is $B$? 
   - What is $A \cap B$? Note that because $A$ is a set and $B$ is a set, $A \cap B$ is a set.

### Coding question

In `independence_code.py` you will simulate an experiment in which you flip a coin twice.

#### Step 1
- To get started, fill out the `two_flips` function. This function simulates two coin flips. If you fill out the function correctly, the two corresponding tests will pass. We will check in about this function in a few minutes.

#### Step 2
- After the check in, code the `find_omega` function to identify the set of all possible outcomes of the two flips experiment. The `find_omega` runs the experiment 1000 times. 
    - Note that we are making an assumption that if you run the experiment 1000 times you will see all possible outcomes at least once. (This assumption is reasonable, in this case.) 
- We will check in about this function after you take a crack at coding it.

### Question

Using your `two_flips` function, compute the probability associated with each possible outcome of the experiment. To compute these probabilties, you should run the two flips experiment 1000 times using the code below.

In [11]:
from independence_code_instructor import two_flips

from collections import defaultdict 

outcomes = defaultdict(int)

for i in range(1000):
    # your code here. 
    # 1. run the two flips experiment and 
    # 2. record the outcome in the outcomes dictionary
    pass # delete me

### Question

Present your results from the previous experiment using the table below.

In this table, the rows indicate the value of your coin on the first flip and the columns indicate the value of your coin on the second flip. So for example, the top right corner should be the probability of getting a heads and then getting a tails.

||1| 0|
|:-:|:-:| :-:|
|1 | x | x | 
|0 | x | x | 

### Question

Using your `outcomes` data structure, what is the probability of getting a heads on the first flip? Note that this can happen two ways. You can draw a "10" or you can draw a "11. In other words $p($heads on first flip$)$ = $p(10) \cup p(11)$.

### Marginal probabilities

Notice that to compute the probability of getting a heads on the first flip, you would add across the row in the table above. In olden times, people used to print out tables like this and write the sum of the probabilities in the margins. Hence these kinds of probabilities are called **marginal probabilities**. More formally, the marginal probability of an event $x$ is $P(x) = \sum_{y \in Y} p(x \cap y)$, i.e. the sum of the probabilities of $x$ and other possible events $y$. 

For instance, $p($heads on first flip$)$ = $p($heads on first flip $\cap$ heads on second flip$)$ + $p($heads on first flip $\cap$ tails on second flip$)$. 
   - In this case, $Y = \{H, T\}$ would be the possible outcomes for the second flip.

The distribution of $P(x)$, over all possible values of $x$, is called the **marginal distribution**.

#### Question

Using your `outcomes` data structure, what is the marginal probability of getting a tails on the second flip?

### Independence

We are now ready to introduce the idea of independence. 

- Two events are **independent** if the probability of both events occurring is the same as the marginal probability of one event times the marginal probability of the second event. 

    - You can think about independence a little more **loosely**. If one event happening does not make it more or less likely that the other event happens, the two events are independent.

    - You can also think about independence a little more **formally**. If $p(A \cap B) = p(A)p(B)$ then $A$ and $B$ are independent. 

##### Notation for independence
- We write $A \perp \!\!\! \perp B$ to say that $A$ and $B$ are independent.

#### Question

_Using your same text file, answer the following question with a partner, then turn in your answers to Canvas using today's link. Don't delete the earlier part of your text file. Just add your answer to this question at the bottom in a new section._

Using the definition of independence above, and your prior answers, do you think the two events in our experiment (i.e. flip 1 and flip 2) are independent? 

### But what about real data?

There is something a little funny about this notebook. We coded this notebook so that the probablity of each coin flip did not depend on any of the other coin flips. So of course we could compute that the flips were independent. If this feels a little circular to you, you're right!

In real life, we don't usually simulate data with a computer. Instead, we just observe data in the world and have to make our best guess about how it is created or generated. This is sometimes called the [data generating process](https://en.wikipedia.org/wiki/Data_generating_process). 

Next time, we will look at crime data from the Boulder police. Specifically, we will look at the race and gender of people who are stopped by the Boulder police and try to determine if these variables are independent by estimating the marginal distributions based on data.