# Chapter 2: Conditional Probability

In [None]:
# The source of the content is freely available online
# https://drive.google.com/file/d/1VmkAAGOYCTORq1wxSQqy255qLJjTNvBI/view
# https://projects.iq.harvard.edu/stat110/

<h4>Definition 2.2.1 (Conditional Probability)</h4>

If $A$ and $B$ are events with $P(B) \gt 0$, then the conditional probability of $A$ given $B$, $P(A|B)$, is:

$P(A|B) = \frac{P(A \cap B)}{P(B)}$

$A$ is the prior probability of $A$ and $P(A|B)$ the posterior probability of $A$.

<h4>Example 2.2.2. (Two Cards)</h4>

Two cards are drawn randomly, one at a time without replacement. Let $A$ be the event that the first card is a heart, and $B$ be the event that the second card is red. Find $P(A|B)$ and $P(B|A)$.

Answer:

By the naive definition of probability and the multiplication rule:

$P(A \cap B) = \frac{13 \cdot 25}{52 \cdot 51} = \frac{25}{204}$

since a favorable outcome is determined by choosing any of the $13$ hearts and then any of the remaining $25$ red cards. Also, $P(A) = \frac{1}{4}$ since the $4$ suits are equally likely, and 

$P(B) = \frac{26 \cdot 51}{52 \cdot 51} = \frac{1}{2}$

since there are $26$ favorable possibilities for the second card, and for each of those, the first card can be any other card.

A neater way to see that $P(B) = \frac{1}{2}$ is by symmetry - from a vantage point before having done the experiment, the second card is equally likely to be any card in the deck.

We now have all the pieces needed to apply the definition of conditional probability.

$P(A|B) = \frac{P(A \cap B)}{P(B)} = \frac{25/204}{1/2}$
$P(B|A) = \frac{P(B \cap A)}{P(A)} = \frac{25/204}{1/4} = \frac{25}{51}$

There are several things worth noting:

1. It's important to be careful about which events to put on which side of the conditioning bar, $P(A|B) \neq P(B|A)$. Confusing these two quantities is the 'prosecutor's fallacy'.

2. The chronological order in which cards were chosen does not dictate which conditional probabilities we can look at. Imagine that someone spreads out the cards and draws one cared with their left hand and another card with their right hand, at the same time. Defining $A$ and $B$ based on the right hand's card rather than the first and second card would not change the structure of the problem in any important way.

3. We can see that $P(B|A) = \frac{25}{51}$ by a direct interpretation of what conditional probability means: if the first card drawn is a heart, then the remaining cards consist of $25$ red cards and $26$ black cards, and the conditional probability of getting a red card is $25/(25+26) = 25/51$.

<h4>Example 2.2.5 (Two Children)</h4>

Mr. Jones has two children. the older child is a girl. What is the probability that both children are girls?

Mr. Smith has two children. At least one of them is a boy. What is the probability that both children are boys?

Answer:

The definition of conditional probability gives:

$P(\text{both girls | elder is girl}) = \frac{ P(\text{both girls, elder is girl}) }{ P(\text{elder is girl}) } = \frac{1/4}{1/2} = \frac{1}{2}$

$P(\text{both girls | at least one girl}) = \frac{P(\text{both girls, at least one girl})}{P(\text{at least one girl})} = \frac{1/4}{3/4} = \frac{1}{3}$

By symmetry:

$P(\text{both girls | younger is girl}) = P(\text{both girls | elder is girl}) = \frac{1}{2}$

However, there is no such symmetry between the conditional probabilities $P(\text{both girls | elder is girl})$ and $P(\text{GG | at least one girl})$. Saying that the elder is a girl designates a specific child, and then the other child (the younger) has a $50\%$ chance of being a girl.

Conditioning on a specific child being a girl knocks away $2$ of the $4$ 'pebbles' in the sample space of $\{GG, GB, BG, BB\}$, where $GB$ means the elder child is a girl and the younger child is a boy. In contrast, conditioning on at least once child being a girl knocks away only BB.

<h4>Example 2.2.6 (Random Child is a Girl)</h4>

A family has two children. You randomly run into one of the two, and learn that she is a girl. What is the conditional problem that both are girls?

Let $G_1$, $G_2$, and $G_3$ be the events that the elder, younger, and random child is a girl, respectively. By assumption, $P(G_1) = P(G_2) = P(G_3) = 1/2$. By the naive definition of probability, we have:

$P(G_1 \cap G_2 | G_3) = P(G_1 \cap G_2 \cap G_3) / P(G_3) = \frac{1}{4} \frac{1}{2} = \frac{1}{2}$

because $G_1 \cap G_2 \cap G_3 = G_1 \cap G_2$. If both children are girls, it guarantees that the random child is a girl.

Keep in mind that to arrive at $\frac{1}{2}$, an assumption is needed about how the random child was selected. We have collected a random sample.

<h4>Example 2.2.7 (A Girl Born in Winter)</h4>

A family has two children. Find the probability that both children are girls, given that at least one of the two is a girl who was born in winter. Assume gender is independent of season.

Answer:

$P(\text{both girls | at least one winter girl}) = \frac{ P(\text{both girls, at least one winter girl}) }{P(\text{at least one winter girl})}$

Thus:

$P(\text{both girls | at least one winter girl}) = \frac{ \left( \frac{1}{4} \right) \left( 1-\left(\frac{3}{4} \right)^2 \right) }{ 1-(7/8)^2 } = \frac{7/64}{15/64} = \frac{7}{15}$

The result in example 2.2.5 is that the conditional probability of both children being girls, given that at least one is a girl, is $\frac{1}{3}$. Why should it be any different when we learn that at least one is a winter-born girl?

Conditioning on more and more specific information brings the probability closer and closer to $\frac{1}{2}$.

Condtioning on "at least one girl born on March 31" comes very close to specifying a child. The seemingly irrelevant information such as season of birth interpolates between the two parts of example 2.2.5.

<h3>Bayes' Rule and the LOTP</h3>

<h4>Theorem 2.3.1 (Probability of the Intersection of Two Events)</h4>

For any events $A$ and $B$ with positive probabilities:

$P(A \cap B) = P(B)P(A|B) = P(A) P(B|A)$

This follows from taking the definition of $P(A|B)$ and multiplying both sides by $P(B)$, and then taking the definition of $P(B|A)$ and multiplying both sides by $P(A)$.

Applying theorem 2.3.1 repeatedly, we can generalize to the intersection of $n$ events.

<h4>Theorem 2.3.2 (Probability of the Intersection of $n$ Events)</h4>

For any events $A_1, \ldots, A_n$ with $P(A_1, A_2, \ldots, A_{n-1}) > 0$, 

$P(A_1, A_2, \ldots, A_n) = P(A_1) P(A_2 | A_1) P(A_3 | A_1, A_2) \ldots P(A_n | A_1, \ldots, A_{n-1})$

<h4>Theorem 2.3.3. (Bayes' Rule)</h4>

Bayes' Rule can be derived as follows:

$P(A|B)P(B) = P(B|A)P(A)$
$P(A|B) = \frac{ P(B|A) P(A) }{P(B)}$

Another way to work with Bayes' rule is in terms of odds rather than probability.

$\frac{P(A|B)}{P(A^C | B)} = \frac{P(B|A)}{P(B|A^C)} \frac{P(A)}{P(A^C)}$

<h4>Theorem 2.3.6 (Law of Total Probability)</h4>

Let $A_1, \ldots, A_n$ be a partition of the sample space $S$ (i.e., the $A_i$ are disjoint events and their union is $S$), with $P(A) \gt 0$ for all $i$. Then, $P(B) = \sum_{i=1}^n P(B|A_i)P(A_i)$.

<h4>Example 2.3.7 (Random Coin)</h4>

You have one fair coin and one biased coin which lands heads with probability $\frac{3}{4}$. You pick one of the coins at random and flip it $3$ times. It lands heads all $3$ times. What is the probability that the coin you picked is the fair one?

Let $F$ be the event that the fair coin is picked, and and $A$ be the event that $3$ heads are flipped.

$P(F|A) = \frac{ P(A|F) P(F) }{P(A)}$

$P(F|A) = \frac{ P(A|F P(F)) }{ P(A|F) P(F) + P(A|F^C) P(F^C) }$

$P(F|A) = \frac{ \left( \frac{1}{2} \right)^3 \cdot \frac{1}{2} }{\left( \frac{1}{2} \right)^3 \cdot \frac{1}{2}} + \left( \frac{3}{4} \right)^3 \cdot \frac{1}{2} \approx 0.23$

<h4>Example 2.3.9 (Testing for a Rare Disease)</h4>

Fred is tested for a disease which afflicts $1\%$ of the population, and the test is positive. Let $D$ be the event that Fred has the disease, and $T$ be the event that he tests positive.

The test is $95\%$ accurate, in this case meaning $P(T|D) = 0.95 and P(T^C|D^C) = 0.95$ (the quantity $P(T|D)$ is the sensitivity or $TPR$, and $P(T^C|D^C)$ is known as the specificity or $TNR$)

Find the conditional probability that Fred has the disease, given the positive test result.

Answer:

Applying Bayes' rule and the law of total probability:

$P(D|T) = \frac{ P(T|D) P(D) }{P(T)}$
$P(D|T) = \frac{ P(T|D) P(D) }{ P(T|D)P(D) + P(T|D^C) P(D^C) }$
$P(D|T) = \frac{ 0.95 \cdot 0.01 }{ 0.95 \cdot 0.01 + 0.05 \cdot 0.99 } \approx 0.16$

For intuition, consider a population of $10,000$ people, where $100$ have the disease and $9,900$ do not. If we tested everybody in the population, we'd expect that out of the 100 diseased individuals, 95 would test positive and $5$ would test negative. Out of the $9,900$ healthy individuals, we'd expect $(0.95)(9900) \approx 9405$ to test negative and $495$ to test positive. The $95$ TPs are far outnumbered by the $495$ FPs, so most people who test positive don't actually have the disease.

**img, pg 57**


Since all probabilities are conditional on background information, we can imagine that there is always a vertical conditioning bar, and the unconditional probability $P(A)$ is just shorthand for $P(A|B)$.

<h4>Theorem 2.4.2 (Bayes' Rule with Extra Conditioning)</h4>

Provided that $P(A \cap E) \gt 0$ and $P(B \cap E) \gt 0$, we have:

$P(A|B,E) = \frac{ P(B|A,E) P(A|E) }{ P(B|E) }$

<h4>Theorem 2.4.3 (Law of Total Probability with Extra Conditioning)</h4>

Let $A_1, \ldots, A_n$ be a partition of $S$. Provided that $P(A_i \cap E) \gt 0$ for all $i$, we have all the information, it does not matter whether we update sequentially or simultaneously. If we're conducting a week-long experiment, that yields data at the end of each day, we could use Bayes' rule every day to update our probabilities based on the data from that day, or update using the entire week's worth of data at the end of the week.

<h4>Example 2.4.4 (Random Coin Continued)</h4>

In example 2.3.7, we had a fair coin and a biased coin, and we picked one at random and flipped it $3$ times.

Suppose we have now seen our chosen coin land heads $3$ times. If we toss the coin a fourth time, what is the problem it will land heads once more?

Answer:

Let $A$ be the event that the chosen coin lands heads $3$ times, and define a new event $H$ for the chosen coin landing heads on the fourth toss.

We are interested in $P(H|A)$, and the law of total probability with extra conditioning gives us $P(H|A)$ as a weighted average of $P(H|F,A)$ and $P(H|F^C, A)$, so we can calculate the probability that we have the fair coin.

$P(H|A) = P(H|F,A) P(F|A) + P(H|F^C,A) P(F^C|A) \approx \frac{1}{2} \cdot 0.23 + \frac{3}{4} \cdot (1-0.23)$

The posterior probabilities $P(F|A)$ and $P(F^C|A)$ are from our answer to example 2.3.7.

<h3>2.5 Independence of Events</h3>

Events $A$ and $B$ are independent if:

$P(A \cap B) = P(A)P(B) or P(A|B) = P(A)$

i.e., two events are independent if we can obtain the probability of their intersection by multiplying their individual probabilities, or alternatively, if learning that $B$ occurred gives us no information that would change our probability of $A$ occurring.

If $A$ and $B$ are independent, then $A$ and $B^C$ are independent, $A^C$ and $B$ are indpendent, and $A^C$ and $B^C$ are independent.

<h4>Definition 2.5.6 (Independence of Many Events)</h4>

For $n$ events $A_1, A_2, \ldots, A_n$ to be independent, we require any pair to satisfy $P(A_i \cap A_j) = P(A_i) P(A_j) P(A_k)$, for distinct $i$, $j$, and $k$.

And so on... for infinitely many events, we say that they are independent if every finite subset of the events is independent.

It is easy to confuse independence with conditional independence.
- Two events can be conditionally independent given $B$, but not independent given $E^C$
- Two events can be conditionally independent given $E$, but not independent 
- Two events can be independent, but not conditionally independent given $E$

<h4>Example 2.5.9 (Conditional Independence Given $E$ vs. Given $E^C$)</h4>

Suppose there are two types of classes: good classes and bad classes. In good classes, hard work likely leads to a grade of $A$, whereas in bad classes, the professor randomly assigns grades. Let $G$ be the event that a class is good, $W$ be the event that you work hard, and $A$ be the event that you receive an $A$. Then $W$ and $A$ are conditionally independent given $G^C$, but not conditionally independent given $G$.

<h4>Example 2.5.10 (Conditional Independence Doesn't Imply Independence)</h4>

[Returning to the scenario for 2.3.7]: 

<i>"You have one fair coin and one biased coin which lands heads with probability $\frac{3}{4}$. You pick one of the coins at random and flip it $3$ times. It lands heads all $3$ times. What is the probability that the coin you picked is the fair one? Let $F$ be the event that the fair coin is picked, and and $A$ be the event that $3$ heads are flipped."</i>

Suppose we have chosen either a fair coin or biased coin with probability $3/4$ of $H$, but we do not know which one we have chosen. We flip the coin a number of times. Conditional on choosing the fair coin, the coin tosses are independent, with each toss having probability $1/2$ of heads. Conditional on choosing the biased coin, the tosses are independent, each with probability $3/4$ of heads.

The coin tosses are not unconditionally independent, because if we don't know which coin we've chosen, observing the sequence of tosses gives us information about whether we have the fair coin.

To state this formally, let $F$ be the event that we've chosen the fair coin, and let $A_1$ and $A_2$ be the events that the first and second coin tosses land heads. Conditional on $F$, $A_1$ and $A_2$ are independent, but $A_1$ and $A_2$ are not unconditionally independent because $A_1$ provides information about $A_2$.

<h4>Example 2.5.11 (Independence Doesn't Imply Conditional Independence)</h4>

My friends Alice and Bob are the only two people who ever call me on the phone. Each day, they independently decide whether to call. Let $A$ be the event that Alice calls me next Friday and $B$ be the event that Bob calls next Friday. Assume $A$ and $B$ are unconditionally independent with $P(A) \gt 0$ and $P(B) \gt 0$.

However, given that I receive exactly one call next Friday, $A$ and $B$ are no longer independent. The call is from Alice if and only if it is not from Bob. i.e., letting $C$ be the event that I receive exactly one call next Friday, $P(B|C) \gt 0$ while $P(B|A,C) = 0$, so $A$ and $B$ are not conditionally independent given $C$.

<h4>Example 2.5.12 (Baby Crying)</h4>

**can be found in notes?**

<h4>2.6 Coherency of Bayes' Rule</h4>

Bayes' rule is coherent, meaning that if we receive multiple pieces of information and wish to update our probabilities to incorporate all the information, it does not matter whether we update sequentially or all at once. If we're conducting a weeklong experiment that yields data at the end of each day, we could use Bayes' rule every day to update our probabilities based on the data from that day, or using the entire week's worth of data at the end of the week.

<h4>Example 2.6.1 (Testing for a Rare Disease Continued)</h4>

**in notes?**

<h1>Exercises</h1>

<h3>Conditioning on Evidence</h3>

<h4>Exercise 1</h4>

A spam filter is designed by looking at commonly occurring phrases in spam. Suppose that $80/%$ of email is spam. In $10/%$ of the spam emails, the phrase "free money" is used, whereas this phrase is only used in $1/%$ of non-spam emails. A new email arrives which mentions "free money". What is the probability that it is spam?

<i>Answer:</i>

Let $S$ be the event that an email is spam and $F$ be the event that an email has the "free money" phrase. By Bayes' rule:

$P(S|F) = \frac{ P(F|S)P(S) }{P(F)}$
$P(S|F) = \frac{ 0.1 \cdot 0.8 }{ 0.1 \cdot 0.8 + 0.01 \cdot 0.2 }$
$P(S|F) = \frac{80/1000}{82/1000} = \frac{80}{82} \approx 0.9756$

<h4>Exercise 2</h4>

A woman is pregnant with twin boys. Twins may be either identical or fraternal. In general, $\frac{1}{3}$ or twins born are identical. Fraternal twins may or may not be of the same sex (identical twins cannot be). What is the probability that the woman's twins are identical?

<i>Answer:</i>

$P(Identical|BB) = \frac{ P(BB|Identical)P(Identical) }{P(BB)}$

$P(Identical|BB) = \frac{ \frac{1}{2} \cdot \frac{1}{3} }{ \frac{1}{2} \cdot \frac{1}{3} + \frac{1}{4} \cdot \frac{2}{3} } = \frac{1}{2}$

<h4>Exercise 22</h4>

A bag contains one marble which is either green or blue, with equal probabilities. A green marble is put in the bag (so there are $2$ marbles now), and then a random marble is taken out. The marble taken out is green. What is the probability that the remaining marble is also green?

<i>Answer:</i>

Let $A$ be the event that the initial marble is green, $B$ be the event that the removed marble is green, and $C$ be the event that the remaining marble is green. We need to find $P(C|B)$, and one natural way is to condition on whether the initial marble is given.

$P(C|B) = P(C|B,A) P(A|B) + P(C|B,A^C) P(A^C|B)$
$P(C|B) = 1 P(A|B) + 0 P(A^C|B)$

To find $P(A|B)$, use Bayes' rule.

$P(A|B) = \frac{ P(B|A) P(A) }{P(B)}$

$P(A|B) = \frac{ \frac{1}{2} }{ P(B|A) P(A) + P(B|A^C) P(A^C) }$

$P(A|B) = \frac{ \frac{1}{2} }{ \frac{1}{2} + \frac{1}{4} } = \frac{2}{3}$

So $P(C|B) = \frac{2}{3}$

<h4>Exercise 23</h4>

Let $G$ be the event that a certain individual is guilty of a certain robbery. In gathering evidence, it is learned that an event $E_1$ occurred, and a little later it is also learned that another event $E_2$ also occurred. Is it possible that individually, these pieces of evidence increase the chance of guilt (so $P(G|E) \gt P(G)$ and $P(G|E_2) \gt P(G))$, but together they decrease the chance of guilt (so $P(G|E_1, E_2) \lt P(G))$?

<i>Answer:</i>

Yes, it is possible to have two events which separately provide evidence in favor of $G$, yet which together preclude $G$.

For example, suppose that the crime was committed between 1pm and 3pm on a certain day. Let $E_1$ be the event that the suspect was at a nearby coffeeshop from 1pm to 2pm that day, and $E_2$ be the event that the suspect was at the nearby coffeeshop from 2pm to 3pm that day.

Then $P(G|E_1) \gt P(G)$ and $P(G|E_2) \gt P(G)$, yet $P(G|E_1 \cap E_2) \lt P(G)$, as being in the coffeehouse from 1pm to 3pm gives an alibi for the whole time.

<h4>Exercise 25</h4>

A crime is committed by one of two suspects, A and B. Initially, there is equal evidence against both of them. In further investigation, it is found that the guilty party had a blood type found in $10\%$ of the population. Suspect $A$ does match this blood type, whereas the blood type of suspect $B$ is unknown.

<b>Part A:</b>

Given this new information, what is the probability that $A$ is the guilty party?

Answer:

Let $M$ be the event that $A$'s blood type matches the guilty party's. Let $A$ mean $A$ is guilty, and $B$ mean $B$ is guilty. By Bayes' rule:

$P(A|M) = \frac{ P(M|A)P(A) }{ P(M|A)P(A) + P(M|B)P(B) }$

$P(A|M) = \frac{ \frac{1}{2} }{ \frac{1}{2} + \left( \frac{1}{10} \right) \left( \frac{1}{2} \right) } = \frac{10}{11}$

We have $P(M|B) = \frac{1}{10}$ since, given that $B$ is guilty, the probability that $A$'s blood type matches the guilty party's is the same probability as for the general population.


<b>Part B:</b>

What is the probability that $B$'s blood type matches that found at the scene?

Let $C$ be the event that $B$'s blood type matches, and condition on whether $B$ is guilty. This gives:

$P(C|M) = P(C|M,A) P(A|M) + P(C|M,B) P(B|M)$

$P(C|M) = \frac{1}{10} \cdot \frac{10}{11} + \frac{1}{11} = \frac{2}{11}$

<h4>Exercise 26</h4>

Bob installs two anti-spam programs. An email arrives, which is either legitmate (event $L$) or spam (event $L^C$), and which program $j$ marks as legitimate (event $M_j$) or spam (event $M_j^C$). Assume that $10\%$ of Bob's email is legitimate and that the two programs are each $90\%$ accurate, i.e. $P(M_j|L) = P(M_j^C|L^C) = 0.9$. Assume that given whether an email is spam, the two program's outputs are conditionally independent.

<b>Part A:</b>

Find the probability that the email is legitimate, given that the first program marks it as legitimate.

<i>Answer:</i>

$P(L|M_1) = \frac{ P(M_1|L) P(L) }{P(M_1)}$

$P(L|M_1) = \frac{ \frac{9}{10} \cdot \frac{1}{10} }{ \frac{9}{10} \cdot \frac{1}{10} + \frac{1}{10} \cdot \frac{9}{10} } = \frac{1}{2}$


<b>Part B:</b>

Find the problem that the email is legitimate, given that both programs mark it as legitimate.

$P(L|M_1, M_2) = \frac{ P(M_1,M_2|L) P(L) }{ P(M_1,M_2) }$

$P(L|M_1, M_2) = \frac{ \left( \frac{9}{10} \right) \cdot \frac{1}{10} }{ \left( \frac{9}{10} \right)^2 \cdot \frac{1}{10} + \left( \frac{1}{10} \right) \frac{9}{10} } = \frac{9}{10}$

<h3>Independence and Conditional Independence</h3>

<h4>Exercise 30</h4>

A family has 3 children, $A$, $B$, and $C$.

<b>Part A:</b>

Discuss wehether the event "A is older than B" is independent of the event "A is older than C".

<i>Answer:</i>

They are not independent. Knowing $A$ is older than B makes it more likely that $A$ is older than $C$, as the only way that A can be younger than $C$ is then if the birth order $ABC$ and $ACB$ are both compatible with $A$ being older than $B$.

To make this more intuitive, think of an extreme case where there are $100$ children instead of $3$, $A_1, \ldots, A_{100}$. Given that $A_1$ is older than all of $A_2, A_3, \ldots, A_{99}$, it's clear that $A_1$ is very old (relatively), whereas there isn't evidence about where $A_{100}$ fits into the birth order.

<b>Part B:</b>

Find the probability that $A$ is older than $B$, given that $A$ is older than $C$.

<i>Answer:</i>

Writing $x \gt y$ to mean that $x$ is older than $y$.

$P(A \gt B | A \gt C) = \frac{ P(A \gt B, A \gt C) }{ P(A \gt C) } = \frac{1/3}{1/2} = \frac{2}{3}$

since $P(A \gt B, A \gt C) = P(\text{A is oldest}) = \frac{1}{3}$. Unconditionally, any of the three children is likely to be oldest.

<h4>Exercise 31</h4>

Is it possible that an event is independent of itself?

Answer:

If $A$ is independent of itself, $P(A) = P(A \cap A) = P(A)^2$, therefore this is only possible in the extreme cases where $P(A)=0$ or $P(A)=1$.

<h4>Exercise 35</h4>

You are going to play two games of chess with an opponent whom you have never played against before. The opponent is equally likely to be a beginner, intermediate, or master, and depending on which, your chances of winning a game are $90\%$, $50\%$, and $30\%$ respectively.

<b>Part A:</b>

What is your probability of winning the game?

<i>Answer:</i>

$P(W_1) = (0.9 + 0.5 + 0.3) / 3 = 17/30$


<b>Part B:</b>

You won the first game. What is the probability that you will also win the second?

<i>Answer:</i>

$P(W_2|W_1) = \frac{ P(W_2,W_1) }{P(W_1)}$

The denominator is known from part $A$ while the numerator can be found by conditioning on the skill level of the opponent.

$P(W_1, W_2) = \frac{1}{3} P(W_1, W_2 | beginner) + \frac{1}{3} P(W_1, W_2 | intermediate) + \frac{1}{3} P(W_1, W_2 | master)$

Since $W_1$ and $W_2$ are conditionally independent, given the skill of the opponent.

$P(W_1, W_2) = (0.9^2 + 0.5^2 + 0.3^2) / 3 = 23/60$

$P(W_2, W_1) = \frac{23/60}{17/30} = \frac{23}{34}$


<b>Part C:</b>

Explain the distinction between assuming that the outcomes of the games are independent and assuming that they are conditionally independent given the opponent's skill level. Which of these assumptions seems more reasonable, and why?

<i>Answer:</i>

Independent means that knowing one game's outcome gives no information about the other game's outcome, while conditional independence is the same statement where all probabilities are conditioned on the opponent's skill level.

Conditional independence given the opponent's skill level is a more reasonable assumption. Winning the first game gives information about the opponent's skill level.

<h3>First-Step Analysis and Gambler's Ruin Exercies</h3>

<h4>Exercise 42</h4>

A fair die is rolled repeatedly, and a running total is kept. Let $p_n$ be the probability that the running total is ever exactly $n$? Assume the die will always be rolled enough times so that the running total will eventually exceed $n$, but may or may not ever equal $n$.

<b>Part A:</b>

Write a recursive equation for $p_n$ (relating $p_n$ to earlier terms $p_k$). Your equation should be true for all positive integers $n$. i.e., give a definition of $p_0$ and $p_k$ for $k \lt 0$ so that the recursive equation is true for small values of $x$.

<i>Answer:</i>

We will find something to condition on to reduce the case of interact to earlier, simpler cases. This is first step analysis.

Let $p_n$ be the probability that the running total is ever exactly $n$. If, for example, the first throw is a $3$, then the probability of reaching $n$ exactly is $p_{n-3}$ since starting from that point, we need to get a total of $n-3$. So,

$p_n = \frac{1}{6}(p_{n-1}, p_{n-2}, \ldots, p_{n-6})$


<b>Part B:</b>

Find $p_7$.

<i>Answer:</i>

Using the recursive equation in part $A$, we have:

$p_1 = \frac{1}{6}$

$p_2 = \frac{1}{6} \left( 1 + \frac{1}{6} \right)$

$p_3 = \frac{1}{6} \left( 1 + \frac{1}{6} \right)^2$

$\ldots$

$p_6 = \frac{1}{6} \left( 1 + \frac{1}{6} \right)^5$

Hence:

$p_7 = \frac{1}{6} (p_1 + p_2 + \ldots + p_6)$

$p_7 = \frac{1}{6} \left( \left( 1 + \frac{1}{6} \right)^6 - 1 \right) \approx 0.2536$