# Debunking Probability in Court

As a respectable data analyst, you have been hired by a Statistical Law Firm to investigate cases where probability has gone wrong. In this mini-project, you will examine probability in real court cases.

### The Sally Clark Trial

#### Evidence

Sally Clark was accused of murdering two of her babies. Both died of SIDS, Sudden Infant Death Syndrome, a rare reason for death when no other cause can be found.

In the trial, the prosecutor stated that the probabily of SIDS is 1/8543. But since Sally Clark had this happen to 2 of her children, by the probability law of multiplication, the odds of both children dying from SIDS is 1/8543 * 1/8543 which is approximately 1 in 73 million.

##### 1. Given the prosecuter's argument, what is the probability that Sally Clark is guilty?

In [2]:
1/8543*1/8543

1.3701849320790423e-08

##### 2. What are the odds that Sally Clark is guilty? 1 in how many?

In [5]:
8543**2

72982849

This probability, approximately 1 in 73 million, was the primary piece of "evidence" used to convict Sally Clark of murder and life imprisonment.

#### Refutation

Now it's your turn to present a counterargument.

##### 3. Why are the odds of 1 in 73 million untrue?

According to Ray Hill, a math professor, the probability of having a second child die from SIDS, given that a first child died from SIDS, may be as high as 1 in 60.

##### 4. Use the new probability to determine a more accurate probability of both babies dying from SIDS.

In [3]:
1/8543*1/60

1.9509149791252097e-06

##### 5. What are the new odds? 1 in how many?

In [4]:
8543*60

512580

As you can see, the probability is still very low. Well, there's something else that's wrong with the argument. It has to do with conditional probability, and it's referred to as the prosecutor's fallacy. The argument goes something like this.

1. Let's assume that Sally Clark is innocent.
2. If she's innocent, the odds of both children dying from SIDS is approximate 1 in 500 thousand.
3. Since the odds of her being innocent are so improbable, she must be guilty.

Let's rephrase this argument using probability. 

P(A) = Probability that Sally Clark is innocent. <br>
P(B) = Probability of both children dying suddenly.


##### 6. True or False? P(A|B) = (P(B|A).

False!

This is most definitely false! If you've studied probability, the answer may seem obvious, but if you have not, it's not so clear.

In the Sally Clark case, and in many others, the argument is, "Given Sally Fields is innocent, the odds of both babies dying from SIDS is 1 in 500 thousand." The fallacy is thinking this is the same as, "Given that both babies died from SIDS, the probability that Sally Fields is innocent is 1 in 500 thousand." A quick study of conditional probability reveals that these are not the same.

Bayes Theorems provides the correct interpretation.

P(A) = Probability that Sally Clark is innocent. <br>
P(B) = Probability of both children dying suddenly and unexpectedly.

P(B) = Probability of both children dying from SIDS. <br>

P(A|B) = (P(B|A) * P(A))/P(B)

Let's go over the right side one term at a time.

##### 7. Given that P(A) = Probability that Sally Clark is innocent, and P(B) = Probability of both children dying suddenly, what is P(B|A)?

In [6]:
1/8543*1/60

1.9509149791252097e-06

Now let's consider P(A), the probability that Sally Clark is innocent. This is difficult to determine by itself. Isn't that the whole point? Not exactly. We are trying to determine her innocence based on the condition that her children died suddenly and unexpectedly. Ignoring that condition, we can treat her as any other person. What's the probability that someone is innocent of murder? In England, where Sally Clark is from, there are 1.22 homicides per 100,000 people.

##### 8. Given 10 billion people, and 300,000 murderers, what's the probability that a random person is a murderer?

In [9]:
1.22/100000

1.22e-05

##### 9. Given no other information, what's the probability that Sally Clark is innocent?

In [10]:
1-1.22/100000

0.9999878

So far we have the following information:

P(B|A) = 1.9509149791252097e-06 <br>
P(A) = 0.9999878

All we need is P(B), the probability of both babies dying suddenly and unexpectedly. There are two ways for this to happen, one is P(B|A) and the other is P(B|~A). This last possibility is tricky to compute. Given that Sally Clark is not innocent, what's the probability of both babies dying suddenly and unexpectedly? 