## The Prosecutor's Fallacy problem


This is a classic problem in probability theory.  David Morin's book <i> Probability for the enthusiastic beginner</i> gives a nice overview and solution of the problem, although one should always try to solve the problem before checking out the solution.

This notebook takes his advice and performs many iterations of the scenario and calculates the percent time that birthdays are duplicated at least one.

#### Problem: 
Detectives in a city, say San Diego (whose population we will assume to be one million), are working on a crime and have put together a description of the perpetrator, based on things such as height, a tattoo, a limp, an earing, etc.  Let's assume that only one person in 10,000 fits the description.  On a routine patrol the next day, police officers see a person fitting the description.  This person is arrested and brought to trial based solely on the fact that he fits the description.

During the trial, the prosecutor tells the jury that since only one person in 10,000 fits the description, it is highly unlikely (far beyond a reasonable doubt) that an innocent person fits the description; it is therefore highly unlikely that the defendant is innocent.  

If you were a member of the jury, would you cast a guilty or not-guilty vote?  What is your level of confidence in your decision?

In [None]:
import numpy as np

In [None]:
# set the number of iterations
n_rounds = int(1e5)

# set the number of people in the city
n_people = 1e6

# odds of random person fitting description
odds_random_person_fits_desc = 1/1e4

# the number of people who fit the description of the crime
n_fit_description = n_people*odds_random_person_fits_desc

We assume that for a given person, without any other information, the probability they are guilty of the crime is 1/total_people

In [None]:
results = []

for k in range(n_rounds):
    # all people fitting description
    guilty_looking = np.random.randint(1,n_people,int(n_fit_description))
    
    # randomly choose a guilty person from all fitting description
    guilty_person_index = np.random.randint(0,len(guilty_looking),1)
    
    # randomly choose person off the street who fits the description
    fits_desc_person_index = np.random.randint(0,len(guilty_looking),1)
    
    if guilty_person_index == fits_desc_person_index:
        results.append(1)
    else:
        results.append(0)
        
p_guilty = sum(results)/len(results)

In [None]:
print(f'Total rounds: \t\t\t\t\t\t\t\t{n_rounds}')
print(f'Probability the person on trial is guilty: \t\t\t\t{p_guilty}')
print(f'Said differently, probability the person on trial is innocent: \t\t{1-p_guilty}')