## The Droid Problem

This notebook presents and walks through the solution to the Droid Problem using BayesTables

In [12]:
# Configure Jupyter so figures appear in the notebook
%matplotlib inline

# Configure Jupyter to display the assigned value after an assignment
%config InteractiveShell.ast_node_interactivity='last_expr_or_assign'

import numpy as np
import pandas as pd

# Problem description

You have been tasked with finding two fugitive droids that your employer believes are hiding on a planet, and have been told that one is a C-series protocol droid and the other is an R-series Astromech. Furthermore, you know the following information:
- The total number of droids on the planet is approximately 300,000
- R-series droids make up approximately 1% of the galaxy's droid population, while C-series droids make up approximately 0.1%

You come across a vehicle carrying, as you were told to expect, one R-series droid and one C-series droid, along with an old man and a child. 

**Are these the droids you are looking for?**

In [34]:
population = 300000
R_density = 0.01
C_density = 0.001;

### The BayesTable class

Here's the class that represents a Bayesian table. More information is available in the `bayes_table.ipynb` example.

In [14]:
class BayesTable(pd.DataFrame):
    def __init__(self, hypo, prior=1):
        columns = ['hypo', 'prior', 'likelihood', 'unnorm', 'posterior']
        super().__init__(columns=columns)
        self.hypo = hypo
        self.prior = prior
    
    def mult(self):
        self.unnorm = self.prior * self.likelihood
        
    def norm(self):
        nc = np.sum(self.unnorm)
        self.posterior = self.unnorm / nc
        return nc
    
    def update(self):
        self.mult()
        return self.norm()
    
    def reset(self):
        return BayesTable(self.hypo, self.posterior)

## Priors

To compute an appropriate prior belief that a randomly selected vehicle contains the droids you are looking for, we need to formalize the problem description slightly more. 

### Option 1: Independent droids
If we interpret "these are the droids you are looking for" to mean that _both_ of the droids you are looking for are currently in the vehicle, and we assume that the droids in a vehicle are selected at random from the droid population on the planet, then the prior probability of a specific vehicle containing both droids is:

In [15]:
independent_prior = 2/(population * (population-1))
independent_prior

2.2222296296543212e-11

### Option 2: Fugitives stick together

If, on the other hand, we assume that the two fugitives you seek are probably in the same vehicle, then we could model the population as being composed of pre-assigned droid pairs, of which one pair is the fugitives you seek. In that case, the prior probability is:

In [16]:
dependent_prior = 1/(population/2)
dependent_prior

6.666666666666667e-06

The difference between the priors is really dramatic (approximately five orders of magnitude) in this case, so we will use the Law of Total Probability to compute a hybrid prior based on the assumption that there are 50-50 odds that the fugitives decided to remain together.

In [17]:
hybrid_prior = 0.5 * independent_prior + 0.5 * dependent_prior
hybrid_prior

3.3333444444814817e-06

## Calculation

In [18]:
table = BayesTable(['Fugitives', 'Innocents'], prior=[hybrid_prior, 1-hybrid_prior])

Unnamed: 0,hypo,prior,likelihood,unnorm,posterior
0,Fugitives,3e-06,,,
1,Innocents,0.999997,,,


The observation is that the vehicle contains one R-series and one C-series droid, which happens with likelihood 1 if these actually are the fugitives. If these droids are innocent, this coincidence can happen two ways, based on the order of choosing the two droids.

In [20]:
table.likelihood = [1.0, 2*R_density*C_density]
table

Unnamed: 0,hypo,prior,likelihood,unnorm,posterior
0,Fugitives,3e-06,1.0,,
1,Innocents,0.999997,2e-05,,


Using this observation, we can update the table.

In [22]:
table.update()
table

Unnamed: 0,hypo,prior,likelihood,unnorm,posterior
0,Fugitives,3e-06,1.0,3e-06,0.142858
1,Innocents,0.999997,2e-05,2e-05,0.857142


Notice that the normalization constant of this table is quite small, indicating that it is unlikely to observe a vehicle with an R-series and a C-series droid given the prior information you have available.

In [23]:
table.norm()

2.3333277777592593e-05

### Alternate calculation

Alternately, we could treat seeing each of the droids as an independent observation, with exactly the same result.

In [30]:
table2 = BayesTable(['Fugitives', 'Innocents'], prior=[hybrid_prior, 1-hybrid_prior])

# The vehicle contains an R-series droid
table2.likelihood = [1.0, 2*R_density]
table2.update()

# The vehicle also contains a C-series droid
table2 = table2.reset()
table2.likelihood = [1.0, C_density]
table2.update()
table2

Unnamed: 0,hypo,prior,likelihood,unnorm,posterior
0,Fugitives,0.000167,1.0,0.000167,0.142858
1,Innocents,0.999833,0.001,0.001,0.857142


Note that this second approach has exactly the same posterior distribution as the single-step update above. 

**These are _probably_ not the droids you are looking for.**