# Report 03
#### Matthew Beaudouin-Lafon

In [2]:
from __future__ import print_function, division

% matplotlib inline
import warnings
warnings.filterwarnings('ignore')

import math
import numpy as np

from thinkbayes2 import Pmf, Cdf, Suite, Joint
import thinkplot

## GPS Problem 
#### From Think Bayes Chapter 9

> GPS included a (currently disabled) feature called Selective Availability (SA) that adds intentional, time varying errors of up to 100 meters (328 ft) to the publicly available navigation signals. This was intended to deny an enemy the use of civilian GPS receivers for precision weapon guidance.
> [...]
> Before it was turned off on May 2, 2000, typical SA errors were about 50 m (164 ft) horizontally and about 100 m (328 ft) vertically.[10] Because SA affects every GPS receiver in a given area almost equally, a fixed station with an accurately known position can measure the SA error values and transmit them to the local GPS receivers so they may correct their position fixes. This is called Differential GPS or DGPS. DGPS also corrects for several other important sources of GPS errors, particularly ionospheric delay, so it continues to be widely used even though SA has been turned off. The ineffectiveness of SA in the face of widely available DGPS was a common argument for turning off SA, and this was finally done by order of President Clinton in 2000.

Suppose it is 1 May 2000, and you are standing in a field that is 200m square.  You are holding a GPS unit that indicates that your location is 51m north and 15m west of a known reference point in the middle of the field.

However, you know that each of these coordinates has been perturbed by a "feature" that adds random errors with mean 0 and standard deviation 30m.

1) After taking one measurement, what should you believe about your position?

Note: Since the intentional errors are independent, you could solve this problem independently for X and Y.  But we'll treat it as a two-dimensional problem, partly for practice and partly to see how we could extend the solution to handle dependent errors.

You can start with the code in gps.py.

2) Suppose that after one second the GPS updates your position and reports coordinates (48, 90).  What should you believe now?

3) Suppose you take 8 more measurements and get:

    (11.903060613102866, 19.79168669735705)
    (77.10743601503178, 39.87062906535289)
    (80.16596823095534, -12.797927542984425)
    (67.38157493119053, 83.52841028148538)
    (89.43965206875271, 20.52141889230797)
    (58.794021026248245, 30.23054016065644)
    (2.5844401241265302, 51.012041625783766)
    (45.58108994142448, 3.5718287379754585)

At this point, how certain are you about your location?

In [3]:
class GPS(Suite, Joint):
    """
    Bayesian model for the GPS Problem. Inherits from Suite and Joint.
    """
    
    def Likelihood(self, data, hypo):
        """
        
        """
        x, y = hypo
        gpsX, gpsY = data
        
        

## Lincoln Index Problem 
#### From John D. Cook
"Suppose you have a tester who finds 20 bugs in your program. You want to estimate how many bugs are really in the program. You know there are at least 20 bugs, and if you have supreme confidence in your tester, you may suppose there are around 20 bugs. But maybe your tester isn't very good. Maybe there are hundreds of bugs. How can you have any idea how many bugs there are? There’s no way to know with one tester. But if you have two testers, you can get a good idea, even if you don’t know how skilled the testers are.

"Suppose two testers independently search for bugs. Let k1 be the number of errors the first tester finds and k2 the number of errors the second tester finds. Let c be the number of errors both testers find. The Lincoln Index estimates the total number of errors as k1 k2 / c [I changed his notation to be consistent with mine]."

So if the first tester finds 20 bugs, the second finds 15, and they find 3 in common, we estimate that there are about 100 bugs. What is the Bayesian estimate of the number of errors based on this data?

In [40]:
# Forward problem
p1 = 0.4 # Probability tester finds any given bug
p2 = 0.6
n = 100.0  # Number of bugs

K1 = []
K2 = []
C = []

for i in range(1000):
    sample1 = np.random.random(size=n) < p1
    sample2 = np.random.random(size=n) < p2
    k1 = np.sum(sample1) / n
    k2 = np.sum(sample2) / n
    c = np.sum(sample1 & sample2) / n
    K1.append(k1)
    K2.append(k2)
    C.append(c)

print(sum(K1)/len(K1), sum(K2)/len(K2), sum(C)/len(C))

0.39967 0.59894 0.23978


In [48]:
class Lincoln(Suite, Joint):
    """
    
    """
    def Likelihood(self, data, hypo):
        """
        data: k1, k2, c
        hypo: n, p1, p2
        """
        k1, k2, c = data
        n, p1, p2 = hypo
        like1 = thinkbayes.EvalBinomialPmf(k1, p1, n)
        like2 = thinkbayes.EvalBinomialPmf(k2, p2, n)
        likeC = thinkbayes.EvalBinomialPmf(kc, p1 * p2, n)
        return like1 * like2 * likeC
    
numSamples = 100.0
K1 = [i/numSamples in range(int(numSamples))]
K2 = [i/numSamples in range(int(numSamples))]
C = np.linspace(20, 200)

lincoln = Lincoln(K1, K2, C)

TypeError: __init__() takes at most 3 arguments (4 given)