To write a likelihood function for the locomotive problem, we had to answer this question: “If the railroad has N locomotives, what is the probability that we see number 60?”

The answer depends on what sampling process we use when we observe the locomotive. In this chapter, I resolved the ambiguity by specifying that there is only one train-operating company (or only one that we care about).
But suppose instead that there are many companies with different numbers of trains. And suppose that you are equally likely to see any train operated by any company. In that case, the likelihood function is different because you are more likely to see a train operated by a large company.
As an exercise, implement the likelihood function for this variation of the locomotive problem, and compare the results.

In [1]:
%matplotlib inline
from ThinkBayes.code.thinkbayes import Pmf
from ThinkBayes.code.thinkbayes import Suite
from __future__ import division
import matplotlib.pyplot as plt 

In [2]:
class Train(Suite):
    def __init__(self, hypos, alpha=1.0):
        Pmf.__init__(self)
        for hypo in hypos:
            self.Set(hypo, hypo**(-alpha))
        self.Normalize()
    def Likelihood(self, data, hypo, Ns=[10, 100, 1000, 10000]):
        # I got the idea from
        # http://stats.stackexchange.com/questions/70096/locomotive-problem-with-various-size-companies
        total_number_of_locomotives = sum(N for N in Ns)
        number_of_locomotives_with_that_number = sum(1 for N in Ns if data<=N)
        likelihood = (number_of_locomotives_with_that_number / total_number_of_locomotives)
        return likelihood

In [3]:
hypos = xrange(1, 2001)
suite= Train(hypos)
for data in [60, 30, 90]:
    suite.Update(data)
suite.Mean()

244.54756433830198