The Poisson distribution tells us that $P(\chi = x) = \frac{e^{-\lambda} \lambda^{x}}{x!}$

That is, the probability that the observed numebr of events $\chi$ is exactly $x$ across a given interval is given by the above formula, involving a parameter $\lambda$.

For time intervals, $\lambda = T \cdot r$ where $T$ is the length of the interval and $r$ is the rate of events/time. The units do not matter, only that they are consistent.

In [1]:
import numpy as np

#rate r
def get_rate(events, interval):
    return(events / interval)

#Lambda
def get_parameter(rate, interval):
    return(rate * interval)

#Factorial, x!
def factorial(x):
    if x == 0:
        return(1)
    else:
        return(x * factorial(x - 1))

#Probability X = x
def get_probability(value, parameter):
    return(((np.e ** - parameter) * (parameter ** value)) / factorial(value))

#Gives a dictionary for P(X = x) for every index in indexes
def get_probabilities_over_range(parameter, indexes):
    return({value : get_probability(value, parameter) for value in indexes})

#Some quick unit tests
if __name__ == "__main__":
    assert(get_rate(3, 0.5) == 6)
    assert(get_parameter(3, 4) == 12)
    assert(factorial(5) == 120)
    assert(get_probability(0, 0) == 1)
    assert(get_probabilities_over_range(0, range(15))[5] == 0)
    print("All tests passed!")

All tests passed!


Example applications:

Suppose we are 64 minutes into a 90 minute game of football and the home team is 2-1 up. What is our best guess for the final score?

In [6]:
def predict_score(team_1_score = 0, team_2_score = 0, time = 45):
    interval = time / 90
    parameter_1 = get_parameter(team_1_score / interval, interval)
    parameter_2 = get_parameter(team_2_score / interval, interval)
    expected_1 = parameter_1 * (1 - interval) #How many goals in the time remaining?
    expected_2 = parameter_2 * (1 - interval) #How many goals in the time remaining?
    return(int(expected_1 + team_1_score), int(expected_2 + team_2_score))

predict_score(3, 1, 40)

(4, 1)