# Example Poisson distribution: field goals attempted per game

In [1]:
import numpy as np
import scipy.stats as st

import bokeh.plotting
import bokeh.io
bokeh.io.output_notebook()

The story behind the Poisson distribution is as follows.
>The number of arrivals of a Poisson processes in a given set time interval is Poisson distributed.

We could model field goal attempts in a basketball game using a Poisson distribution. When a player takes a shot is a largely stochastic process, being influenced by the myriad ebbs and flows of a basketball game. Some players shoot more than others, though, so there is a well-defined *rate* of shooting. Let's consider LeBron James's field goal attempts for the 2017-2018 NBA season. First, the data.

In [2]:
fga = [19, 16, 15, 20, 20, 11, 15, 22, 34, 17, 20, 24, 14, 14, 
       24, 26, 14, 17, 20, 23, 16, 11, 22, 15, 18, 22, 23, 13, 
       18, 15, 23, 22, 23, 18, 17, 22, 17, 15, 23, 8, 16, 25, 
       18, 16, 17, 23, 17, 15, 20, 21, 10, 17, 22, 20, 20, 23, 
       17, 18, 16, 25, 25, 24, 19, 17, 25, 20, 20, 14, 25, 26, 
       29, 19, 16, 19, 18, 26, 24, 21, 14, 20, 29, 16, 9]

To show that this random variable is approximately Poisson distributed, we will plot its empirical cumulative distribution function (ECDF) and compare it with the maximum likelihood estimate for the ECDF of the Poisson distribution. First, we'll generate the *x* and _y_ values for the ECDF.

In [3]:
# Make x and y values for ECDF plot
x_ecdf = np.sort(fga)
y_ecdf = np.arange(1, len(x_ecdf)+1) / len(x_ecdf)

Next, we will draw many samples out of a Poisson distribution to get the theoretical ECDF.

In [5]:
n_reps = 1000
x_theor = np.concatenate([np.sort(np.random.poisson(np.mean(fga), size=len(fga))) 
                               for _ in range(n_reps)])
y_theor = np.concatenate([y_ecdf]*n_reps)

Now let's build the plot!

In [15]:
p = bokeh.plotting.figure(plot_height=250,
                          plot_width=450,
                          x_axis_label='field goal attempts',
                          y_axis_label='ECDF')

p.circle(x_theor, y_theor, alpha=0.01, color='gray', line_alpha=0)
p.circle(x_ecdf, y_ecdf)

bokeh.io.show(p)

Indeed, LeBron's field goal attempts per game are Poisson distributed!