notebooks/2019_09_10_polya/index.idyll

[meta title:"Polya Urn Idyll" description:"Short description of your
project" /]


### Introduction

The Pólya urn model is a statistical experiment in which we study the distribution of two populations of distinctly colored balls in an urn. In its simplest form, the problem can be stated as follows:

[div className:"definition"]
Given an urn containing initially [equation]a[/equation]   [span className:"popa"] orange[/span] balls and [equation]b[/equation]   [span className:"popb"] blue[/span] balls, we pick one ball at random from the urn, place it back and add one new ball of the same color in the urn; We then repeat this sampling procedure [equation]n[/equation] times.
[/div]


[var name:"a0_fig_urn" value:1 type:int/]
[var name:"b0_fig_urn" value:2 type:int/]
[var name:"run_fig_urn" value:false /]


[div className:"figure"]
[div]
[Inline]
[button className:"simulate" onClick:`run_fig_urn = true`]
Simulate
[/button]
[/Inline]
Pólya urn with [equation]a =[/equation] [Dynamic value:a0_fig_urn max:12 min:1 display:`a0_fig_urn.toString()`/] and [equation]b =[/equation] [Dynamic value:b0_fig_urn max:12 min:1 display:`b0_fig_urn.toString()`/]
[/div]
[PolyaUrn className:"d3-component" state:state a0:a0_fig_urn b0:b0_fig_urn run:run_fig_urn /]

**Figure 1**: Simulation of a Pólya urn with initial conditions [span className:"popa"]a = [Display value:a0_fig_urn format:"d" /][/span] and [span className:"popb"]b = [Display value:b0_fig_urn format:"d" /][/span].
[/div]


This process is a good example of a model where small imbalances are magnified over time: In fact, the more balls of one color are drawn, the more this color is likely to be drawn. As such, it has been used to model for instance propagation of epidemics in a population [(Hayhoe et al.)](https://arxiv.org/pdf/1705.02239.pdf) or the correlated dynamics between novelty events [(Tria et al, 2014)](https://www.nature.com/articles/srep05890). In the following, we study the dynamics of the model from a probabilistic point of view and its asymptotical properties, in particular with respect to the initial ratio of populations.


### Basic dynamics of the model

#### Composition of the urn at step n
Let us first express the *probability of having picked [equation] k [/equation] [span className:"popb"] blue[/span] objects after [equation]n \geq k[/equation] steps* of the experiment. Denoting by [equation]B_n[/equation] the number of blue balls at step  [equation]n[/equation], and by [equation]S_n[/equation] the sampling event at step [equation]n[/equation], taking value [span className:"popb"]1[/span] if a blue ball is sampled and  [span className:"popa"]0[/span] otherwise:

[Equation display:true]
\mathbb{P} (B_n = b + k) =   \sum_{\mathbf{s} \in {\{0, 1\}}^n \atop \sum_i \mathbf{s}_i = k}  \mathbb{P}(S_1 = s_1, \dots S_{n} = s_n)
[/Equation]


[Aside]
[div className:"detail"]
[Equation]\ \ \ \ \mathbb{P}(S_1 = s_1, \dots, S_{n} = s_n)[/Equation]
[Equation]= \mathbb{P}(S_1 = s_1)  \mathbb{P}(S_2 = s_2\ |\ S_1 = s_1) \dots[/Equation]
[/div]
[/Aside]


We can expand the inner term by using the probability [span className:"text_detail"]chain rule[/span]. Moreover, the probability of drawing a ball of a certain color given the composition of the urn is simply the ratio of correspondingly colored balls in the urn, which evolves with each sampling. Therefore, the conditional probability of drawing a [span className:"popb"]blue[/span]  object at step [equation]n[/equation] can be written as:

[Aside][equation](1)[/equation][/Aside]
[Equation display:true]
\mathbb{P}(S_n = 1\ |\ S_{1} = s_1, \dots, S_{n-1} = s_{n-1}) =
\frac{b + \sum_{i = 1}^{n - 1} s_i}{a + b + n - 1}
[/Equation]

From this expression, we can see that the probability does not depend on the  time step [equation]n[/equation] but rather on *how many blue balls have been sampled until now*, as captured by the quantity [equation]\sum_{i=1}^{n-1} s_i[/equation]. This also means that the model is *Markovian*, as the probability at step [equation]n[/equation] only depends on the urn's composition at step [equation]n - 1[/equation].

Consequently, for any sequence of draws [equation]\mathbf{s}[/equation], we can arrange the order in which the balls have been drawn without affecting the probability of the next draw. This is sometimes called the *exchangeability* property. More formally, it means that **Equation (1)** is equivalent to:

[Aside][equation](2)[/equation][/Aside]
[Equation display:true]
\mathbb{P}(S_n = 1\ |\ S_{1} = \textcolor{CornflowerBlue}{1}, \dots,
S_{\bar{\mathbf{s}}} = \textcolor{CornflowerBlue}{1},
S_{\bar{\mathbf{s}} + 1} = \textcolor{Orange}{0} \dots S_{n-1} = \textcolor{Orange}{0}  )
[/Equation]

where [equation]\bar{\mathbf{s}}[/equation] is a shortcut notation for [equation]\sum_{i=1}^{n-1} s_i[/equation] from [equation](1)[/equation]. The same analysis holds for the conditional probability of drawing an orange ball. Bringing everything together, we can thus finally write
[Equation display:true]
\mathbb{P} (B_n = b + k) = \binom{n}{k} \mathbb{P}(S_1 =
\textcolor{CornflowerBlue}{1}, \dots, S_k = \textcolor{CornflowerBlue}{1}, S_{k+1} = \textcolor{Orange}{0} \dots S_{n} = \textcolor{Orange}{0})
[/Equation]

[Equation display:true]
= \binom{n}{k} \mathbb{P}(S_1 =
\textcolor{CornflowerBlue}{1}) \dots \mathbb{P}(S_{n} = \textcolor{Orange}{0}\ |\  S_{1} = \textcolor{CornflowerBlue}{1} \dots S_{n-1} = \textcolor{Orange}{0})
[/Equation]

[Equation display:true]
= \binom{n}{k} \prod_{p=0}^{k-1} \frac{b + p}{a+ b + p}  \prod_{q=0}^{n-k-1} \frac{a + q}{a+ b + q}
[/Equation]

[Equation display:true]
= \binom{n}{k} \frac{(b+k-1)! (a+b-1)! (a+n-k-1)!}{(b-1)! (a-1)! (a+b+n-1)!}
[/Equation]

[div className:"equation"]
[Equation display:true]
\mathbb{P} (B_n = b + k) = \frac{\binom{n}{k} \binom{a+b}{b}}{\binom{a+b+n-1}{b+k}}  \frac{ab}{(a+b)(b+k)}
[/Equation]
[/div]


#### Probability of the next draw

Following the previous result, we can now compute the probability of *picking a [span className:"popb"]blue[/span] ball at step [equation]n + 1[/equation]*, independently of the state of the urn at time [equation]n [/equation]:

[equation display:true]
\mathbb{P} (S_{n + 1} = 1) = \sum_{k=0}^{n} \mathbb{P}(S_{n+1} = 1\ |\ B_{n} = b + k)\ \mathbb{P}(B_{n} = b + k)
[/equation]

[equation display:true]
= \sum_{k=0}^{n} \frac{b+k}{a+b+n}\ \mathbb{P}(B_{n} = b + k)
[/equation]

[Aside][equation](3)[/equation][/Aside]
[equation display:true]
= \frac{ab\binom{a+b}{b}}{(a+b+n)(a+b)} \underbrace{\sum_{k=0}^{n} \frac{\binom{n}{k}}{\binom{a+b+n-1}{b+k}}}_{f(n, a, b)}
[/equation]


[Aside]
[div className:"detail"]
Pascal's triangle: [Equation]\binom{n}{k} = \binom{n - 1}{k - 1} + \binom{n - 1}{k}[/Equation]
[/div]
[/Aside]

To simplify the expression of [equation]f(n, a, b)[/equation], we can use [span className:"text_detail"]Pascal's triangle formula[/span] on the numerator (while isolating the terms for [equation]j = 0[/equation] and [equation]j = n[/equation]) and observe that:
[Equation display:true]
f(n, a, b) = \sum_{j=0}^n \frac{\binom{n}{j}}{\binom{a+b+n-1}{b+j}}
[/Equation]

[Equation display:true]
= \frac{1}{\binom{a+b+n-1}{b}} + \frac{1}{\binom{a+b+n-1}{b+n}} + \sum_{j=1}^{n-1} \frac{\binom{n-1}{j}}{\binom{a+b+n-1}{b+j}} + \sum_{j=1}^{n-1} \frac{\binom{n-1}{j-1}}{\binom{a+b+n-1}{b+j}}
[/Equation]

[Equation display:true]
= \sum_{j=0}^{n-1} \frac{\binom{n-1}{j}}{\binom{a+b+n-1}{b+j}} + \sum_{j=0}^{n-1} \frac{\binom{n-1}{j}}{\binom{a+b+n-1}{b+ j +1}}
[/Equation]

[Equation display:true]
f(n, a, b) = f(n-1, a+1, b) + f(n-1, a, b+1)
[/Equation]

By extending the relation, one can infer and finally prove by induction over [equation]n[/equation], that [equation]f(n, a, b) = \frac{a+b+n}{(a+b) \binom{a+b-1}{b}}[/equation]. Plugging this expression in **Equation (3)**, we finally get:

[div className:"equation"]
[equation display:true]
\mathbb{P} (S_n = 1) = \frac{b}{a+b}
[/equation]
[/div]

In other words, the overall probability of the urn composition after [equation]n[/equation] only *depends on the initial color distribution in the urn*. In particular, for the case where [equation]a = 1[/equation] and [equation]b = 1[/equation] we have
[equation]\mathbb{P} (B_n = k + 1) = \frac{[\!|\ k \leq n \ |\!]}{n + 1}[/equation] and [equation]\mathbb{P} (S_n = 1) = 0.5[/equation]. In other words, all urn configurations are equiprobable when starting from a balanced urn.


### Asymptotic behavior

#### Expected value

Since various urn compositions can be reached from the same initial configuration  ([span className:"popa"][equation]a[/equation][/span], [span className:"popb"][equation]b[/equation][/span]),  one can also wonder what the urn composition looks like asymptotically, when the number of repeats [equation]n[/equation] goes to [equation] \infty[/equation].


Let us introduce [equation]Y_n = \frac{B_n}{a + b +n}[/equation], the ratio of the two colored ball populations at step [equation]n[/equation]. Using the previous results, we can easily see that the expected value of [equation]Y_n[/equation] is equal to the initial ratio, independently of the time step:

[equation display:true]
\mathbb{E}(Y_n) = \sum_{k=0}^n \mathbb{P}(B_n = b + k) \frac{b + k}{a + b +n}
[/equation]

[equation display:true]
= \sum_{k=0}^n \mathbb{P}(B_n = b + k) \mathbb{P}(S_n = 1\ |\ B_n = b + k) =  \mathbb{P}(S_n = 1)
[/equation]

[div className:"equation"]
[equation display:true]
\mathbb{E}(Y_n) = \frac{b}{a + b} = Y_0
[/equation]
[/div]


[var name:"a0_fig_dynamics" value:1 type:int/]
[var name:"b0_fig_dynamics" value:2 type:int/]
[var name:"num_runs_fig_dynamics" value:70 type:int/]
[var name:"num_steps_fig_dynamics" value:100 type:int/]
[var name:"run_fig_dynamics" value:false /]


This can also be observed in simulations. Here, blue lines ( [span className:"legend_lines"]---[/span] ) represent the observed value of [equation]Y_n[/equation] for each time step [equation] n \leq [/equation]  [Display value:num_steps_fig_dynamics format:"d" /]  for [Display value:num_runs_fig_dynamics format:"d" /]  different runs with the same initial conditions [equation]a[/equation] and [equation]b[/equation]. Red circles ( [span className:"legend_circles"]●[/span] ) mark the average of these values across runs, for each time step, hence serve as an approximation of [equation]\mathbb{E}(Y_n)[/equation].


[div className:"figure"]
[div]
[Inline]
[button className:"simulate" onClick:`run_fig_dynamics = true`]
Simulate
[/button]
[/Inline]
observed values of [equation]Y_n[/equation] with [equation]a =[/equation] [Dynamic value:a0_fig_dynamics max:12 min:1 display:`a0_fig_dynamics.toString()`/] and [equation]b =[/equation] [Dynamic value:b0_fig_dynamics max:12 min:1 display:`b0_fig_dynamics.toString()`/].
[/div]
[PolyaDynamics className:"d3-component" state:state a0:a0_fig_dynamics b0:b0_fig_dynamics run:run_fig_dynamics num_steps:100 num_runs: 60/]

**Figure 2**: Study of the average asymptotic distribution for a Pólya urn experiment.
[/div]


#### Link to the Dirichlet-multinomial distribution
Going further, we can also study the convergence *in distribution* of the random variable [equation]Y_n[/equation]. First we observe that the probability distribution of [equation]Y_n[/equation] is directly linked to the one of [equation]B_n[/equation], that we expressed earlier.

[Equation display:true]
\mathbb{P}\left(Y_n = \frac{b + k}{a + b+ n}\right) = \mathbb{P} (B_n = b + k) = \frac{\binom{n}{k} \binom{a+b}{b}}{\binom{a+b+n-1}{b+k}}  \frac{ab}{(a+b)(b+k)}
[/Equation]

We can further simplify the expression by making use of the [Gamma](https://en.wikipedia.org/wiki/Gamma_function) and [Beta](https://en.wikipedia.org/wiki/Beta_function) functions. This yields the following result:

[Equation display:true]
\mathbb{P}\left(Y_n = \frac{b + k}{a + b+ n}\right)  = \binom{n}{k}\frac{ab \binom{a+b}{b}}{a + b} \frac{1}{\binom{a+b+n-1}{b+k}(b+k)}
[/Equation]

[Aside][equation](4)[/equation][/Aside]
[Equation display:true]
= \binom{n}{k} \frac{\Gamma(a+b)}{\Gamma(a)\Gamma(b)} \frac{\Gamma(b + k) \Gamma(a + n - k)}{\Gamma(a + b + n)}
[/Equation]


[div className:"equation"]
[equation display:true]
\mathbb{P}\left(Y_n = \frac{b + k}{a + b+ n}\right)  = \binom{n}{k} \frac{\beta(b + k, a + n - k)}{\beta(a, b)}
[/equation]
[/div]

Where the Gamma and Beta functions are defined as [equation]\Gamma: x \mapsto (x - 1)![/equation] and [equation]\beta: (a, b) \mapsto \frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)} = \int_0^1 x^{a-1} (1-x)^{b-1}dx[/equation] respectively.

Note that, up to a simple rearrangement of terms in **Equation (4)**, we observe that [equation]Y_n[/equation] follows a [Dirichlet-multinomal](https://en.wikipedia.org/wiki/Dirichlet-multinomial_distribution) distribution with parameters [equation]\alpha = (a, b)[/equation]. And in fact, this distribution is also known as the *multivariate Pólya distribution* and can be generalized to urns with more than two color populations.


#### Asymptotic distribution
To prove a convergence in distribution, we want to look at *the limit of the repartition function* of [equation]Y_n[/equation], defined as [equation]F_{Y_n}(x) = \mathbb{P}(Y_n \leq x)[/equation], when the time step [equation]n \rightarrow \infty[/equation]. Using the previous simplified expression, we have:


[Equation display:true]
F_{Y_n} (t) = \sum_{k = 0}^{t (a + b+ n) - b} \mathbb{P}\left(Y_n = \frac{b + k}{a + b+ n}\right)
[/Equation]


[Equation display:true]
= \frac{1}{\beta(a,b)} \int_0^1 \sum_{k=0}^{t(a+b+n) - b} \binom{n}{k} x^{b+k-1} (1 - x)^{a+n-k-1} dx
[/Equation]

[Aside][equation](5)[/equation][/Aside]
[Equation display:true]
= \frac{1}{\beta(a,b)} \int_0^1 x^{b-1} (1 - x)^{a-1} \underbrace{\sum_{k=0}^{t(a+b+n) - b} \binom{n}{k} x^{k} (1 - x)^{n-k}}_{g(x, t, n,a,b)} dx
[/Equation]

Thus [equation]g(x,t,n, a, b)[/equation] is the only term that depends on [equation]n[/equation]. Its expression is clearly linked, and similar to, a binomial expansion: More specifically, if we define the random variable [equation]X_n[/equation] with a *binomial distribution* of parameters [equation]n[/equation] and [equation]x \in [0, 1][/equation], then we have exactly that [equation]g(x, t, n, a, b) = \mathbb{P}(X_n \leq t(a+b+n) - b)[/equation].

We can then directly apply the [Demoivre-Laplace theorem](https://en.wikipedia.org/wiki/De_Moivre%E2%80%93Laplace_theorem), which derives from the Central Limit theorem, to [equation]X_n[/equation] which yields that, for large [equation]n[/equation],

[Equation display:true]
g(x, t, n, a,b) \simeq \Phi\left(\frac{n (t - x)}{\sqrt{nx(1-x)}} + \frac{t(a+b) - b}{\sqrt{nx(1-x)}}\right)
[/Equation]

where [equation]\Phi[/equation] is the repartition function of the standard Gaussian distribution, [equation]\mathcal{N}(0,1)[/equation], and, as such, is continuous and bounded. As a result, we have that:


[Equation display:true]
\text{lim}_{n \rightarrow \infty} g(x, t, n, a,b) = \left\{\begin{array}{lr}
1, & \text{if } t > x\\
0, & \text{if } t < x
\end{array}\right.
[/Equation]

By splitting the integral from **Equation (5)** at point [equation]t \in [0, 1][/equation], we conclude that

[div className:"equation"]
[Equation display:true]
\text{lim}_{n \rightarrow \infty} F_{Y_n} (t) = \frac{1}{\beta(a,b)} \int_0^t x^{b-1} (1 - x)^{a-1} dx
[/Equation]
[/div]

The right-side term is exactly the repartition function of a [Beta distribution](https://en.wikipedia.org/wiki/Beta_distribution) at point [equation]t[/equation]. Thus, the proportion of blue balls in the urn, [equation]Y_n[/equation], converges in distribution to the Beta distribution with parameters *[span className:"popb"]b[/span]* and  *[span className:"popa"]a[/span]* when [equation]n[/equation] grows.

[div className:"figure"]
[Inline][img src:"static/images/polya_1_1.png" width:"300px"/][/Inline]
[Inline][img src:"static/images/polya_1_6.png" width:"300px"/][/Inline]

[Inline][img src:"static/images/polya_4_9.png" width:"300px"/][/Inline]
[Inline][img src:"static/images/polya_7_2.png" width:"300px"/][/Inline]

**Figure 3:** Histograms of observed values of [equation]Y_{n = 1000}[/equation] for 10000 runs with different initial conditions (*[span className:"popa"]a[/span]* , *[span className:"popb"]b[/span]*), compared to the density function of the [equation]\beta(b, a)[/equation] distribution (represented as [span className:"legend_circles"]---[/span] ).
[/div]


### Generalized Pólya urn
We won't study such models here, but it is possible to extend the Pólya urn experiment to more flexible frameworks: Generalized Pólya urns follow the same workflow as the one we have been working with until now, but with different replacement values. More specifically, the experiment is described as follows:

[div className:"definition"]
Given an urn containing initially [equation]a[/equation]   [span className:"popa"] orange[/span] balls and [equation]b[/equation]   [span className:"popb"] blue[/span] balls, we pick one ball at random from the urn. **If the ball is  [span className:"popa"]orange[/span]**, place it back and add   [span className:"popa"][equation]\alpha[/equation][/span]  new orange balls and  [span className:"popb"][equation]\beta[/equation][/span]  new blue balls to the urn; **If the ball is  [span className:"popb"] blue[/span]**, do the same but with  [span className:"popa"][equation]\gamma[/equation][/span]  new orange balls and  [span className:"popb"][equation]\delta[/equation][/span] respectively instead. We then repeat this sampling procedure [equation]n[/equation] times.
[/div]

Such a model is often represented by its initial conditions (*[span className:"popa"]a[/span]* , *[span className:"popb"]b[/span]*), and its replacement matrix, [equation]R = \begin{pmatrix}
\alpha & \beta \\
\gamma & \delta
\end{pmatrix} [/equation].

It is also possible to generalize the urn to an arbitrary number of colors, leading to a [equation]d \times d[/equation] replacement matrix. For instance, in the standard Pólya setting, we have [equation]d = 2[/equation] and [equation]R = \begin{pmatrix}
1 & 0 \\
0& 1
\end{pmatrix} = I_2[/equation]

### References

* *"A Pólya Contagion Model for Networks"*, M. Hayhoe et al, 2017
* *"A survey of random processes with reinforcement"*, R. Pemantle, 2017
* *"On generalized Pólya urn models"*, M. Chen and M. Kuba, 2011
* *"Pólya urn models: Lecture Notes"*, N. Pouyanne, 2004
* *"Rate of convergence of Pólya's urn to the Beta distribution"*, J. Drinane, 2008
* "*The dynamics of correlated novelties"*, F. Tria et al, 2014
* *"Urnes aléatoires, populations en équilibre et séries génératrices"*, B. Mallein, 2010