In [1]:
import math
from nbmetalog import nbmetalog as nbm
import numpy as np
import random
import sympy

random.seed(1)


In [2]:
nbm.print_metadata()


context: ci
hostname: 812d283872fe
interpreter: 3.8.12 (default, Jan 15 2022, 18:39:47)  [GCC 7.5.0]
nbcellexec: 2
nbname: maximum_likelihood_popsize_estimator_expected_value
nbpath: /opt/hereditary-stratigraph-concept/binder/popsize/maximum_likelihood_popsize_estimator_expected_value.ipynb
revision: null
session: 13156ca8-595c-4702-b3a6-2796b8316885
timestamp: 2022-07-31T03:55:12Z00:00


IPython==7.16.1
keyname==0.4.1
yaml==5.3.1
nbmetalog==0.2.6
numpy==1.21.5
sympy==1.5.1
re==2.2.1
ipython_genutils==0.2.0
logging==0.5.1.2
zmq==22.3.0
json==2.0.9
ipykernel==5.5.3


# Goal

Derive the expected value for the maximum likelihood population size estimator, $\hat{n}_\mathrm{mle}$.


# Derivation

From [gene_drive_scenario.ipynb](gene_drive_scenario.ipynb), we have that for each independent observation of fixed gene magnitude $\boldsymbol{X}_i$, we have $\boldsymbol{X}_i \sim p(x_i) = n x_i^{n-1}$ for $x_i \in [0,1]$ and $p(x_i) = 0$ otherwise.

From [maximum_likelihood_popsize_estimator.ipynb](maximum_likelihood_popsize_estimator.ipynb) we have

$$
\hat{n}_\mathrm{mle}
= -\frac{k}{\sum_{i=1}^k \log( \boldsymbol{X}_i )}.
$$

From the definition of expected value and then making use of the laplace transform as shown in <https://math.stackexchange.com/a/302442>,

$\begin{align*}
\\
E(\hat{n}_\mathrm{mle})
&= E\Big(-\frac{k}{\sum_{i=1}^k \log( \boldsymbol{X}_i )}\Big)\\
&= -k E\Big(\frac{1}{\sum_{i=1}^k \log( \boldsymbol{X}_i )}\Big)\\
&= -k E\Big[\int_0^\infty \exp \Big( -t \sum_{i=1}^k \log( \boldsymbol{X}_i )\Big) \, \mathrm{d}t \Big] \\
&= -k E\Big[\int_0^\infty \exp \Big( -t \sum_{i=1}^k \log( \boldsymbol{X}_i )\Big) \, \mathrm{d}t \Big]\\
&= -k \int_0^\infty E\Big[ \exp \Big( -t \sum_{i=1}^k \log( \boldsymbol{X}_i )\Big)\Big] \, \mathrm{d}t \\
&= -k \int_0^\infty E\Big[ \exp \Big( -t \log( \boldsymbol{X} )\Big)\Big]^k \, \mathrm{d}t \\
&= -k \int_0^\infty \Big[\int_0^1 \exp \Big( -t \log( x )\Big) p(x) \, \mathrm{d}x \Big]^k \, \mathrm{d}t \\
&= -k \int_0^\infty \Big[\int_0^1 \exp \Big( -t \log( x )\Big) n x^{n-1} \, \mathrm{d}x \Big]^k \, \mathrm{d}t \\
&= -k \int_0^\infty \Big[n\int_0^1 \exp \Big( -t \log( x )\Big) x^{n-1} \, \mathrm{d}x \Big]^k \, \mathrm{d}t \\
&= -k \int_0^\infty n^k\Big[\int_0^1 \exp \Big( -t \log( x )\Big) x^{n-1} \, \mathrm{d}x \Big]^k \, \mathrm{d}t \\
&= -k \times n^k \int_0^\infty \Big[\int_0^1 \exp \Big( -t \log( x )\Big) x^{n-1} \, \mathrm{d}x \Big]^k \, \mathrm{d}t \\
&= -k \times n^k \int_0^\infty \Big[\int_0^1 \exp \Big( \log( x^{-t} )\Big) x^{n-1} \, \mathrm{d}x \Big]^k \,
\mathrm{d}t \\
&= -k \times n^k \int_0^\infty \Big[\int_0^1 x^{-t} x^{n-1} \, \mathrm{d}x \Big]^k \,
\mathrm{d}t \\
&= -k \times n^k \int_0^\infty \Big[\int_0^1 x^{n-t-1} \, \mathrm{d}x \Big]^k \,
\mathrm{d}t \\
&= -k \times n^k \int_0^\infty \Big[ \frac{x^{n-t}}{n-t} \Big|^0_1 \Big]^k \, \mathrm{d}t \\
&= -k \times n^k \int_0^\infty \Big[\frac{1^{n-t}}{n-t} - \frac{0^{n-t}}{n-t} \Big]^k \, \mathrm{d}t \\
&= -k \times n^k \int_0^\infty \Big[\frac{1}{n-t} \Big]^k \, \mathrm{d}t \\
&= -k \times n^k \int_0^\infty \frac{1}{u^k} \, (-\mathrm{d}u) \\
&= k \times n^k \int_0^\infty u^{-k} \, \mathrm{d}u.
\end{align*}$

For $k > 1$, we find

$\begin{align*}
E(\hat{n}_\mathrm{mle})
&= k \times n^k \int_0^\infty u^{-k} \, \mathrm{d}u \\
&= k \times n^k \frac{u^{1-k}}{1-k} \Big|_0^\infty \\
&= k \times n^k \frac{(n-t)^{1-k}}{1-k} \Big|_0^\infty \\
&= k \times n^k \frac{1}{(n-t)^{k-1}(1-k)} \Big|_0^\infty \\
&= k \times n^k \Big( \frac{1}{(n-\infty)^{k-1}(1-k)} - \frac{1}{(n-0)^{k-1}(1-k)} \Big) \\
&= k \times n^k \Big( 0 - \frac{1}{(n-0)^{k-1}(1-k)} \Big) \\
&= -k \times n^k \frac{1}{n^{k-1}(1-k)} \\
&= -\frac{k \times n^k}{n^{k-1}(1-k)} \\
&= -\frac{k \times n}{1-k} \\
&= -n\frac{k}{1-k} \\
&= n\frac{k}{k-1}.
\end{align*}$

For $k = 1$, expected value diverges as

$\begin{align*}
E(\hat{n}_\mathrm{mle})
&= k \times n^k \int_0^\infty u^{-k} \, \mathrm{d}u \\
&= k \times n^1 \int_0^\infty u^{-1} \, \mathrm{d}u \\
&= k \times n \log(u) \Big|_0^\infty \\
&= k \times n \log(n-t) \Big|_0^\infty \\
&= k \times n \Big[ \log(n-\infty) - \log(n-0) \Big] \\
&= k \times n \Big[ \infty - \log(n) \Big] \\
&= \infty.
\end{align*}$


# Literature Review

[(Terelius, 2012)](#terelius2012distributed) report an identical result.


# Result

Expected value for the maximum likelihood population size estimator $\hat{n}_\mathrm{mle}$ is given as,

$$
E(\hat{n}_\mathrm{mle})
= n\frac{k}{k-1}
$$

for $k>1$.


# References

<a
   id="terelius2012distributed"
   href="http://doi.org/10.1109/CDC.2012.6425912">
H. Terelius, D. Varagnolo and K. H. Johansson, "Distributed size estimation of dynamic anonymous networks," 2012 IEEE 51st IEEE Conference on Decision and Control (CDC), 2012, pp. 5221-5227, doi: 10.1109/CDC.2012.6425912.
</a>
