## 1. Maximum likelihood estimation

A set of multi-face dice contains dice with 4, 6, 8, 10, 12 and 20 faces.

One die from the set with d faces is rolled 10 times, showing k times out of 10 either 1 or 2. Using maximum likelihood principle, estimate, which of the dice d was rolled.

Describe method of estimation.

Obtain number k of appearances of either 1 or 2 from the description page in the test application.

Enter the estimated number of faces on the rolled die d as the first answer in the quiz.

#### [Answer]

Given event A as the rolled dice shows 1 or 2, A follow B

$$Pr(A) = 2 \times \frac{1}{d} = \frac{2}{d} $$

$$Pr( \lnot A) = 1-Pr(A) = 1 - \frac{2}{d} = \frac{d-2}{d}$$


Then, the probability of event B that we roll a dice 10 times, and get k times either 1 or 2 is:

$$
\begin{split}
Prob &= \binom{10}{k}Pr(A)^kPr(\lnot A)^{10-k} \\
&= \frac{10!}{k!(10-k)!}Pr(A)^kPr(\lnot A)^{10-k} \\
&= \frac{10!}{k!(10-k)!}{\frac{2}{d}}^k{\frac{d-2}{d}}^{10-k}  \\
&= \frac{10!}{k!(10-k)!}\frac{2^k(d-2)^{10-k}}{d^{10}} 
\end{split}\tag{1.3}
$$

$$
\begin{split}
logProb &= \log{10!}+k\log2 + (10-k)\log{(d-2)} - \log{k!} - \log{(10-k)!} - 10\log{d} \\
&= \log{10!}+k\log2 + (10-k)\log{(d-2)} - \Sigma_{i=1}^{k}\log{i} - \Sigma_{i=1}^{10-k}\log{i} - 10\log{d}
\end{split}\tag{1.3}
$$

In [11]:
import numpy as np
import pandas as pd
def logP(k, d):
    return np.log(np.math.factorial(10)) + k*np.log(2) + (10-k)*np.log(d-2) \
            - np.log(np.math.factorial(k)) - np.log(np.math.factorial(10-k)) - 10 * np.log(d)

In [12]:
dices_list = [4, 6, 8, 10, 12, 20]
apearence_times = 11
result = pd.DataFrame(
    {dice:{k:logP(k,dice) for k in range(apearence_times)} for dice in dices_list},
)
result['maximum_dices'] = [dices_list[row.argmax()] for row in result.values]
result.index.name = '# Appearence'
result.columns.name = '# dices'
result

# dices,4,6,8,10,12,20,maximum_dices
# Appearence,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
0,-6.931472,-4.054651,-2.876821,-2.231436,-1.823216,-1.053605,20
1,-4.628887,-2.445213,-1.672848,-1.315145,-1.130068,-0.948245,20
2,-3.124809,-1.634283,-1.267383,-1.197362,-1.235429,-1.641392,10
3,-2.14398,-1.346601,-1.385166,-1.602827,-1.864038,-2.857787,6
4,-1.584364,-1.480132,-1.924162,-2.429505,-2.91386,-4.495396,6
5,-1.402043,-1.990958,-2.840453,-3.633478,-4.340976,-6.510299,4
6,-1.584364,-2.866427,-4.121387,-5.202094,-6.132736,-8.889845,4
7,-2.14398,-4.11919,-5.779615,-7.148004,-8.301789,-11.646685,4
8,-3.124809,-5.793166,-7.859057,-9.515128,-10.892056,-14.824739,4
9,-4.628887,-7.990391,-10.461746,-12.4055,-14.005572,-18.526041,4


### 2. Testing hypotheses using likelihood ratio

Random variable X has uniform distribution on [0,T].

In [13]:
test_sample = pd.read_csv('test_sample.csv')

In [14]:
def pdf(x,T):
    return 1/T
def cdf(x,T):
    return x/T

#### 2.1 Maximum likelihood estimate

Derive maximum likelihood estimate for T if the i.i.d. sample x1,…,xn is given.

Enter the Maximum Likelihood estimate of T as answer to the second question in the quiz.

$$MLE=\Pi_{i=1}^nProb(X=x)={\frac{1}{T}}^n$$

In [15]:
def MLE(T, n):
    return T^(n)

In [16]:
test_sample.max()

x    9.746431
dtype: float64

#### 2.2 Method of moments estimate

Recall that estimate for the same parameter can be obtained by method of moments. Derive it using the equation for the first moment of uniform distribution on [0,T].

Enter Method of Moments estimate of T in the quiz as answer to the third question.

$$
\begin{split}
M_X^t &=E(e^{xt}) \\
&=\int_0^T{e^{xt}Pr(X=x)dx} \\
&=\int_0^T{e^{xt}\frac{1}{T}dx} \\
&=\frac{1}{T}\int_0^T{e^{xt}dx} \\
&=\frac{1}{T}(\frac{e^{xt}}{t}|_0^T) \\
&=\frac{1}{T}\frac{e^{Tt}-1}{t} \\
&=\frac{e^{Tt}-1}{Tt}
\end{split}
$$

$$
\begin{split}
\frac{\partial M_X^t}{\partial t } &= \frac{Te^{Tt}Tt-T(e^{Tt}-1)}{(Tt)^2} \\
&= \frac{(Tt-1)e^{Tt}+1}{Tt^2}
\end{split}
$$

By applying L'Hopital's Rule, 
$$
\begin{split}
lim_{t→0}\frac{\partial M_X^t}{\partial t} &=lim_{t→0}\frac{(Tt-1)e^{Tt}+1}{Tt^2} \\
&= lim_{t→0}\frac{(Tt-1)e^{Tt}+e^{Tt}}{2t} \\
&= lim_{t→0}\frac{(Tt-1)e^{Tt}+(T+1)e^{Tt}}{2} \\
&= \frac{T}{2}
\end{split}
$$


In [17]:
test_sample.mean()*2

x    10.637446
dtype: float64

#### 2.3 Test hypothesis using likelihood ratio

In [18]:
T0 = 10
(test_sample.max() / T0) ** 10

x    0.773493
dtype: float64