# Problem Set 3: Estimators

## MLE Proof

Suppose we have a random sample $X_1, X_2, \dots X_N$ whose assumed probability distribution depends on some unknown parameter $\theta$. The observed values of the sample are $x_1, x_2, \dots x_N$. We want to find an estimator $u(X_1, X_2, \dots X_N)$ such that $u(x_1, x_2, \dots x_N)$ is a "good" estimate of $\theta$. It seems reasonable that such an estimate of the unknown parameter $\theta$ would be the value of $u$ that maximises the probability, or the likelihood, of getting the actual data we observed.

If the probability density function of each $X_i$ is $f(x_i; \theta)$, the joint probability mass (or density) function of $X_1, X_2, \dots X_N$ is

$$L(\theta) = P(X_1 = x_1, X_2 = x_2, \dots X_N = x_N) = \prod_i^N f(x_i; \theta)$$

Assuming that the $X_i$ are independent Bernoulli random variables with unknown parameter $p$, the probability mass function of each $X_i$ is

$$f(x_i; p) = p^{x_i} (1-p)^{1-x_i}$$

In order to maximise the function, we need to find the $p$ that maximises the likelihood $L(p)$. To make the differentiation easier, we note that the value of $p$ that maximises $\ln \left[L(p)\right]$ will also maximise $L(p)$.

$$\ln[L(p)] = \ln \left[ \prod_i p^{x_i} (1-p)^{1-x_i} \right] = \sum_i \left[x_i \ln(p) + (1-x_i) \ln(1-p) \right]$$

Take the first derivative with respect to $p$, and set it to zero:

$$\frac{\partial}{\partial p} \ln[L(p)] = \sum_i \left[\frac{x_i}{p} - \frac{1-x_i}{1-p} \right] = 0$$

Multiply both sides by $p (1-p)$:

$$ \sum_i \left[(1-p) x_i - p (1 - x_i)\right] = \sum_i (x_i - p) = 0$$

$$\sum_i x_i - N p = 0$$

Thus, we finally get

$$p = \frac{1}{N} \sum_i x_i$$

Source: https://onlinecourses.science.psu.edu/stat414/node/191

## Standardize and scale data

Suppose you have a set of data $x_1, x_2, \dots x_N$ with mean 5, standard deviation 4 and variance 16. If $x_i =$ 9, what is its standard score?

In [16]:
from __future__ import print_function, division

xi = 9
mu = 5
sd = 4

print((xi-mu)/sd)

1.0


Now let's multiply every data point by 1.5 and calculate the new mean, variance and standard deviation. Also calculate the new standard score $z$ of the point we considered earlier.

$$\mu' = \sum_i x_i'= \sum_i (1.5 x_i) = 1.5 \mu$$

$$\sigma'^2 = \frac{1}{N} \sum_i (1.5 x_i - \mu')^2 = \frac{1}{N} \sum_i (1.5 x_i - 1.5 \mu)^2 = 2.25 \sigma^2$$

In [17]:
xi = 1.5*xi
mu = 1.5*mu
sd = 1.5*sd

print(xi)
print((xi-mu)/sd)

13.5
1.0


## Scatter plot spread

If we have the following plot of adult height vs child height,

![](scatterplot.png)

which of two variables has the largest variance?

*Answer: adult height - look at how the data is spread out with respect to the mean for each variable.*

## Histogram averages

In the following histogram, what are the mean, median and mode?

![](histogram.png)

In [18]:
mean = (1 + 2*2 + 3*3 + 2*4 + 5 + 6 + 11)/(1 + 2 + 3 + 2 + 1 + 1 + 1)
print(mean)

4.0


The value 3 is both the median and the mode.