# P03: Point Processes

## Problem 1: Mean and variance of Gaussian random variable

Let us simulate the experiment of measuring a Gaussian random variable with $\mu=0$ and standard deviation $\sigma=1$. Each experiment is a draw from a standard normal.

(i) Simulate 10 repetitions of this experiment by drawing $n_{\mathrm{samp}}=10$ samples from this distribution.

In [2]:
import numpy as np

#Paramters
mu = 0
sigma = 1
n_samp = 10
n_reps = 10

# Simulating 10 repetitions of drawing 10 samples
samples = np.random.normal(mu, sigma, (n_reps, n_samp))

print(samples)

[[ 0.18596878  1.25769569 -2.26003271  0.16180086  0.03540603  2.37001372
  -0.44626117 -1.20232799 -1.02225544  0.17747129]
 [ 1.13908121  0.1833797  -0.31566111  0.66199693  0.49851565  0.07725671
   0.48884498 -2.51503537 -0.65923549 -0.24083509]
 [-2.41838318  0.03604784 -0.67005602 -1.99995643  0.94067819  0.78042428
  -1.85106765  0.79235972 -0.36791975 -0.1928916 ]
 [-0.79296772 -0.59284281  0.33399968 -0.13004299  0.12375311  0.48139531
   1.73264781  0.36933436 -0.7146222   0.88984051]
 [-0.55933459 -0.26415729  0.05500363  1.09609434 -1.21687146  0.74316929
   0.66765461  0.25211435  1.58121629  1.73936241]
 [ 0.93168568 -0.00944482 -1.07113096 -2.06389163  0.05125423  0.73178613
   1.30585322  1.03783724 -0.18002798 -0.65741046]
 [-0.38347193  0.45907645  0.53886446 -1.34151891  1.80760496 -0.5294096
   0.36643489 -2.17839976  0.27580512 -1.38383154]
 [ 0.53601094  1.19307675 -0.65927961 -0.87093796  2.6795793  -1.03928139
  -0.58069058  2.63139212 -0.79200464  0.66972753]
 

(ii) Use `numpy` routines to estimate the mean and the variance of the samples. For the variance, use the trivial but biased estimator as well as the unbiased one.

In [15]:
np.random.normal(2, 1, (10,))

array([ 2.98555645, -0.02075629,  1.0571324 ,  3.32090786,  2.20380181,
        2.19652335,  3.14205284,  3.16024399,  1.63291071,  2.22335813])

In [4]:
# Calculating the mean and the variance
sample_means = np.mean(samples, axis=1)
variances_biased = np.var(samples, axis=1) # biased variance
variance_unbiased = np.var(samples, axis=1, ddof=1) # unbiased variance

print(sample_means)

[-0.07425209 -0.06816919 -0.49507646  0.17004951  0.40942516  0.00765106
 -0.23688459  0.37675925 -0.29045298 -0.19732674]


## Problem 2: Distribution of sample mean

Write a function that repeats problem 1 $n$ times.

(i) Using these samples, investigate the distribution of the sample mean. How does it compare to your expectations? How do your conclusions change if you increase the sample size to $n_{\mathrm{samp}}=100$?

(ii) Using these samples, investigate the bias of the variance estimators. How does it compare to your expectations? How do your conclusions change if you increase the sample size to $n_{\mathrm{samp}}=100$?

## Problem 3: Analyzing samples

Load `sample1.out` from the `data` directory into your notebook. Inspect the distribution of the sample by plotting a histogram. 

(i) Write a class that is able to estimate mean and variance as well as error on the mean and bias of the variance. Make the choice of the variance estimator an argument of the class' `__init__` function.

(ii) Use your class to estimate mean and variance of `sample1.out`. In addition, also estimate the median and plot them on top of your histogram.

(iii) Now also estimate the mean and variance of the remaining samples in the `data` directory. Are the samples consistent with being drawn from the same underlying distribution? If yes, investigate the distribution of the sample means and compare to your expectations.

## Problem 4: The Poisson distribution

Let us assume we have a process in which the probability of a given event, $p$, is small but we perform a large number of trials $N$. We further assume that the so-called rate of this process (or mean number of ocurring events), $\lambda=Np$, is finite and constant. Then the probability of $k$ events occurring follows a Poisson distribution given by $$P(k|\lambda)=\frac{\lambda^k e^{-\lambda}}{k!}.$$ So the Poisson distribution describes the distributions of events, where each single one is rare but we perform a large number of trials in order to keep $\lambda$ constant. An example for this is the number of photons reaching a telescope.

One of the earliest application of Poisson processes was the probability of Prussian soldiers being kicked to death by horses. This problem was analyzed by Ladislaus Bortkiewicz in 1898. Analyzing 10 Prussian corps for 20 years he collected the following data:

| Number of deaths | Number of groups |
|:---| :--- |
| 0 | 109 |
| 1 | 65 |
| 2 | 22 |
| 3 | 3 |
| 4 | 1 |

(i) Plot the probability distribution of these data.

(ii) Compare to the theoretical prediction using Poisson statistics.