# HW3 problems

### Learning objectives
In this homework you will:

- See that you can use $\sigma$ as a measure of resolution even for distributions that are not Gaussian
- Learn how to write a simple Monte Carlo and use it to reproduce an analytical result
- Understand what position resolution means using a silicon strip detector as an example
- Demonstrate that you can use simulation to solve problems that are more complicated than what can be done analytically
- Learn how noise affects measurements


## Problem 1: Measurement Uncertainties (8 pts)

The [*Central Limit Theorem*](https://en.wikipedia.org/wiki/Central_limit_theorem) tells us  that 
the distribution of the sum (or average) of a large number of independent, identically 
distributed measurements will be approximately normal, regardless of the underlying distribution
(subject to the condition that mean and variance of the underlying distribution are not infinite).
We'll see how this works for the simplest pdf ([probability density function](https://www.khanacademy.org/math/statistics-probability/random-variables-stats-library/random-variables-continuous/v/probability-density-functions)), a random variable $x$ uniformly distributed:

$$f(x) = \begin{cases} \frac{1}{b-a} &\mbox{if } a \leq x \leq b \\ 
0 & \mbox{otherwise} \end{cases} $$

### 1a. (2 pts)

Show that the mean ($\mu$) is $\frac{b+a}{2}$ and the variance ($\sigma^2$) is $\frac{(b-a)^2}{12}$ for the above distribution.

Write your answer here

### 1b. (2 pts)

Let $a=0$ and $b=1$. Using your favorite random number generator, generate 1000 random numbers from the uniform distribution, $f(x)$.  Calculate the mean and variance of the numbers you generate.  

Hint: You can use the NumPy (Numerical Python) library to generate random numbers from the (0,1) uniform distribution:

In [None]:
#Import the NumPy library as "np"
import numpy as np

#Use NumPy to generate a list (it's actually a numpy.ndarray) of 1000 random numbers
samples = np.random.rand(1000)

#Print the first 10 random numbers
print(samples[0:10])

Write your own functions to find the mean and variance of a list of numbers:

In [None]:
def find_mean(num_list):
    """Compute mean of an input list of numbers
    
    Parameters
    ==========
    num_list : list of floats
      given as a NumPy array of numbers
      
    Returns
    =======
    mean : float
      the mean of num_list
    """
    
    # Your code to calculate mean here
    
    return mean

def find_variance(num_list):
    """Compute variance of an input list of numbers
    
    Parameters
    ==========
    num_list : list of floats
      given as a NumPy array of numbers
      
    Returns
    =======
    variance : float
      the variance of num_list
    """    
    
    # Your code to calculate mean here
    
    return variance

### 1c. (1 pt)

Do these numerical results agree with the analytical results you found above?

Write your answer here

### 1d. (1 pt)

Make a histogram  with 100 bins where the lower edge of the first bin is at $x=0$ and the upper
edge of the last is at $x=1$.  Fill your histogram with the random numbers you generated above.

We'll use the **matplotlib.pyplot** module to make several histograms throughout the assignment, so we should go ahead and import it now.  Making a histogram from a list of numbers is as simple as calling [plt.hist()](https://matplotlib.org/api/_as_gen/matplotlib.pyplot.hist.html), with the list as the input parameter.  There are many optional parameters, like the number of bins and the color of the bars.

Since our histograms will share similar formatting, it's also useful to define a function rather than typing the same thing over and over.  For this homework, we'll supply the histogramming routine.  In the future, you will write this code yourself.

Take some time to play with the plot formatting and choose a [color](https://matplotlib.org/2.0.2/api/colors_api.html) you like for the bars.  Then call the function with your random samples.

In [None]:
#Import the pyplot module of matplotlib as "plt"
import matplotlib.pyplot as plt


#Makes a histogram filled with the random numbers we generate
def plot_histogram(samples,xtitle,ytitle, title, limits):
    """Create and plot a histogram of an array of numbers
    
    Parameters
    ==========
    samples : 1D numpy array
      the numbers to histogram
      
    xtitle : string
      the label of the x-axis of the histogram
      
    ytitle : string
      the label of the y-axis of the histogram
      
    title : string
      the title added to the histogram
      
    limits : tuple of floats
      the x-axis lower and upper limit
      
    Returns
    =======
    Nothing!
    """
    #It would be nice to have the mean and standard deviation in the title, so let's get these
    mean, sigma = np.mean(samples),  p.sqrt(np.var(samples))
    #Plot the histogram of the sampled data with 100 bins and a nice color
    plt.hist(samples, bins=100, range=limits, color=(0,0.7,0.9))  #Set the color using (r,g,b) values or
                                                                  #  use a built-in matplotlib color""" 

    #Add some axis labels and a descriptive title
    plt.xlabel(xtitle)
    plt.ylabel(ytitle)
    plt.title(title+'\n $\mu={0:.3f},\ \sigma={1:.3f}$'.format(mean,sigma) )

    #Get rid of the extra white space on the left/right edges (you can delete these two lines without a problem)
    xmin, xmax, ymin, ymax = plt.axis()
    plt.axis([limits[0],limits[1],ymin,ymax])

    #Not necessarily needed in a Jupyter notebook, but it doesn't hurt
    plt.show()

In [None]:
#Write your answer here

### 1f. (1 pt)

Now suppose you make an ensemble of 1000 pseudoexperiments where each pseduoexperiment consists of $N$ uniformly distributed random numbers.  For each pseudoexperiment, define the measurement $S$ to be

$$
S \equiv \frac{1}{N} \sum_1^N x_i
$$

Make histograms of $S$ with the same $x-$axis as above for the cases $N=2$, $N=5$
and $N=10$.  Determine the mean and the $\sigma$ of the distributions displayed in
these histograms.  

In [None]:
#Calculates and returns S as defined above
def pseudoexperiment(N):
    """Run a single pseudoexperiment consisting of N measurements
    
    Parameters
    ==========
    N : int
      number of uniformly sampled random numbers per pseudoexperiment
      
    Returns
    =======
    s : float
      the measurement result as defined in the text above
    """
    samples = np.random.rand(N)  #samples = [x_1, x_2, ... x_N]
    s = 0  #replace this with your calculation of s
    
    """Your code here"""
    
    return s

#Performs each pseudoexperiment 1000 times and plot a histogram of the results
def run_pseudoexperiments(N):
    """Run a set of pseudoexperiments each with N random numbers
    
    Parameters
    ==========
    N : int
      number of random numbers in each pseudoexperiment
      
    Returns
    =======
    Nothing!
    """
    s_list = []
    #run the ensemble of 1000 pseudoexperiments and store measurements
    for i in range(1000):
        s_list.append(pseudoexperiment(N))
    #plot a histogram of these 1000 mesa
    plot_histogram(s_list,"Mean Value of x","Number of Entries","1000 PseudoExperiments, each with "+str(N)+" Randomly Distributed x values",[0.0,1.0])

In [None]:
# Run this cell after you have completed the code above
for N in [2,5,10]:
    run_pseudoexperiments(N)

### 1g. (1 pt)

Compare the values of $\sigma$ you obtain to what you would predict if you assumed the experiments followed a uniform distribution.

Write your answer here

## Problem 2: Silicon Detector Position Resolution – Analytic Calculation (6 pts)

In this problem and the next we will study how the position resolution of a detector depends upon the properties of that detector. For our example, we will consider a silicon micro-strip detector. We will describe our detector as a plane segmented into strips, each of width $\ell$. When a track passes through the plane, it deposits energy in the detector and that energy is collected using charge sensitive ampliﬁers (one per strip). You many assume that the incident track is normal to the silicon plane. Looking down on the strip detector (so that the incident tracks are traveling into the page), the detector looks like this:

<img src="strips.png" alt="Drawing" style="width: 600px;"/>

The position $x=0$, $y=0$ is taken to be the center of the middle strip.

### 2a. (3 pts)

Suppose all the energy is deposited in a single strip (the strip the
track passes through).  Find an expression for the position resolution
of the detector as a function of $\ell$. The position resolution is defined
to be $\sigma_x = \sqrt{var[(x_{meas}-x_{true})]}$ where $x_{true}$ is the 
position where the track actually hit the detector.  Because we only
know which strip is hit, in this example 
$x_{meas}$ is the center of the strip that is hit.

Write your answer here

### 2b. (3 pts)

Suppose that 
the charge deposited in our detector spreads out due
to physical effects such as diffusion. It is possible for more than
one strip to register a signal.  Assume in this part that our electronics
is binary (i.e. registers a 1 if the deposited energy on the
strip is above a specified threshold
and 0 otherwise). Assume the threshold on the electronics is
such that particles hitting within
a  distance of $\pm \ell/3 $ of the center of the strip only register on
a single strip while all particles hitting further from the strip
center register on two strips.

What is the position resolution now? (Here, if
only one strip is hit, $x_{meas}$ is the center of the strip.  If
two strips are hit, then $x_{meas}$ is the common edge of the two
hit strips).  **Note:** this is *not* an unrealistic example.  The ATLAS silicon
strip detector has such binary readout.

Write your answer here

## Problem 3: Silicon Detector Position Resolution – Monte Carlo Calculation (6 pts)

In problem 2, it was possible to calculate the position resolution analytically. In cases where the detector response is more complicated, this may not be the case. Typically, physicists model detector performance using Monte Carlo simulations. In this problem, you will write a simple simulation to determine the position resolution of a silicon detector.


### 3a. (1 pt)

Let's begin by reproducing the analytic results obtained in problem **2**.
Consider a silicon strip detector that consisting of several strips of width $\ell$.  
Assume that the incident particles have a uniform distribution
in $x$ with $-\ell/2 < x < \ell/2$ and all have $y=0$. (We'll just focus on this "center strip", so $x_{meas}$ can either be $-\ell/2$, $0$, or $\ell/2$ depending on where the particle hits.)
Generate 10,000 such particles for
the case described in part **2(a)** and for the case described
in part **2(b)**.
For each case, make a histogram of ($x_{meas}-x_{true}$) and 
verify that
the resolution 
is consistent with that  obtained in problem **2**. 

You can use np.random.uniform() to sample from a uniform distribution with arbitrary bounds. We'd like 10,000 samples for this problem, so we set size = 10000. Note that we take $\ell = 1$ for simplicity.


In [None]:
samples = np.random.uniform(-0.5, 0.5, size=10000)

For the case of **2(a)**, $x_{meas}=0$ (the center of the strip), so we can simply find the variance of these samples and take the square root to find the position resolution.  Note that this is really just a repetition of the first problem, but with our bounds shifted from (0,1) to (-0.5,0.5).

In [None]:
#Write your answer here

### 3b. (2 pts)

For the case of **2(b)**, we need to consider the three different cases for $x_{meas}$. Comment on how the resolution compares.

In [None]:
#We'll call the quantity (x_meas - x_true) the "error" of the measurement
errors = []

#Go through each of our random samples and append (x_meas - x_true) to the error list
for x_true in samples:
    #particle hits the left third of strip
    if x_true < -1/3:
        x_meas = """Your code here"""
    #particle hits the right third of strip
    elif x_true > 1/3:
        x_meas = """Your code here"""
    else:
        x_meas = """Your code here"""
        
    errors.append(x_meas - x_true)

plot_histogram(errors,"x value","Number of Entries","Measured-True Position",[-0.5,0.5])

In [None]:
#Write your answer here

### 3c.  (3 pts)

Now, let us replace our binary electronics from part 2(b) with analog electronics (so that the magnitude of the charge deposited on the strip is recorded).  We will model the transverse spreading of the charge from our incident
track using a Gaussian distribution with width $\sigma_M$: 

$$ f(x)\ dx = \frac{1}{\sigma_M\sqrt{2\pi}} \exp(-(x-x_0)^2/2\sigma_M^2)\ dx$$

where $f(x)$ is the charge deposited between position $x$ and $x+dx$ and $x_0$ is the point where the track hits the detector.
Assume that the total energy deposited by each track is 1 MIP (a MIP is the energy deposited by a single minimum ionizing particle), that our analog electronics has a threshold of 0.2 MIP and that $\sigma_M=\ell$.  Also assume that the electronics has an intrinsic noise contribution $\sigma_N=0.1$ MIP. (This means that the measurement of the charge on each strip is modified by adding a noise contribution that is distributed according
to a Gaussian with mean 0 and variance $\sigma_N^2$.  Assume that the
noise on neighboring strips is uncorrelated.)

Generate 10,000 particles and simulate the response of this
silcon strip detector (**using 7 strips now**) to these particles.  From this simulation
determine the position resolution of the silicon detector.
Assume that in the analysis of these data the measured position of the particle is:
$$
x_{meas} = \sum_{i=strips} q_i x_i
$$
where the index $i$ is the strip number, $q_i$ is the measured
charge on the strip (set to zero for strips with charge below
threshold) and $x_i$ is the position of the center of strip $i$.

Once again, assume that the incident particles have a uniform distribution
in $x$ with $-\ell/2 < x < \ell/2$.

We'll take $\sigma_M = \ell = 1$ for simplicity.  To keep track of our detector geometry, we'll first create a list of each strip's centers along with a list of its left/right bounds.  

In [None]:
num_strips = 7  #Keep this odd so the problem is symmetric
centers = []
bounds = []

for i in range(-int(num_strips/2),int(num_strips/2)+1):
    centers.append(i)
    bounds.append([i-0.5, i+0.5])
    
print(centers, bounds)

Next, we'll use the error function to integrate over the Gaussian distribution while finding the charge deposited on each strip.  We'll also implement the weighted sum in finding the measured position.

In [None]:
#Import the error function to help integrate the gaussian distribution
from math import erf

def get_charge(i,x):
    """Finds the charge deposited on strip i with a hit at location x
    
    Parameters
    ==========
    i : int
      index of the strip in the centers and bounds lists
      
    x : float
      the true x-coordinate of the hit
      
    Returns
    =======
    charge : float
      the charge measured by the strip indexed by i given a hit coordinate x
    """
    charge=0 # replace this with your code below
    
    """Your code here.  Using the error function erf() is convenient/fast, but
       feel free to use an integration package like scipy.integrate or write your
       own numerical integration function.  The left and right bound of each strip
       is contained in bounds[i][0] and bounds[i][1], respectively. """
    
    return charge
                         
def find_x_meas(charges, cutoff):
    """Finds the measured particle position as given by the expression in the
    text above
    
    Parameters
    ==========
    charges : list of floats
      a list representing the charge measured at each silicon strip
      
    cutoff : float
      the value below which a measured charge is registered as a zero
      
    Returns
    =======
    xmeas : float
      the measured x-coordinate based on the charges seen in each strip
    """
    xmeas = 0 # replace this with your code below
    
    """Your code here.  Use the weighted sum definition given in the problem statement.
       Note that x_i = centers[i]."""
    
    return xmeas

Finally, we'll write a function that finds the position resolution for any intrinsic noise and charge threshold.

In [None]:
#Noise, cutoff corresponds to sigma_N, threshold described in the problem statement
def test_analog_electronics(noise, cutoff):
    """Finds the position resolution given intrinsic electronics noise and
    charge threshold
    
    Parameters
    ==========
    noise : float
      additional measurement smearing from electronics noise in 
            units of energy deposited by an ionizing particle
      
    cutoff : float
      the value below which a measured charge is registered as a zero
      
    Returns
    =======
    errors : list of floats
      a list of differences between the measured position and the true 
      hit position
    """
    samples = np.random.uniform(-0.5,0.5, size=10000)
    errors = []
    for x_true in samples:
        #Calculate the charge deposited on each detector strip
        charges = []
        for i in range(num_strips):
            charges.append(get_charge(i,x_true))

        #Add intrinsic noise to the electronics
        charges += np.random.normal(0, noise, num_strips)

        #Find the measured position of the particle
        x_meas = find_x_meas(charges, cutoff)

        errors.append(x_meas - x_true)
    
    return errors

First run the test_analog_electronics($\sigma_N$, cutoff) function with $\sigma_N =0.05$ and cutoff $= 0.2$ MIP.  How does the position resolution of the detector change as you increase/decrease these two parameters? (Note: physicists often characterize the the threshold in terms of units of $\sigma_N$.  For example, the parameters above correspond to a threshold of $4\sigma_N$.  Consider when exploring these parameters, describing the theshold in this way)

In [None]:
#Write your answer here