**Question 1:**

CODING: Students in PHYS150 were tasked to measure the local acceleration of gravity $g$ by dropping an object from height $H$, measuring the time $t$ it takes to reach the ground, and using the formula,
\begin{equation}
    H = \frac{1}{2}gt^2.
\end{equation}
The datafile $\texttt{GravityMeasurements.dat}$ contains
  measurements of the value of $H$ in meters, its uncertainty $\sigma_{H_i}$, time $t$ in seconds, and its uncertainty $\sigma_{t_i}$ made independently by $N=350$ students.  The file has five columns -- (1)
  index $i=0,...,N-1$, (2) measurement $\{H_i\}$, (3) its corresponding uncertainty $\{\sigma_{H_i}\}$, (4) measurement $\{t_i\}$, and (5) its corresponding uncertainty $\{\sigma_{t_i}\}$ from each student.
  
```
      0   20.066  1.120   2.092  0.084
      1   20.363  0.389   2.023  0.050
      2   21.498  0.706   1.939  0.115
      3   19.709  0.791   1.963  0.118
      4   21.192  1.225   2.132  0.097
      5   16.631  1.479   1.877  0.112
      ...
```

 

(a) (1 pt) Calculate the values of acceleration of gravity $\{g_i\}$.

(b) (2 pts) Calculate the corresponding uncertainties $\{\sigma_{g_i}\}$ using the proper error propagation formulae.
    
(c) (2 pts) In this part and the next one, ignore the uncertainties
    $\{\sigma_{g_i}\}$, but rather estimate it directly from the measurements by
    computing the standard deviation of $\{g_i\}$.  What is the best estimate of
    this value $\tilde{\sigma}_g$?
    
(d) (2 pts) Using this value of $\tilde{\sigma}_g$, calculate the maximum-likelihood
    estimate (MLE) of the mean of $\{g_i\}$ and the uncertainty in the mean from the
    measurements.

(e) (2 pts) Now compute the inverse-variance weighted MLE of the mean of $\{g_i\}$ using
    the actual uncertainty $\{\sigma_{g_i}\}$ on each measurement.  Also compute the
    uncertainty in the mean to show that it is smaller than the value
    $\tilde{\sigma}_g$ estimated in (a).

In [None]:
# Problem 1a
def calculate_g(data):
    """Given measurements of H and t, calculate the inferred gravitational acceleration constant, g.
    
    The input height (H) and time (t) are given as data['H'] and data['t'].
    
    Returns an array with the corresponding g values.
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
import numpy as np

data = np.genfromtxt('GravityMeasurements.dat', names=['ID','H','sigma_H','t','sigma_t'])

print('The first few rows are:')
for i in range(6):
    print(data[i])
print()

g = calculate_g(data)

print('The first few g values are: ',g[:5])
print('The lowest estimate is ',np.min(g))
print('The highest estimate is ',np.max(g))
print('The mean estimate is ',np.mean(g))

# None of the students had measurements that were more than 30% off.
assert np.allclose(g, 9.8, rtol=0.3)

In [None]:
# Problem 1b
def propagate_sigma_g(data):
    """Calculate the uncertainties in the g measurement based on the reported uncertainties in H and t.
    
    The input height (H) and time (t) are given as data['H'] and data['t'].
    Their respective estimated uncertainties are data['sigma_H'] and data['sigma_t'].
    
    Returns an array with the corresponding sigma_g values.
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
sigma_g = propagate_sigma_g(data)

print('The first few calculated uncertainties are: ',sigma_g[:10])
print('The range of sigma_g values is ',np.min(sigma_g),np.max(sigma_g))

# The "pull" is sometimes a useful quantity to look at.
# It is the difference of each measurement from the mean (or expected value) divided by sigma.
# For Gaussian errors, you should expect around 68% of values to have pull between -1 and 1.
# If this isn't the case, it could mean your errors are poorly estimated, or the distribution is not
# Gaussian, or both.

pull = (g-np.mean(g)) / sigma_g
print('The first few pulls are: ',pull[:5])
frac_lt_1 = np.sum(np.abs(pull) < 1) / len(pull)
print(f'The fraction of points with |pull| < 1 = {frac_lt_1:0.3f}')

In [None]:
# Problem 1c
def estimate_unweighted_sigma_g(data):
    """Estimate an overall estimate of the uncertainties in the g values based on the empirical
    standard deviation.
    
    This estimate ignores the students' own estimates of the uncertainties on H and t, and instead
    just uses their calculated g values.
    
    Returns a single (not array) value sigma_g.
    """
    # Hints: 1. Use your calculate_g function to get the array of g values.
    #        2. Feel free to use appropriate numpy or scipy functions.
    
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
unweighted_sigma_g = estimate_unweighted_sigma_g(data)

print(f'The estimated unweighted sigma_g is {unweighted_sigma_g:.4f}')
print(f'Compare this to the mean propagated sigma_g: {np.mean(sigma_g):.4f}')

In [None]:
# Problem 1d
def calculate_unweighted_mle_meang(data):
    """Estimate the unweighted maximum-likelihood estimate of <g> and its uncertainty.
    
    This estimate ignores the students' own estimates of the uncertainties on H and t.
    
    Returns a tuple of two values: <g>, sigma_<g>
    """
    # Hints: 1. Use calculate_g to get the array of g values.
    #        2. Use estimate_unweighted_sigma_g for sigma_g.
    #        3. Feel free to use appropriate numpy or scipy functions.
    
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
meang, sigma_meang = calculate_unweighted_mle_meang(data)

print('Using unweighted maximum likelihood,')
print(f'the mean estimate of g is {meang:0.4f} +- {sigma_meang:0.4f}')

In [None]:
# Problem 1e
def calculate_weighted_mle_meang(data):
    """Estimate the inverse-variance weighted maximum-likelihood estimate of <g> and its uncertainty.
    
    This estimate uses the students' own estimates of the uncertainties on H and t to propagate
    into a separate estimate of sigma_g for each data point.
    
    Returns a tuple of two values: <g>, sigma_<g>
    """
    # Hints: 1. Use calculate_g to get the array of g values.
    #        2. Use propagate_sigma_g for sigma_g.
    #        3. Feel free to use appropriate numpy or scipy functions.
    
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
meang, sigma_meang = calculate_weighted_mle_meang(data)

print('Using weighted maximum likelihood,')
print(f'The mean estimate of g is {meang:0.4f} +- {sigma_meang:0.4f}')


**Question 2:**

CODING: Using the same dataset as above:

(a) (3 pts) Calculate the $\chi^2$ value defined as:

\begin{equation}
      \chi^2 = \sum_{i=0}^{N-1} \left( \frac{g_i - \mu^\prime}{\sigma_{g_i}} \right)^2
\end{equation}
    
where $\mu^\prime$ is the inverse-variance weighted MLE of the mean of $\{g_i\}$
from 1(c).
     
(b) (4 pts) Now calculate the $\chi^2$ value as a function of $\mu^\prime$ from
     $\mu^\prime = 9.6$ to 10.0 in steps of 0.001 and show that the value of
     $\mu^\prime$ that minimizes the $\chi^2$ is indeed given by the answer from 1(c).
     
(c) (4 pts) Finally, determine the lower and upper values of $\mu^\prime$ at which the
     $\chi^2$ is larger than the minimum value by 1.00.  
     (NOTE: These values should match the $\pm 1\sigma$ range from
     the MLE in part 1(e) and is another way to perform parameter estimation using the
     $\chi^2$ statistic.)
     
(d) **EXTRA CREDIT**: (2 pts) Make a single plot that shows all of these facts with appropriate labels to show the important features.
     

In [None]:
# Problem 2a
def calculate_mle_chisq(data):
    """Calculate chisq for the inverse-variance weighted MLE estimate of mu'.
    
    The sigma_g values are based on the students' estimated uncertainties sigma_H and sigma_t.
    
    Returns chisq
    """
    # Hints: 1. Use calculate_g to get the array of g values.
    #        2. Use propagate_sigma_g for sigma_g.
    #        3. Use calculate_weighted_mle_meang to get mu'.

    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
chisq = calculate_mle_chisq(data)

print(f'The chisq estimate for the MLE estimate of <g> is {chisq:.2f}')
print('This should be roughly comparable to the number of data points: ',len(data))


In [None]:
# Problem 2b
def calculate_chisq_range(data, min_mu, max_mu):
    """Calculate chisq over a range of mu values from min_mu to max_mu in steps of 0.001
    
    This function will generate an array of mu values over the given range.
    For each value of mu, it will calculate the corresponding chisq value.

    The sigma_g values are based on the students' estimated uncertainties sigma_H and sigma_t.

    Note: The output mu values should be monotonically increasing.
    
    Returns mu_array, chisq_array as arrays of equal length.
    """
    # Hints: 1. Use calculate_g to get the array of g values.
    #        2. Use propagate_sigma_g for sigma_g.
    
    # YOUR CODE HERE
    raise NotImplementedError()
    
def find_minimum_chisq(mu_array, chisq_array):
    """Find the minimum chisq and its corresponding mu, given arrays of each.
    
    Returns mu_minimum, chisq_minimum
    """
    # Hint: np.min(chisq_array) will return the value of the minimum.  There is another
    #       numpy function that will instead give you the index of the minimum, which 
    #       will let you access the corresponding element from mu_array.
    
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
mu_array, chisq_array = calculate_chisq_range(data, 9.6, 10.0)
mu_minimum, chisq_minimum = find_minimum_chisq(mu_array, chisq_array)

print(f'The minimum chisq in steps of 0.001 is {chisq_minimum:.4f}')
print(f'mu at the minimum is {mu_minimum:.3f}')
assert chisq_minimum == np.min(chisq_array)

# Do this again for comparison.
chisq_mle = calculate_mle_chisq(data)

print('\nFor comparison:')
print(f'chisq at the MLE solution is {chisq_mle:.3f}')
print(f'The MLE estimate of mu is {calculate_weighted_mle_meang(data)[0]:.3f}')
assert np.isclose(chisq_minimum, chisq_mle, rtol=1.e-3)

In [None]:
# Problem 2c:
def find_one_sigma_range(mu_array, chisq_array):
    """Find a range of mu values where chisq < min_chisq + 1.0, given arrays of mu and chisq.
    
    Note: the input values in mu_array may be assumed to be monotonically increasing.
    
    Returns min_mu, max_mu.
    """
    # YOUR CODE HERE
    raise NotImplementedError()

In [None]:
min_mu, max_mu = find_one_sigma_range(mu_array, chisq_array)
print(f'Range where chiq < ({chisq_minimum:.2f} + 1) is {min_mu:.3f} < mu < {max_mu:.3f}')

print('\nFor comparison with 1(e):')
meang, sigma_meang = calculate_weighted_mle_meang(data)
print(f'The MLE estimate would predict {meang-sigma_meang:.3f} < mu < {meang+sigma_meang:.3f}')

In [None]:
# Problem 2d (EXTRA CREDIT)

# YOUR CODE HERE
raise NotImplementedError()