# In-class assignment: more MCMC

The purpose of today's in-class assignment is to use the Markov Chain Monte Carlo (MCMC) technique to find the best fit to a poorly-sampled, error-rich time series.  This time series is not amenable to Fourier Transform-based analysis because the data is not uniformly sampled in time and has large and heteroscedastic (i.e., non-uniform) errors, and thus a different technique is necessary.

The dataset has been provided, and has three columns: sample time (in seconds), flux (in arbitrary units), and error (in the same arbitrary units). You decide that a reasonable model to describe the data is:

$f(t) = A \sin(\omega t + \phi)$

Extend the MCMC code used in the pre-class assignment to fit the parameters A, $\omega$, and $\phi$, show the 2D histograms of the resulting MCMC trace, and estimate the real answer. Here are a couple of suggestions that might be useful:

1. When exploring more dimensions, you generally need to do more sampling to get a reasonable solution.  Try using a minimum of $10^5$ samples in your Markov Chain - $10^6$ would be better if your computer can handle it.
2. Make sure to use the errors given to you in the file for this assignment - don't guess!
3. You do not have to use equal steps in each direction in parameter space.  In other words, $\Delta A$, $\Delta\omega$ and $\Delta\phi$ do not all have to be equally sized.  Think about which one probably has the most range, and make that one largest!
4. You also don't necessarily need to take steps where $\Delta A, \Delta\omega,\Delta\phi$ are uniformly distributed (using, e.g., `np.random.uniform()`).  Experiment with using a Gaussian distribution (via `np.random.normal()`) for the same values of $\Delta$ and see how it impacts convergence!  Note that you may need to smooth the errors measured in the trace to tell the difference between the two algorithms.  See [the SciPy Coookbook's signal-smoothing example](https://scipy-cookbook.readthedocs.io/items/SignalSmooth.html) for some code you can copy and paste.

To get you started, we're going to read in and plot the data!

In [None]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import numpy.random as npr

In [None]:
# Read the time series into 3 arrays: time, flux, error in flux.
# Note that there is a "low noise" version of the dataset in the file
#    timeseries_lownoise.txt, which is otherwise identical except it
#    has 8x smaller errors!
t, f_t, f_err = np.loadtxt("timeseries.txt",skiprows=1,unpack=True)


In [None]:
# plot it up.  That's some awful data, eh?
plt.errorbar(t,f_t,yerr=f_err,fmt='bo')
plt.xlabel('Time [s]')
plt.ylabel('Flux [arb. units]')
plt.title('Flux vs. Time')

In [None]:
# put your MCMC code and traces here, adding additional cells if necessary!




## Some questions

1. What are the best-fit values for A, $\omega$, and $phi$, and what are their $68\%$ confidence intervales?
2. How does your answer to the previous question differ when you use the low-noise dataset (in `timeseries_lownoise.txt`) instead of the original dataset?
2. Do you see a difference in behavior when the walker chooses its steps from a normal distribution vs. a linear distribution?
3. In what ways does the outcome of this experiment differ from your pre-class assignment, if any?
 

*Put your answers here!*