# Example of fitting: 51 Pegasi b

The ELODIE data and parameters used in this example notebook for 51 Peg b were obtained from [Birkby et al. 2017](http://doi.org/10.3847/1538-3881/aa5c87).

In [None]:
from ravest.model import Planet, Star, Trend, calculate_mpsini
from ravest.fit import Fitter
from ravest.param import Parameter, Parameterisation
import ravest.prior

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

Import and inspect the data

In [None]:
data = pd.read_csv('example_data/51Pegb.txt', delimiter='\s+', )
data

In [None]:
plt.figure(figsize=(15,3.5))
plt.title("51 Peg b ELODIE data")
plt.ylabel("Radial Velocity [m/s]")
plt.xlabel("BJD_TDB")
plt.errorbar(data["time"], data["vel"], yerr=data["verr"], marker=".", linestyle="None")
plt.show()

In [None]:
fitter = Fitter(planet_letters=["b"], parameterisation=Parameterisation("per k e w tc"))
fitter.add_data(time=data["time"], vel=data["vel"], verr=data["verr"], t0=np.mean(data["time"]))
print("t0:", fitter.t0)

# Construct the params dict
# These values will be used as your initial guess for the fit
params = {"per_b": Parameter(4.23, "d", fixed=False),
          "k_b": Parameter(60, "m/s", fixed=False),
          "e_b": Parameter(0, "", fixed=True),
          "w_b": Parameter(np.pi/2, "rad", fixed=True),
          "tc_b": Parameter(2456326.9, "d", fixed=False),
          
          "g": Parameter(-33251.9, "m/s", fixed=False),
          "gd": Parameter(0, "m/s/day", fixed=True),
          "gdd": Parameter(0, "m/s/day^2", fixed=True),
          "jit": Parameter(0, "m/s", fixed=True),}

fitter.add_params(params)
fitter.params

In [None]:
# Construct the priors dict. Every parameter that isn't fixed requires a prior.
priors = {
          "per_b": ravest.prior.Gaussian(4.2293, 0.0011),
          "k_b": ravest.prior.Uniform(0,100),
          "tc_b": ravest.prior.Uniform(2456326.9-(4.2293), 2456326.9+(4.2293)),
          "g": ravest.prior.Uniform(-33260, -33240),
        }

fitter.add_priors(priors)
fitter.priors

Now that we have loaded the `Fitter` with the data, our parameterisation, our initial parameter values, and priors for each of the free parameters, we can now fit the free parameters of the model to the data. First, Maximum A Posteriori (MAP) optimisation is performed to find the best-fit solution. Then, MCMC is used to explore the parameter space and estimate the parameter uncertainties. This can take a few minutes, you can enable a progress bar with `progress=True`.

In [None]:
nwalkers = 2 * len(fitter.get_free_params_dict())
nsteps = 5000

# Fit the free parameters to the data
samples = fitter.fit_model_to_data(nwalkers=8, nsteps=nsteps, progress=False)  # This will take a while!

Now that the MCMC is finished, the `emcee` sampler has been saved into the `Fitter` object. We can therefore interact with it in the usual way to export the samples, as a numpy array that can be passed into other functions (such as for comparing two models by calculating the Bayesian evidence - example notebook coming soon!). We can also export them into a Pandas dataframe, which keeps each parameter labelled. In both cases, we can pass in the `discard` and `thin` arguments as desired.

In [None]:
# Get the results from the sampler, as a numpy array
samples = fitter.sampler.get_chain(discard=0)

# Get the samples as a labelled Pandas dataframe
samples_df = fitter.get_samples_df(discard=0, thin=1)
samples_df

To inspect the chains visually, we can plot (and optionally save) the time series of each parameter in the chain.

In [None]:
fitter.plot_chains(discard=0, save=False)

We can also visualise (and optionally save) the posterior parameter distributions in corner plots, using the `corner` module.

In [None]:
fitter.plot_corner(discard=0, save=False)

Inspecting the conjunction time $t_{c}$, we can see that we shouldn't always take the automatically generated value and error (the 16th, 50th and 84th percentiles). It's a good idea to inspect the posterior distribution visually with the corner plots. For further analysis and inspection, recall we can get a dataframe of the samples, e.g. to plot them in a histogram, with the `Fitter.get_samples_df()` method we saw earlier.

## Calculate $M_p\sin{i}$
Now that we have fitted for the parameters, we can investigate the $M_p\sin{i}$ of 51 Peg b. To do this, we'll pass in the samples from the `Fitter` for the parameters we need. We also need the stellar mass, which I've again obtained from Birkby et al. 2017.

$$ M_p\sin{i}=K\sqrt{1-e^2}\left(\frac{PM^2_*}{2\pi G}\right)^{1/3} $$

In [None]:
# Stellar mass [solar masses] from Birkby et al. 2017.
mass_star_val = 1.11
mass_star_err = 0.066

# Create a distribution to ensure the uncertainty in the stellar mass is captured in the mpsini uncertainty
mass_star = np.random.normal(loc=mass_star_val, scale=mass_star_err, size=len(samples_df))

# Get the fixed parameters, as some of the params needed for mpsini were fixed
fixed_params = fitter.get_fixed_params_dict()

# Calculate the mpsini value
mpsini_unit = "M_jupiter"  # must be M_jupiter or M_earth
mpsini = calculate_mpsini(mass_star, samples_df["per_b"], samples_df["k_b"], fixed_params["e_b"].value, unit="M_jupiter")

# Calculate the knuth bin width for histogram plotting
from astropy.stats import knuth_bin_width
width, edges = knuth_bin_width(mpsini, return_bins=True)

# Let's plot the mpsini posterior distribution in a histogram
fig, ax = plt.subplots(1, 1)
ax.set_title("Minimum mass estimate $M_p\sin{i}$ for 51 Peg b")
ax.set_xlabel("$M_p \sin(i)$ [M$_J$]")
ax.set_ylabel("Frequency")
plot = ax.hist(mpsini, bins=edges, color="tab:blue", alpha=0.7)

# Let's overplot the 16th, 50th and 84th percentiles
ps = np.percentile(mpsini, [16, 50, 84])
# Search for the heights of the bins in which the percentiles are located
heights = plot[0][np.searchsorted(plot[1], ps, side='left')-1]
# The line height will be bin-height / y_bound
_, ymax = ax.get_ybound()
ax.axvline(ps[0], label='16%', color='blue', linestyle=':', linewidth=2, ymax=heights[0] / ymax)
ax.axvline(ps[1], label='50%', color='blue', linestyle='--', linewidth=2, ymax=heights[1] / ymax)
ax.axvline(ps[2], label='84%', color='blue', linestyle=':', linewidth=2, ymax=heights[2] / ymax)
plt.legend()
print(f"51 peg b Mpsini: {ps[1]:} +{ps[2]-ps[1]:.1g} -{ps[1]-ps[0]:.1g} M_jupiter")