# Multi-instrument model with `simppler`

In this tutorial, we will explore an example where we have datasets from various instruments that we want to analyze jointly in `simppler`.

We will replicate the ["Fitting radial-velocities" tutorial](https://juliet.readthedocs.io/en/latest/tutorials/rvfits.html)  from the `juliet` package, analyzing the TOI-141 observations presented in [Espinoza et al. (2019)](https://arxiv.org/abs/1903.07694).

In [None]:
import os
import radvel
import numpy as np
from pandas import read_csv
import matplotlib.pyplot as plt

plt.style.use("tableau-colorblind10")

data_df = read_csv(os.path.join(radvel.DATADIR, "rvs_toi141.dat"), sep=" ", names=["t", "rv", "erv", "inst"])
t, rv, erv, inst = data_df.t.values, data_df.rv.values, data_df.erv.values, data_df.inst.values
sort_inds = np.argsort(t)
t = t[sort_inds]
rv = rv[sort_inds]
erv = erv[sort_inds]
inst = inst[sort_inds]
tmod = np.linspace(t.min(), t.max(), num=1000)
print("Instruments:", ", ".join(np.unique(inst)))

The observations come from four different instruments listed above.
Each instrument can have its own systematic parameters: `jit_{inst}`, `gamma_{inst}`, `dvdt_{inst}` and `curv_{inst}`.
Orbital parameters are shared between instruments so they do not have a suffix.

## Building the model

Let us start by building a model with all required parameters for a single-planet fit to the four instruments.

In [None]:
import simpple.distributions as sdist
import simppler.model as smod

P = 1.007917
P_err = 0.000073
t0 = 2458325.5386
t0_err = 0.0011

parameters = {
    "per1": sdist.Normal(P, P_err),
    "tc1": sdist.Normal(t0, t0_err),
    "e1": sdist.Fixed(0.0),
    "w1": sdist.Fixed(90.0 * np.pi / 180.0),
    "k1": sdist.Uniform(0.0, 100.0),
    "gamma_CORALIE14": sdist.Uniform(-100.0, 100.0),
    "gamma_CORALIE07": sdist.Uniform(-100.0, 100.0),
    "gamma_HARPS": sdist.Uniform(-100.0, 100.0),
    "gamma_FEROS": sdist.Uniform(-100.0, 100.0),
    "jit_CORALIE14": sdist.LogUniform(1e-3, 100.0),
    "jit_CORALIE07": sdist.LogUniform(1e-3, 100.0),
    "jit_HARPS": sdist.LogUniform(1e-3, 100.0),
    "jit_FEROS": sdist.LogUniform(1e-3, 100.0),
}
model = smod.RVModel(parameters, 1, t, rv, erv, inst=inst, basis="per tc e w k", tmod=tmod)

In [None]:
import simppler.plot as sp
fig, axs = sp.plot_rv(model)
plt.show()

The CORALIE data is spread over a long time and very sparse.
We can exclude it from the plots by specifying which instruments we want to see.


In [None]:
fig, axs = sp.plot_rv(model, inst=["FEROS", "HARPS"])
plt.show()

We can display a test model by passing the `parameters` argument to the plotting function.

In [None]:
test_p = {"per1": P, "tc1": t0, "k1": 10.0}
for inst_name in model.inst_unique:
    inst_mask = model.inst == inst_name
    test_p[f"gamma_{inst_name}"] = np.mean(model.rv[inst_mask])
    test_p[f"jit_{inst_name}"] = np.mean(model.erv[inst_mask])
fig, axs = sp.plot_rv(model, parameters=test_p, inst=["FEROS", "HARPS"])
plt.show()

## MAP Optimization

Let us now optimize our model as we would normally do.

In [None]:
from scipy.optimize import minimize
res = minimize(
    lambda p: model.log_prob(p), list(test_p.values()), method='Nelder-Mead',
    options=dict(maxiter=200, maxfev=100000, xatol=1e-8)
)

In [None]:
fig, axs = sp.plot_rv(model, parameters=res.x, inst=["FEROS", "HARPS"])
fig.suptitle("MAP model")
plt.show()

We can also look at the phase-folded results.

In [None]:
fig, axs = sp.plot_phase(model, parameters=res.x)
fig.suptitle("MAP model")
plt.show()

## Nested Sampling for a Single Planet

To properly explore the posterior, and to derive Bayes factors for model comparison, we can use Nested Sampling with the `ultranest` package.

### Prior Samples

It is always a good idea to draw samples from the posterior to see what models are allowed.

In [None]:
prior_samples = model.get_prior_samples(10_000)

import corner
corner.corner(prior_samples)
plt.show()

In [None]:
sp.plot_rv(model, prior_samples, inst=["FEROS", "HARPS"])
plt.show()

In [None]:
sp.plot_phase(model, prior_samples)
plt.show()

As we can see here, the model does not quite capture the variability of the data.
However, the period and time are constrained from transits, and there is only so much we can do by varying the semi-amplitude.
We will see in the next section how a multi-planet model can improve this, but for now let us sample this single-planet model!

### Sampling the Posterior

In [None]:
from ultranest import ReactiveNestedSampler
from ultranest.stepsampler import SliceSampler, generate_mixture_random_direction

sampler = ReactiveNestedSampler(model.keys(), model.log_likelihood, model.prior_transform)
nsteps = model.ndim * 2
sampler.stepsampler = SliceSampler(
    nsteps=nsteps, generate_direction=generate_mixture_random_direction,
)

In [None]:
import logging
# Set root logger
logging.getLogger().setLevel(logging.WARNING)

# Force all existing loggers
for logger_name in logging.root.manager.loggerDict:
    logging.getLogger(logger_name).setLevel(logging.WARNING)

sampler.run()

We can first take a look at the nested sampling diagnostic plots.

In [None]:
sampler.plot()
plt.show()

Then we can extract the posterior samples and the evidence.

In [None]:
samples_one = sampler.results["samples"]
lnZ_one = sampler.results["logz"]
lnZerr_one = sampler.results["logzerr"]

And finally, we can take a look at the models sampled from the posterior.

In [None]:
sp.plot_rv(model, parameters=samples_one.T, inst=["FEROS", "HARPS"])
plt.show()

In [None]:
fig, axs = sp.plot_phase(model, samples_one.T)
fig.suptitle("Posterior model")
plt.show()

## Nested Sampling for two Planets

As we can see above, the single-planet model does not provide a great fit for the data.
Let us fit a two-planet model and see how it compares.

### Building the Two-Planet Model

In [None]:
parameters = {
    "per1": sdist.Normal(P, P_err),
    "tc1": sdist.Normal(t0, t0_err),
    "e1": sdist.Fixed(0.0),
    "w1": sdist.Fixed(90.0 * np.pi / 180.0),
    "k1": sdist.Uniform(0.0, 100.0),
    "per2": sdist.Uniform(1.0, 10.0),
    "tc2": sdist.Uniform(2458325.0, 2458330.0),
    "e2": sdist.Fixed(0.0),
    "w2": sdist.Fixed(90.0 * np.pi / 180.0),
    "k2": sdist.Uniform(0.0, 100.0),
    "gamma_CORALIE14": sdist.Uniform(-100.0, 100.0),
    "gamma_CORALIE07": sdist.Uniform(-100.0, 100.0),
    "gamma_HARPS": sdist.Uniform(-100.0, 100.0),
    "gamma_FEROS": sdist.Uniform(-100.0, 100.0),
    "jit_CORALIE14": sdist.LogUniform(1e-3, 100.0),
    "jit_CORALIE07": sdist.LogUniform(1e-3, 100.0),
    "jit_HARPS": sdist.LogUniform(1e-3, 100.0),
    "jit_FEROS": sdist.LogUniform(1e-3, 100.0),
}
model = smod.RVModel(parameters, 2, t, rv, erv, inst=inst, basis="per tc e w k", tmod=tmod)

### Prior checks


In [None]:
prior_samples = model.get_prior_samples(10_000)

import corner
corner.corner(prior_samples)
plt.show()

In [None]:
sp.plot_rv(model, prior_samples, inst=["FEROS", "HARPS"])
plt.show()

### MAP


In [None]:
test_p = {"per1": P, "tc1": t0, "k1": 10.0}
test_p |= {"per2": 3.0, "tc2": 2458325+2, "k2": 10.0}
for inst_name in model.inst_unique:
    inst_mask = model.inst == inst_name
    test_p[f"gamma_{inst_name}"] = np.mean(model.rv[inst_mask])
    test_p[f"jit_{inst_name}"] = np.mean(model.erv[inst_mask])

from scipy.optimize import minimize
res = minimize(
    lambda p: model.log_prob(p), list(test_p.values()), method='Nelder-Mead',
    options=dict(maxiter=200, maxfev=100000, xatol=1e-8)
)

In [None]:
fig, axs = sp.plot_rv(model, parameters=res.x, inst=["FEROS", "HARPS"])
fig.suptitle("MAP model")
plt.show()

In [None]:
fig, axs = sp.plot_phase(model, parameters=res.x)
fig.suptitle("MAP model")
plt.show()

### Nested Sampling for Two Planets

In [None]:
sampler = ReactiveNestedSampler(model.keys(), model.log_likelihood, model.prior_transform)
nsteps = model.ndim * 2
sampler.stepsampler = SliceSampler(
    nsteps=nsteps, generate_direction=generate_mixture_random_direction,
)

In [None]:
import logging
# Set root logger
logging.getLogger().setLevel(logging.WARNING)

# Force all existing loggers
for logger_name in logging.root.manager.loggerDict:
    logging.getLogger(logger_name).setLevel(logging.WARNING)

sampler.run()

In [None]:
sampler.plot()
plt.show()

In [None]:
samples_two = sampler.results["samples"]
lnZ_two = sampler.results["logz"]
lnZerr_two = sampler.results["logzerr"]

In [None]:
sp.plot_rv(model, parameters=samples_two.T, inst=["FEROS", "HARPS"])
plt.show()

In [None]:
fig, axs = sp.plot_phase(model, samples_two.T)
fig.suptitle("Posterior model")
plt.show()

## Model Comparison

Finally, we can compare the models by calculating the Bayes Factor.

In [None]:
print(f"Log-evidence for one planet: {lnZ_one} +/- {lnZerr_one}")
print(f"Log-evidence for two planets: {lnZ_two} +/- {lnZerr_two}")
lnK = lnZ_two - lnZ_one
lnKerr = np.sqrt(lnZ_two**2 + lnZ_one**2)
print(f"Log-bayes factor for two planets - one planet: {lnK} +/- {lnKerr}")