# Eclipsing binary: Full solution

In this notebook, we're continuing our tutorial on how to do inference. In [this notebook](EclipsingBinary_Generate.ipynb) we showed how to use `pymc3` to get posteriors over map coefficients of an eclipsing binary light curve, and in [this notebook](EclipsingBinary_Linear.ipynb) we did the same thing using the analytic linear formalism of `starry`.

Here, we're going to combine the two methods. We're going to sample the nonlinear parameters (the orbital parameters, limb darkening coefficients, etc.) using `pymc3` and analytically *marginalize* over the linear parameters (the spherical harmonic coefficients) using the `starry` linear formalism.

**Note that since we're using `pymc3`, we need to enable `lazy` evaluation mode in `starry`.**

In [None]:
%matplotlib inline
%config InlineBackend.figure_format='retina'

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pymc3 as pm
import exoplanet as xo
import os
import starry
from corner import corner
import theano.tensor as tt

np.random.seed(12)
starry.config.lazy = True
starry.config.quiet = True

## Load the data

Let's load the EB dataset:

In [None]:
# Run the Generate notebook if needed
if not os.path.exists("eb.npz"):
    import nbformat
    from nbconvert.preprocessors import ExecutePreprocessor

    with open("EclipsingBinary_Generate.ipynb") as f:
        nb = nbformat.read(f, as_version=4)
    ep = ExecutePreprocessor(timeout=600, kernel_name="python3")
    ep.preprocess(nb);

In [None]:
data = np.load("eb.npz", allow_pickle=True)
A = data["A"].item()
B = data["B"].item()
t = data["t"]
flux = data["flux"]
sigma = data["sigma"]

Here's the light curve we're going to do inference on:

In [None]:
fig, ax = plt.subplots(1, figsize=(12, 5))
ax.plot(t, flux, "k.", alpha=0.5, ms=4)
ax.set_xlabel("time [days]", fontsize=24)
ax.set_ylabel("normalized flux", fontsize=24);

Next, we instantiate the primary, secondary, and system objects within a `pm.Model()` context.
Here are the priors we are going to assume for the parameters of the primary:

| Parameter      | True Value | Assumed Value / Prior     | Units          |
| ---            | ---        | ---                       | ---            |
|$\mathrm{amp}$  | $1.0$      | $1.0$                     | $-$            |
|$r$             | $1.0$      | $\mathcal{N}(0.95,0.1^2)$ | $R_\odot$      |
|$m$             | $1.0$      | $\mathcal{N}(1.05,0.1^2)$ | $M_\odot$      |
|$P_\mathrm{rot}$| $1.25$     | $\mathcal{N}(1.25,0.01^2)$| $\mathrm{days}$|
|$i$             | $80.0$     | $\mathcal{N}(80.0,5.0^2)$ | $\mathrm{deg}$ |
|$u_1$           | $0.40$     | $\mathrm{Kipping}$        | $-$            |
|$u_2$           | $0.25$     | $\mathrm{Kipping}$        | $-$            |

And here are the priors we are going to assume for the secondary:

| Parameter      | True Value | Assumed Value / Prior      | Units          |
| ---            | ---        | ---                        | ---            |
|$\mathrm{amp}$  | $0.1$      | $0.1$                      | $-$            |
|$r$             | $0.7$      | $\mathcal{N}(0.75,0.1^2)$  | $R_\odot$      |
|$m$             | $0.7$      | $\mathcal{N}(0.70,0.1^2)$  | $M_\odot$      |
|$P_\mathrm{rot}$| $0.625$    | $\mathcal{N}(0.625,0.01^2)$| $\mathrm{days}$|
|$P_\mathrm{orb}$| $1.0$      | $\mathcal{N}(1.01,0.01^2)$ | $\mathrm{days}$|
|$t_0$           | $0.15$     | $\mathcal{N}(0.15,0.001^2)$| $\mathrm{days}$|
|$i$             | $80.0$     | $\mathcal{N}(80.0,5.0^2)$  | $\mathrm{deg}$ |
|$e$             | $0.0$      | $0.0$                      | $-$            |
|$\Omega$        | $0.0$      | $0.0$                      | $\mathrm{deg}$ |
|$u_1$           | $0.20$     | $\mathrm{Kipping}$         | $-$            |
|$u_2$           | $0.05$     | $\mathrm{Kipping}$         | $-$            |

Above, $\mathcal{N}$ denotes a 1-d normal prior with a given mean and variance, and $\mathrm{Kipping}$ denotes the prior introduced in [Kipping (2013)](https://arxiv.org/abs/1308.0009) for uninformative sampling over limb darkening coefficients.

In [None]:
with pm.Model() as model:

    PositiveNormal = pm.Bound(pm.Normal, lower=0.0)

    # Primary
    A_inc = 80  # pm.Normal("A_inc", mu=80, sd=5, testval=80)
    A_amp = 1.0
    A_r = 1.0  # PositiveNormal("A_r", mu=0.95, sd=0.1, testval=0.95)
    A_m = 1.0  # PositiveNormal("A_m", mu=1.05, sd=0.1, testval=1.05)
    A_prot = 1.25  # PositiveNormal("A_prot", mu=1.25, sd=0.01, testval=1.25)
    A_q1 = 0.40  # pm.Uniform("A_q1", lower=0, upper=1, testval=0.1)
    A_q2 = 0.25  # pm.Uniform("A_q2", lower=0, upper=1, testval=0.1)
    A_u1 = 2 * tt.sqrt(A_q1) * A_q2
    A_u2 = tt.sqrt(A_q1) * (1 - 2 * A_q2)
    pm.Deterministic("A_u1", A_u1)
    pm.Deterministic("A_u2", A_u2)
    pri = starry.Primary(
        starry.Map(ydeg=A["ydeg"], udeg=A["udeg"], inc=A_inc, amp=A_amp),
        r=A_r,
        m=A_m,
        prot=A_prot,
    )
    pri.map[1] = A_u1
    pri.map[2] = A_u2

    # Secondary
    B_inc = 80  # pm.Normal("B_inc", mu=80, sd=5, testval=80)
    B_amp = 0.1
    B_r = pm.Uniform(
        "B_r", lower=0.1, upper=1.0
    )  # 0.7 #PositiveNormal("B_r", mu=0.75, sd=0.1, testval=0.75)
    B_m = 0.7  # PositiveNormal("B_m", mu=0.70, sd=0.1, testval=0.70)
    B_prot = 0.625  # PositiveNormal("B_prot", mu=0.625, sd=0.01, testval=0.625)
    B_porb = 1.0  # PositiveNormal("B_porb", mu=1.01, sd=0.01, testval=1.01)
    B_t0 = 0.15  # pm.Normal("B_t0", mu=0.15, sd=0.001, testval=0.15)
    B_q1 = 0.20  # pm.Uniform("B_q1", lower=0, upper=1, testval=0.1)
    B_q2 = 0.05  # pm.Uniform("B_q2", lower=0, upper=1, testval=0.1)
    B_u1 = 2 * tt.sqrt(B_q1) * B_q2
    B_u2 = tt.sqrt(B_q1) * (1 - 2 * B_q2)
    pm.Deterministic("B_u1", B_u1)
    pm.Deterministic("B_u2", B_u2)
    sec = starry.Secondary(
        starry.Map(ydeg=B["ydeg"], udeg=B["udeg"], inc=B_inc, amp=B_amp),
        r=B_r,
        m=B_m,
        porb=B_porb,
        prot=B_prot,
        t0=B_t0,
        inc=B_inc,
    )
    sec.map[1] = B_u1
    sec.map[2] = B_u2

    # System
    sys = starry.System(pri, sec)

Now let's declare our model.

In [None]:
with model:
    sys.set_data(flux, C=sigma ** 2)
    pri.map.set_prior(L=1e-2)
    sec.map.set_prior(L=1e-2)
    pm.Potential("marginal", sys.lnlike(t=t))

Now that we've specified the model, it's a good idea to run a quick gradient descent to find the MAP (maximum a posteriori) solution. This will give us a decent starting point for the inference problem.

In [None]:
%%time
with model:
    map_soln = xo.optimize()

In [None]:
with model:
    MAP = xo.eval_in_model(sys.solve(t=t)[0], point=map_soln)

In [None]:
map = starry.Map(ydeg=A["ydeg"])
map.inc = A["inc"]  # map_soln["A_inc"]
map[1:, :] = MAP[0]
map.show(theta=np.linspace(0, 360, 50))

In [None]:
map = starry.Map(ydeg=B["ydeg"])
map.inc = B["inc"]  # map_soln["B_inc"]
map[1:, :] = MAP[1]
map.show(theta=np.linspace(0, 360, 50))

%%time
with model:
    trace = pm.sample(
        tune=250,
        draws=500,
        start=map_soln,
        chains=4,
        cores=1,
        step=xo.get_dense_nuts_step(target_accept=0.9),
    )

varnames = ["B_r"]
display(pm.summary(trace, varnames=varnames).head())

corner(trace["B_r"], truths=[B["r"]]);