## Aim

Compute an ABC occurrence rate estimate, and compare results with ```occurrence_mcmc```.

In [1]:
#!/bin/env python  
import numpy as np
from matplotlib import pyplot as plt
%matplotlib inline
import pandas as pd
from scipy import optimize, stats, linalg
from utils import dfm
from utils.abc import ABCSampler
# import astroabc

%load_ext autoreload
%autoreload 2

My ABC setup requires

- a prior for the model parameters $\theta$, i.e. a ```stats.continuous_rv/stats.discrete_rv``` object whose .rvs() method returns candidate parameters.
- a sampler for $f(x \mid \theta)$
- distance and statistic functions
- data to fit

I'm not 100% sure how any of these work.

In [2]:
kois = dfm.get_kois()
stellar = dfm.stlr
kois = kois[kois["kepid"].isin(stellar["kepid"])]
# kois = dfm.kois_cuts(kois) - don't do the cuts! Not enough data to work with otherwise
# the only part of the cuts that is necessary is to get rid of NaN radii
kois = kois[np.isfinite(kois["koi_prad"])]

AttributeError: module 'utils.dfm' has no attribute 'stlr'

In [None]:
starcounts = np.array(pd.crosstab(index=kois['kepid'], columns="count")).flatten()
plt.hist(starcounts)
plt.xlabel("Planets per star")
plt.ylabel("Counts")

Now the dataset has sufficient complexity to fit a Poisson process.

In [None]:
assert np.all(np.isfinite(kois["koi_prad"]))
assert np.all(np.isfinite(kois["koi_period"]))
radii = kois["koi_prad"]
periods = kois["koi_period"]

In [None]:
lam = np.mean(pd.crosstab(index=kois['kepid'], columns="count"))
planet_numbers = np.minimum(stats.poisson(lam).rvs(size=(len(stellar),)), 10)        

In [None]:
plt.hist(planet_numbers)

In [None]:
min_period, min_radius = min(periods), min(radii)
max_period, max_radius = max(periods), max(radii)
period_buckets = [min_period * 2 ** i for i in range(int(np.ceil(np.log2(max_period / min_period))))]
radius_buckets = [min_radius * 2 ** i for i in range(int(np.ceil(np.log2(max_radius / min_radius))))]