# Posterior and Marginal distributions

This notebook is the continuation of `bandwidths.ipynb` notebook, apliying the bw results and the class for get the n-dimentional PDF. 

#### Some considerations: 
1. As **prior** information we will take the cleaned (without NaN or inf values) information from exoplanet.eu, this is the first part of notebook.

2. The PDF from `oiptimal_pdf` class  fulfills the functions of **likelihood** for a certain number of variables in synthetic systems with no-pertutbation, low perturbation and high perturbation. 

3. To get the **marginal** distributions of a variable of interest, we go in the same way that the example marginalization in the notebook `3D.ipynb`.

In [1]:
import numpy as np
import pandas as pd
import warnings; warnings.simplefilter('ignore')

import nbimporter
from bandwidths import optimal_pdf #import the class for get the pdf.

Importing Jupyter notebook from bandwidths.ipynb


In [2]:
from IPython.core.display import HTML
HTML("""
<style>
.output_png {
    display: table-cell;
    text-align: center;
    vertical-align: middle;
}
</style>
""")

## Data Cleaning

### 1. Simulation data   

In [3]:
#========================== Simulation Data ==========================
dn=pd.read_csv('data/proc_no_pert.csv',index_col=None); dn["gia"]=dn.ngi>0 #without pertubtations 
dl=pd.read_csv('data/proc_lo_pert.csv',index_col=None); dl["gia"]=dl.ngi>0 #with low pertubtations 
dh=pd.read_csv('data/proc_hi_pert.csv',index_col=None); dh["gia"]=dh.ngi>0 #with high pertubtations

In [4]:
#======================= Simulation variables ========================
##Terrestrial: t; giant;g
dnt=dn[~dn["gia"]]; dng=dn[dn["gia"]] # without pertubtations 
dlt=dl[~dl["gia"]]; dlg=dl[dl["gia"]] # low pertubtations 
dht=dh[~dh["gia"]]; dhg=dh[dh["gia"]] # high pertubtations 

x_variables = [dng,dlg,dhg,dnt,dlt,dht,dn,dl,dh]

for i, var in enumerate(x_variables):
    var['logeff'] = np.log10(var.massefficiency)
    var['logcom'] = np.log10(var.com)

In [5]:
dnt.head()

Unnamed: 0.1,Unnamed: 0,ident,com,nplanets,massbudget,massefficiency,sigmag0,md,rc,ms,metal,taugas,qest,ngi,mtr,apert,gia,logeff,logcom
1,1,5.0,2.932894,12.0,17.882769,0.000488,102.431593,0.11,38.977428,1.075269,-0.15016,1014449.0,5.464831,0.0,17.882769,0.0,False,-3.311837,0.467296
3,3,8.0,5.740174,9.0,8.166382,0.000163,62.737337,0.15,58.158928,1.076658,-0.282408,6017040.0,4.704798,0.0,8.166382,0.0,False,-3.78694,0.758925
5,5,15.0,8.394027,8.0,16.003091,0.000436,106.824759,0.11,38.167542,0.986003,0.388613,2435406.0,5.218175,0.0,16.003091,0.0,False,-3.360068,0.92397
6,6,16.0,4.289089,24.0,12.426573,0.000219,118.54372,0.17,45.042137,1.258747,-0.352459,1107032.0,4.469478,0.0,12.426573,0.0,False,-3.658976,0.632365
7,7,17.0,3.771156,12.0,16.762554,0.000811,35.587738,0.062,49.645451,0.739731,0.121866,9050091.0,7.257983,0.0,16.762554,0.0,False,-3.09093,0.576474


### 2. Observational data 

Data get from <a href="http://exoplanet.eu/">exoplanet.eu</a>

In [9]:
data_obs = pd.read_csv('data/exoplanet.eu_catalog.csv', 
                       usecols = ['mass','mass_error_min', 'orbital_period', 'orbital_period_error_min', 'semi_major_axis', 'semi_major_axis_error_min', 
                                  'semi_major_axis','semi_major_axis_error_min', 'star_metallicity', 'star_metallicity_error_min', 'star_mass', 'star_mass_error_min', ])

# It replaces the inf values:
data_obs = data_obs.replace([np.inf, -np.inf], np.nan) 
data_obs = data_obs.replace([0], np.nan)
data_obs = data_obs.dropna()
# Total of NaN values:

In [7]:
data_obs

Unnamed: 0,# name,planet_status,mass,mass_error_min,mass_error_max,mass_sini,mass_sini_error_min,mass_sini_error_max,radius,radius_error_min,...,star_sp_type,star_age,star_age_error_min,star_age_error_max,star_teff,star_teff_error_min,star_teff_error_max,star_detected_disc,star_magnetic_field,star_alternate_names
61,55 Cnc e,Confirmed,0.02703,0.00135,0.00135,,,,0.17370,0.003390,...,K0IV-V,10.200,2.50,2.500,5196.0,24.0,24.0,,,
86,BD+20 594 b,Confirmed,0.05130,0.01900,0.01900,0.0513,0.019,0.019,0.19900,0.009800,...,K0,3.340,1.49,1.950,5766.0,99.0,99.0,,,2MASS 03343623+2035574
114,CoRoT-1 b,Confirmed,1.03000,0.12000,0.12000,,,,1.49000,0.080000,...,G0V,,,,6298.0,66.0,66.0,,,
115,CoRoT-10 b,Confirmed,2.75000,0.14000,0.14000,,,,0.97000,0.050000,...,K1V,3.000,,,5075.0,75.0,75.0,,,
116,CoRoT-11 b,Confirmed,2.33000,0.27000,0.27000,,,,1.43000,0.033000,...,F6V,2.000,1.00,1.000,6343.0,72.0,72.0,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4314,XO-5 b,Confirmed,1.07700,0.03700,0.03700,,,,1.03000,0.050000,...,G8V,8.500,0.80,0.800,5510.0,44.0,44.0,,,
4315,XO-6 b,Confirmed,1.90000,0.50000,0.50000,,,,2.07000,0.220000,...,F5,1.880,0.20,0.900,6720.0,100.0,100.0,,,TYC 4357-995-1
4316,XO-7 b,Confirmed,0.70900,0.03400,0.03400,,,,1.37300,0.026000,...,GOV,1.180,0.71,0.980,6250.0,100.0,100.0,,,BD+85 317
4341,kappa And b,Confirmed,13.00000,2.00000,12.00000,,,,1.20000,0.100000,...,B9IV,0.047,0.04,0.027,11361.0,66.0,66.0,,,"HD 222439, HR 8976, HIP 116805, 19 Andromedae"


In [8]:
a = multidim_bw(dng.logeff, dng.logcom)
likelihood = a.pdf_ndim()

NameError: name 'multidim_bw' is not defined