# Demo No. 12 - Light curve fitting.

This demo will guide you trough a process of inferring a parameters of the eclipsing binary from the shape of the light curve. We will stat again by importing necessary modules and setting our loging:

In [3]:
from IPython.core.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

%matplotlib notebook
import os
import astropy.units as u

from elisa.conf import config
from elisa.analytics import LCData, LCBinaryAnalyticsTask
from elisa.analytics.params.parameters import BinaryInitialParameters


# setting up a custom logging config to prevent unreasonably long log messages during fit
config.LOG_CONFIG ='jupyter_fit_logging.json'
config.set_up_logging() 

In this particular case of KIC 4851217, we will help ourselves by using previous results obtained from radial velocity fit. In particular, radial velocities are much more sensitive to mass ratio of the components, therefore we will adopt the value obtained in previous demo and we will keep it fixed. 

The procedure itself is very similar to the one mentioned in the previous demo 11. Initially, we will initiate the dataset with our Kepler phase curve:

In [4]:
kepler_data = LCData.load_from_file(filename='demo_data/lc_data/kepler_phs_crv.dat', 
                                    x_unit=u.dimensionless_unscaled, 
                                    y_unit=u.dimensionless_unscaled
                                   )

Lets finally define our starting parameters. Starting parameters are divided to system, primary and secondary category listed in form of dictionaries that contain name of the variable as a key and values contains dictionary characterizing starting value of the parameter, status of the parameter (fixed: True/False), the boundaries of the fitted parameter defined by min, max values and the unit in astropy format. In case of parameter `semi_major_axis` we utilized the capability to constraint parameter to the `system@inclination` parameter due to the fact that a*sin(i) parameter was determined during radial velocity fit:

In [5]:
lc_initial = {
    'system': {
        'inclination': {
            'value': 81.0,
            'fixed': False,
            'min': 79,
            'max': 83,
            'unit': u.deg
        },
        'eccentricity': {
            'value': 0.01,
            'fixed': False,
            'min': 0.00,
            'max': 0.05,
        },
        'mass_ratio': {
            'value': 1.077,
            'fixed': True,
        },  # value obtained during RV fit
        'argument_of_periastron': {
            'value': 285,
            'fixed': False,
            'min': 0,
            'max': 360,
            'unit': u.deg
        },  # similar case to eccentricity
        'semi_major_axis': {
            'constraint': '11.86 / sin(radians(system@inclination))'
        },  # using parameter asin(i) obtained from RV fit
        'period': {
            'value': 2.47028376,
            'fixed': True,
            'unit': u.d
        },
        'additional_light': {
            'value': 0.2415143632,
            'fixed': False,
            'min': 0.20,
            'max': 0.30,
        },  # there is strong evidence for the presence of the third body
        'phase_shift': {
            'value': 0,
            'fixed': False,
            'min': -0.02,
            'max': 0.02,
        },  # accounting for missalignment of the phase curve
    },
    'primary': {
        't_eff': {
            'value': 7022.0,
            'fixed': True,
            'unit': u.K
        },  # fixed to value found in literature
        'surface_potential': {
            'value': 8.1,
            'fixed': False,
            'min': 7.6,
            'max': 8.7,
        },
        'albedo': {
            'value': 1.0,
            'fixed': True
        },
        'gravity_darkening': {
            'value': 1.0,
            'fixed': True
        },
    },
    'secondary': {
        't_eff': {
            'value': 6961.75240,
            'fixed': False,
            'min': 6700.0,
            'max': 7300.0,
            'unit': u.K
        },
        'surface_potential': {
            'value': 5.2,
            'fixed': False,
            'min': 4.8,
            'max': 5.8,
        },
        'albedo': {
            'value': 1.0,
            'fixed': True
        },
        'gravity_darkening': {
            'value': 1.0,
            'fixed': True
        },  # we are presuming radiative envelopes on both components
    },
}

This will be followed by initialization of `BinarySystemAnalyticsTask`, where light curves are provided as values of dictionary with filter names as their corresponding keys:

In [18]:
task = BinarySystemAnalyticsTask(light_curves={'Kepler': kepler_data})

Make sure that `light_curve` argument keys i.e. passband names are identical to passband names accesible for your Elisa install:

In [19]:
config.PASSBANDS

['bolometric',
 'Generic.Bessell.U',
 'Generic.Bessell.B',
 'Generic.Bessell.V',
 'Generic.Bessell.R',
 'Generic.Bessell.I',
 'SLOAN.SDSS.u',
 'SLOAN.SDSS.g',
 'SLOAN.SDSS.r',
 'SLOAN.SDSS.i',
 'SLOAN.SDSS.z',
 'Generic.Stromgren.u',
 'Generic.Stromgren.v',
 'Generic.Stromgren.b',
 'Generic.Stromgren.y',
 'Kepler',
 'GaiaDR2']

Similarly to rv_fit module, we can also pick from `standard` and `community` approach. We can acces lists of available parameters in a similar fashion:

In [4]:
task.lc_fit.FIT_PARAMS_COMBINATIONS

{'standard': ['p__mass',
  's__mass',
  'inclination',
  'eccentricity',
  'argument_of_periastron',
  'period',
  'primary_minimum_time',
  'p__t_eff',
  's__t_eff',
  'p__surface_potential',
  's__surface_potential',
  'p__gravity_darkening',
  's__gravity_darkening',
  'p__albedo',
  's__albedo',
  'additional_light',
  'phase_shift',
  'p__synchronicity',
  's__synchronicity',
  'p__metallicity',
  's__metallicity',
  'p__spots',
  's__spots',
  'p__pulsations',
  's__pulsations'],
 'community': ['mass_ratio',
  'semi_major_axis',
  'inclination',
  'eccentricity',
  'argument_of_periastron',
  'period',
  'primary_minimum_time',
  'p__t_eff',
  's__t_eff',
  'p__surface_potential',
  's__surface_potential',
  'p__gravity_darkening',
  's__gravity_darkening',
  'p__albedo',
  's__albedo',
  'additional_light',
  'phase_shift',
  'p__synchronicity',
  's__synchronicity',
  'p__metallicity',
  's__metallicity',
  'p__spots',
  's__spots',
  'p__pulsations',
  's__pulsations']}

Using this dictionary, we can define our starting parameters:

In [5]:
lc_initial = {
    'inclination': {
        'value': 81.0,
        'fixed': False,
        'min': 79,
        'max': 83
    },
    'eccentricity': {
        'value': 0.01,
        'fixed': False,
        'min': 0.00,
        'max': 0.05,
    },  # value obtained during RV fit but kept variable due to poor coverage of RV measurements and sensitivity of LC 
        # on eccentricity
    'mass_ratio': {
        'value': 1.077,
        'fixed': True,
    },  # value obtained during RV fit
    'argument_of_periastron': {
        'value': 285,
        'fixed': False,
        'min': 0,
        'max': 360,
    },  # similar case to eccentricity
    'p__t_eff': {
        'value': 7022.0,
        'fixed': True,
    },  # fixed to value found in literature
    's__t_eff': {
        'value': 6961.75240,
        'fixed': False,
        'min': 6700.0,
        'max': 7300.0
    },
    'p__surface_potential': {
        'value': 8.1,
        'fixed': False,
        'min': 7.6,
        'max': 8.7,
    },
    's__surface_potential': {
        'value': 5.2,
        'fixed': False,
        'min': 4.8,
        'max': 5.8,
    },
    'p__albedo': {
        'value': 1.0,
        'fixed': True
    },  
    's__albedo': {
        'value': 1.0,
        'fixed': True
    },
    'p__gravity_darkening': {
        'value': 1.0,
        'fixed': True
    },  
    's__gravity_darkening': {
        'value': 1.0,
        'fixed': True
    },  # we are presuming radiative envelopes on both components
    'semi_major_axis': {
        'constraint': '11.86 / sin(radians({inclination}))'
    },  # using parameter asin(i) obtained from RV fit
    'period': {
        'value': 2.47028376,
        'fixed': True,
    },
    'additional_light': {
        'value': 0.2415143632,
        'fixed': False,
        'min': 0.20,
        'max': 0.30,
    },  # there is strong evidence for the presence of the third body
    'phase_shift': {
        'value': 0,
        'fixed': False,
        'min': -0.02,
        'max': 0.02,
    },  # accounting for missalignment of the phase curve
}

Where we used "constrained" parameter `semi_major_axis` to make use of the `asini` parameter obtained during RV fit. Constraint itself is provided in form of a string where other fit parameters to which constraint is attached are in curly brackets. List of available operators and numerals that can be used to form constraint can be accesed here:

In [6]:
task.CONSTRAINT_OPERATORS

['arcsin',
 'arccos',
 'arctan',
 'log',
 'sin',
 'cos',
 'tan',
 'exp',
 'degrees',
 'radians',
 '(',
 ')',
 '+',
 '-',
 '*',
 '/',
 '.',
 '0',
 '1',
 '2',
 '3',
 '4',
 '5',
 '6',
 '7',
 '8',
 '9']

## Least squares method 

Before fit itself, it is recommended to make use of multiprocessig capabilities of this package. The least squares method is serially executed method, therefore parallelization has to be implemented during claculation of each light curve. This can be achieved by setting configuration variable `NUMBER_OF_PROCESSES` in this script or by setting `number_of_processes` parameter in you config file: 

In [7]:
config.NUMBER_OF_PROCESSES = os.cpu_count()  # this will make sure to utilize all available processors

We can now finally perfom the fit itself. As in the previous demo, we will start with the least squares method. Due to the fact that we expect the system to be detached, we specified the `morphology` argument to be 'detached' instead of 'over-contact'. We also reduced the discredization factor of the primary component to 10 since we expect much larger secondary component compared to the primary component. Suitable value of discredization factor can reduce computational time significantly. Interpolation treshold `interp_treshold` defines maximum number of points in observed data, above which synthetic light curve will be calculated on `interp_treshold` equidistant phases that will be subsequently interpolated to produce residuals for every observed data point. 

In [8]:
param_file = 'demo_data/aux/lc_least_squares.json'

# # this part can take a few hours to complete in case of eccentric orbit
# fit_params = task.lc_fit.fit(x0=lc_initial, method='least_squares', morphology='detached', discretization=10,
#                              interp_treshold=150)
# task.lc_fit.store_parameters(filename=param_file)

# loading stored results
fit_params = task.lc_fit.load_parameters(filename='demo_data/aux/lc_least_squares_params.json')
task.lc_fit.fit_summary()

# BINARY SYSTEM
# Parameter                                        value            -1 sigma            +1 sigma                unit    status                                            
#-----------------------------------------------------------------------------------------------------------------------------
Mass ratio (q=M_2/M_1):                            1.077                                                                Variable                                          
Semi major axis (a):                              11.998                                                      solRad    Variable                                          
Inclination (i):                                   81.31                                                      degree    Variable                                          
Eccentricity (e):                                   0.03                                                                Variable                                          
Ar

The resulting model can be visualized as well:

In [9]:
task.lc_fit.plot.model(discretization=10, number_of_points=150, start_phase=-0.6, stop_phase=0.6)

2020-04-14 09:12:06,691 - 11849 - observer.mp - INFO: starting observation worker for batch index 2
2020-04-14 09:12:06,691 - 11848 - observer.mp - INFO: starting observation worker for batch index 1
2020-04-14 09:12:06,696 - 11850 - observer.mp - INFO: starting observation worker for batch index 3
2020-04-14 09:12:06,691 - 11847 - observer.mp - INFO: starting observation worker for batch index 0
2020-04-14 09:12:06,765 - 11849 - binary_system.curves.lc - INFO: surface geometry at some orbital positions will not be recalculated due to similarities to previous orbital positions
2020-04-14 09:12:06,758 - 11848 - binary_system.curves.lc - INFO: surface geometry at some orbital positions will not be recalculated due to similarities to previous orbital positions
2020-04-14 09:12:06,783 - 11847 - binary_system.curves.lc - INFO: surface geometry at some orbital positions will not be recalculated due to similarities to previous orbital positions
2020-04-14 09:12:06,792 - 11850 - binary_system.

<IPython.core.display.Javascript object>

## Markov chain Monte Carlo (MCMC)

Markov chain Monte Carlo (MCMC) is also implemented for light curve fitting module and it serves similar purpose to produce a reliable error estimates on your eclipsing binary parameters. Since this method supports parallel approach, we can change our approach to paralelism by changing few config variables since it turns out to be much more efficient to parallelize the process on the level of the MCMC method itself:

In [10]:
config.NUMBER_OF_PROCESSES = 1  # we want a single process approach on the light curve integration level
config.NUMBER_OF_MCMC_PROCESSES = os.cpu_count()

Least squares method already found approximate position of the solution and the good coverage of the phase curve means that we can probably trust that solution and use it as a starting point for our MCMC sampling and for the reduction in the size of the search space in order to speed up the burn-in phase. Estimation on the size of the searching box for each parameter is left to user, however, the validity of the selected interval can be asserted aposteriori by studying the traces where you should avoid any clipping of your chain distribution by the borders of the searching box:

In [11]:
lc_initial = {
    'inclination': {
        'value': fit_params['inclination']['value'],
        'fixed': False,
        'min': fit_params['inclination']['value'] - 2,
        'max': fit_params['inclination']['value'] + 2
    },
    'eccentricity': {
        'value': fit_params['eccentricity']['value'],
        'fixed': False,
        'min': fit_params['eccentricity']['value'] - 0.01,
        'max': fit_params['eccentricity']['value'] + 0.01,
    },
    'mass_ratio': {
        'value': 1.077,
        'fixed': True,
    },  # value obtained during RV fit
    'argument_of_periastron': {
        'value': fit_params['argument_of_periastron']['value'],
        'fixed': False,
        'min': fit_params['argument_of_periastron']['value'] - 30,
        'max': fit_params['argument_of_periastron']['value'] + 30,
    },  
    'p__t_eff': {
        'value': 7022.0,
        'fixed': True,
    },  # fixed to value found in literature
    's__t_eff': {
        'value': fit_params['s__t_eff']['value'],
        'fixed': False,
        'min': fit_params['s__t_eff']['value'] - 100,
        'max': fit_params['s__t_eff']['value'] + 100
    },
    'p__surface_potential': {
        'value': fit_params['p__surface_potential']['value'],
        'fixed': False,
        'min': fit_params['p__surface_potential']['value'] - 0.2,
        'max': fit_params['p__surface_potential']['value'] + 0.2,
    },
    's__surface_potential': {
        'value': fit_params['s__surface_potential']['value'],
        'fixed': False,
        'min': fit_params['s__surface_potential']['value'] - 0.1,
        'max': fit_params['s__surface_potential']['value'] + 0.1,
    },
    'p__albedo': {
        'value': 1.0,
        'fixed': True
    },  
    's__albedo': {
        'value': 1.0,
        'fixed': True
    },
    'p__gravity_darkening': {
        'value': 1.0,
        'fixed': True
    },  
    's__gravity_darkening': {
        'value': 1.0,
        'fixed': True
    },  # we are presuming radiative envelopes on both components
    'semi_major_axis': {
        'constraint': '11.86 / sin(radians({inclination}))'
    },  # using parameter asin(i) obtained from RV fit
    'period': {
        'value': 2.47028376,
        'fixed': True,
    },
    'additional_light': {
        'value': fit_params['additional_light']['value'],
        'fixed': False,
        'min': fit_params['additional_light']['value'] - 0.03,
        'max': fit_params['additional_light']['value'] + 0.03,
    },  # there is strong evidence for the presence of the third body
    'phase_shift': {
        'value': fit_params['phase_shift']['value'],
        'fixed': True,
    },  # phase shift parameter was determined sufficiently in least squares method 
        # and its main purpose was to remove any imperfections in supplied phase curve or ephemeris, 
        # therefore, its error estimation is no longer of any significant importance to us
}

This initial vector can now be used to perform MCMC sampling:

In [12]:
param_file = 'demo_data/aux/lc_mcmc_params.json'
# fit_params = task.lc_fit.fit(x0=lc_initial, method='mcmc', nsteps=1000, burn_in=0, morphology='detached',
#                              discretization=10, interp_treshold=150, progress=True)

# task.lc_fit.store_parameters(filename=param_file)

# again, due to very time consuming nature of a MCMC sampling, we will load pre-calculated results
chain_file = 'demo_data/aux/lc_mcmc_chain.json'
task.lc_fit.load_parameters(filename=param_file)
task.lc_fit.load_chain(filename=chain_file)

task.lc_fit.fit_summary()

# BINARY SYSTEM
# Parameter                                        value            -1 sigma            +1 sigma                unit    status                                            
#-----------------------------------------------------------------------------------------------------------------------------
Mass ratio (q=M_2/M_1):                            1.077                                                                Variable                                          
Semi major axis (a):                              12.007                                                      solRad    Variable                                          
Inclination (i):                                 81.0128             -0.0046              0.0028              degree    Variable                                          
Eccentricity (e):                               0.037139           -0.000259             2.4e-05                        Variable                                          
Ar

With `burn_in` parameter 0 we have to inspect traces not only for any sign of clipping but also we have to determine how big portion of the chain belongs to the thermalization stage and has to be discarded:

In [13]:
task.lc_fit.plot.traces()

<IPython.core.display.Javascript object>

Here we can see that chain reaches sampling phase after roughly 2000 steps. Therefore, we have to discard the first 2000 steps by using `discard` argument:

In [14]:
task.lc_fit.load_chain(filename=chain_file, discard=2000)

(array([[0.48487982, 0.65851191, 0.26095683, ..., 0.64215529, 0.43612447,
         0.40279391],
        [0.48784044, 0.65745998, 0.26023961, ..., 0.65091757, 0.4348926 ,
         0.39865376],
        [0.48687438, 0.64609288, 0.2812855 , ..., 0.63897936, 0.43752101,
         0.41400257],
        ...,
        [0.4879476 , 0.657797  , 0.26010918, ..., 0.65638301, 0.43505752,
         0.39814654],
        [0.48738214, 0.65697515, 0.26007648, ..., 0.65025003, 0.43590682,
         0.39773416],
        [0.48819143, 0.6576412 , 0.26009808, ..., 0.65716855, 0.43507312,
         0.39786526]]),
 ['inclination',
  'eccentricity',
  'argument_of_periastron',
  's__t_eff',
  'p__surface_potential',
  's__surface_potential',
  'additional_light'],
 {'inclination': [79.06078610807855, 83.06078610807855],
  'eccentricity': [0.0239893249235908, 0.043989324923590804],
  'argument_of_periastron': [138.07564135071294, 198.07564135071294],
  'gamma': [0, 1000000.0],
  'p__mass': [0.1, 50],
  's__mass': [0.1

And plot the corner plot:

In [15]:
task.lc_fit.plot.corner(truths=True)

<IPython.core.display.Javascript object>



As it is obvious from the corner plot, 1000 MCMC steps was not sufficient to produce satisfactory statistical sample to reliably determine errors of the fitted parameters.