# Nested Samplers available in Gleipnir

Gleipnir currently has four classes that can be used to setup and launch Nested Sampling runs: one for using a built-in classic Nesting Sampling implementation and three interface classes for external Nested Sampling codes. In this notebook, we will explore each of these Nested Samplers and how to use them. 

## Model: Egg Carton likelihood

For the purposes of this tutorial, we will again use the Egg Carton likelihood landscape model (as in the ([Intro to Nested Sampling Notebook](https://github.com/LoLab-VU/Gleipnir/blob/master/jupyter_notebooks/Intro_to_Nested_Sampling_with_Gleipnir.ipynb)).  

The model is typically two-dimensional (two parameters) and the landscape generated by the likelihood function is a multi-modal egg carton-like shape; see slide 15 of [this pdf](http://www.nbi.dk/~koskinen/Teaching/AdvancedMethodsInAppliedStatistics2016/Lecture14_MultiNest.pdf) for a visualization of the likelihood landscape. The parameters are each defined on \[0:10pi\] with uniform priors. 

Here is the Egg Carton loglikelihood function which returns the natural logarithm of the likelihood for a given parameter vector:

In [None]:
# Import NumPy
import numpy as np
# Define the loglikelihood function.
def loglikelihood(parameter_vector):
    chi = (np.cos(parameter_vector)).prod()
    return (2. + chi)**5

### Sampled Parameters

Now that we have our loglikelihood function, let's look at how to define parameters for sampling during the Nested Sampling run. The parameters that are sampled are defined by a list of SampledParameter class instances. The SampledParameter class object stores data on the name of the parameter and the parameter's prior probability distribution. 

In [None]:
# Import the SampledParameters class.
from gleipnir.sampled_parameter import SampledParameter

A new SampledParameter needs two arguments: a name and a object defining the prior.

For priors we can use frozen RV objects from scipy.stats; in special cases you could also write your own prior distribution class objects, but for most purposes scipy.stats distributions will be sufficient.

In [None]:
# Let's import the uniform distribution.
from scipy.stats import uniform

In [None]:
# Now we'll create our list sampled parameters.
# There are two parameters 'x' and 'y',each with a uniform prior on [0:10pi]. 
sampled_parameters = list()
sampled_parameters.append(SampledParameter(name='x', prior=uniform(loc=0.0,scale=10.0*np.pi)))
sampled_parameters.append(SampledParameter(name='y', prior=uniform(loc=0.0,scale=10.0*np.pi)))  

Now we've defined our list of paramters that are to be sampled and their prior probability distributions.  We'll use these sampled parameters for each of the Nested Samplers. 

## 1. NestedSampling
The first Nested Sampler is Gleipnir's built-in implementation of the classic Nested Sampling algorithm, NestedSampling. It is imported from the gleipnir.nested_sampling module. 

In [None]:
from gleipnir.nested_sampling import NestedSampling

This Nested Sampler uses a plug-in style approach to sampling (i.e., replacing dead points) and stopping criterion (i.e., when to exit the Nested Sampling run). Therefore, we need to define a sampler and stopping criterion.

### Sampler
The sampler is used in the classic Nested Sampling algorithm to replace dead points during each nested iteration. The samplers are imported from the samplers module:

In [None]:
# Import the sampler we want to use during the Neseted Sampling run.
from gleipnir.samplers import MetropolisComponentWiseHardNSRejection

Currently, Gleipnir just has the one sampler: MetropolisComponentWiseHardNSRejection. This sampler uses a [Metropololis Monte Carlo](http://xbeams.chem.yale.edu/~batista/vaa/node42.html) scheme adapted for Nested Sampling with a hard Nested likelihood level rejection criterion. During the Nested Sampling iteration the most recent dead point is replaced with a survivor that is  then modified using the sampler. More samplers may be added in the future.

Now we can initialize the sampler:

In [None]:
# Initialize the sampler.
sampler = MetropolisComponentWiseHardNSRejection(iterations=10, tuning_cycles=1)

Here the iterations=10 are the number of total component wise update cycles, so the value of 10 will yield 20 component-wise Monte Carlo trial moves (i.e., iterations*(number of sampled parameters)). tuning_cycles=1 sets the number of trial move size tuning cycles; each tuning_cycle is 20 iterations. 

### Stopping Criterion

The stopping criterion sets the method used to determine when terminate the Nested Sampling iterations. They are imported from the stopping_criterion modulue. There are currenlty two criterion which can be used:
  * NumberOfIterations - Stop after a fixed number of nested iterations.
  * RemainingPriorMass - Stop after the remaining fraction of prior mass is less than or equal to a preset threshold.
Here, we'll use the fixed number of iterations:  

In [None]:
# Import the stopping criterion object. In this case, we'll use a fixed number of iterations.
from gleipnir.stopping_criterion import NumberOfIterations
# Initialize the stopping criterion -- We'll stop after 1000 Nested Sampling iterations.
stopping_criterion = NumberOfIterations(1000)

Now that we've got the sampled parameters, sampler, and stopping criterion, all we need now is the Nested Sampling population size. Let's go ahead and set it:

In [None]:
# Set the NS population size.
population_size=500

Now we create the instance of the NestedSampling class:

In [None]:
NS = NestedSampling(sampled_parameters, loglikelihood,
                    sampler, population_size,
                    stopping_criterion)

Then the Nested Sampling is launched with the run function:

In [None]:
log_evidence, log_evidence_error = NS.run(verbose=False)

Then we can check the output:

In [None]:
print(" Ln(Evidence): {}+-{}".format(log_evidence, log_evidence_error))

In addition to the log_evidence and log_evidence_error estimates returned by the run function, the NestedSampling Nested Sampler has the following accesible properties:
  * evidence
  * evidence_error
  * information

In [None]:
evidence = NS.evidence
evidence_error = NS.evidence_error
information = NS.information

The NestedSampling Nested Sampler also has the following functions:
  * posteriors - Estimates of the posterior marginal probability distributions of each parameter. 
  * akaike_ic - Estimate of the Akaike Information Criterion.
  * bayesian_ic - Estimate of the Bayesian Information Criterion.
  * deviance_ic - Estimate of the Deviance Information Criterion.

## 2. MultiNestNestedSampling

Gleipnir also provides a Nested Sampler class, MultiNestNestedSampling, that is an interface to [MultiNest](https://academic.oup.com/mnras/article/398/4/1601/981502). Use of MultiNest via Gleipnir requires separate building and installation of both MultiNest and the Python wrapper PyMultiNest; see Gleinir's [README](https://github.com/LoLab-VU/Gleipnir#multinest) for the links to those instructions. 

It is imported from the multinest module:

In [None]:
from gleipnir.multinest import MultiNestNestedSampling

Then the instance is created with a similar input pattern as the NestedSampling, but only requires the sampled parameters, loglikelihood, and population size:

In [None]:
MNNS = MultiNestNestedSampling(sampled_parameters, loglikelihood, population_size)

However, the MulitNestNestedSampling object does have some extra keyword arguments that can be passed in to alter the behavior of the MultiNest run (internally passed along to PyMultiNest):
  * importance_nested_sampling (bool): Should MultiNest use
    Importance Nested Sampling (INS). Default: True
  * constant_efficiency_mode (bool): Should MultiNest run in
    constant sampling efficiency mode. Default: False
  * sampling_efficiency (float): Set the MultiNest sampling
    efficiency. 0.3 is recommended for evidence evaluation,
    while 0.8 is recommended for parameter estimation.
    Default: 0.8
  * resume (bool): Resume from a previous MultiNest run (using
    the last saved checkpoint in the MultiNest output files).
    Default: True
  * write_output (bool): Specify whether MultiNest should write
    to output files. True is required for additional
    analysis. Default: True
  * multimodal (bool): Set whether MultiNest performs mode
    separation. Default: True
  * max_mode (int): Set the maximum number of modes allowed in
    mode separation (if multimodal=True). Default: 100
  * mode_tolerance (float): A lower bound for which MultiNest will
    use to separate mode samples and statistics with
    log-evidence value greater the given value.
    Default: -1e90
  * n_clustering_params (int): If multimodal=True, set the number
    of parameters to use in clustering during mode separation.
    If None, then MultiNest will use all the paramters for
    clustering during mode separation. If
    n<(number of sampled parameters), then MultiNest will only
    use a subset composed of the first n parameters for
    clustering during mode separation. Default: None
  * null_log_evidence (float): If multimodal=True, a lower bound
    for which MultiNest can use to separte mode samples and
    statistics with a local log-evidence value greater than the
    given bound. Default: -1.e90
  * log_zero (float): Set a threshold value for which points with
    a loglikelihood less than the given value will be ignored
    by MultiNest. Default: -1e100
  * max_iter (int): Set the maximum number of nested sampling
    iterations performed by MultiNest. If 0, then it is
    unlimited and MultiNest will stop using a different
    criterion. Default: 0

By default, MultiNest will output a set of files with the root 'multinest_run_'. If you want to change the file root you can do so via the multinest_file_root property:

In [None]:
print("Default: {}".format(MNNS.multinest_file_root))
MNNS.multinest_file_root = 'run_eggcarton_multinest_'
print("Changed to: {}".format(MNNS.multinest_file_root))

Now the MultiNest output files with begin with 'run_eggcarton_multinest'

Once the instance has been defined, you start the Nested Sampling using the run function:

In [None]:
log_evidence, log_evidence_error = MNNS.run(verbose=False)

In [None]:
print(" Ln(Evidence): {}+-{}".format(log_evidence, log_evidence_error))

In addition to the log_evidence and log_evidence_error estimates returned by the run function, the MultiNestNestedSampling Nested Sampler has the following accesible properties:
  * evidence
  * evidence_error
  
The MultiNestNestedSampling Nested Sampler also has the following functions:
  * posteriors - Estimates of the posterior marginal probability distributions of each parameter. 
  * akaike_ic - Estimate of the Akaike Information Criterion.
  * bayesian_ic - Estimate of the Bayesian Information Criterion.
  * deviance_ic - Estimate of the Deviance Information Criterion.  

## 3. PolyChordNestedSampling

Gleipnir also provides a Nested Sampler class, PolyChordNestedSampling, that is an interface to the public version of [PolyChord](https://github.com/PolyChord/PolyChordLite). Use of PolyChord via Gleipnir requires separate building and installation of pypolychord; see Gleinir's [README](https://github.com/LoLab-VU/Gleipnir#polychord) for the links to those instructions. 

It is imported from the polychord module:

In [None]:
from gleipnir.polychord import PolyChordNestedSampling

Then the instance is created with a similar input pattern as the NestedSampling, but like the MultiNestNestedSampling object it only requires the sampled parameters, loglikelihood, and population size:

In [None]:
PCNS = PolyChordNestedSampling(sampled_parameters, loglikelihood, population_size)

The PolyChordNestedSampling object does not have any extra keyword arguments to worry about.

By default, PolyChord will output a set of files with the root 'polychord_run_'. If you want to change the file root you can do so via the polychord_file_root property:

In [None]:
print("Default: {}".format(PCNS.polychord_file_root))
PCNS.polychord_file_root = 'run_eggcarton_polychord_'
print("Changed to: {}".format(PCNS.polychord_file_root))

Now the PolyChord output files will begin with 'run_eggcarton_polychord'

Once the instance has been defined, you start the Nested Sampling using the run function:

In [None]:
%%capture
# Run. There is no verbose setting for the PCNS object's run function. 
log_evidence, log_evidence_error = PCNS.run()

In [None]:
print(" Ln(Evidence): {}+-{}".format(log_evidence, log_evidence_error))

In addition to the log_evidence and log_evidence_error estimates returned by the run function, the PolyChordNestedSampling Nested Sampler has the following accesible properties:
  * evidence
  * evidence_error
  
The PolyChordNestedSampling Nested Sampler also has the following functions:
  * posteriors - Estimates of the posterior marginal probability distributions of each parameter. 
  * akaike_ic - Estimate of the Akaike Information Criterion.
  * bayesian_ic - Estimate of the Bayesian Information Criterion.
  * deviance_ic - Estimate of the Deviance Information Criterion.  

## 4. DNest4NestedSampling

Gleipnir also provides a Nested Sampler class, DNest4NestedSampling, that is an interface to [DNest4](https://github.com/eggplantbren/DNest4). Use of DNest4 via Gleipnir requires separate building and installation of DNest4 and its Python bindings; see Gleinir's [README](https://github.com/LoLab-VU/Gleipnir#dnest4) for the links to those instructions. 

It is imported from the dnest4 module:

In [None]:
from gleipnir.dnest4 import DNest4NestedSampling

Then the instance is created with a similar input pattern as the NestedSampling, but only requires the sampled parameters, loglikelihood, and population size:

In [None]:
DNS = DNest4NestedSampling(sampled_parameters, loglikelihood, population_size)

However, the DNest4NestedSampling object does have some extra keyword arguments that can be passed in to alter the behavior of the DNest4 run:
  * n_diffusive_levels (int): Set the maximum number of
      diffusive likelihood levels for DNest4 to use.
      Default: 20
  * dnest4_backend (str : "memory" or "csv"): Set which
      DNest4 backend to use. "memory" means outputs are 
      just kept in memory. "csv" means outputs are 
      written to disk in files with a csv format. Default: "memory"
  * num_steps (int): The number of nested iterations to run. If None, 
      will run forever. 
      Default: None
  * new_level_interval (int): The number of nested iterations to run before
     creating a new diffusive likelihood level. Default: 10000
  * lam (float): Set the backtracking scale length. Default: 5.0
  * beta (float): Set the strength of effect to force the histogram
      to equal bin counts. Default: 100

Let's rinitialize the sampler and set the num_steps variable (so that DNest4 will not run forever).

In [None]:
DNS = DNest4NestedSampling(sampled_parameters, loglikelihood, population_size, num_steps=500)

If the dnest4_backend keyword argument is set to "csv" then the DNest4 will output a set of files with the root 'dnest4_run_'. If you want to change the file root you can do so via the dnest4_file_root property:

In [None]:
print("Default: {}".format(DNS.dnest4_file_root))
DNS.dnest4_file_root = 'run_eggcarton_dnest4_'
print("Changed to: {}".format(DNS.dnest4_file_root))

Now the MultiNest output files with begin with 'run_eggcarton_multinest'

Once the instance has been defined, you start the Nested Sampling using the run function:

In [None]:
log_evidence, log_evidence_error = DNS.run(verbose=False)

In [None]:
print(" Ln(Evidence): {}+-{}".format(log_evidence, log_evidence_error))

In addition to the log_evidence and log_evidence_error estimates returned by the run function, the DNest4NestedSampling Nested Sampler has the following accesible properties:
  * evidence
  * evidence_error
  * information
  
The DNest4NestedSampling Nested Sampler also has the following functions:
  * posteriors - Estimates of the posterior marginal probability distributions of each parameter. 
  * akaike_ic - Estimate of the Akaike Information Criterion.
  * bayesian_ic - Estimate of the Bayesian Information Criterion.
  * deviance_ic - Estimate of the Deviance Information Criterion.  