Skip to content

Releases: dom0015/surrDAMH

Version 0.2.0

28 May 21:34
Compare
Choose a tag to compare

surrDAMH

Python implementation of surrogate-accelerated Markov chain Monte Carlo methods for postrior sampling in Bayesian inversion. Based on the delayed acceptance Metropolis-Hastings algorithm.

Requirements

  • numpy
  • scipy
  • mpi4py
  • matplotlib
  • torch (for neural network surrogate model)
  • scikit-learn (for polynomial surrogate model)
  • pandas (for post-processing)
  • emcee (for post-processing)

Bayesian inverse problem setting

What is given:

  • forward model: $G:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m}$
  • observations (noisy outputs of the forward model): $y\in\mathbb{R}^{m}$
  • prior probability density function (pdf): $f_{U}$
  • noise pdf: $f_{Z}$

Input parameters $u\in\mathbb{R}^{n}$ are unknown.

Additive noise is considered; therefore, the posterior pdf is given by the following formula:

$f_{U|Y}\left(u|y\right)\propto{f_{Z}\left(y-G\left(u\right)\right)}{f_{U}\left(u\right)}$

Parallel processes:

  • several Markov chains generated in parallel
  • the chains share one surrogate model, which is refined during the sampling process using data from all chains
  • the chains share a pool of spawned solvers (processes that evaluate $G$)
    • assumed to be the computationally most demanding part
    • a solver is typically a linked numerical library
    • number of solvers is typically lower than number of chains

Getting started:

  • open repository in the Docker container (e.g. using the Dev Containers extension of Visual Studio Code)
  • pip install .

Before running the sampling process, it is necessary to specify:

  • configuration (basic settings, e.g. number of solvers, initial samples, ...)
  • prior
  • likelihood
  • solver specification
  • surrogate model updater
  • list of stages

Check examples in the toy_examples folder, e.g.:

  • cd toy_examples/
  • mpiexec -n 4 python3 -m mpi4py minimal_example.py

Surrogate models:

The following non-intrusive surrogate models are implemented. They are constructed from snapshots $\left(u^{\left(k\right)},G\left(u^{\left(k\right)}\right)\right)$ and they can be adaptively refined during the sampling process.

  • polynomial chaos approximation - complete polynomials, adaptive increase of maximum degree based on available data (using scikit-learn)
  • radial basis functions interpolation (RBF) - combined with polynomials up to chosen degree (using SciPy)
  • approximation using nearest points identified using kd-tree (using SciPy)
  • multilayer perceptron regressor (using PyTorch)

Version 0.1.0

14 Mar 12:28
Compare
Choose a tag to compare

surrDAMH 1.0

Python implementation of surrogate-accelerated Markov chain Monte Carlo methods for the numerical realization of the Bayesian inversion

The library is based on the delayed acceptance Metropolis-Hastings algorithm and accelerated using surrogate models and using parallelization. It provides samples from the posterior distribution π(u|y) ∝ fη(y - G(u)) π0(u), where y is a given vector of observations, G is an observation operator, fη is a probability density function (pdf) of Gaussian observational noise, π0(u) is Gaussian prior pdf.

Requirements

  • numpy
  • scipy
  • pandas
  • json
  • mpi4py
  • petsc4py (for "Darcy" example)
  • MyFEM (for "Darcy" example)
  • pcdeflation (for the use of an own deflation basis in "Darcy" example)
    • make -C examples/solvers/pcdeflation clean
    • make -C examples/solvers/pcdeflation build
  • cython (for pcdeflation build)

Instructions for running the sampling process

  • problem_name
    • defines the name of the json configuration file that will be loaded: "conf/" + problem_name + ".json"
    • prepared toy examples: "simple", "simple_MPI", "Darcy" (in the form of json configuration files)
  • N = number of sampling processes

run the sampling process:

python3 run.py problem_name N (oversubscribe)

  • toy examples:

    • python3 run.py simple 4
    • python3 run.py simple_MPI 4
    • python3 run.py Darcy 4
  • use "oversubscribe" if there are not enough slots available, for example:

    • python3 run.py simple 4 oversubscribe
  • python3 run.py simple 4   <=>   mpirun -n 4 python3 -m mpi4py surrDAMH/process_SAMPLER.py : -n 1 python3 -m mpi4py surrDAMH/process_SOLVER.py simple : -n 1 python3 -m mpi4py surrDAMH/process_COLLECTOR.py

  • python3 run.py simple 4 oversubscribe   <=>   mpirun -n 4 --oversubscribe python3 -m mpi4py surrDAMH/process_SAMPLER.py : -n 1 --oversubscribe python3 -m mpi4py surrDAMH/process_SOLVER.py simple : -n 1 --oversubscribe python3 -m mpi4py surrDAMH/process_COLLECTOR.py

Obtained samples are saved into saved_samples/problem_name.

run the visualization of obtained samples:

python3 run.py problem_name N visualize

  • toy examples:

    • python3 run.py simple 4 visualize
    • python3 run.py simple_MPI 4 visualize
    • python3 run.py Darcy 4 visualize
  • python3 run.py simple 4 visualize   <=>   python3 examples/visualization/simple.py 4

MPI processes

  • process_SAMPLER.py: N sampling processes based on the Metropolis-Hastings (MH) and the delayed-acceptance MH algorithm
  • process_SOLVER.py: solvers parent process that spawns new MPI processes (number and type specified in the JSON configuration file)
  • process_COLLECTOR.py: collects snapshots and uses them for the construction and updates of the surrogate model ("poly" or "rbf" ... polynomial or radial basis functions based surrogate model)

spawned solvers:

  • provide evaluations of an observation operator

    observations = G(parameters)

  • e.g. wrapper of an external library (see the "Darcy" example)

Path and constructor arguments of the spawed solver (wrapper) class are specified in the JSON configuration file. The following methods are required (executed by all MPI processes in the spawned communicator):

  • set_parameters(self, parameters: list[no_parameters]) -> None
  • get_observations(self) -> list[no_observations]

MPI processes

JSON configuration file example & comments

{

"problem_name": "Darcy",

"no_parameters": 4,

"no_observations": 3, length of the vector of the observations, repetitive observations are not supported

"problem_parameters": {

"prior_mean": 0.0, scalar or vector of length no_parameters

"prior_std": 1.0, scalar or vector or covariance matrix

"observations": [9.62638828, -5.90755323, -3.71883564], vector of observations

"noise_std": [0.2, 0.1, 0.1] standard deviation or covariance matrix of observational noise

},

"paths_to_append": [ optional

"/home/simona/GIT/Simple_Python_PETSc_FEM"

],

"solver_module_path": "examples/solvers/FEM_interfaces.py",

"solver_module_name": "FEM_interfaces",

"solver_init_name": "FEM", spawned solver class

"solver_parameters": { spawned solver class constructor arguments (in the form of a dictionary)

},

"solver_parent_parameters": { optional

"maxprocs": 4 number of MPI processes of each spawned solver

},

"no_solvers": 3, number of separate solvers to be spawned

"samplers_list": [ dictionary of sampling algoritms run by all sampling processes in consecutive order

{

"type": "MH", MH (does not use surrogate model) or DAMH (uses surrogate model)

"max_samples": 200, maximal length of the chain

"time_limit": 60, maximal sampling time in seconds

"proposal_std": 0.2, standard deviation or covariance matrix of Gaussian proposal distribution

"surrogate_is_updated": true update surrogate while sampling

},

{

"type": "DAMH",

"max_samples": 2000,

"time_limit": 60,

"proposal_std": 0.2,

"surrogate_is_updated": true

},

{

"type": "DAMH",

"max_samples": 10000,

"time_limit": 60,

"proposal_std": 0.2,

"surrogate_is_updated": false

}

],

"surrogate_type": "rbf", "poly" or "rbf" (polynomial or radial basis functions based surrogate model)

"surr_solver_parameters": { optional - arguments of class Surrogate_apply that evaluates surrogate model

"kernel_type": "polyharmonic",

"beta": 3

},

"surr_updater_parameters": { optional - arguments of class Surrogate_update that updates surrogate model

"kernel_type": "polyharmonic",

"beta": 3,

"max_centers": 500,

"expensive": false

}

}