surrDAMH

Python implementation of surrogate-accelerated Markov chain Monte Carlo methods for postrior sampling in Bayesian inversion. Based on the delayed acceptance Metropolis-Hastings algorithm.

Requirements

numpy
scipy
mpi4py
matplotlib
torch (for neural network surrogate model)
scikit-learn (for polynomial surrogate model)
pandas (for post-processing)
emcee (for post-processing)

Bayesian inverse problem setting

What is given:

forward model: $G:\mathbb{R}^{n}\rightarrow\mathbb{R}^{m}$
observations (noisy outputs of the forward model): $y\in\mathbb{R}^{m}$
prior probability density function (pdf): $f_{U}$
noise pdf: $f_{Z}$

Input parameters $u\in\mathbb{R}^{n}$ are unknown.

Additive noise is considered; therefore, the posterior pdf is given by the following formula:

$f_{U|Y}\left(u|y\right)\propto{f_{Z}\left(y-G\left(u\right)\right)}{f_{U}\left(u\right)}$

Parallel processes:

several Markov chains generated in parallel
the chains share one surrogate model, which is refined during the sampling process using data from all chains
the chains share a pool of spawned solvers (processes that evaluate $G$)
- assumed to be the computationally most demanding part
- a solver is typically a linked numerical library
- number of solvers is typically lower than number of chains

Getting started:

open repository in the Docker container (e.g. using the Dev Containers extension of Visual Studio Code)
pip install .

Before running the sampling process, it is necessary to specify:

configuration (basic settings, e.g. number of solvers, initial samples, ...)
prior
likelihood
solver specification
surrogate model updater
list of stages

Check examples in the toy_examples folder, e.g.:

cd toy_examples/
mpiexec -n 4 python3 -m mpi4py minimal_example.py

Surrogate models:

The following non-intrusive surrogate models are implemented. They are constructed from snapshots $\left(u^{\left(k\right)},G\left(u^{\left(k\right)}\right)\right)$ and they can be adaptively refined during the sampling process.

polynomial chaos approximation - complete polynomials, adaptive increase of maximum degree based on available data (using scikit-learn)
radial basis functions interpolation (RBF) - combined with polynomials up to chosen degree (using SciPy)
approximation using nearest points identified using kd-tree (using SciPy)
multilayer perceptron regressor (using PyTorch)

surrDAMH 1.0

Python implementation of surrogate-accelerated Markov chain Monte Carlo methods for the numerical realization of the Bayesian inversion

The library is based on the delayed acceptance Metropolis-Hastings algorithm and accelerated using surrogate models and using parallelization. It provides samples from the posterior distribution π(u|y) ∝ f_η(y - G(u)) π₀(u), where y is a given vector of observations, G is an observation operator, f_η is a probability density function (pdf) of Gaussian observational noise, π₀(u) is Gaussian prior pdf.

Requirements

numpy
scipy
pandas
json
mpi4py
petsc4py (for "Darcy" example)
MyFEM (for "Darcy" example)
- github: https://github.com/Beremi/Simple_Python_PETSc_FEM/tree/SD
pcdeflation (for the use of an own deflation basis in "Darcy" example)
- make -C examples/solvers/pcdeflation clean
- make -C examples/solvers/pcdeflation build
cython (for pcdeflation build)

Instructions for running the sampling process

problem_name
- defines the name of the json configuration file that will be loaded: "conf/" + problem_name + ".json"
- prepared toy examples: "simple", "simple_MPI", "Darcy" (in the form of json configuration files)
N = number of sampling processes

run the sampling process:

python3 run.py problem_name N (oversubscribe)

toy examples:
- python3 run.py simple 4
- python3 run.py simple_MPI 4
- python3 run.py Darcy 4
use "oversubscribe" if there are not enough slots available, for example:
- python3 run.py simple 4 oversubscribe
python3 run.py simple 4 <=> mpirun -n 4 python3 -m mpi4py surrDAMH/process_SAMPLER.py : -n 1 python3 -m mpi4py surrDAMH/process_SOLVER.py simple : -n 1 python3 -m mpi4py surrDAMH/process_COLLECTOR.py
python3 run.py simple 4 oversubscribe <=> mpirun -n 4 --oversubscribe python3 -m mpi4py surrDAMH/process_SAMPLER.py : -n 1 --oversubscribe python3 -m mpi4py surrDAMH/process_SOLVER.py simple : -n 1 --oversubscribe python3 -m mpi4py surrDAMH/process_COLLECTOR.py

Obtained samples are saved into saved_samples/problem_name.

run the visualization of obtained samples:

python3 run.py problem_name N visualize

toy examples:
- python3 run.py simple 4 visualize
- python3 run.py simple_MPI 4 visualize
- python3 run.py Darcy 4 visualize
python3 run.py simple 4 visualize <=> python3 examples/visualization/simple.py 4

MPI processes

process_SAMPLER.py: N sampling processes based on the Metropolis-Hastings (MH) and the delayed-acceptance MH algorithm
process_SOLVER.py: solvers parent process that spawns new MPI processes (number and type specified in the JSON configuration file)
process_COLLECTOR.py: collects snapshots and uses them for the construction and updates of the surrogate model ("poly" or "rbf" ... polynomial or radial basis functions based surrogate model)

spawned solvers:

provide evaluations of an observation operator

observations = G(parameters)
e.g. wrapper of an external library (see the "Darcy" example)

Path and constructor arguments of the spawed solver (wrapper) class are specified in the JSON configuration file. The following methods are required (executed by all MPI processes in the spawned communicator):

set_parameters(self, parameters: list[no_parameters]) -> None
get_observations(self) -> list[no_observations]

JSON configuration file example & comments

{

"problem_name": "Darcy",

"no_parameters": 4,

"no_observations": 3, length of the vector of the observations, repetitive observations are not supported

"problem_parameters": {

"prior_mean": 0.0, scalar or vector of length no_parameters

"prior_std": 1.0, scalar or vector or covariance matrix

"observations": [9.62638828, -5.90755323, -3.71883564], vector of observations

"noise_std": [0.2, 0.1, 0.1] standard deviation or covariance matrix of observational noise

},

"paths_to_append": [ optional

"/home/simona/GIT/Simple_Python_PETSc_FEM"

],

"solver_module_path": "examples/solvers/FEM_interfaces.py",

"solver_module_name": "FEM_interfaces",

"solver_init_name": "FEM", spawned solver class

"solver_parameters": { spawned solver class constructor arguments (in the form of a dictionary)

},

"solver_parent_parameters": { optional

"maxprocs": 4 number of MPI processes of each spawned solver

},

"no_solvers": 3, number of separate solvers to be spawned

"samplers_list": [ dictionary of sampling algoritms run by all sampling processes in consecutive order

{

"type": "MH", MH (does not use surrogate model) or DAMH (uses surrogate model)

"max_samples": 200, maximal length of the chain

"time_limit": 60, maximal sampling time in seconds

"proposal_std": 0.2, standard deviation or covariance matrix of Gaussian proposal distribution

"surrogate_is_updated": true update surrogate while sampling

},

{

"type": "DAMH",

"max_samples": 2000,

"time_limit": 60,

"proposal_std": 0.2,

"surrogate_is_updated": true

},

{

"type": "DAMH",

"max_samples": 10000,

"time_limit": 60,

"proposal_std": 0.2,

"surrogate_is_updated": false

}

],

"surrogate_type": "rbf", "poly" or "rbf" (polynomial or radial basis functions based surrogate model)

"surr_solver_parameters": { optional - arguments of class Surrogate_apply that evaluates surrogate model

"kernel_type": "polyharmonic",

"beta": 3

},

"surr_updater_parameters": { optional - arguments of class Surrogate_update that updates surrogate model

"kernel_type": "polyharmonic",

"beta": 3,

"max_centers": 500,

"expensive": false

}

}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

surrDAMH

Requirements

Bayesian inverse problem setting

Parallel processes:

Getting started:

Surrogate models:

surrDAMH 1.0

Python implementation of surrogate-accelerated Markov chain Monte Carlo methods for the numerical realization of the Bayesian inversion

Requirements

Instructions for running the sampling process

run the sampling process:

`python3 run.py problem_name N (oversubscribe)`

run the visualization of obtained samples:

`python3 run.py problem_name N visualize`

MPI processes

spawned solvers:

JSON configuration file example & comments

Releases: dom0015/surrDAMH

Version 0.2.0

surrDAMH

Requirements

Bayesian inverse problem setting

Parallel processes:

Getting started:

Surrogate models:

Version 0.1.0

surrDAMH 1.0

Python implementation of surrogate-accelerated Markov chain Monte Carlo methods for the numerical realization of the Bayesian inversion

Requirements

Instructions for running the sampling process

run the sampling process:

python3 run.py problem_name N (oversubscribe)

run the visualization of obtained samples:

python3 run.py problem_name N visualize

MPI processes

spawned solvers:

JSON configuration file example & comments

`python3 run.py problem_name N (oversubscribe)`

`python3 run.py problem_name N visualize`