Skip to content
This repository has been archived by the owner on Feb 17, 2023. It is now read-only.
/ amnre Public archive

Arbitrary Marginal Neural Ratio Estimation for Likelihood-free Inference

License

Notifications You must be signed in to change notification settings

francois-rozet/amnre

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Arbitrary Marginal Neural Ratio Estimation for Likelihood-free Inference

This repository is the official implementation of the Master's thesis Arbitrary Marginal Neural Ratio Estimation for Likelihood-free Inference and the article Arbitrary Marginal Neural Ratio Estimation for Simulation-based Inference.

Code

The majority of the code is written in Python with some bash automation. The neural networks are built and trained using the PyTorch automatic differentiation framework. We also rely on nflows to implement normalizing flow networks, torchist to manipulate histograms and matplotlib to display results graphically.

All dependencies are provided as a conda environment file, which can be installed with

conda env create -f environment.yml
conda activate amnre

Organization

The core of the code is the amnre folder, which is a Python package. It contains the implementation of the models, simulators, data loaders and more.

Simulators and datasets

Three simulators are provided: Simple Likelihood and Complex Posterior (SLCP), Hodgkin-Huxley (HH) and Gravitational Waves (GW). Other simulators can be implemented by extending the Simulator class. In order to use HH and GW, the code of other repositories has to imported and compiled, which is automated by hh_setup.sh and gw_setup.sh, respectively.

conda activate amnre
bash misc/hh_setup.sh
bash misc/gw_setup.sh

If you don't plan on using HH or GW this step can be skipped.

To store and access samples from these simulators, we use the HDF5 file format with its Python interface h5py. In particular, we store the parameters in a dataset theta and the realizations in a dataset x within a .h5 file. From such file, an OfflineDataset acting as an iterable data loader can be instantiated. The latter has the advantage over torch's builtin data loaders to only load a chunk of the data on RAM, at once, which is necessary when realizations are large. The script sample.py generates samples from the simulators and creates the .h5 files.

For example, to create training and validation datasets for the SLCP simulator, we use

python sample.py -simulator SLCP -seed 0 -samples 1048576 -o slcp-train.h5
python sample.py -simulator SLCP -seed 1 -samples 131072 -o slcp-valid.h5

Training

The script train.py links the different modules of amnre to instantiate and train the models. A lot of options are available and can be listed with the --help flag.

For example, to train an AMNRE model with the SLCP training dataset, we use

MODEL='{"num_layers": 7, "hidden_size": 256, "activation": "ELU"}'
python train.py -simulator SLCP -samples slcp-train.h5 -valid slcp-valid.h5 -model "$MODEL" -arbitrary -device cuda -o slcp-amnre.pth

The outputs are tree files: the instance settings slcp-amnre.json, the network weights slcp-amnre.pth and the training statistics slcp-amnre.csv. The latter reports, for each epoch, the time, the learning rate, the mean and standard deviation of the training loss, and the mean and standard deviation of the validation loss.

epoch,time,lr,mean,std,v_mean,v_std
1,2.0477960109710693,0.001,0.52534419298172,0.27149534225463867,0.23883646726608276,0.01787589117884636
2,1.9976160526275635,0.001,0.20120219886302948,0.04054423049092293,0.16070948541164398,0.016732219606637955
3,1.9926025867462158,0.001,0.13525991141796112,0.07273339480161667,0.08658836781978607,0.01457754522562027
...

The settings and weights files should always stay together as the former contains the instructions to build the network.

Evaluation

The evaluation process is done in two stages: eval.py performs the actual computations and plots.py represents graphically the results. These scripts are heavily specialized for our experiments. If you wish to perform other experiments, it is probably easier to write your own evaluation procedures.

Experiments

Our experiments were performed on a cluster of GPUs managed with Slurm. All the scripts are provided in the slurm folder.

Citation

@mastersthesis{rozet2021arbitrary,
  title={Arbitrary Marginal Neural Ratio Estimation for Likelihood-free Inference},
  author={Rozet, Fran{\c{c}}ois and Louppe, Gilles},
  year={2021},
  school={University of Liège, Belgium},
  url={https://hdl.handle.net/2268.2/12993}
}

@misc{rozet2021arbitrary,
  title={Arbitrary Marginal Neural Ratio Estimation for Simulation-based Inference},
  author={François Rozet and Gilles Louppe},
  year={2021},
  eprint={2110.00449},
  archivePrefix={arXiv},
  primaryClass={cs.LG}
}