Skip to content
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Inferring dark matter substructure with machine learning

Code repository for the paper Mining for Dark Matter Substructure: Inferring subhalo population properties from strong lenses with machine learning by Johann Brehmer, Siddharth Mishra-Sharma, Joeri Hermans, Gilles Louppe, and Kyle Cranmer.

License: MIT Dark matter arXiv

Visualization of Bayesian inference on substructure properties.


The subtle and unique imprint of dark matter substructure on extended arcs in strong lensing systems contains a wealth of information about the properties and distribution of dark matter on small scales and, consequently, about the underlying particle physics. However, teasing out this effect poses a significant challenge since the likelihood function for realistic simulations of population-level parameters is intractable. We apply recently-developed simulation-based inference techniques to the problem of substructure inference in galaxy-galaxy strong lenses. By leveraging additional information extracted from the simulator, neural networks are efficiently trained to estimate likelihood ratios associated with population-level parameters characterizing substructure. Through proof-of-principle application to simulated data, we show that these methods can provide an efficient and principled way to simultaneously analyze an ensemble of strong lenses, and can be used to mine the large sample of lensing images deliverable by near-future surveys for signatures of dark matter substructure.


In figures/ we collect the figures shown in the paper. The folder also contains a few additional plots and animations:

  • deflection_maps.pdf shows an example simulated lens including the subhalos, the deflection map from the host halo, and the deflection map from all subhalos combined.
  • calibration_curves.pdf shows calibration curves (raw network output vs calibrated likelihood ratio) for several randomly chosen parameter points.
  • live_inference_population.gif shows the evolution of the posterior on the population-level parameters with successive simulated observations.
  • live_inference_shmf.gif shows the evolution of the posterior on the subhalo mass function with successive simulated observations.
  • live_inference_both.gif and live_inference_with_images.gif combine the evolution of both posteriors, the latter (shown at the top of this readme) also includes the simulated observed lensed images.


The dependencies of our code are listed in environments.yml.

The starting point of our code based are five high-level scripts:

  • starts the simulation of lensing systems and, depending on the command-line parameters, creates training, calibration, or test samples.
  • makes it easy to combine multiple simulation runs into a single file as a preparation for training.
  • trains neural networks on the simulated data.
  • steers the evaluation of the estimated likelihood ratio on test or calibration data.
  • calibrated network predictions based on histograms of the network output.

All scripts can be called with the argument --help to show the command line options. In scripts/ we collect the actual calls we used (on a HPC environment) during this project.

Generally, the simulation code resides in simulation, while the inference code is in the inference folder. Notebooks in notebooks contain the plotting code.

Note: Internally the code uses a different convention for the SHMF slope beta than in the paper. The relation is beta_code = beta_paper - 1, so the fiducial value beta = -0.9 from the paper is internally represented as beta = -1.9.


If you use this code, please cite our paper:

      author         = "Brehmer, Johann and Mishra-Sharma, Siddharth and Hermans,
                        Joeri and Louppe, Gilles and Cranmer, Kyle",
      title          = "{Mining for Dark Matter Substructure: Inferring subhalo
                        population properties from strong lenses with machine
      year           = "2019",
      eprint         = "1909.02005",
      archivePrefix  = "arXiv",
      primaryClass   = "astro-ph.CO",
      SLACcitation   = "%%CITATION = ARXIV:1909.02005;%%"

The inference method is based on 1805.00013, 1805.00020, 1805.12244, and 1808.00973.


Inference of substructure properties in strong lensing systems with machine learning. Code reposity associated with





No packages published
You can’t perform that action at this time.