Skip to content


Folders and files

Last commit message
Last commit date

Latest commit



19 Commits

Repository files navigation


DOI arXiv


We describe the outcome of a data challenge to detect signals of new physics at the LHC using unsupervised machine learning algorithms conducted as part of the Dark Machines initiative. We first define and describe a large benchmark dataset, consisting of $>1$ Billion simulated LHC events corresponding to 10 fb^{-1} of proton-proton collisions at a center-of-mass energy of 13 TeV. We then review a wide range of anomaly detection and density estimation algorithms, developed in the context of the data challenge, and we measure their performance in a set of realistic analysis environments. We draw a number of useful conclusions that will aid the development of unsupervised new physics searches during the third run of the LHC, and provide our benchmark dataset for future studies at phenoMLdata.


Results for the Dark Machines Unsupervised Learning Challenge.

The notebooks directory contains jupyter notebooks for creating the figures in the paper for the results of the over 1000 models submitted for the challenge.

The individual results can be found in the data directory.

├── data/
│   ├── AE.csv
│   ├── ALAD.csv
│   ├── CNN_BVAE.csv
│   ├── CNN_VAE.csv
│   ├── Combined.csv
│   ├── ConvVAE_and_Flows.csv
│   ├── DAGMM.csv
│   ├── DarkMachinesUnsupervisedChallenge_TotalImprovements.csv
│   ├── DeepSetVAE.csv
│   ├── DeepSVDD.csv
│   ├── Flow.csv
│   ├── KDE.csv
│   ├── MethodsInLatentSpaceOfVAE.csv
│   ├── Metric_Scores.csv
│   ├── ModelsToSecretResults.csv
│   └── VAE.csv
├── figures/                  <- Figures from the paper
│   └── indivdual_signals/    <- Not included in the paper, contains best results for each BSM signal
├── notebooks/
│   ├── 01-ExampleBoxAndWhiskerPlot.ipynb
│   ├── 02-AnalysisAcrossPhysicsSignals_FigureOfMerit.ipynb
│   ├── 03-AnalysisOfTopMethodsForFiguresOfMerit.ipynb
│   ├── 04-SignificanceImprovement.ipynb
│   ├── 05-CompareWithSecretSetResults.ipynb
│   └──

Contributing new models

We encourage the development of new models using the Dark Machines datasets. The easiest way to compare new models will be to add a CSV file to the data directory using the same formatting.


Then, rerun the notebooks and append the new file to the list being analyzed.

bibtex citation

Please use the following bibtex citations if you use the data or notebooks from this study.

  • The article:
    author = "Aarrestad, T. and others",
    title = "{The Dark Machines Anomaly Score Challenge: Benchmark Data and Model Independent Event Classification for the Large Hadron Collider}",
    eprint = "2105.14027",
    archivePrefix = "arXiv",
    primaryClass = "hep-ph",
    month = "5",
    year = "2021"
  • The code:
  author       = {Bryan Ostdiek},
  title        = {{bostdiek/DarkMachines-UnsupervisedChallenge:
  month        = jun,
  year         = 2021,
  publisher    = {Zenodo},
  version      = {0.2-alpha},
  doi          = {10.5281/zenodo.4897467},
  url          = {}


Results for the Dark Machines Unsupervised Learning Challenge







No packages published