Skip to content

Order-Agnostic Autoregressive Diffusion Models for Geostatistical Applications

License

Notifications You must be signed in to change notification settings

LukasMosser/order_agnostic_diffusion_geostats

Repository files navigation

Order-Agnostic Autoregressive Diffusion Models for Geostatistical Applications

Reference: DOI

A live version of this is released on Curvenote Curvespace under this link.

Introduction

This is a short introduction to the reasoning behind this work.
The introductory notebook provides a full length description and implementation of the methods.
Introductory Notebook Open In Colab

Geostatistical Modeling

Geostatistical models are critical for applications such as mineral resource estimation, storage modeling of CO2, and many other geospatial tasks.

Sequential indicator simulation (SIS) (See Gomez-Hernandez&Srivastasa, 2021 for an excellent review) is an autoregressive model for categorical properties that has found widespread adoption due to its flexibility and ability to incorporate existing data.

These features make SIS able to generate stochastic realizations honoring existing observations.

(Deep) Autoregressive Generative Models

Autoregressive models (See Kevin Patrick Murphy's new book Probabilistic Machine Learning: Advanced Topics chapter 22 for an introduction) using (deep) neural networks have shown a large potential to represent complex data distributions. In many cases, so-called causal convolutions require data to be generated in very specific patterns (top down, left to right) which does not allow for sampling of realizations with conditioning data at various locations.

A recent method called order-agnostic autoregressive diffusion models by Hoogeboom et al. allows for arbitrary ordering of the generation steps.

New possibilities for geostatistical modeling with (deep) generative models

This opens up the ability to incorporate spatially distributed conditioning data to generate geostatistical realizations that honor data.

Furthermore, the model parameterizes a categorical distribution which allows us to directly compute the entropy i.e. uncertainty distribution given the conditioning data.

I hope that these connections between (deep) autoregressive models and sequential geostatistical methods also interest the reader, and spurns further research at the intersection between the fields of geostatistics and machine learning.

Disclaimer and a note on publishing

Right now this is a few notebooks, some code, and some models. I do not have funding necessary to publish in a proper journal, but may consider publishing through Curvenote.

Please if you find this useful or interesting do consider referencing the repository anyway.

As such this article is not peer reviewed, but I am happy to receive comments and will acknowledge these.

Models have been trained on my own cost via Google Colab Pro+. Models are hosted on :hug-face: Huggingface Model Repositories and Monitoring was done with Weights&Biases.

Installation

Installation can be performed via pip:

pip install git+https://github.com/LukasMosser/order_agnostic_diffusion_geostats@main

Demos

The following demos are available and deployed on Huggingface Spaces:

Description Demo Link
Conditional Channels Generation Huggingface Spaces Link
Conditional MNIST Generation Huggingface Spaces Link

Models

A few pre-trained models are available via huggingface model repositories

Model Description Huggingface Model Hub Link Weights & Biases Logging Run
Channels Dataset at 64x64 px Huggingface Model Hub Link Weights & Biases Monitoring
MNIST Dataset at 32x32 px Huggingface Model Hub Link Weights & Biases Monitoring

Notebooks

These notebooks are intended to be run on Google Colab.

Description Google Colab Link
Introduction and Walkthrough (Start Here) Open In Colab
Train MNIST and Channel Models with Google Colab Open In Colab
Conditional Channel Image Generation on Google Colab Open In Colab
Log Likelihood Evaluation Demo Open In Colab

Acknowledgments

I would like to thank Emiel Hoogeboom [Website] [Twitter] for clarifications via email on understanding the methodology of the ARDM approach.

There exists an excellent official implementation by Emiel and his co-authors here: Official Implementation Github

Furthermore, thanks to Eric Laloy and colleagues for making their channel training image available online.

Finally, a huge thanks to the ML community for making available libraries such as :hug-face: huggingface hub, diffusers, accelerate, pytorch, and many more without this work couldn't exist.

Please consider citing their work and providing proper attribution.

Reference

If you've found this useful please consider referencing this repository in your own work

@software{lukas_mosser_2022_6961205,
  author       = {Lukas Mosser},
  title        = {{LukasMosser/order\_agnostic\_diffusion\_geostats: 
                   Initial Release}},
  month        = aug,
  year         = 2022,
  publisher    = {Zenodo},
  version      = {v0.0.2\_zenodo},
  doi          = {10.5281/zenodo.6961205},
  url          = {https://doi.org/10.5281/zenodo.6961205}
}

License

                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

About

Order-Agnostic Autoregressive Diffusion Models for Geostatistical Applications

Resources

License

Stars

Watchers

Forks

Packages

No packages published