# Autoregressive Diffusion for Geostatistical Applications
**Author: [Lukas Mosser](https://scholar.google.com/citations?hl=en&user=y0R9snMAAAAJ), August 2022**
## Demos for log-likelihood evaluation on two datasets

### Install Dependencies

In [1]:
!pip install --upgrade --no-cache git+https://github.com/LukasMosser/order_agnostic_diffusion_geostats@main

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting git+https://github.com/LukasMosser/order_agnostic_diffusion_geostats@main
  Cloning https://github.com/LukasMosser/order_agnostic_diffusion_geostats (to revision main) to /tmp/pip-req-build-jjrvtu7l
  Running command git clone -q https://github.com/LukasMosser/order_agnostic_diffusion_geostats /tmp/pip-req-build-jjrvtu7l
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
    Preparing wheel metadata ... [?25l[?25hdone
Collecting wandb
  Downloading wandb-0.13.0-py2.py3-none-any.whl (1.8 MB)
[K     |████████████████████████████████| 1.8 MB 13.3 MB/s 
Collecting diffusers
  Downloading diffusers-0.1.3-py3-none-any.whl (95 kB)
[K     |████████████████████████████████| 95 kB 74.6 MB/s 
[?25hCollecting python-dotenv
  Downloading python_dotenv-0.20.0-py3-none-any.whl (17 kB)
Collecting huggingface-hub
  Downloading hu

### Log-Likelihood demo for the MNIST dataset

In [5]:
import torch
import gradio as gr
from diffusers.models import UNet2DModel
from huggingface_hub import hf_hub_download
from oadg.sampling import sample, make_conditional_paths_and_realization, initialize_empty_realizations_and_paths
from oadg.training import sample_random_path, sample_random_index_for_sampling, create_mask_at_random_path_index
from oadg.training import log_prob_of_realization
from oadg.training import one_hot_realization, create_sampling_location_mask, predict_conditional_prob
from tqdm.auto import tqdm

image_size = 32
batch_size = 1
device = 'cuda'

path = hf_hub_download(repo_id="porestar/oadg_mnist_32", filename="model.pt")

model = UNet2DModel(
    sample_size=32,
    in_channels=2,
    out_channels=2,
    layers_per_block=2,
    block_out_channels=(64, 64, 128, 128),
    down_block_types=(
        "DownBlock2D",
        "DownBlock2D",
        "AttnDownBlock2D",
        "DownBlock2D",
    ),
    up_block_types=(
        "UpBlock2D",
        "AttnUpBlock2D",
        "UpBlock2D",
        "UpBlock2D",
    ),
)

model.load_state_dict(torch.load(path, map_location=torch.device('cpu')))

model = model.to(device)

model.eval()

def evaluate_log_prob_nats(img):
  img = (img > 0).astype(int)
  
  batch_size, _, w, h = 1, 0, *img.shape
  
  realization = torch.from_numpy(img).view(1, 1, h, w).to(device)
  
  realization = one_hot_realization(realization).float()

  # Get a batch of random sampling paths
  sampled_random_path = sample_random_path(batch_size, w, h, device=device)

  total_log_prob = 0.
  for idx in tqdm(torch.arange(start=0, end=w*h, device=device, requires_grad=False)):
    # We create a mask that masks the locations where we assume we've already sampled
    random_path_mask = create_mask_at_random_path_index(sampled_random_path.view(-1, w, h), idx, batch_size, w, h)

    mask_at_sampling = create_sampling_location_mask(sampled_random_path, idx, w, h).long()

    # We predict the conditional probability for the current sampling step for each training image in the batch
    # Image 1: log p(x23 | x22, x21, x20, ..., x1)
    # Image 2: log p(5 | x4, x3, x2, x1)
    with torch.inference_mode():
      conditional_prob = predict_conditional_prob(realization, model, random_path_mask, idx)

      # Evaluate the value of the log probability for the given realization
      log_prob = log_prob_of_realization(conditional_prob, realization)

    total_log_prob += (log_prob*mask_at_sampling).sum().item()

  return (img.astype(int))*255, "Log Likelihood: {0:.2f}[bits]".format(-total_log_prob)


img = gr.Image(image_mode="L", source="canvas", shape=(image_size, image_size), invert_colors=True, label="Drawing Canvas")
img_in = gr.Image(image_mode="L", source="canvas", shape=(image_size, image_size), invert_colors=True, label="Drawn Image")
text = gr.Text(label="Log Likelihood")
demo = gr.Interface(fn=evaluate_log_prob_nats, inputs=img, outputs=[img_in, text],
                    title="Order Agnostic Autoregressive Diffusion MNIST Log-Likelihood Demo",
                    description="""Compute the log-likelihood of a drawn image under the model. Try to draw a number, and evaluate the log probability. 
                    Then draw something that doesn't resemble a number and compare the log-likelihood. 
                    You should see that the model naturally assigns a higher log-likelihood i.e. lower probability to such an image, than for images it has been trained on. 
                    This allows us to evaluate the probability of different scenarios for example under a given model.""")
demo.launch(debug=True)


Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Running on public URL: https://25072.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://huggingface.co/spaces


  0%|          | 0/1024 [00:00<?, ?it/s]

Keyboard interruption in main thread... closing server.


(<gradio.routes.App at 0x7f723a31c8d0>,
 'http://127.0.0.1:7861/',
 'https://25072.gradio.app')

## Log Likelihood Evaluation Demo for the Channels Dataset

In [6]:
image_size = 64
batch_size = 1
device = 'cuda'

path = hf_hub_download(repo_id="porestar/oadg_channels_64", filename="model.pt")

model = UNet2DModel(
    sample_size=image_size,
    in_channels=2,
    out_channels=2,
    layers_per_block=2,
    block_out_channels=(64, 64, 128, 128),
    down_block_types=(
        "DownBlock2D",
        "DownBlock2D",
        "AttnDownBlock2D",
        "DownBlock2D",
    ),
    up_block_types=(
        "UpBlock2D",
        "AttnUpBlock2D",
        "UpBlock2D",
        "UpBlock2D",
    ),
)

model.load_state_dict(torch.load(path, map_location=torch.device('cpu')))

model = model.to(device)

model.eval()

img = gr.Image(image_mode="L", source="canvas", shape=(image_size, image_size), invert_colors=True, label="Drawing Canvas")
img_in = gr.Image(image_mode="L", source="canvas", shape=(image_size, image_size), invert_colors=True, label="Drawn Image")
text = gr.Text(label="Log Likelihood")
demo = gr.Interface(fn=evaluate_log_prob_nats, inputs=img, outputs=[img_in, text],
                    title="Order Agnostic Autoregressive Diffusion Channels Log-Likelihood Demo",
                    description="""Compute the log-likelihood of a drawn image under the model. Try to draw a channel system, and evaluate the log probability. 
                    Then draw something that doesn't resemble a channel system, like sand lenses, and compare the log-likelihood. 
                    You should see that the model naturally assigns a higher log-likelihood i.e. lower probability to such an image, than for images it has been trained on. 
                    This allows us to evaluate the probability of different scenarios for example under a given model.""")
demo.launch(debug=True)

Downloading:   0%|          | 0.00/28.7M [00:00<?, ?B/s]

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
Running on public URL: https://25457.gradio.app

This share link expires in 72 hours. For free permanent hosting, check out Spaces: https://huggingface.co/spaces


  0%|          | 0/4096 [00:00<?, ?it/s]

  0%|          | 0/4096 [00:00<?, ?it/s]

  0%|          | 0/4096 [00:00<?, ?it/s]

Keyboard interruption in main thread... closing server.


(<gradio.routes.App at 0x7f723a1c4e10>,
 'http://127.0.0.1:7861/',
 'https://25457.gradio.app')