# Backend-Agnostic Two Moons

This example notebook covers a backend-agnostic model trained online on the two moons dataset. You will learn how to:

1. Use BayesFlow with your backend of choice
2. Define joint distributions with BayesFlow decorators
3. Fit an amortized posterior with the new BayesFlow interface

## 1. Select a Backend

You can select a backend by setting the KERAS_BACKEND environment variable to one of "jax", "tensorflow", or "torch". You can do this by running any of the following commands. For this notebook, we set the variable dynamically, so you can switch it around as you like, but in general we recommend the conda environment version.

1. Using your system environment variables:
```
export KERAS_BACKEND="torch"
```

2. Using conda:
```
conda env config vars set KERAS_BACKEND="torch"
```

3. Dynamically in Python:

In [1]:
import os
# "jax", "tensorflow", or "torch"
os.environ["KERAS_BACKEND"] = "torch"

## 2. Defining the Simulation

We will use online training on the two moons toy dataset in this example. To define a joint distribution, we use convenience decorators:

In [2]:
import keras
import keras.callbacks

import numpy as np

import bayesflow.experimental as bf

Context Distribution:

In [3]:
@bf.distribution
def two_moons_context():
    # r ~ N(0.1, 0.01)
    r = keras.ops.random.normal(shape=(1,), mean=0.1, stddev=0.01)
    # alpha ~ U(-π/2, π/2)
    alpha = keras.ops.random.uniform(shape=(1,), minval=-0.5 * np.pi, maxval=0.5 * np.pi)
    return dict(r=r, alpha=alpha)

Parameter Prior:

In [4]:
@bf.distribution
def two_moons_prior():
    # θ ~ U(-1, 1)
    theta = keras.ops.random.uniform(shape=(2,), minval=-1.0, maxval=1.0)
    return dict(theta=theta)

Simulator:

In [5]:
@bf.distribution
def two_moons_likelihood(r, alpha, theta):
    # simulate the two moons
    x1 = -keras.ops.abs(theta[0] + theta[1]) / np.sqrt(2.0) + r * keras.ops.cos(alpha) + 0.25
    x2 = (-theta[0] + theta[1]) / np.sqrt(2.0) + r * keras.ops.sin(alpha)
    return dict(x=keras.ops.concatenate([x1, x2], axis=0))

Combining these to yield a joint distribution:

In [6]:
joint_distribution = bf.JointDistribution(
    local_context=two_moons_context,
    global_context=None,
    prior=two_moons_prior,
    likelihood=two_moons_likelihood,
)

## 3. Defining the training strategy via a Dataset

We want to train online, meaning we sample new data for each training step. BayesFlow already provides a Dataset for such common cases:

In [7]:
# pass batch size and steps per epoch here for now
# support this issue to fix that and move these to posterior.fit()
# https://github.com/keras-team/keras/issues/19528
train_dataset = bf.datasets.OnlineDataset(
    joint_distribution=joint_distribution,
    batch_size=64,
    steps_per_epoch=100,
    workers=8,
    use_multiprocessing=True,
)
validation_dataset = bf.datasets.OnlineDataset(
    joint_distribution=joint_distribution,
    batch_size=64,
    steps_per_epoch=10,
    workers=8,
    use_multiprocessing=True,
)

## 4. Defining Summary and Inference Networks

We do not want to use a summary network for this example, so we just leave this blank. As the inference network, we use a 4-layer coupling flow with affine transforms.

In [None]:
summary_network = None

In [8]:
# define this manually, for now
class AffineSubnet(keras.Layer):
    def __init__(self, in_features, out_features):
        super().__init__()
        self.in_features = in_features
        self.out_features = out_features
        
        self.network = keras.Sequential([
            keras.layers.Input(shape=(in_features,)),
            keras.layers.Dense(512, activation="relu"),
            keras.layers.Dense(2 * out_features),
        ])
        
    def call(self, x):
        parameters = self.network(x)
        scale, shift = keras.ops.split(parameters, 2, axis=1)
        return dict(scale=scale, shift=shift)

In [9]:
# use a sequential coupling flow
# method name is subject to change
# we will allow to use the default BayesFlow networks in the future
inference_network = bf.networks.CouplingFlow.uniform(
    subnet_constructor=AffineSubnet,
    # 2 parameters
    features=2,
    # 2 observables that we condition on
    conditions=2,
    layers=4,
    transform="affine",
    base_distribution="normal",
)

## 5. Putting Things Together

Now that the model internals are defined, collect them in an `AmortizedPosterior` and train.

In [10]:
posterior = bf.AmortizedPosterior(inference_network=inference_network, summary_network=summary_network)

In [11]:
optimizer = keras.optimizers.AdamW(learning_rate=1e-3, weight_decay=0.01)

In [12]:
posterior.compile(optimizer)

In [13]:
callbacks = [
    # track losses and metrics in TensorBoard
    keras.callbacks.TensorBoard("logs/two_moons/"),
    # save the best model each epoch
    keras.callbacks.ModelCheckpoint("logs/two_moons/checkpoints/", save_best_only=True)
]

Finally, fit your model:

In [None]:
posterior.fit(train_dataset, validation_data=validation_dataset, epochs=100, callbacks=callbacks)









Epoch 1/100


In [None]:
bf.diagnostics.show_posterior(posterior=posterior)