# Real NVP (non-volume-preserving) flows

__NOTE:__ the procedure below doesn't seem to work as the Keras `Model` subclass does not seem to have trainable variables. In particular it looks like `tfb.real_nvp_default_template` doesn't produce trainable variables that either Keras or Tensorflow can track (below I try to implement a `TransformedDistribution` object directly using a real NVP and even in that case no trainable variables are seen. This is probably related to [this issue](https://github.com/tensorflow/probability/issues/1439).

__Objective:__ train a real NVP model.

Source: [here](https://github.com/tensorchiefs/dl_book/blob/master/chapter_06/nb_ch06_04.ipynb)

In [None]:
import tensorflow as tf
import tensorflow_probability as tfp
import matplotlib.pyplot as plt
import seaborn as sns

tfd = tfp.distributions
tfb = tfp.bijectors

sns.set_theme()

## Generate data

Generate a complicated distribution of points in 2 dimensions.

In [None]:
n_samples = 2500

x2_samples = tfd.Normal(loc=0., scale=4.).sample(n_samples)

x1_samples = tfd.Normal(loc=.25 * tf.square(x2_samples), scale=tf.ones(n_samples, dtype=tf.float32)).sample()

samples = tf.stack(
    [x1_samples, x2_samples],
    axis=1
) / 40.

samples

In [None]:
fig = plt.figure(figsize=(14, 6))

sns.scatterplot(
    x=samples[:, 0],
    y=samples[:, 1],
)

plt.xlabel('$x_1$')
plt.ylabel('$x_2$')
plt.title('Samples', fontsize=14)

## Build the NVP model

Build the model as a Keras `Model` object (subclass).

In [None]:
class RealNVP(tf.keras.Model):
    """
    Subclass of a Keras `Model` object implementing a real
    NVP flow.
    """
    def __init__(self, *, output_dim, num_masked, **kwargs):
        """
        Constructor of the real NVP.
        """
        super().__init__(kwargs)
        
        self.output_dim = output_dim
        self.nets = []
        
        bijectors = []
        
        num_blocks = 5  # Number of layers.
        
        # Number of units in the hidden layers of the NN parametrizing
        # the affine transformation in the real NVP flow.
        h = 32
        
        # Each block (layer) is composed of a real NVP flow and a
        # permutation, written in this order but then applied in
        # reversed order (first the permutation, then the real NVP).
        # The resulting first permutation is actually discarded (see
        # below).
        for i in range(num_blocks):
            # Build a function to be used to compute the affine
            # parameters in the real NVP (in this case, a NN).
            net = tfb.real_nvp_default_template(
                hidden_layers=[h, h]  # Number of units in each hidden layer (two heads).
            )
            
            # Instantiate a real NVP object and append it to
            # the list of bijectors.
            bijectors.append(
                tfb.RealNVP(
                    shift_and_log_scale_fn=net,
                    # Number of masked dimensions.
                    # Note: in 2 dimensions this can only be 1 to get a
                    #       nontrivial case.
                    num_masked=num_masked
                )
            )
            
            # Instantiate a bijector implementing the permutation
            # operation among dimensions, so that singling out the
            # first n dimensions in the real NVP doesn't select
            # the same ones in each layer (block).
            # Note: the argument is the permutation to be used,
            #       which in our 2-dimensional case can be only
            #       [1, 0] ([0, 1] would be the identity).
            bijectors.append(tfb.Permute([1, 0]))
            
            # Append the neural network function (parametrizing the
            # affine parameters) to keep track of it.
            self.nets.append(net)
            
        # Build the full bijector corresponding to the real NVP by
        # chaining together the bijectors in the `bijectors` list.
        # Notes: 
        #   * We reverse the list of bijectors so that they are
        #     applied in reversed order w.r.t. the one we populated
        #     the list with.
        #   * Before reversing the list, we leave out the last biijector,
        #     which whould be a useless initial permutation.
        bijector = tfb.Chain(list(reversed(bijectors[:-1])))
        
        # Instantiate the flow object: a distribution obtained starting
        # from simple source distribution and then applying the full
        # bijector obtained above.
        self.flow = tfd.TransformedDistribution(
            # Source distribution.
            distribution=tfd.MultivariateNormalDiag(loc=[0., 0.]),
            # Bijector (NF) to apply.
            bijector=bijector
        )
        
    def call(self, *inputs):
        """
        Forward pass.
        """
        return self.flow.bijector.forward(*inputs)

In [None]:
model = RealNVP(output_dim=2, num_masked=1)

**Note:** what the untrained flow does depends on the random initialization of the NN weights.

In [None]:
fig = plt.figure(figsize=(14, 6))

sns.scatterplot(
    x=samples[:, 0].numpy(),
    y=samples[:, 1].numpy(),
    label='Original samples'
)

sns.scatterplot(
    x=model(samples)[:, 0].numpy(),
    y=model(samples)[:, 1].numpy(),
    label='Transformed samples (untrained flow)'
)

plt.xlabel('$x_1$')
plt.ylabel('$x_2$')
plt.title('Samples', fontsize=14)

**WARNING:** no trainable variables in the model?

In [None]:
model.summary()

Attempt: let's implement a `TransformedDistribution` object implementing a real NVP and see if it has trainable variables. It doesn't look like it (probably `tfb.real_nvp_default_template` doesn't create them for some reason).

In [None]:
test_transf_distr = tfd.TransformedDistribution(
    distribution=tfd.MultivariateNormalDiag(loc=[0., 0.]),
    bijector=tfb.RealNVP(
        num_masked=1,
        shift_and_log_scale_fn=tfb.real_nvp_default_template([4, 4])
    )
)

In [None]:
test_transf_distr