In [1]:
# Some basic imports
import numpy as np
import matplotlib.pyplot as plt
import torch

This tutorial deals with how to implement your flow using the tools provided by this library. We'll start the explanation talking about general flows, since this allows us to explain the architecture for Transformers and for Conditioners.

* [Flow](#flow)
* [Transformer](#transformer)
* [Conditioner](#conditioner)
    * [Conditional Conditioners](#cond-conditioners)

<a id="flow" />

# Implementing my own Flow

When implementing a Flow from scratch, three functions need to be overriden in the inheriting class:

In [5]:
from flow import Flow

class MyOwnFlow(Flow):
    
    def _transform(self, x, log_det=False, **kwargs): 
        # Transforms x into u. Used for training.
        pass

    def _invert(self, u, log_det=False, **kwargs): 
        # Transforms u into x. Used for sampling.
        pass

    def warm_start(self, x, **kwargs):
        # Warm start operation for the flow, if necessary.
        return self # don't forget to return self!
    
    # Also, if you need to save any parameters or buffers, for the init:
    def __init__(self, arg1, arg2, **kwargs):
        # Don't forget to capture **kwargs, that will need to be passed to the base class
        super().__init__(**kwargs)
        
        # Here we do what we want with the initializing function
        self.arg1 = arg1
        self.arg2 = arg2

There are two important aspects to take into account. **Always** capture `**kwargs` in each of these functions, since there may be other flows that depend on this (for example, conditioning tensors).

On the other hand, notice that `log_det=False` argument? If True, both `_transform` and `_invert` need to compute the logarithm of the absolute value of the determinant of the Jacobian of the transformation at x or u, respectively. This term is used in the computation of the nll of a sample, and it's essential for training. 

If `log_det` is True, these functions are expected to compute **and** return that log det. In that case, the return type is a tuple with the transformed tensor and the log det, like this: `return u, log_det`. If `log_det` is False, 

Taking this into account, you can do whatever you want with this Flow. Look for the implementations in the library to see some examples.

<a id="transformer" />

# Implementing my own Transformer

Transformers are a special kind of Flow. When we call them directly (through `__call__` and eventually `forward`) we give them an additional positional argument, h. h are the parameters that this transformer needs. For example, in an Affine transformer, h are loc and scale. 

When we define a Transformer, we need to implement the following methods in the inheriting class:

In [6]:
from flow import Transformer

class MyOwnAffineTransformer(Transformer):
    
    def __init__(self, **kwargs):
        # Extend the constructor to pass, to its base class,
        # the required kwarg h_dim >= 0. 
        
        # h_dim is the number of dimensions that your parameters (in total) require PER DIMENSION.
        # For example, in an Affine transformer, we have 2 parameters, loc and scale, 
        # each of dimension 1 per each dimension we want to transform. That means,
        # h_dim = 1 (for scale) + 1 (for loc). If the Flow is used with a 10-dimensional distribution,
        # then we'll have 10 pairs of parameters, each of h_dim=2 dimensions 
        # (the first one for scale, the second one for loc).
        
        # When we define h_dim, we pass it to the base class:
        h_dim = 2 # whatever you need here
        super().__init__(h_dim=h_dim, **kwargs) # call the base class
        
        # do something else if you need to
        # ...

    def _activation(self, h, **kwargs): 
        # Transform h by activation before calling _transform or _invert.
        # 
        # For example, for a scale parameter, _activation could pass h
        # through a softplus function to make it positive.
        # Returns a tuple with the activated tensor parameters.
        
        # Let's do the Affine one
        loc, log_scale = h[:, ::2], h[:, 1::2]
        # here, the odd positions are for loc, and the even ones for log_scale
        # Note that h can contain a single dimension's worth of parameters,
        # or all dimensions at the same time. That's why we use this approach here,
        # getting all odds or evens at once.
        
        # dimension parameters always come together, like this:
        # loc1 | log_scale1 | loc2 | log_scale2
        # Always take this into account.
        # Otherwise, some Conditioners might not work with your transformer.
        
        # Now, we need to pass an activation function through log_scale to transform it
        # into a positive tensor.
        scale = torch.exp(log_scale)
        
        return loc, scale # notice how we return all divided parameters in a single tuple
        # even if you return a single parameter, return it as a tuple, like this:
        # return loc,
        
    def _transform(self, x, *h, log_det=False, **kwargs): 
        # Transform x into u using parameters h.
        # Notice that, in contrast with a base Flow, here we receive our tuple of parameters
        # as positional arguments. Here h is the tuple we return from _activation
        loc, scale = h # unpack the tuple
        
        u = x * scale + loc
        
        if log_det: # remember to return log_det if required
            # the determinant for this function is the sum of the diagonal's elements
            log_det = torch.log(scale).sum(dim=1)
            
            return u, log_det
        else:
            return u

    def _invert(self, u, *h, log_det=False, **kwargs):
        # Transform u into x using parameters h.
        loc, scale = h # unpack again
        
        x = (u - loc) / scale
        
        if log_det:
            log_det = -torch.log(scale).sum(dim=1) # log_det is always the opposite in invert
            
            return x, log_det
        else:
            return x

    def _h_init(self):
        # Return initialization values for pre-activation h parameters.
        
        # If you want to set the pre-_activation parameters h to a default, stable value,
        # return that value in this function.
        # This is useful to initialize your flow to perform the identity function at first.
        # This helps in stabilizing training.
        
        # For this example, to make it the identity function, we need scale = 1 and loc = 0.
        # But remember, we return the pre-_activation values! The inverse of exp is log, and log(1) is 0.
        # So we just need to return 0s
        h_init = torch.zeros(self.dim * self.h_dim, device=self.device)
        # Notice that the shape of h_init is (self.dim * self.h_dim),
        # which means, h_dim dimensions per each distribution's dimension.

        return h_init
        
        # If you don't want to do initialization, just return None
        return None

<a id="conditioner" />

# Implementing my own Conditioner

Conditioners use the abstract interface we described before to define a Flow irrespective of the actual transformation we perform with its Transformer. 

When we define a Conditioner, we need to implement the following methods in the inheriting class:

In [7]:
from flow import Conditioner

class MyOwnConditioner(Conditioner):
    
    conditional = False # True means that this is a conditional-enabled Conditioner
    
    # Take into account that a Conditioner has an attribute self.trnf
    # that contains the transformer it's going to use.
    
    def _h(self, x, cond=None, **kwargs): 
        # Return the (non-activated) tensor of parameters h 
        # corresponding to the given x. If this is a conditional flow,
        # the conditioning tensor is passed as the 'cond' kwarg.
        
        # For example, this calls a network that gets x as input and returns h.
        pass

    def _invert(self, u, cond=None, log_det=False, **kwargs): 
        # Transform u into x.
        pass
        
    # Also, optionally, you might want to use _h_init in your __init__
    # to initialize your network. Look at flow.conditioner for examples.

Notice that a Conditioner does not implement a `_transform` method. This is taken care of by the Conditioner abstract class directly. What it does is call `_h` with the given x (and maybe also cond) and obtains the resulting h pre-activation parameters. Then, it passes x and h to the Transformer's `forward` method, which is the one that actually performs the transformation.

However, you do need to define the `_invert` operation. Since each Conditioner is different, you need to specify how exactly your Conditioner is going to invert u into x. The rules for `log_det` are the same as before.

<a id="cond-conditioners" />

# Conditional Conditioners

The last thing to take into account with Conditioners is if you wanna make them conditional or not. The library expects you to support conditional distributions (see the Iris tutorial). If you do not want to support it, mark your Conditioner class with conditional=False as above.

If you do want to support it, make sure you use the `cond` keyword argument in both `_h` and `_invert`. `cond` contains the conditioning tensor, that is expected to have `self.cond_dim` dimensions. If `self.cond_dim` is 0, it means that the used is not using your Conditioner in conditional-mode, and, as such, cond should be ignored (will be None). Look at the examples in `flow.conditioner` for more details.