# Bijectors

## Basics

### Bijection 

A bijection is a one-to-one mapping between 2 corresponding sets. 

The most important property of a bijective function is that it is invertibile. Which mean if there is a function 
$ f : X \to Y $ then there should exist a function $ g : Y \to X$ which is the inverse of $f$ such that each of the two ways for composing the two functions produces an identity function: $g ( f ( x ) ) = x $ $g(f(x))=x$ for each x in X and $f ( g ( y ) ) = y$ for each y  in Y. 

### Homeomorphism
A homeomorphism also called topological isomorphism, or bicontinuous function, is a bijective and continuous function between topological spaces that has a continuous inverse function. 

Homeomorphisms are the isomorphisms in the category of topological spaces—that is, they are the mappings that preserve all the topological properties of a given space. **Two spaces with a homeomorphism between them are called homeomorphic, and from a topological viewpoint they are the same**. 

A function $f : X \to Y$  between two topological spaces is a homeomorphism if it has the following properties:
- $f$ is a bijection (one-to-one and onto),
- $f^{-1}$ is continuous,
- the inverse function $f^{-1}$ is continuous (f is an open mapping).

### Differentiable manifold

Differentiable manifold (also differential manifold) is a type of manifold that is locally similar enough to a vector space to allow one to apply calculus. 

A differentiable manifold is a topological manifold with a globally defined differential structure. Any topological manifold can be given a differential structure locally by using the homeomorphisms in its atlas and the standard differential structure on a vector space. To induce a global differential structure on the local coordinate systems induced by the homeomorphisms, **their compositions on chart intersections in the atlas must be differentiable functions on the corresponding vector space.** 

### Image(Mathematics)

In mathematics, for a function $f : X \to Y$, the image of an input value $x$ is the single output value produced by $f$ when passed $x$. The preimage of an output value $y$  is the set of input values that produce y.

More generally, evaluating f at each element of a given subset A of its domain X produces a set, called the "image of A under (or through) f". Similarly, the inverse image (or preimage) of a given subset B of the codomain Y is the set of all elements of X that map to a member of B.

## Local diffeomorphism

A local diffeomorphism is intuitively a map between smooth manifolds that preserves the local differentiable structure.

Let $X$ and $Y$ be differentiable manifolds. A function $f : X \to Y$ is a local diffeomorphism if, for each point $x \in X$, there exists an open set $U$ containing $x$ such that the image $f(U)$ is open in $Y$ and $f | U : U \to f ( U )$ is a diffeomorphism.

A local diffeomorphism is a special case of an immersion $f : X → Y$. In this case, for each $x \in X$, there exists an open set $U$ containing $x$ such that the image $f ( U )$ is an embedded submanifold, and $f | U : U \to f ( U )$ is a diffeomorphism. Here $X$ and $f(U)$ have the same dimension, which may be less than the dimension of Y.

Important property of diffeomorphism is that the inverse function theorem implies that a smooth map $f : X \to Y$ is a local diffeomorphism if and only if the derivative $$D f_x : T_x X \to T_{f ( x )} Y$$ is a linear isomorphism for all points $x \in X$ . This implies that X  and Y have the same dimension.

Thus $D f_x$ is a linear isomorphism if and only if it is injective, or equivalently, if and only if it is surjective.

for more details see https://en.wikipedia.org/wiki/Local_diffeomorphism

## Covering space

In topology, a covering or covering projection is a map between topological spaces that, intuitively, locally acts like a projection of multiple copies of a space onto itself. In particular, coverings are special types of local homeomorphisms. If $p:\tilde X \to X$ is a covering, $(\tilde X,p)$ is said to be a covering space or cover of $X$, and $X$ is said to be the base of the covering, or simply the base.

## Bijectors 

Using the above concepts, we can say that bijectors are the implementation of local diffeomorphism. So based on this we can clearly say that it will contain 3 kinds of functions.

- Forward : Used for converting one random outcome to another random outcome for a differential distribution.
- Inverse : Used to convert back to the original distribution from the random outcome derived from the forward operation
- Log_det_jacobian : The log of the absolute value of the determinant of the matrix of all first-order partial derivatives of the inverse function. 

Geometrically, the Jacobian determinant is the volume of the transformation and is used to scale the probability. We take the absolute value of the determinant before log to avoid NaN values. Geometrically, a negative determinant corresponds to an orientation-reversing transformation. It is ok for us to discard the sign
of the determinant because we only integrate everywhere-nonnegative functions (probability densities) and **the correct orientation is always the one that produces a nonnegative integrand**.

By convention, transformations of random variables are named in terms of the forward transformation. The forward transformation creates samples, the inverse is useful for computing probabilities.


## Example of Bijector

### Exponential function, 
```
    Y = g(X) = exp(X)
    X ~ Normal(0, 1)  # Univariate.
```
Implies:
```
    g^{-1}(Y) = log(Y)
    |Jacobian(g^{-1})(y)| = 1 / y
    Y ~ LogNormal(0, 1), i.e.,
    prob(Y=y) = |Jacobian(g^{-1})(y)| * prob(X=g^{-1}(y))
            = (1 / y) Normal(log(y); 0, 1)
```
### ScaleMatvecTriL

```
Y = g(X) = sqrtSigma * X
X ~ MultivariateNormal(0, I_d)
```

Implies:

```
    g^{-1}(Y) = inv(sqrtSigma) * Y
    |Jacobian(g^{-1})(y)| = det(inv(sqrtSigma))
    Y ~ MultivariateNormal(0, sqrtSigma) , i.e.,
    prob(Y=y) = |Jacobian(g^{-1})(y)| * prob(X=g^{-1}(y))
            = det(inv(sqrtSigma))^(-d) *
                MultivariateNormal(inv(sqrtSigma) * y; 0, I_d)
```
