### Distribution Transformation

Distribution transformation is a very useful tool which will be extensively used with the copula concept that we discuss in the next Section.
The technique transforms every random variables from uniform to any distribution and vice versa and is called \emph{probability integral transform} or (percentile-to-percentile transform). 

Computationally, this method involves computing the quantile function of the distribution, in other words, computing the cumulative distribution function (CDF) of the distribution (which maps a number in the domain to a probability between 0 and 1) and then inverting that function. We won't go into the details but we will just show few examples of how this can be done in $\tt{python}$.

For example, imagine that $\mathbb{P}(X)$ is the standard normal distribution with mean zero and standard deviation one. If we want to convert uniformly distributed samples to standard normal we need to apply the inverse of CDF to each sample. Below few examples which exploit $\tt{scipy.stats}$ which defines many useful statistical distributions.

(Remember that the Uniform samples have to be interpreted as cumulative probabilities)

In [1]:
from scipy.stats import uniform, norm

uniform_samples = [0.5, .975, 0.995, 0.999999]

print ("Unif.\t\tStd. Normal")
for u in uniform_samples:
    print ("{:.7f}\t{:.8f}".format(u, norm.ppf(u)))

Unif.		Std. Normal
0.5000000	0.00000000
0.9750000	1.95996398
0.9950000	2.57582930
0.9999990	4.75342431


The same transformation may be applied directly on the entire sample.

In [18]:
x = uniform(0, 1).rvs(size=10000)

Next we apply $\tt{ppf()}$ to the list of $x$ directly.

In [6]:
norm = norm() 
x_trans = norm.ppf(x)

<img src="uniform_gauss.png">

If we plot them togheter in a 2D plot we can get a sense of what is going on when using the inverse CDF transformation:

<img src="uniform_to_gauss_2d.png">

The inverse CDF stretches the outer regions of the uniform to yield a normal distribution. 

The nice thing of the technique is that it can be used with any arbitrary (univariate) probability distributions, like for example [t-Student](https://en.wikipedia.org/wiki/Student%27s_t-distribution) or [Gumbel](https://en.wikipedia.org/wiki/Gumbel_distribution)

In [45]:
tstudent = distributions.t(4)
x_trans = tstudent.ppf(x)

<img src="uniform_tstudent_2d.png">

Clearly to do the opposite transformation from an arbitray distribution to the uniform(0, 1) we can just apply the inverse of the inverse CDF, which is the CDF itself...

In [46]:
x = uniform(0, 1).rvs(size=10000)
x_trans = norm.ppf(x)
x_uniform = norm.cdf(x_trans)

<img src="full_chain.png">

## Copula

In probability theory a *copula* $\mathcal{C}(U_1, U_2, \ldots, U_n, \rho)$ is a multivariate (multidimensional) cumulative distribution function for which the marginal probability distribution (the probability distribution of each dimension) of each variable is uniform on the interval $[0, 1]$ ($U_i \approx$ Uniform(0,1)). $\rho$ represent the correlation between each variable. 

Copulas are used to describe the dependencies between random variables and have been widely used in quantitative finance to model risk. Copulas are popular since they allow to eas- ily model and estimate the distribution of random vectors by representing marginals and their correlation separately.

### Example Problem Case
Imagine to measure two variables that are correlated. For example, we look at various rivers and for every river we look at its maximum water level, and also count how many months each river caused flooding. 

For the probability distribution of the maximum level of the river we know that maximums are Gumbel distributed, while the number of flooding can be modelled according to a [Beta distribution](https://en.wikipedia.org/wiki/Beta_distribution).

Clearly it is pretty reasonable to assume that the maximum level and the number of floodings is going to be correlated, however we don't know how we could model that correlated probability distribution. Above we only specified the distributions for individual variables, irrespective of the other one (i.e. the marginals), in reality we are dealing with the joint distribution of both of these together. 

And here is where copulas come to our rescue.

Copulas essentially allow to decompose a joint probability distribution into their marginals (which by definition have no correlation) and a function which couples (hence the name) them together and thus allows to specify the correlation separately. 

Copula is that coupling function.

## Adding Correlation with Gaussian Copulas

So let's continue with our example even if we are actually almost done.
Indeed, we saw before how to convert pretty much everything from and to uniform distribution. So that means we can generate uniformly distributed data with the correlation we want and then transform the marginals into the desired distributions. 

How do we do that ? 

* simulate from a multivariate Gaussian with the specific corrrelation structure;
* transform each Gaussian marginal to uniform;
* finally transform the uniforms to whatever we like.

So let's sample from a multivariate normal (2D) with a 0.5 correlation.

In [47]:
from scipy.stats import multivariate_normal

mvnorm = multivariate_normal(mean=[0, 0] , cov=[[1, 0.5],
                                                [0.5, 1]])
x = mvnorm.rvs(size=100000)

<img src="multivariate_2d.png">

Now use what we have just seen to tranform the marginals to uniform using the $\tt{cdf}$ function of the normal distribution ($x$ is a 2D vector in this case, but tranformation will be applied separately on each component)

In [48]:
norm = norm()
x_unif = norm.cdf(x)
print (x_unif.shape)

<table>
    <tr>
        <td><img src="copula_2d.png" width=300></td>
        <td><img src="copula_3d.png" width=300></td>
    </tr>
</table>

These plots above is usually how copulas are visualized. **Since we used a multivariate stadard normal to model correlation this is also called a Gaussian Copula.**

Finally we can just transform the marginals again from uniform to what we want (i.e. Gumbel and Beta in our river example): 

In [49]:
m1 = gumbel_l()
m2 = beta(a=10, b=3)

# transform U1 into Gumbel
# [:, 0] means all entries of dim 0  
x1_trans = m1.ppf(x_unif[:, 0])

#transform U2 into Beta
# [:, 1] means all entries of dim 1
x2_trans = m2.ppf(x_unif[:, 1])

To see that it is actually working as expected we should now compare our scatter plot with correlation to the joint distribution of the same marginals without correlation:

In [50]:
# sample from Gumbel
x1 = m1.rvs(size=10000)
# sample from Beta
x2 = m2.rvs(size=10000)

<table>
    <tr>
        <td><img src="gumbel_beta_corr.png" width=300></td>
        <td><img src="gumbel_beta_uncorr.png" width=300></td>
    </tr>
</table>

Using the uniform distribution as a common base for our transformations we can easily introduce correlations and flexibly construct complex probability distributions. Clearly this is directly extendeable to higher dimensional distributions as well.

### Generate Correlated Distributions
Let's now see how copulas can be used to generate numbers from correlated distributions. These are the steps to follow:

* generate a random vector $\mathbf{x}=(x_1, x_2,\ldots)$ from a multivariate distribution with the desired correlation;
* determine the single $U_i(x_i)$ by applying $\tt{cdf}$ to each $x_i$;
* transform again each $U_i(x_i)$ to the desired marginal distributions using $\tt{ppf}$.

Each component of the vector $\mathbf{x}$ is now transformed as it was drawn from the desired marginals with the appropriate correlation.

A practical application concerns the probability of default. Imagine there are three companies (A, B and C) which have a cumulative probability of defaulting within the next two years of 10%.

Let’s try to compute the probabilities to have the three of them all defaulting within the next two years in the cases with independent and correlated default probabilities.

In the first case (independent probabilities), the odds to get three defaults within two years is the product of the single probabilities, hence:
$$\mathbb{P}_{\mathrm{uncorr}}= 10\%\cdot 10\%\cdot 10\% = 0.1\%$$

We can verify this in $\tt{python}$ by applying the Monte Carlo algorithm outlined above: generate a random sample from an uncorrelated multivariate normal distribution, then transform each sample
( $[ x_A , x_B , x_C ]$ ) into the uniform distribution with the $\tt{norm.cdf}$ function (i.e. we convert the samples into probabilities) and then count how many times the three of them are lower than 10%. The final probability will be the ratio of the numer of successes by the total trials.

I am not interested in the real distribution of the marginals since I just want to work with cumulative probabilities so the uniforms are sufficients.

In [58]:
from scipy.stats import multivariate_normal, uniform, norm
import numpy

numpy.random.seed(10)
trials = 500000
mvnorm_no_corr = multivariate_normal(mean=[0, 0, 0], cov=[[1, 0, 0],
                                                          [0, 1, 0],
                                                          [0, 0, 1]])

defaults = 0
x = mvnorm_no_corr.rvs(trials)
x_trans = norm.cdf(x)
for i in range(len(x_trans)):
    if x_trans[i][0] < 0.1 and x_trans[i][1] < 0.1 and \
       x_trans[i][2] < 0.1:
        defaults += 1

print ("Defaults w/o correlation: {:.2f}%".format(defaults/trials*100))

Defaults w/o correlation: 0.10%


If we repeat the same Monte Carlo experiment with perfectly correlated default probabilities
we have

In [59]:
from scipy.stats import multivariate_normal, uniform, norm
import numpy

numpy.random.seed(10)
trials = 500000
mvnorm_no_corr = multivariate_normal(mean=[0, 0, 0], 
                                     cov=[[1, 0.99999, 0.99999],
                                          [0.99999, 1, 0.99999],
                                          [0.99999, 0.99999, 1]])

defaults = 0
x = mvnorm_no_corr.rvs(trials)
x_trans = norm.cdf(x)
for i in range(len(x_trans)):
    if x_trans[i][0] < 0.1 and x_trans[i][1] < 0.1 and \
       x_trans[i][2] < 0.1:
        defaults += 1

print ("Defaults w/o correlation: {:.2f}%".format(defaults/trials*100))

Defaults w/o correlation: 9.96%


In this case the result is 10%, like we had only one single company, indeed being perfectly
correlated either there is no default or three "simultaneous" defaults with 10% probability.