# Many Transformations

## Advanced Statistics Tutorial
## 2024-10-04 Paul Libbrecht CC-BY

## The Idea of a Transformation
* A random variable X: Ω → D ⊂ ℝⁿ
* A mapping m: D → E
* Obtain a transformed random variable m∘X where m∘X(ω) = m(X(ω))
* So first you measured X(ω) , now you measure m(X(ω))


## Example 1d-transformation
* Bijective, derivable... kind things
* Displacement: d:[0,100] → [273.15, 373.15] 
	* x ↦ x+273.15
* Log: l:[1,100000]→[0,5]
	* x ↦ log₁₀(x)
* Square: s:[0,1]→[0,1]
	* x ↦ x²

## Applying a 1d-Transformation to a Uniform RV
* Say we have random variable $X: Ω → [a,b]$
* For displacement: $X: Ω → [0,100]$, so $m∘X: Ω → [273.15,373.15]$
* Assume $X$ is uniform distributed $[0,100]$, i.e. $f_X(x)=\frac{1}{100}\mbox{ for }x∈[0,100]$, i.e. $F_X(x)=\frac{x}{100}$

$$F_{m∘X}(y) = P(m∘X(ω) \leq y) = P(X(ω) \leq m^{-1}(y)) = P(X(ω) \leq y-273.15) = \frac{y-273.15}{100} \mbox{ for } y \in [273.15, 373.15]$$

Now derive to get the density function:

$$ f_{m∘X}'(y) = \frac{1}{100}\mbox{ if }y\in[273.15,373.15]$$

## Applying again
* Suppose we have the RV $X: Ω → [1,100000]$, so $log∘X: Ω → [0,5]$ (log is the decimal log)
* Density function of $X$ is $f_X(x) = \frac{1}{100000-1}$ for $x\in[1,100000]$
* So CDF of $X$ is $F_X(x) = P(X(ω)\leq x) = \frac{x-1}{100000-1}$ for $x\in[1,100000]$
* Remember: $log$ is monotonous increasing
* $log(x)=y \Longleftrightarrow 10^y = x$

$$F_{log∘X}(y)=P(log∘X(ω) \leq y) = P(X(ω) \leq log^{-1}(y)) = P(X(ω) \leq e^y) = \frac{e^y-1}{100000-1}\mbox{ for }10^y\in[1,100000]$$

The last domain is, thus, $y\in[0,5]$. The PDF is then:

$$f_{log∘X}(y) =  \frac{10^y}{100000-1}\mbox{ for }10^y\in[1,100000]$$

## Applying again
* Suppose we have the RV $X: Ω → [0,1]$, so $s∘X: Ω → [0,1]$ with $s(x)=x^2$
* Density function of $X$ is $f_X(x) = 1$ for $x\in[0,1]$
* So CDF of $X$ is $F_X(x) = P(X(ω)\leq x) = x$ for $x\in[0,1]$
* Remember: $s$ is monotonous growing
* $s(x)=y \Longleftrightarrow \sqrt{y} = x$

$$F_{s∘X}(y)=P(s∘X(ω) \leq y) = P(X(ω) \leq s^{-1}(y)) = P(X(ω) \leq \sqrt{y}) = \sqrt{y}\mbox{ for }y\in[0,1]$$

The PDF is then:

$$f_{s∘X}(y) =  \frac{1}{\sqrt{y}}\mbox{ for }y\in[0,1]$$

## Rationale: The Transformation Theorem (1d)

Given a random variable $X: Ω → [a,b]$ and a transformation $t:[a,b] → [c,d]$ (bijective and derivable).
Given the density function $f_X$ of $X$, then we call the random variable $t∘X: Ω → [c,d]$ defined by $t∘X(ω) = t(X(ω))$ the _random variable $X$ transformed by $t$_. The theorem proves that:

$$ f_{t∘X}(y) = f(t^{-1}(y))\cdot \left|(t^{-1})'(y)\right|$$

(see, e.g., Hogg & McKean Theorem 1.7.1)

## Transforming

Transformed random-variables are more interesting when you combine random variables: e.g. a random angle and a random distance... get the coordinates (see, e.g. [here](https://stackoverflow.com/questions/5408276/sampling-uniformly-distributed-random-points-inside-a-spherical-volume)).

You need multidimensional random variables  $X: Ω → D ⊂ ℝ^n$.
So you don't have a CDF anymore... but measures of sets.

You need multidimensional transformations: $t: D → E⊂ ℝ^n$... and still want them to be bijective and derivable.

The transformation theorem says about same but the derivative is now the Jacobian (_think: how much is the volume twisted by the transformation?_).

$$ f_{t∘X}(y) = f(t^{-1}(y))\cdot \left|J_{t^{-1}}(y)\right|$$

(see Hogg & McKean sec 2.7)

## Combining, Transforming, Projecting

Using multi-dim is particularly interesting to compute simple operations between multiple random variables.
For example the sum of two (independent) variables $\Gamma-$variables $X:→ ℝ^+$ and $Y:→ ℝ^+$:

* Join the random-variables, obtain $Z: Ω → ℝ^+ \times ℝ^+$ (the pdf is the product)
* Transform, e.g. $t: ℝ^+ \times ℝ^+ → D$ with $t(x,y) = (x+y, x-y) = (u,v)$ (note: define $D$ !)
* Obtain the pdf of the transformed random variable
* The pdf $f_{X+Y}$ is the marginal distribution: $f_{X+Y}(u) = \int_v f_{t∘Z}(u,v)\cdot dv$