# Linear Transformation of Normal Distribution

[Back to index](https://shotahorii.github.io/math-for-ds/)

---

## Table of contents
1. Linear Transformation of Normal Distribution
2. Standardisation of Normal Distribution
3. Examples 

---

## 1. Linear Transformation of Normal Distribution

Given a random variable $X \sim \mathcal{N}(\mu,\sigma^2)$, $aX+b \, (a \ne 0)$ follows $\mathcal{N}(a\mu+b,a^2\sigma^2)$.

**Proof 1**

Probability density function of a random variable $X \sim \mathcal{N}(\mu,\sigma^2)$ is below.

$f(x) = \frac{1}{\sigma\sqrt{2\pi}}exp\{-\frac{1}{2\sigma^2}(x-\mu)^2\}$

Let: $y = g(x) = ax+b$

Note: $x = g^{-1}(y) = \frac{y-b}{a}$

Now, the probability density function of $Y = g(X)$, say $h(y)$, is below.

$h(y) = f(x)\frac{dx}{dy} = f(g^{-1}(y))\frac{dx}{dy}$

Note: $\frac{dx}{dy} = (\frac{y-b}{a})' = \frac{1}{a}$

$h(y) = f(\frac{y-b}{a})\cdot \frac{1}{a}$

$= \frac{1}{a\sigma\sqrt{2\pi}}exp\{-\frac{1}{2\sigma^2}(\frac{y-b}{a}-\mu)^2\}$

$= \frac{1}{a\sigma\sqrt{2\pi}}exp\{-\frac{1}{2a^2\sigma^2}(y-(a\mu+b))^2\}$

$= \mathcal{N}(a\mu+b,a^2\sigma^2)$

**Proof 2**

Characteristic function of normal distribution $X \sim \mathcal{N}(\mu,\sigma^2)$ is below.

$\varphi_X(t) = E[e^{itX}] = exp\{i\mu t - \frac{\sigma^2t^2}{2}\}$

Now, characteristic function of $Y=aX+b$ is below.

$\varphi_Y(t) = E[e^{itY}] = E[e^{it(aX+b)}]$

$=E[e^{itaX}e^{itb}] = e^{itb}E[e^{itaX}] = e^{itb}\varphi_X(ta)$

$=exp\{itb\} exp\{i\mu at - \frac{\sigma^2a^2t^2}{2}\}$

$= exp\{itb + i\mu at - \frac{\sigma^2a^2t^2}{2}\}$

$= exp\{i(a\mu+b)t - \frac{a^2\sigma^2t^2}{2}\}$

$=$ characteristic function of normal distribution $Y \sim \mathcal{N}(a\mu+b,a^2\sigma^2)$.

---

## 2. Standardisation of Normal Distribution 

$\mathcal{N}(0,1)$ is called standard normal distribution. Standardising a random variable $X \sim \mathcal{N}(\mu,\sigma^2)$ into $\mathcal{N}(0,1)$ enables us to calculate upper probability of given $X=x$,
using standard normal distribution table (or z-distribution table). 

If a random variable $X \sim \mathcal{N}(\mu,\sigma^2), then, \frac{X-\mu}{\sigma} \sim \mathcal{N}(0,1)$

**Proof**

$aX+b \, (a \ne 0)$ follows $\mathcal{N}(a\mu+b,a^2\sigma^2)$ for a given random variable $X \sim \mathcal{N}(\mu,\sigma^2)$.

Let: $a = \frac{1}{\sigma}, b = -\frac{\mu}{\sigma}$

Then: $a\mu+b = 0, a^2\sigma^2 = 1$

## 3. Examples

Answer **Q1~Q5** based on below facts. The upper probabities of standard normal distribution are given in the table below. 
- This year, the upper half applicants of the university X passed the entrance exam based only on their exam score. 
- None of the applicants who got score in top 10% accepted the offer from the university X, whereas all other successful applicants accepted the offer. 
- The distribution of exam scores of all applicants follows $\mathcal{N}(100,20^2)$
- John marked 108 points in this exam.

**Q1. If you convert the distribution of exam scores of all applicants from $\mathcal{N}(100,20^2)$ to $\mathcal{N}(50,10^2)$, what's John's score?** 

Here, the original score $X$ follows $\mathcal{N}(100,20^2)$ and the converted score $Y$ follows $\mathcal{N}(50,10^2)$. Let $Y=aX+b$, then $100a+b=50, a^2 20^2=10^2$. Hence, $a=\frac{1}{2}, b=0$. John's converted score is $Y=aX+b=\frac{1}{2}108+0 = 54$.

**Q2. John's score is in upper A% of all applicants, upper B% of all those who passed the exam, and upper C% of all those who accepted the offer. Fill A,B and C.**

Convert the original score to z-score, to use the standard normal distribution table. 

$z_{John} = \frac{X-\mu}{\sigma} = \frac{108-100}{20} = 0.4$

By using the table below, $A=34.46$

Then, the applicants who passed the exam are exact 50% of all applicants, (and normal distribution is centrosymmetry), $B=34.46*2=68.92$

Finally, as top 10% applicants didn't accept the offer, $C=100*\frac{34.46-10}{50-10} = 61.15$

**Q3. What's the lowest and highest score of applicants who accepted the offer?**

The lowest score is $100$.

Now, from the standard normal distribution table below, top 10% applicants' z-score are $z>1.28$. Hence the applicant who got $z=1.28$ is the one with the highest score among those who accepted the offer. 

$\frac{X-100}{20} = 1.28$

The highest socre is $X = 125.6$.

**Q4. Describe the distribution of successful applicants' score using pdf of standard normal distribution: $\varphi(z) = \frac{1}{\sqrt{2\pi}}exp\{-\frac{z^2}{2}\}$**

The successful applicants' score distribution is the upper half ($X \ge 100$) of $\mathcal{N}(100,20^2)$ multiplied by $2$. (Because the area is 1/2 of the original distribution. Probability needs to be 1 in total.)

So, what we need to do here is to describe pdf of $\mathcal{N}(100,20^2)$ using $\varphi(z)$, and multiply by $2$. 

$z = \frac{x-100}{20}$

$f(x) = \varphi(z)\frac{dz}{dx}$

$\frac{dz}{dx} = \frac{1}{20}$

Hence, the distribution we want is $\varphi(z)\frac{dz}{dx}\cdot 2 = \varphi(\frac{x-100}{20})\frac{1}{20}2 = \frac{1}{10}\varphi(\frac{x-100}{20}) \,\,\,\, where \,\, x \ge 100$

**Q5. What is mean and variance of the score of the successful applicants?**

$E[X|X\ge100] = \int_{100}^\infty x \frac{1}{10}\varphi(\frac{x-100}{20})dx$

$=\int_{0}^\infty (20z+100) \frac{1}{10}\varphi(z)\frac{dx}{dz}dz$

Note: $\frac{dx}{dz} = 20$

$=2\int_{0}^\infty (20z+100)\varphi(z)dz$

$=40\int_{0}^\infty z\varphi(z)dz + 200\int_{0}^\infty \varphi(z)dz$

Note: $\int_{0}^\infty \varphi(z)dz = 0.5$

$=40\int_{0}^\infty z\varphi(z)dz + 100$

$=40\int_{0}^\infty z\frac{1}{\sqrt{2\pi}}exp(-\frac{z^2}{2})dz + 100$

$=\frac{40}{\sqrt{2\pi}}\int_{0}^\infty z \cdot exp(-\frac{z^2}{2})dz + 100$

Note: $\int_{0}^\infty x \cdot exp(-ax^2)dx = \frac{1}{2a}$

$=\frac{40}{\sqrt{2\pi}}\frac{1}{2\frac{1}{2}}+100$

$=\frac{40}{\sqrt{2\pi}} + 100$

Note: $\sqrt{2\pi}=2.50662827$ ... this should be given in the question though.

$=115.95769124$

Then,

$V[X|X\ge100]= E[X^2|X\ge100] - (E[X|X\ge100])^2$

$E[X^2|X\ge100] = \int_{100}^\infty x^2 \frac{1}{10}\varphi(\frac{x-100}{20})dx$

$=\int_{0}^\infty (20z+100)^2 \frac{1}{10}\varphi(z)\frac{dx}{dz}dz$

$=2\int_{0}^\infty (400z^2+4000z+100^2) \varphi(z)dz$

$=800\int_{0}^\infty z^2\varphi(z)dz+8000\int_{0}^\infty z\varphi(z)dz +2\cdot 100^2\int_{0}^\infty \varphi(z)dz$

$=800\int_{0}^\infty z^2\varphi(z)dz + \frac{8000}{\sqrt{2\pi}}+100^2$

$=\frac{800}{\sqrt{2\pi}}\int_{0}^\infty z^2 exp(-\frac{z^2}{2})dz + \frac{8000}{\sqrt{2\pi}}+100^2$

Note: $\int_{-\infty}^\infty x^2 e^{-ax^2}dx = \frac{1}{2a}\sqrt{\frac{\pi}{a}} \,\,\,\,$ hence, $\int_0^\infty x^2 e^{-ax^2}dx = \frac{1}{4a}\sqrt{\frac{\pi}{a}}$

$=\frac{800}{\sqrt{2\pi}}\frac{1}{2}\sqrt{2\pi} + \frac{8000}{\sqrt{2\pi}}+100^2$

$=400+3191.5382491+100^2 = 13591.538249$

Now, 

$V[X|X\ge100]= E[X^2|X\ge100] - (E[X|X\ge100])^2$

$= 13591.538249 - (115.95769124)^2$

$= 13591.538249 - 13446.186157$

$= 145.3520912$

## Appendix

**The upper probabilites of standard normal distribution**

![upper probabities of standard normal distribution table](linear_transformation_of_normal_distribution_1.png)


This example is from 2019 Nov's Exam of [Japan Statistical Society Certificate Grade1](http://www.toukei-kentei.jp/about/grade1/).