In [ ]:
import math
import scipy as sp
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

# required for interactive plotting
from __future__ import print_function
from ipywidgets import interact, interactive, fixed
import ipywidgets as widgets
import numpy.polynomial as np_poly

In [ ]:
from IPython.display import Math
from IPython.display import Latex
Math(r'F(k) = \int_{-\infty}^{\infty} f(x) e^{2\pi i k} dx')


initialization  
$ \newcommand{\E}[1]{\mathbb{E}\left[#1\right]}$  
$ \newcommand{\V}[1]{\mathbb{V}\left[#1\right]}$
$ \newcommand{\EXP}[1]{\exp\left(#1\right)}$  
$ \newcommand{\P}[1]{\mathbb{P}\left(#1\right)}$
$\newcommand{\nck}[2]{\left( \begin{matrix}#1\\#2\end{matrix} \right)}$

# Exponential distribution

[here](/notebooks/probability-distributions/exponential.ipynb)

Bernoulli Distribution
============
[here](/notebooks/probability-distributions/bernoulli.ipynb)

Binomial Distribution
=====================
[here](/notebooks/probability-distributions/binomial.ipynb)

# Geometric
[here](/notebooks/probability-distributions/geometric.ipynb)

Memorylessness
================

[wiki](http://www.wikiwand.com/en/Memorylessness)

A probability distribution is memoryless only if $P( X>(s+t) | X>s ) = P(X>t)$.  
This holds for both discrete and continuous random variables.

pretty vague but here is an example.  
Let $X$ be the number of trials until the first success in a probability distribution 
Then, 
$P(X>(10+20) | X>20) = P(X>10)$  
The distribution forgets the fact $X>2$

As another example,  
suppose X is the lifetime of a car engine given in terms of number of miles driven.   
If the engine has lasted 200,000 miles, then, based on our intuition,   
it is clear that the probability that the engine lasts another 100,000 miles  
is not the same as the engine lasting 100,000 miles from the first time it was built.  
However, memorylessness states that the two probabilities are the same.   
In essence, we 'forget' what state the car is in.   
In other words, the probabilities are not influenced by how much time has elapsed.

The only memoryless discrete probability distributions are the geometric distributions,  
which feature the number of independent Bernoulli trials needed to get one "success,"  
with a fixed probability p of "success" on each trial.  
In other words those are the distributions of waiting time in a Bernoulli process.  

Poisson Distribution
===============

[here](/notebooks/probability-distributions/poisson.ipynb)

Erlang Distribution
=========================
[here](/notebooks/probability-distributions/erlang.ipynb)

Gamma distribution
===========================
[here](/notebooks/probability-distributions/gamma.ipynb)

# Beta Distribution

[here](/notebooks/probability-distributions/beta.ipynb)

t Distribution
================

[here](/notebooks/probability-distributions/students-t.ipynb)

Cauchy Distribution
===========================
[here](/notebooks/probability-distributions/cauchy.ipynb)

Chi Squared ( $\chi^2$ ) Distribution
==========================
[here](/notebooks/probability-distributions/chi-squared.ipynb)

*******************************************

Pearson's chi-squared test
================

* applicable to categorical data 
* How likely is the observed difference arose by chance
* tests the null hypothesis 
* Events considered must be mutually exclusive and have total probability of 1
* used to assess two types of comparison
  * [Goodness of fit](https://www.wikiwand.com/en/Goodness_of_fit): whether or not the observed frequency distribution differs from a theoretical / expected distribution
  * Test of independence: Whether unpaired observations of two variables, expressed in a contingency table are independent of each other. For ex, the polling repsonses from people of different nationalities if one's nationality is related to the response
 

Procedure
--------
1. Calculate the chi-squared test statistic, $\chi^2$, which resembles a normalized sum of squared deviations between observed and theoretical frequencies (see below).
2. Determine the [degrees of freedom](https://www.wikiwand.com/en/Degrees_of_freedom_(statistics)), $df$, of that statistic, which is essentially the number of categories reduced by the number of parameters of the fitted distribution.
3. Select a desired level of confidence (significance level, [p-value](https://www.wikiwand.com/en/P-value) [explanation](http://stats.stackexchange.com/questions/31/what-is-the-meaning-of-p-values-and-t-values-in-statistical-tests) or alpha level) for the result of the test.
4. Compare $\chi^2$ to the critical value from the chi-squared distribution with df degrees of freedom and the selected confidence level (one-sided since the test is only one direction, i.e. is the test value greater than the critical value?), which in many cases gives a good approximation of the distribution of $\chi^2$.
5. Accept or reject the null hypothesis that the observed frequency distribution is different from the theoretical distribution based on whether the test statistic exceeds the critical value of $\chi^2$. If the test statistic exceeds the critical value of $\chi^2$, the null hypothesis ($H_o$ = there is no difference between the distributions) can be rejected with the selected level of confidence and the alternative hypothesis ($H_a$ = there is a difference between the distributions) can be accepted with the selected level of confidence.

Example
----------

Given Data ([Khan Academy](https://www.khanacademy.org/math/probability/statistics-inferential/chi-square/v/contingency-table-chi-square-test))

|  	        | Herb 1 	| Herb 2 	| Placebo 	|
|---	    |---	    |---	    |---	    |
| Sick 	    |  	20      | 30        | 30        |
| Not Sick 	|  100      | 110       | 90        |

Null Hypothesis $H_0$: No difference between herbs and Placebo.  
$H_a$: Herbs show some effect

Compute the totals

|  	        | Herb 1 	| Herb 2 	| Placebo 	| *Total* 	|
|---	    |---	    |---	    |---	    |--- 	 	|
| Sick 	    | 20 	 	| 30        | 30        | <span class="mark">*80*</span> 	 	|
| Not Sick 	| 100 	 	| 110       | 90        | <span class="mark">*300*</span>	 	|
| *Total*	| <span class="mark">*120*</span> 	| <span class="mark">*140*</span> 	| <span class="mark">*120*</span> 	| <span class="mark">*380*</span>	 	|

Find the percentages

|  	        | Herb 1 	| Herb 2 	| Placebo 	| *Total* 				|
|---	    |---	    |---	    |---	    |--- 	 				|
| Sick		| 20 	 	| 30        | 30        | *80*  <br> <span class="mark">(*21%*)</span>	|
| Not Sick 	| 100 	 	| 110       | 90        | *300* <br> <span class="mark">(*79%*)</span>	|
| *Total*	| *120* 	| *140* 	| *120* 	| *380*	 				|

Find expected values

|  	        | Herb 1			| Herb 2 				| Placebo 			| *Total* 				|
|---	    |---	    		|---					|---				|--- 	 				|
| Sick		| 20 <br> <span class="mark">(*25.3*)</span>	| 30 <br> <span class="mark">(*29.4*)</span>		| 30 <br> <span class="mark">(*25.3*)</span>	| *80*  <br> (*21%*)	|
| Not Sick 	| 100 <br> <span class="mark">(*94.7*)</span>	| 110 <br> <span class="mark">(*110.6*)</span>	| 90 <br> <span class="mark">(*94.7*)</span>	| *300* <br> (*79%*)	|
| *Total*	| *120* 			| *140* 				| *120* 			| *380*	 				|

Question: why take 21% of 120, 140 and 120?  
coz, if there is no difference between the herbs and the placebo, then the percentages of sick people in "herb1", "herb2" and "placebo" should be equal.

Compute the statistic:  
$$ \sum_{i} (O_i-E_i)^2/E_i \\
= (20-25.3)^2/25.3 + (30-29.4)^2/29.4 + (30-25.3)^2/25.3 + \\
(100-94.7)^2/94.7 + (110-110.6)^2/110.6 + (90-94.7)^2/94.7 \\
 = 2.53
$$

Now, say the significance level is 0.10 (10%).  
The d.f. is $(r-1)(c-1)$ = $1 \times 2$ = $2$  
From the tables, we find the critical value for $0.10$ is $\chi_2^2(0.10) = 4.60$. This value can be thought of as the max allowed value of the statistic aka the deviation.   
That is, probability of deviation taking a value gt 4.60 is 1%.  
that is, $P(X>4.60) = 0.10$  
Since the computed value, $2.53 < 4.60$, we accept $H_0$, which is there is no difference between the placebo and the herbs.

  
Caveats
-------
1. Breaks down if the expected frequencies are too low.
2. Acceptable as long as no more than 20% of the events have expected frequencies less than 5.
3. If there is one d.f., then approximation is not reliable if the expected frequencies are below 10.
4. In such cases, use [Yates' correction for continuity](https://www.wikiwand.com/en/Yates%27s_correction_for_continuity): reduce the abs. value of the difference by 0.5 before squaring

**************************

Bivariate distributions
=================

Example

$$
f(x) = 
\begin{cases}
x+y & if ~0 \le x,y \le 1\\
0 & \text{otherwise}
\end{cases}
$$

Cumulative distribution:  
$$
\begin{align}
F(x) & = \int_0^{x_1} \int_0^{y_1} x+y dx dy\\
     & = \int_0^{y_1} \left\{ \int_0^{x_1} x dx\right\} dy + 
         \int_0^{y_1} \left\{ \int_0^{x_1} y dx\right\} dy \\
     & = \int_0^{y_1} \frac{x_1^2}{2} dy + 
         \int_0^{y_1} x_1 y dy \\
     & = \frac{x_1^2 y_1 + x_1 y_1^2}{2} \\ 
     & = \frac{x_1 y_1 (x_1 + y_1) }{2}
\end{align}
$$

In [ ]:
for x1,x2 in enumerate([1,3,5]):
    print(x1,x2)

In [ ]:
# cumulative distribution
def compute_bivariate_cdf(x, y):
    return x*y*(x+y)/2.

in_bivariate_cdf = 100
xy = np.linspace(0, 1, in_bivariate_cdf)
mat1 = np.zeros((in_bivariate_cdf,in_bivariate_cdf))
for ix, xx in enumerate(xy):
    for iy, yy in enumerate(xy):
        mat1[ix, iy] = compute_bivariate_cdf(xx, yy)
        

plt.matshow(mat1)
plt.gca().invert_yaxis()
plt.colorbar()
plt.title('CDF of Bivariate')
plt.show()

********************************

Independence of Random Variables
========================

Two RV's, X \& Y are independent, if, $\forall A,B$,  
$$\mathbb{P}(X \in A, Y \in B) = \mathbb{P}(X \in A) \mathbb{P}(Y \in B)$$.  
It is written as $X \amalg Y$ or $X \perp Y$.  

We need to check the above for all subsets A,B. But there are other better ways.

**Method 1:**  
If X,Y has joint PDF $f_{X,Y}$. Then $X \amalg Y$ iff $f_{X,Y}(x,y) = f_X(x) f_Y(y)$ for all values x,y. This holds good for both discrete as well as continuous.

**Method 2:**  
Suppose that the range of X and Y is a (possibly infinite) rectangle. If $f(x,y) = g(x) ~ h(y)$ for some function $g,h$ which need not be pdf's themselves, then $X \amalg Y$.

2.33 Theorem. Suppose that the range of X and Y is a (possibly infinite) rectangle. If f(x,y) = g(x)h(y) for some functions g and h (not necessarily pro

More than two events
-------------------
A finite set of events {A_i} is *pairwise independent* if and only if every pair of events is independent. That is, $\forall m, ~k$,  
$$\mathbb{P}(A_m \cap A_k) = \mathbb{P}(A_m) ~ \mathbb{P}(A_k)$$

A finite set of events is *mutually independent* if and only if every event is independent of any intersection of the other events. That is, for every n-element subset $\{A_i\}$,  
$$\mathbb{P}\left( \bigcap_{i=1}^n A_i \right) = 
\prod_{i=1}^n \mathbb{P}(A_i)$$  
* This is called the multiplication rule for independent events.  
* For more than 2 events, mutual independece $\Rightarrow$ Pairwise Independence and not the other way around.

*********************************************
Pairwise independent, not mutually independent
![Alt text](images/440px-Pairwise_independent.svg.png "Pairwise but not mutually independent")

$\mathbb{P}(A) = \mathbb{P}(B) = 1/2$ and $\mathbb{P}(C) = 1/4$.  
$$
\begin{align}
\mathrm{P}(A|BC) &= \frac{ \frac{4}{40} }
                         {\frac{4}{40} + \frac{1}{40}}
                 &= \frac{4}{5}
                 & \ne \mathrm{P}(A)\\
\mathrm{P}(B|AC) &= \frac{ \frac{4}{40} }
                         {\frac{4}{40} + \frac{1}{40}}
                 & = \frac{4}{5} 
                 &\ne \mathrm{P}(B)\\
\mathrm{P}(C|AB) &= \frac{\frac{4}{40}}
                         {\frac{4}{40} + \frac{6}{40}}
                 &= \frac{2}{5} 
                 &\ne \mathrm{P}(C)
\end{align}
$$




![Alt Text](images/440px-Mutually_independent.svg.png "Mutually independent")
$\mathbb{P}(A) = \mathbb{P}(B) = 1/2$ and $\mathbb{P}(C) = 1/4$.  

$$
\begin{align}
\mathrm{P}(A|BC) &= \frac{\frac{1}{16}}
                         {\frac{1}{16} + \frac{1}{16}}
                 &= \tfrac{1}{2}
                 &= \mathrm{P}(A) \\
\mathrm{P}(B|AC) &= \frac{\frac{1}{16}}
                         {\frac{1}{16} + \frac{1}{16}}
                 &= \tfrac{1}{2}
                 &= \mathrm{P}(B)\\
\mathrm{P}(C|AB) &= \frac{\frac{1}{16}}
                         {\frac{1}{16} + \frac{3}{16}}
                 &= \tfrac{1}{4}
                 &= \mathrm{P}(C)
\end{align}
$$

*********************************************
Mutual Independence

George, Glyn, "Testing for the independence of three events," Mathematical Gazette 88, November 2004, 568 [pdf](resources/testing-for-independence-of-three-events.pdf)

![Mutual, not pairwise](images/mutual-independence-not-pairwise.png)

$\mathbb{P}(ABC) = \mathbb{P}(A) \mathbb{P}(B) \mathbb{P}(C) = 0.04$

But no two of the 3 events are pairwise independent.

$$
\begin{align}
\mathbb{P}(AB) &= 0.10 
& \text{but } & \mathbb{P}(A) \times \mathbb{P}(B)
= 0.2 \times 0.4 = 0.08\\
\mathbb{P}(BC) &= 0.24 
& \text{but } & \mathbb{P}(B) \times \mathbb{P}(C)
= 0.4 \times 0.5 = 0.20\\
\mathbb{P}(CA) &= 0.14 
& \text{but } & \mathbb{P}(C) \times \mathbb{P}(C)
= 0.5 \times 0.2 = 0.10
\end{align}
$$

Problems
=====================

13. Transformation of RV's
----

$Y = e^{X}$ and $X \sim N(0,1)$

$$
\begin{align}
P(Y=y) = P(X = ln(y)) = \frac{1}{\sqrt{2\pi}} e^{-(\ln{y})^2/2}
\end{align}
$$

In [ ]:
import math
def p_y(y):
    term1 = 1./math.sqrt(2*math.pi)
    term1 = 1.
    term2_pwr = -(math.log(y)**2) / 2.0
    return term1 * math.e**term2_pwr

def normal_mine(x):
    term1 = 1./math.sqrt(2*math.pi)
    term1 = 1.
    term2_pwr = -x**2/2.
    return term1 * math.e**term2_pwr

def find_format_hist(data, in_pts, in_bins):
    hist, edges = np.histogram(data, bins=in_bins)

    dist_cum = np.matrix([sum(hist[:ix])/(1.*in_pts) for ix in range(in_bins)])
    bin_centers = (edges[1:]+edges[:-1])/2.
    
    bin_centers = bin_centers.reshape((in_bins,1))
    dist_cum = dist_cum.reshape((in_bins0,1))
    
    return (bin_centers, dist_cum)

in_pts = 10000
e_power = math.e**3
y = np.linspace(1./e_power, e_power, in_pts)
p_y_exp = [p_y(yy) for yy in y]
sum_exp = sum(p_y_exp)
sum_exp = 1.
cdf_exp = [sum(p_y_exp[:ix])/sum_exp for ix in range(in_pts)]
plt.plot(y, cdf_exp , label='expected y')

xx = np.random.normal(loc=0.0, scale=1.0, size=(1, in_pts))
xx.sort()

print('yo')
bin_centers0, dist_cum0 = find_format_hist(xx, in_pts, 1000)
plt.plot(bin_centers0, dist_cum0, label='normal sampled')

yy = [math.e**xxx for xxx in xx]

data1 = [yyy for yyy in yy[0] if yyy<=1.0]
bin_centers1, dist_cum1 = find_format_hist(data1, in_pts, 1000)
print(dist_cum1[-1,0])
plt.plot(bin_centers1, dist_cum1, label='first half')

data2 = [yyy for yyy in yy[0] if yyy>1.0]
bin_centers2, dist_cum2 = find_format_hist(data2, in_pts, 1000)
#plt.plot(bin_centers2, dist_cum2, label='second half')

x_min, x_max = plt.xlim()
#plt.xlim((-20, 40.0))

#plt.legend(loc='upper left')
plt.legend(loc='lower right')
plt.grid()
plt.show()

# # hist2, edges2 = np.histogram([yyy for yyy in yy[0] if yyy>1.0], 10)
# # plt.plot((edges2[1:]+edges2[:-1])/2., hist2/(1.*in_pts), label='y actual > 1')

# # bin_size = 10
# # cdf = np.zeros((1,bin_size+1))
# # #print(cdf)
# # prev = 0.0
# # for ix, xx_hist in enumerate(range(bin_size+1)):
# #     edge1 = xx_hist/(1.*bin_size)
# #     current = len([yyy for yyy in yy[0] if yyy<=edge1])/(1.*in_pts)
# #     #print(edge1, current-prev)
# #     plt.plot(edge1, current-prev, '*', markersize=15)
# #     cdf[0,ix] = current
# #     prev = current
    
# plt.show()
# #print(cdf)


In [ ]:
plt.plot(p_y_exp)
print(sum(p_y_exp))

In [ ]:
y=1.
term1 = 1./math.sqrt(2*math.pi)
print(term1)
term2_pwr = -(math.log(y)**2) / 2.0
print(term2_pwr)
print(term1 * math.exp(term2_pwr))

print(math.sqrt(2))
print(math.pi)
print(math.log(math.e))

In [ ]:
def p_y(x):
    term1 = 1./math.sqrt(2*math.pi)
    term2_pwr = -(x**2) / 2.0
    return term1 * math.e**term2_pwr

x = np.linspace(-5,5,100)
yy = [p_y(xx) for xx in x]
plt.plot(x, yy)
plt.show()


In [ ]:
def p_y(y):
    term1 = 1./math.sqrt(2*math.pi)
    term2_pwr = -(math.log(y)**2) / 2.0
    return term1 * math.e**term2_pwr

x = np.linspace(0.1, 5, 100)
yy = [p_y(xx) for xx in x]
plt.plot(x, yy)
plt.show()
print(sum(yy))

14. Sampling from a circle
---------------------------------

In [ ]:
def compute_hist(data, in_pts, in_bins):
    hist, edges = np.histogram(data, bins=in_bins)

    dist_cum = np.matrix([sum(hist[:ix])/(1.*in_pts) for ix in range(in_bins)])
    bin_centers = (edges[1:]+edges[:-1])/2.
    
    bin_centers = bin_centers.reshape((in_bins,1))
    dist_cum = dist_cum.reshape((in_bins,1))
    
    return (bin_centers, dist_cum, hist, edges)

def plot_circle(radius=1):
    thetas = np.linspace(0, 1, 100)* 2*math.pi
    pts = np.asarray([polar_to_cartesian(theta, radius) for theta in thetas])
    plt.plot(pts[:,0], pts[:,1])
    

Method 1:  
Choose x uniformly from [-1,1] and choose a y in $[-\sqrt{1-x^2}, \sqrt{1-x^2}]$

In [ ]:
def validate_on_circle(x, y):
    is_valid=reduce(lambda zzz, t: zzz and t, 
                    [abs(zz-1.)<1e-8 for zz in (x**2+y**2)],
                    True)
    print('x and y on circle? :: ' + ('yes' if is_valid[0] else 'no'))

def validate_inside_circle(arr_r, radius=1.):
    is_valid=reduce(lambda zzz, t: zzz and t, [zz<=radius for zz in arr_r][0], True)
    print('x and y_rand inside? :: ' + ('yes' if is_valid else 'no'))


In [ ]:

in_pts, radius = 5e4, 1.
x = np.random.rand(in_pts,1)*2.*radius - radius
x.sort()
y = np.sqrt(radius**2 - x**2)
validate_on_circle(x, y)

y_rand = np.asarray([np.random.rand(1)[0]*(2*yy)-yy for yy in y]).reshape(x.shape)
arr_r = (x**2+y_rand**2)
validate_inside_circle(arr_r, radius)
plot_circle(radius=1)
plt.gca().set_aspect('equal', adjustable='box')
plt.plot(x, y_rand, '.', markersize=1)
plt.show()

bin_centers, dist_cum, hist, edges = compute_hist(arr_r, in_pts, 100)
print(bin_centers.shape)
print(dist_cum.shape)
plt.plot(bin_centers, dist_cum, label='actual cdf')
plt.plot(bin_centers, [ctr**2 for ctr in bin_centers], label='expected cdf')
plt.legend(loc='upper left')
plt.grid()
plt.show()

plt.plot(bin_centers, hist/in_pts, label='actual pdf')
edges_sqrd = np.square(edges)
expected_pdf = edges_sqrd[1:] - edges_sqrd[:-1]
plt.plot(bin_centers, expected_pdf, label='expected pdf')
plt.legend(loc='upper left')
plt.show()

bin_centers_x, cum_pdf, hist_x, edges_x = compute_hist(x, in_pts, in_bins=50)
plt.plot(bin_centers_x, hist_x, label='X Histogram')
bin_centers_y, cum_pdf, hist_y, edges_y = compute_hist(y_rand, in_pts, in_bins=50)
plt.plot(bin_centers_y, hist_y, label='Y Histogram')
plt.title('X-Y Histogram')
plt.legend()
plt.show()

Method 2:  
* choose two points randomly on the circumference of the circle
  * sample two values for $\theta$ from $[0, 2\pi)$
* find (x,y) for these points
* find the midpoint of the chord formed by these points  
 

In [ ]:
def check_mid_pts(mid_pts, radius=1.):
    mid_pts_valid = reduce(lambda length,total: (length<=1.*radius) and total,
                       [(x**2+y**2) for (x,y) in mid_pts],
                       True)
    print('midpts valid?: ' + ('yes' if mid_pts_valid else 'no'))

def polar_to_cartesian(theta, radius=1):
    return [radius*math.cos(theta), radius*math.sin(theta)]
def midpoint(pt1, pt2):
    x1, y1 = pt1
    x2, y2 = pt2
    return [(x1+x2)/2., (y1+y2)/2.]


In [ ]:
in_pts, radius = 5e4, 1
arr_thetas = np.random.rand(in_pts, 2) * 2*math.pi
mid_pts = np.asarray([midpoint(polar_to_cartesian(thetas[0]), polar_to_cartesian(thetas[1]))
                      for thetas in arr_thetas])
check_mid_pts(mid_pts, radius)
plt.plot(mid_pts[:,0], mid_pts[:, 1], '.', markersize=1)
plot_circle(radius)
plt.gca().set_aspect('equal', adjustable='box')
plt.show()

radii = np.sqrt(mid_pts[:,0]**2 + mid_pts[:,1]**2)
bin_centers_r, cum_pdf_r, hist_r, edges_r = compute_hist(radii, in_pts, in_bins=10)
plt.plot(bin_centers_r, cum_pdf_r)
plt.show()

plt.plot(bin_centers_r, hist_r/(in_pts*1.))
plt.title('Radius Histogram')
plt.show()

bin_centers_x, cum_pdf, hist_x, edges_x = compute_hist(mid_pts[:,0], in_pts, in_bins=50)
plt.plot(bin_centers_x, hist_x, label='X Histogram')
bin_centers_y, cum_pdf, hist_y, edges_y = compute_hist(mid_pts[:,1], in_pts, in_bins=50)
plt.plot(bin_centers_y, hist_y, label='Y Histogram')
plt.title('X-Y Histogram')
plt.legend()
plt.show()

Transformation of multiple RV's
---------------

$X,Y \sim U(0,1)$

Z=X+Y  
* compute the cdf as follows
* find $X+Y=z$ over multiple values of z
* $Z \in [0,2)$
* four ranges
  * $Z < 0$, $Z \in [0, 1)$, $Z \in [1, 2)$, $Z \ge 0$
  
* $\mathbb{P}(Z \lt z: z \in [0,1)$ = Area(blue triangle) = $\frac{1}{2} z^2$
* $\mathbb{P}(Z \lt z: z \in [1,2)$ = 1 - $\mathbb{P}(Z \gt z: z \in [1,2)$ = 1 - Area(red triangle)
= $1 - \frac{1}{2} (2-z)^2$

In [ ]:
import matplotlib.lines as pltline
import matplotlib.patches as pltpatch

def show_poly(ax, vertices, color='b', label=''):
    ax.add_patch(pltpatch.Polygon(vertices,
                                       closed=True,
                                       fill=True,
                                       color=color, label=label
                                      )
                     )

def show_x_plus_y(z1=0.2, z2=1.4):
    fig = plt.figure()
    arrowprops = dict(facecolor='black', width=1)
    
    ax1 = fig.add_subplot(111)
    case_1_poly = [[0, 0], [0, z1], [z1, 0]]
    show_poly(ax1, case_1_poly, color='b', label='0 <= z < 1')
    ax1.annotate('P(Z<='+str(z1)+')', xy=(0.0, 0.0), xytext=(-0.3, 0.2), arrowprops=arrowprops)
    
    case_2_poly = [[1, 1], [1, z2-1], [z2-1, 1]]
    show_poly(ax1, case_2_poly, color='r', label='1 <= z < 2')
    ax1.annotate('P(Z>='+str(z2)+')', xy=(1.0, 1.0), xytext=(1.1, 1.1), arrowprops=arrowprops)

    plt.xlim(0, 1)
    plt.ylim(0, 1)
    plt.legend(loc='upper left')
    plt.grid()
    plt.show()
    

interact(show_x_plus_y, z1=(0,1,0.1), z2=(1,2,0.1))

$$
F(z) = \mathbb{P}(Z \le z) =
\begin{cases}
0                       & z \lt 0\\
\frac{1}{2} z^2         & z \in [0, 1)\\
1 - \frac{1}{2} (2-z)^2 & z \in [1, 2)\\
1                       & z \ge 2
\end{cases}
$$

$$
f(z) = \mathbb{P}(Z = z) =
\begin{cases}
z     & z \in [0, 1)\\
(2-z) & z \in [1, 2)\\
0     & elsewhere
\end{cases}
$$

In [ ]:
x = np.linspace(0, 1, 100)
y = [xx for xx in x]
plt.plot(x, y, label='0 <= z < 1')

x = np.linspace(1, 2, 100)
y = [2-xx for xx in x]
plt.plot(x, y, label='1 <= z < 2')
plt.legend()
plt.grid()
plt.title('Z = X+Y')
plt.xlabel('z')
plt.ylabel('f(z)')
plt.show()

*******************************

$X,Y \sim U(0,1)$

Z=X-Y  
* compute the cdf as follows
* find $X-Y=z$ over multiple values of z
* $Z \in (-1,1)$
* four ranges
  * $Z < -1$, $Z \in (-1, 0[$, $Z \in [0, 1)$, $Z \ge 1$
  
* $\mathbb{P}(Z \lt z: z \in (-1,0]$ = Area(blue triangle) = $\frac{1}{2} (1+z)^2 $
* $\mathbb{P}(Z \lt z: z \in [0,1)$ = 1 - $\mathbb{P}(Z \gt z: z \in [0,1)$ = 1 - Area(red triangle)
= $1 - \frac{1}{2} (1-z)^2$

In [ ]:
import matplotlib.lines as pltline
import matplotlib.patches as pltpatch

def show_poly(ax, vertices, color='b', label=''):
    ax.add_patch(pltpatch.Polygon(vertices,
                                  closed=True, fill=True,
                                  color=color, label=label))

def show_x_minus_y(z1=-0.6, z2=0.4):
    fig = plt.figure()
    arrowprops = dict(facecolor='black', width=1)
    
    ax1 = fig.add_subplot(111)
    case_1_poly = [[0, 1], [0, -z1], [1+z1, 1]]
    show_poly(ax1, case_1_poly, color='b', label='-1 <= z < '+str(z1))
    ax1.annotate('P(Z<='+str(z1)+')', xy=(0.0, 0.9), xytext=(-0.3, 0.8), arrowprops=arrowprops)
    
    case_2_poly = [[1, 0], [z2, 0], [1, 1-z2]]
    show_poly(ax1, case_2_poly, color='r', label=str(z2) + ' <= z < 1')
    ax1.annotate('P(Z>='+str(z2)+')', xy=(1.0, 0.0), xytext=(1.1, 0.1), arrowprops=arrowprops)

    plt.xlim(0, 1)
    plt.ylim(0, 1)
    plt.legend(loc='lower left')
    plt.grid()
    plt.show()
    

interact(show_x_minus_y, z1=(-1.0,0,0.1), z2=(0,1,0.1))

$$
F(z) = \mathbb{P}(Z \le z) =
\begin{cases}
0                       & z \lt -1\\
\frac{1}{2} (1+z)^2         & z \in (-1, 0]\\
1 - \frac{1}{2} (1-z)^2 & z \in [0, 1)\\
1                       & z \ge 1
\end{cases}
$$

$$
f(z) = \mathbb{P}(Z = z) =
\begin{cases}
1+z  & z \in (-1, 0]\\
1-z  & z \in [0, 1)\\
0    & elsewhere
\end{cases}
$$

In [ ]:
x = np.linspace(-1, 0, 100)
y = [1+xx for xx in x]
plt.plot(x, y, label='-1 < z <= 0')

x = np.linspace(0, 1, 100)
y = [1-xx for xx in x]
plt.plot(x, y, label='0 <= z < 1')
plt.legend()
plt.grid()
plt.title('Z = X-Y')
plt.xlabel('z')
plt.ylabel('f(z)')
plt.show()

*******************************

$X,Y \sim U(0,1)$

Z=X/Y  
* compute the cdf as follows
* find $X-Y=z$ over multiple values of z
* $Z \in (-\infty,\infty)$
* two ranges
  * $Z < 1 $, $Z \ge 1$
  
* $\mathbb{P}(Z \lt z: z \in (-\infty,1]$ = Area(blue triangle) = $\frac{1}{2} (1+z)^2 $
* $\mathbb{P}(Z \lt z: z \in [0,1)$ = 1 - $\mathbb{P}(Z \gt z: z \in [0,1)$ = 1 - Area(red triangle)
= $1 - \frac{1}{2} (1-z)^2$


In [ ]:
in_pts = 1000
xy = np.linspace(0, 1, in_pts)
mat1 = np.zeros((in_pts, in_pts))
for ix, xx in enumerate(xy):
    if xx < 1e-3:
        continue
    for iy, yy in enumerate(xy):
        if yy < 1e-3:
            continue
        result = xx/yy
        if result > 4:
            #mat1[iy, ix] = math.log(xx/yy)
            mat1[iy, ix] = 1
        if result < 0.25:
            mat1[iy, ix] = -1

plt.matshow(mat1)
plt.gca().invert_yaxis()
plt.colorbar()
plt.title('Z = X/y')

In [ ]:
import matplotlib.lines as pltline
import matplotlib.patches as pltpatch

def show_poly(ax, vertices, color='b', label=''):
    ax.add_patch(pltpatch.Polygon(vertices,
                                  closed=True, fill=True,
                                  color=color, label=label))

def show_x_div_y(z1=0.5, z2=2):
    fig = plt.figure()
    arrowprops = dict(facecolor='black', width=1)
    
    ax1 = fig.add_subplot(111)
    case_1_poly = [[0, 1], [0, 0], [z1, 1]]
    show_poly(ax1, case_1_poly, color='b', label='z < '+str(z1))
    ax1.annotate('P(Z<='+str(z1)+')', xy=(0.0, 0.9), xytext=(-0.3, 0.8), arrowprops=arrowprops)
    
    case_2_poly = [[1, 0], [0, 0], [1, 1./z2]]
    show_poly(ax1, case_2_poly, color='r', label='z >' + str(z2))
    ax1.annotate('P(Z>='+str(z2)+')', xy=(1.0, 0.0), xytext=(1.1, 0.1), arrowprops=arrowprops)

    plt.xlim(0, 1)
    plt.ylim(0, 1)
    plt.legend(loc='lower left')
    plt.grid()
    plt.show()
    

interact(show_x_div_y, z1=(0.0001,1,0.2), z2=(1,10,0.1))

* $\mathbb{P}(Z \lt z: z \lt 1$ = Area(blue triangle) = $\frac{z}{2} $
* $\mathbb{P}(Z \lt z: z \in [0,1)$ = 1 - $\mathbb{P}(Z \gt z: z \in [0,1)$ = 1 - Area(red triangle)
= $1 - \frac{1}{z}$

$$
F(z) = \mathbb{P}(Z \le z) =
\begin{cases}
\frac{z}{2}      & z < 1.0 \\
1 - \frac{1}{2z}  & z \ge 1
\end{cases}
$$

$$
f(z) = \mathbb{P}(Z = z) =
\begin{cases}
\frac{1}{2}    & z < 1.0 \\
\frac{1}{2z^2}  & z \ge 1
\end{cases}
$$

*********************************
$X_1, \cdots, X_n \sim Exp(\beta)$
$Y = max(X_1, \cdots, X_n)$  
$Y \le y \rightarrow \forall X_i \le y$  
$$
\begin{align}
F(y) &= \mathbb{P}(Y \le y)\\
     &= \prod_{i=1}^n Exp(X_i \le y)\\
     &= (1 - e^{-\beta y})^n
\end{align}
$$

$f(y) = n (1 - e^{-\beta y})^{n-1} (-e^{-\beta y}) (-\beta) = n \beta e^{-\beta y} (1 - e^{-\beta y})^{n-1}$

# References

[<a id="cit-Wasserman2010" href="#call-Wasserman2010">Wasserman2010</a>] !! _This reference was not found in biblio.bib _ !!

