In [2]:
import numpy as np
# ! pip3 install scipy
from scipy.stats import norm

In [3]:
mean = 0
std_dev = 1

data = np.random.normal(mean, std_dev, 1000)

print(data[:10])

[-1.78676866  1.33960941  0.86356562  1.61161539  1.86484455  0.90516332
  1.21514956 -0.96340817 -1.06097487  2.59309748]


### Law of Large Numbers

The **law of large numbers** is a fundamental theorem in probability theory. It states that if an experiment is repeated many times, the average of the observed outcomes tends to approach the expected value.

Let:

$$
\overline{X}_n = \frac{X_1 + X_2 + \dots + X_n}{n}
$$

be the sample mean based on observations $ X_1, X_2, \dots, X_n $ drawn from a distribution with expected value $ \mu $.

---

### Weak Law of Large Numbers

If $ X_1, X_2, \dots $ are independent and identically distributed (i.i.d.) random variables with a finite expectation $ \mu $, then the sample mean $ \overline{X}_n $ satisfies:

$$
\lim_{n \to \infty} P(|\overline{X}_n - \mu| > \epsilon) = 0
$$

This means that for large $ n $, the probability that the sample mean deviates from the expected value by more than any fixed $ \epsilon > 0 $ becomes negligible.

---

### Strong Law of Large Numbers (SLLN)

The **strong law** strengthens this result by stating that:

$$
\lim_{n \to \infty} \overline{X}_n = \mu \quad \text{almost surely}
$$

Equivalently,

$$
P\left( \lim_{n \to \infty} \overline{X}_n = \mu \right) = 1
$$

This guarantees that the sample mean converges to the true mean with probability one as the number of trials becomes large (in the long run).


### Empirical Distribution Function (EDF)

The **Empirical Distribution Function (EDF)** is a statistical tool used to estimate the cumulative distribution function (CDF) of a given sample. It provides a non-parametric estimate of the distribution from which the sample was drawn.

The EDF is defined as:

$$
F_n(x) = \frac{\text{number of elements in the dataset } \leq x}{n}
$$

where $ n $ is the total number of observations.

---

In fact, there is a direct relationship between a histogram and the empirical distribution function. The area under a single bin of the histogram represents the relative frequency of the elements within that bin, which corresponds to the increase in $ F_n $ over that interval (on that bin).


Given a sample of $ n $ independent and identically distributed (i.i.d.) random variables $ X_1, X_2, \dots, X_n $, the **empirical distribution function** $ F_n(x) $ is defined as:

$$
F_n(x) = \frac{1}{n} \sum_{i=1}^n 1_{ X_i \leq x}
$$

where $ 1_{ X_i \leq x} $ is the indicator function that equals 1 if $ X_i \leq x $ and 0 otherwise. The function $ F_n(x) $ represents the proportion of sample points less than or equal to $ x $.

---

### Properties of the Empirical Distribution Function

- **Step Function**: The EDF is a step function that increases by $ \frac{1}{n} $ at each sample point.
- **Convergence**: As $ n \to \infty $, $ F_n(x) $ converges uniformly to the true cumulative distribution function $ F(x) $.
- **Approximation**: For most realizations of the random sample, the empirical distribution function $ F_n(x) $ is close to the true distribution function $ F(x) $:

  $$
  F_n(a) \approx F(a)
  $$

- **Law of Large Numbers**: The EDF satisfies the Law of Large Numbers, meaning it approaches the true distribution function as more data is collected.


In [4]:
def empirical_distribution_function(data, value):

    sorted_data = np.sort(data)
    
    # Count the number of data points less than or equal to the given value
    count = np.sum(sorted_data <= value)
    
    # Compute the EDF
    edf = count / len(sorted_data)
    
    return edf


data = np.random.normal(0, 1, 1000)  
value = 0  # Example value
edf_value = empirical_distribution_function(data, value)
print(f"Empirical distribution function value for {value}: {edf_value}")

Empirical distribution function value for 0: 0.506


<h4>Identically distributed and Independent Random Variables</h4>
Identically distributed: Each $X_i$ should come from the same underlying distribution.

Independent: No sample value should influence another.

<h4>The Inverse Cumulative Distribution Function (CDF) </h4>

The inverse cumulative distribution function (CDF) is a function that "reverses" the effect of the cumulative distribution function. Given a probability $ p $, it returns the value of the random variable such that the probability of the random variable being less than or equal to that value is $ p $.

Mathematically, for a random variable $ X $ with cumulative distribution function $ F(x) $, the inverse CDF, denoted as $ F^{-1}(p) $, satisfies the condition:

$$
F^{-1}(p) = x \quad \text{if} \quad F(x) = p
$$

In simpler terms, given a probability $ p $, the inverse CDF provides the value $ x $ such that $ F(x) = p $.


In [5]:
def normal_cdf(x, mu, sigma):
    # x: The value(s) at which to compute the CDF.
    return norm.cdf(x, mu, sigma)


mu = 0
sigma = 1
x = 1.5
cdf_value = normal_cdf(x, mu, sigma)
print(f"CDF at {x}: {cdf_value}")
x = 0
cdf_value = normal_cdf(x, mu, sigma)
print(f"CDF at {x}: {cdf_value}")

CDF at 1.5: 0.9331927987311419
CDF at 0: 0.5


In [6]:
def inverse_normal_cdf(p, mu, sigma):
    #  percent-point function (ppf) aka inverse CDF
    return norm.ppf(p, mu, sigma)

mu = 0
sigma = 1
p = 0.5
x = inverse_normal_cdf(p, mu, sigma)
print(f"The value corresponding to the probability {p}: {x}")
p = 0.9331927987311419
x = inverse_normal_cdf(p, mu, sigma)
print(f"The value corresponding to the probability {p}: {x}")

The value corresponding to the probability 0.5: 0.0
The value corresponding to the probability 0.9331927987311419: 1.4999999999999996


<h3>Non-Parametric Estimates</h3>

In situations where we lack sufficient knowledge about the underlying phenomenon, it is often preferable not to assume a specific parametric form for the probability distribution. Instead, we treat the available dataset as a realization of a random sample of size $ N $ drawn from an unknown continuous probability distribution.

To approximate the true probability density function $ f $ and the cumulative distribution function $ F $, we use the kernel density estimate and the empirical distribution function, respectively. By examining the resulting plots of these estimates, we may gain insights into whether the underlying distribution resembles any known parametric distribution.

Rather than interpreting these two plots solely as graphical summaries of the data, we can also treat them as estimators for the true functions. Specifically, the kernel density estimate serves as an estimator for the density function $ f $, while the empirical distribution function estimates the cumulative distribution function $ F $.

Because these approaches do not assume any particular parametric model, they are referred to as non-parametric estimates.
