# KS Test - Kolmogorov-Smirnov

The Kolmogorov-Smirnov test (KS test) is a nonparametric statistical test used to compare two distributions. It assesses whether two samples come from the same underlying distribution or if a sample follows a specific theoretical distribution (e.g., normal distribution). The test is based on the maximum difference between the cumulative distribution functions (CDFs) of the two datasets.

Test Statistic: The KS statistic (D) is the maximum absolute difference between the CDFs of the two distributions:

$D = \sup_x |F_1(x) - F_2(x)|$

where $ F_1(x) $ and $ F_2(x) $ are the CDFs of the two distributions, and $\sup$ denotes the supremum (maximum).

### Key Points:
#### Purpose:
- One-sample KS test: Compares a sample's empirical distribution to a reference distribution (e.g., normal, uniform).
- Two-sample KS test: Compares the empirical distributions of two samples.

### Use Cases:
- Checking if a dataset follows a specific distribution (e.g., normality test).
- Comparing two datasets to see if they come from the same population.


### Limitations:
- Less sensitive to differences in the tails of distributions.
- Works best with continuous distributions; less reliable for discrete distributions.
- For small sample sizes, the test may lack power.

In [1]:
from scipy.stats import ks_2samp, kstest
import numpy as np

# Two-sample KS test
data1 = np.random.normal(0, 1, 100)
data2 = np.random.normal(0.5, 1, 100)
statistic, p_value = ks_2samp(data1, data2)
print(f"KS Statistic: {statistic}, p-value: {p_value}")

# One-sample KS test (against a normal distribution)
statistic, p_value = kstest(data1, 'norm', args=(0, 1))
print(f"One-sample KS Statistic: {statistic}, p-value: {p_value}")

KS Statistic: 0.26, p-value: 0.002219935934558366
One-sample KS Statistic: 0.07869970930984135, p-value: 0.5391910000517836
