# Overview

This notebook provides a function to test whether two given samples originate from the same scale normal distribution or the same scale Laplace distribution. The procedure involves the following steps:

1. **Scale Test:**  
   Levene's test is applied to examine whether the two samples have the same scale (i.e., variance or spread).

2. **Distribution Fit Tests:**  
   Each sample is individually tested for goodness-of-fit to both the normal distribution and the Laplace distribution using the Kolmogorov-Smirnov (KS) test.

3. **Distribution Type Identification:**  
   Based on the KS test p-values, the likely distribution type (normal, Laplace, or undetermined) for each sample is determined.

4. **Final Decision:**  
   Combining the results of the scale test and the distribution type identification, the function outputs a final determination regarding whether the samples share the same scale and what distribution type they follow.

The notebook also includes examples with synthetic data generated from normal and Laplace distributions, both with matching and differing scale parameters, to demonstrate the use of the function.


In [2]:
import numpy as np
from scipy import stats

def distribution_test(sample1, sample2, alpha=0.05):
    """
    Test whether two samples come from the same scale normal distribution
    or the same scale Laplace distribution.
    
    Parameters:
    -----------
    sample1, sample2 : array-like
        Samples to be compared
    alpha : float, optional
        Significance level (default: 0.05)
    
    Returns:
    --------
    dict : Dictionary containing test results
    """
    results = {}
    
    # 1. Test for scale parameter equivalence (Levene test)
    lev_stat, p_value_lev = stats.levene(sample1, sample2)
    
    results['Levene_test'] = {
        'Levene_statistic': lev_stat,
        'p_value': p_value_lev,
        'result': 'Same scale' if p_value_lev > alpha else 'Different scales'
    }
    
    # 2. Test sample 1 for normality
    ks_norm1 = stats.kstest(sample1, 'norm', args=(np.mean(sample1), np.std(sample1, ddof=1)))
    
    # 3. Test sample 1 for Laplace distribution
    loc1 = np.median(sample1)
    scale1 = np.mean(np.abs(sample1 - loc1))
    ks_laplace1 = stats.kstest(sample1, 'laplace', args=(loc1, scale1))
    
    # 4. Test sample 2 for normality
    ks_norm2 = stats.kstest(sample2, 'norm', args=(np.mean(sample2), np.std(sample2, ddof=1)))
    
    # 5. Test sample 2 for Laplace distribution
    loc2 = np.median(sample2)
    scale2 = np.mean(np.abs(sample2 - loc2))
    ks_laplace2 = stats.kstest(sample2, 'laplace', args=(loc2, scale2))
    
    # Add KS test results to dictionary
    results['KS_test_sample1'] = {
        'normal': {'statistic': ks_norm1.statistic, 'p_value': ks_norm1.pvalue},
        'laplace': {'statistic': ks_laplace1.statistic, 'p_value': ks_laplace1.pvalue}
    }
    
    results['KS_test_sample2'] = {
        'normal': {'statistic': ks_norm2.statistic, 'p_value': ks_norm2.pvalue},
        'laplace': {'statistic': ks_laplace2.statistic, 'p_value': ks_laplace2.pvalue}
    }
    
    # Determine distribution type based on p-values
    norm_p1 = ks_norm1.pvalue
    laplace_p1 = ks_laplace1.pvalue
    norm_p2 = ks_norm2.pvalue
    laplace_p2 = ks_laplace2.pvalue
    
    # Higher p-value indicates better fit (not rejected)
    is_norm1 = norm_p1 > alpha and norm_p1 > laplace_p1
    is_laplace1 = laplace_p1 > alpha and laplace_p1 > norm_p1
    is_norm2 = norm_p2 > alpha and norm_p2 > laplace_p2
    is_laplace2 = laplace_p2 > alpha and laplace_p2 > norm_p2
    
    if is_norm1 and is_norm2:
        dist_type = "Normal distribution"
    elif is_laplace1 and is_laplace2:
        dist_type = "Laplace distribution"
    else:
        dist_type = "Undetermined distribution"
    
    results['distribution_type'] = dist_type
    
    # Final determination
    if results['Levene_test']['result'] == 'Same scale':
        if dist_type in ["Normal distribution", "Laplace distribution"]:
            results['final_determination'] = f"Same scale {dist_type}"
        else:
            results['final_determination'] = "Same scale, but distribution type undetermined"
    else:
        results['final_determination'] = "Different scales"
    
    return results

# Generate test data
np.random.seed(42)
n_samples = 1000

# 1. Normal distributions with different means but same scale
sample1_1 = np.random.normal(0, 2, n_samples)
sample1_2 = np.random.normal(3, 2, n_samples)

# 2. Normal distributions with different means and scales
sample2_1 = np.random.normal(0, 1, n_samples)
sample2_2 = np.random.normal(3, 3, n_samples)

# 3. Laplace distributions with different means but same scale
sample3_1 = np.random.laplace(0, 2, n_samples)
sample3_2 = np.random.laplace(3, 2, n_samples)

# 4. Laplace distributions with different means and scales
sample4_1 = np.random.laplace(0, 1, n_samples)
sample4_2 = np.random.laplace(3, 3, n_samples)

# Dictionary of samples
samples_dict = {
    "1. Normal distributions with different means but same scale": (sample1_1, sample1_2),
    "2. Normal distributions with different means and scales": (sample2_1, sample2_2),
    "3. Laplace distributions with different means but same scale": (sample3_1, sample3_2),
    "4. Laplace distributions with different means and scales": (sample4_1, sample4_2)
}

# Run test for all sample pairs
for name, samples in samples_dict.items():
    print(f"\n===== {name} =====")
    sample1, sample2 = samples
    results = distribution_test(sample1, sample2)
    
    # Display results
    print(f"Levene test result: {results['Levene_test']['result']} (p-value: {results['Levene_test']['p_value']:.4f})")
    
    print("Sample 1 distribution tests:")
    print(f"  Normal: p-value = {results['KS_test_sample1']['normal']['p_value']:.4f}")
    print(f"  Laplace: p-value = {results['KS_test_sample1']['laplace']['p_value']:.4f}")
    
    print("Sample 2 distribution tests:")
    print(f"  Normal: p-value = {results['KS_test_sample2']['normal']['p_value']:.4f}")
    print(f"  Laplace: p-value = {results['KS_test_sample2']['laplace']['p_value']:.4f}")
    
    print(f"Distribution type: {results['distribution_type']}")
    print(f"Final determination: {results['final_determination']}")


===== 1. Normal distributions with different means but same scale =====
Levene test result: Same scale (p-value: 0.6840)
Sample 1 distribution tests:
  Normal: p-value = 0.7413
  Laplace: p-value = 0.0003
Sample 2 distribution tests:
  Normal: p-value = 0.9718
  Laplace: p-value = 0.0219
Distribution type: Normal distribution
Final determination: Same scale Normal distribution

===== 2. Normal distributions with different means and scales =====
Levene test result: Different scales (p-value: 0.0000)
Sample 1 distribution tests:
  Normal: p-value = 0.9632
  Laplace: p-value = 0.0037
Sample 2 distribution tests:
  Normal: p-value = 0.8089
  Laplace: p-value = 0.0027
Distribution type: Normal distribution
Final determination: Different scales

===== 3. Laplace distributions with different means but same scale =====
Levene test result: Same scale (p-value: 0.9579)
Sample 1 distribution tests:
  Normal: p-value = 0.0000
  Laplace: p-value = 0.9764
Sample 2 distribution tests:
  Normal: p-va