<a href="https://colab.research.google.com/github/tomheston/fragility-metrics/blob/main/notebooks/correlation_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
# @title
# Fragility Metrics Toolkit: Correlation Analysis — COMPLETE p–fr–nb
# v.25-NOV-2025
# Now with Zerko Fragility Quotient (ZFQ) → full triplet achieved
#
# Input: r (Pearson or Spearman), n (sample size)
# Output: p, fr (ZFQ), nb (DTI)
#
# Breakthrough:
#   Zerko Fragility Quotient (ZFQ) = (|atanh(r)| − 1.96) / (1 + |atanh(r)| − 1.96)
#   → Exact analogue of CFQ/ANOVA-FQ in Fisher-z space
#   → Model-free, path-independent, unique, reproducible
#   → Uses only (r, n) — no assumptions beyond the universal use of Fisher's z
#
# DTI remains the official robustness (nb) metric per NBF
# ZFQ is the canonical fragility (fr) — the missing piece is now found
#
# IF YOU USE THIS CALCULATOR PLEASE CITE:
# Heston, T. F. (2025). Fragility Metrics Toolkit [Software]. Zenodo.
# https://doi.org/10.5281/zenodo.17254763
#
# © Thomas F. Heston 2025. CC-BY 4.0

import numpy as np
from scipy.stats import t as tdist

ALPHA = 0.05
Z_CRIT = 1.96  # Two-sided α=0.05 in z-space (exact for large n, excellent approx otherwise)

# ==================== Zerko Fragility Quotient (ZFQ) ====================
def compute_zfq(r: float):
    """
    Zerko Fragility Quotient — canonical fragility for correlation
    Formula: ZFQ = (|z| − 1.96) / (1 + |z| − 1.96)  where z = atanh(r)
    """
    if abs(r) >= 1.0:
        return 1.0
    z = np.abs(np.arctanh(r))
    zfs = z - Z_CRIT  # Zerko Fragility Score
    if zfs <= 0:
        return 0.0
    return zfs / (1 + zfs)

# ==================== DTI — Distance to Independence (official nb) ====================
def compute_dti(r: float):
    if abs(r) >= 1.0:
        return 1.0
    z = np.abs(np.arctanh(r))
    return z / (1 + z)

# ==================== Classic p-value (for reference) ====================
def classic_pvalue(r: float, n: int):
    if n < 3:
        return None
    if abs(r) >= 1.0:
        return 0.0
    t_stat = r * np.sqrt((n - 2) / (1 - r**2))
    return 2 * tdist.sf(np.abs(t_stat), n - 2)

# ==================== Interpretation ====================
def interpret_zfq(zfq):
    if zfq is None:
        return "not computed"
    if zfq < 0.01: return "extremely fragile"
    elif zfq < 0.05: return "very fragile"
    elif zfq < 0.10: return "fragile"
    elif zfq < 0.25: return "mildly stable"
    elif zfq < 0.40: return "moderate stability"
    else: return "very stable"

def interpret_dti(dti):
    if dti < 0.05: return "at independence (no relationship)"
    elif dti < 0.10: return "near independence"
    elif dti < 0.25: return "moderate distance"
    elif dti < 0.50: return "clear separation"
    else: return "far from independence (strong relationship)"

# ==================== Complete Evidence ====================
def correlation_complete(r: float, n: int):
    if not (-1 <= r <= 1):
        raise ValueError("r must be in [-1, 1]")
    if n < 2:
        raise ValueError("n must be ≥ 2")

    p_val = classic_pvalue(r, n)
    zfq = compute_zfq(r)
    dti = compute_dti(r)

    return {
        "r": r,
        "n": n,
        "p": p_val,
        "fr": zfq,
        "nb": dti,
        "z": np.arctanh(r) if abs(r) < 1 else float('inf')
    }

# ==================== Output ====================
def print_correlation_results(res):
    print("\n" + "="*60)
    print("COMPLETE EVIDENCE ASSESSMENT: CORRELATION (p–fr–nb)")
    print("="*60)
    print(f"Pearson r = {res['r']:+.6f}  (n = {res['n']})")
    print(f"p-value   = {res['p']:.6f}" if res['p'] is not None else "p-value   = n/a (n < 3)")
    print(f"fr (ZFQ)  = {res['fr']:.6f}  →  {interpret_zfq(res['fr'])}")
    print(f"nb (DTI)  = {res['nb']:.6f}  →  {interpret_dti(res['nb'])}")
    print("="*60)

    # Strength verdict
    abs_r = abs(res['r'])
    if abs_r >= 0.7:
        strength = "very strong"
    elif abs_r >= 0.5:
        strength = "strong"
    elif abs_r >= 0.3:
        strength = "moderate"
    elif abs_r >= 0.1:
        strength = "weak"
    else:
        strength = "negligible"

    print(f"\nInterpretation:")
    print(f"• Observed correlation: {strength} (r = {res['r']:+.3f})")
    print(f"• Fragility (ZFQ): {interpret_zfq(res['fr'])}")
    print(f"• Robustness (DTI): {interpret_dti(res['nb'])}")

    if res['p'] is not None and res['p'] <= 0.05 and res['fr'] < 0.10:
        print("→ Significant but fragile — interpret with caution")
    elif res['p'] is not None and res['p'] > 0.05 and res['nb'] > 0.25:
        print("→ Non-significant p-value, but meaningful correlation exists (small sample)")
    else:
        print("→ p–fr–nb triplet is concordant")

    print("\nZerko Fragility Quotient (ZFQ) now completes the triplet for correlation.")
    print("Reference: FRAGILITY_METRICS.md v9.7 + Zerko Addendum (2025)")

# ==================== CLI ====================
def main():
    print("Correlation Complete Evidence Calculator (p–fr–nb)\n")
    r = float(input("Enter Pearson r (-1 to +1): ").strip())
    n = int(input("Enter sample size n: ").strip())

    result = correlation_complete(r, n)
    print_correlation_results(result)

if __name__ == "__main__":
    main()

Correlation Analysis

Pearson r (-1 to +1): .1
Sample size n: 500

p  = 0.025347 by two-sided t-test for Pearson r
fr = n/a for correlations*
nb (DTI) = 0.091186

Interpretation:
The p-value is significant at p≤0.05**
Pearson r = 0.1000  (n = 500)
DTI (Distance to Independence) = 0.091186

The observed correlation of r = 0.100 and DTI agree: weak, essentially indistinguishable from no relationship (r=0).

*Fragility (fr) is not defined because no unique, model-free way exists to toggle points (see: FRAGILITY_METRICS.md https://doi.org/10.5281/zenodo.17254763)
**The p-value is provided for reference only and should be interpreted cautiously, as it is highly dependent on sample size and can be misleading.
DTI is the framework’s official robustness (nb) metric for correlations.
