# Appendix B — Computation and Interpretation of the Relevant Metrics
**Version:** v1.0 — August 28, 2025

> This notebook mirrors the Appendix content from the manuscript’s supplementary file. It is intended as a static, citable artifact. Replace placeholders and bracketed notes before archiving (e.g., Zenodo/OSF/Figshare).

## B.1 Computing the Relevant Metrics
This section outlines the reliability coefficients/metrics and formulas referenced in the framework. Metrics are grouped as **individual‑level** vs **systematic‑level** reliability. As there are multiple forms of ICC (ten common variants), ICC formula selection is provided in **B.2**.


### Appendix Table 1 — Concordance Coefficients and Metrics Used in DHT Analytical Validation (Summary)
| Metric | Description | Optimal | Level |
|---|---|---:|---|
| SW | Shapiro–Wilk normality test | p ≤ 0.05 (reject normality) | Normality |
| ρ | Pearson correlation (linear relationship) | ρ ≥ 0.8 (very strong) | Correlation |
| CCC | Concordance correlation (agreement) | ≥ 0.95 (substantial) | Individual |
| ICC | Intraclass correlation (reproducibility) | ≥ 0.90 (excellent) | Individual |
| BA (LoA) | Bland–Altman (bias, LoA vs MCID) | LoA < MCID | Individual/Systematic |
| BA (Bias) | Mean bias vs MCID |  | Systematic |
| MAPE | Mean absolute percentage error | ≤ 10% (common DHT criterion) | Systematic |


### Appendix Table 2 — Formulas (Notation)
Let \(d_i\) be digital, \(m_i\) manual, \(n\) samples; \(\bar{\cdot}\) denotes mean; \(s(\cdot)\) standard deviation.

- **Pearson**: \(\displaystyle \rho = \frac{\sum_i (d_i-\bar d)(m_i-\bar m)}{\sqrt{\sum_i(d_i-\bar d)^2}\sqrt{\sum_i(m_i-\bar m)^2}}\)
- **CCC (Lin)**: \(\displaystyle \rho_c=\frac{2\rho\,\sigma_d \sigma_m}{\sigma_d^2+\sigma_m^2+(\bar d-\bar m)^2}\)
- **MAE**: \(\displaystyle \mathrm{MAE}=\frac{1}{n}\sum_i |d_i-m_i|\)
- **RMSE**: \(\displaystyle \mathrm{RMSE}=\sqrt{\frac{1}{n}\sum_i (d_i-m_i)^2}\)
- **MAPE**: \(\displaystyle \mathrm{MAPE}=\frac{100}{n}\sum_i \left|\frac{d_i-m_i}{m_i}\right|\ \%\)
- **Bland–Altman**: differences \( \Delta_i=d_i-m_i\); bias \(\bar{\Delta}\); SD \(s_\Delta\); LoA \( \bar{\Delta}\pm 1.96\,s_\Delta\).
- **ICC**: see Selection Guide (B.2); formulas depend on ANOVA mean squares (MSR, MSC, MSE, etc.).


## B.2 ICC Formula Selection
ICC selection is a 3‑step process:

1) **Model** — one‑way random / two‑way random / two‑way mixed  
2) **Type** — single rater vs multiple raters  
3) **Definition** — absolute agreement vs consistency

- One‑way random: randomly assigned devices/rater.  
- Two‑way random: generalize to all similar devices.  
- Two‑way mixed: only the specific devices are of interest.  
- Single rater: compare against a single criterion method.  
- Multiple raters: compare against mean of multiple methods.  
- Absolute agreement: tests \(y pprox x\).  
- Consistency: tests \(y pprox x + c\) (constant bias tolerated).

> When only two raters are present, ICC and CCC often yield similar values.


### Appendix Table 3 — ICC Equation Selection Guide (Schematic)
| Type | Model | Definition |
|---|---|---|
| Single rater | 1‑way random | Agreement / Consistency |
| Single rater | 2‑way random | Agreement / Consistency |
| Single rater | 2‑way mixed | Agreement / Consistency |
| Multiple raters | 1‑way random | Agreement / Consistency |
| Multiple raters | 2‑way random | Agreement / Consistency |
| Multiple raters | 2‑way mixed | Agreement / Consistency |

*MSR = Mean Square Rows; MSC = Mean Square Columns; MSE = Mean Square Error (from ANOVA).*


## B.3 Interpretation of Metrics (Example Criteria)
- **ρ (Chan)**: <0.3 poor; 0.3–0.6 fair; 0.6–0.8 moderately strong; ≥0.8 very strong.  
- **CCC (McBride)**: <0.90 poor; 0.90–0.95 moderate; 0.95–0.99 substantial; ≥0.99 near perfect.  
- **ICC (Koo)**: <0.50 poor; 0.50–0.75 moderate; 0.75–0.90 good; ≥0.90 excellent.  
- **BA vs MCID (relative)**: LoA < 10% and bias < 10% ⇒ accurate / systematic agreement (example for 6MWD in HF).  
- **MAPE (DHT convention)**: ≤10% often used, though no universal standard.

> Note: Only BA uses **clinically** derived thresholds (MCID) by design.


---
## Optional: Placeholder Cells (non‑functional)

In [None]:
# Placeholder: compute metrics (non-functional)
def compute_metrics_placeholder(digital, manual):
    """Return rho, CCC, ICC, BA (bias/LoA), MAE, RMSE, MAPE, etc.
    This is a documentation stub; implement with real data."""
    raise NotImplementedError("Documentation stub only")