An open, reproducible benchmark suite and reference baselines for high-entropy alloy (HEA) phase prediction.
- A consolidated, deduplicated open dataset of 7,784 experimentally characterized multi-principal element alloys, merged from three primary sources (Borg 2020, Pei 2020, Peivaste 2023) with per-row source provenance.
- Reference baseline implementations of the four canonical empirical phase-prediction rules (Yeh ΔSmix, Zhang δ, Guo-Liu VEC, Yang-Zhang Ω), wrapped as proper diagnostic classifiers with sensitivity / specificity / Wilson 95% CIs.
- A clean, dependency-free Python API (
pip install hea-bench) and a self-contained HTML calculator that runs entirely client-side, computes all six descriptors plus the Miedema decompositions, and applies the four phase-prediction rules to the entered composition — open by URL or just double-click the file.
Using an AI coding agent to integrate this? See AGENTS.md for a machine-oriented guide to the API, exact return types and units, the fastest path to each task, and the mistakes to avoid.
Running the four canonical rules against the consolidated benchmark produces the reference baselines below. These are pinned in tests so any drift in dataset, descriptor code, or rule thresholds surfaces as a test failure.
| Rule | n_eval | Accuracy | Sens (single-phase) | Spec (multi-phase) | Youden's J |
|---|---|---|---|---|---|
| Zhang δ < 6.5% | 6,651 | 56.7% | 99.0% | 8.5% | 0.075 |
| Yang Ω > 1.1 | 6,651 | 54.4% | 95.8% | 7.4% | 0.032 |
The Guo–Liu VEC rule predicts crystal structure rather than single-vs-multi, so it's evaluated stratified to single-phase observations (BCC|FCC only):
| Rule | n_eval | Accuracy | FCC sensitivity | BCC sensitivity |
|---|---|---|---|---|
| Guo–Liu VEC (FCC if VEC ≥ 8.0, BCC if VEC < 6.87) | 3,463 | 66.9% | 92.4% | 48.3% |
Yeh ΔSmix is descriptive (no phase-prediction claim attached) — 47% of the consolidated benchmark passes the 1.5R HEA-class threshold, 37% sits in the MEA bin, 16% is dilute.
The publishable observation: on a consolidated benchmark drawn from three independent open sources, both binary rules collapse to "predict single-phase almost always" (Youden's J ~ 0.03–0.08), and the VEC rule misses about half of observed BCC alloys despite catching 92% of FCC alloys. The canonical rules generalize poorly.
pip install hea-benchimport hea_bench
cantor = {"Co": 0.2, "Cr": 0.2, "Fe": 0.2, "Mn": 0.2, "Ni": 0.2}
hea_bench.smix(cantor) # 13.381 J/(mol·K) = R · ln 5
hea_bench.delta(cantor) # 3.164 % atomic-size mismatch
hea_bench.vec(cantor) # 8.0 valence electrons
hea_bench.mixing_enthalpy(cantor) # -4.16 kJ/mol (Miedema)
hea_bench.omega(cantor) # 5.79 (Yang-Zhang)
# Apply the canonical rules
from hea_bench.rules import zhang_delta, yang_omega, guo_vec
zhang_delta.predict(cantor) # 'single-phase'
yang_omega.predict(cantor) # 'single-phase'
guo_vec.predict(cantor) # 'FCC'
# Run the full rule benchmark against the consolidated v0.1.0 dataset
from hea_bench.evaluate import build_report
report = build_report()
print(report["rules"]["zhang_delta_6_5"]["accuracy"]) # 0.5670hea-bench --version
python -m hea_bench.evaluate # run all 4 rules on v0.1.0
python -m hea_bench.benchmark.coverage # coverage analysis on v0.1.0A self-contained HTML calculator computes the descriptors, applies the four phase-prediction rules, and runs the Miedema decompositions entirely client-side. Two equivalent paths:
- Open the hosted page: https://dfieser.github.io/hea-bench/
- Or download / clone the repo and double-click
web/index.html. No install, no terminal, no server.
The page reports each rule's verdict (Yeh HEA/MEA/dilute, Zhang single/multi, Guo–Liu FCC/BCC/mixed, Yang–Zhang single/multi) alongside the computed descriptor values. Logic matches the Python library, including the six-decimal VEC-boundary rounding.
┌────────────────────────────┐
│ data/consolidated/v0.1.0/ │
│ - consolidated.csv │
│ - rule_baselines.json │
│ - coverage_report.json │
│ - manifest.json │
└─────────────▲──────────────┘
│
│ produced by
│
┌─────────────────────┐ ┌───────────────┴───────────────┐
│ data/raw/ │ │ src/hea_bench/ │
│ - borg2020/ │───►│ - benchmark/ │
│ - pei2020/ │ │ consolidate.py │
│ - peivaste/ │ │ coverage.py │
│ (per-source READMEs│ │ loaders/{borg,pei,...}.py│
│ + provenance) │ │ - descriptors/{size, vec, │
└─────────────────────┘ │ melting, miedema, omega} │
│ - rules/{yeh, zhang, │
│ guo, yang} │
│ - classifiers/ │
│ diagnostic_stats.py │
│ - evaluate.py │
└──────────────┬────────────────┘
│
│ independent
│ implementation
▼
┌──────────────────────────────┐
│ web/ (standalone HTML + │
│ JavaScript) │
│ - index.html │
│ - mathjax/ (vendored) │
└──────────────────────────────┘
data/consolidated/v0.1.0/consolidated.csv — 7,784 unique
compositions × 14 columns:
composition_key— alphabetically sorted element symbols + 4-decimal mole fractions, the canonical join keyn_elements,sources(semicolon-separated)canonical_phase— one ofBCC/FCC/HCP/multi-phase(blank when the contributing sources disagree)has_conflict— 1 when the canonical_phase is blank because of a source-label disagreement- Per-source canonical and raw labels preserved verbatim
borg_processing,borg_doi,source_row_idsfor provenance
100 of the 7,784 compositions are cross-source label conflicts — flagged for downstream resolution rather than silently picked. The sources are: Borg 2020 (740 alloys), Pei 2020 (1,209 alloys), Peivaste 2023 (7,747 alloys).
See data/consolidated/v0.1.0/README.md
for the full schema, per-source attribution, and a complete
description of the consolidation rules. See
data/raw/ for per-source provenance, licenses, and
SHA-256s.
- 86.7% of the 7,784 compositions are scorable by every descriptor
(δ, VEC, T_m, ΔS_mix, ΔH_mix, Ω) with the current 24-element
ELEMENTAL_DATAtable - 99.6% are scorable for Miedema-based descriptors only (the vendored matminer pair table covers 75 elements)
- Top elements whose addition would lift coverage to ~95%: Mg, C, Zn, B, Sn, Re (all already in the matminer pair table — pending v0.2.0 data release)
Re-run the coverage analysis on your own version of the dataset with:
python -m hea_bench.benchmark.coverageEvery primary source is cited per-row in the consolidated CSV. The
data files in data/raw/ carry per-source READMEs with
DOIs, licenses, and acquisition SHA-256s.
| Source | Citation | License | Status |
|---|---|---|---|
| Borg 2020 | Sci. Data 7, 430 (doi:10.1038/s41597-020-00768-9) | CC-BY-4.0 | Mirrored |
| Pei 2020 | npj Comput. Mater. 6, 50 (doi:10.1038/s41524-020-0308-7) | CC-BY-4.0 | Mirrored |
| Peivaste 2023 | Sci. Rep. 13, 22556 + GitHub | none on data | Pointer-only (fetch.py) |
| Miedema pair enthalpies | matminer MiedemaLiquidDeltaHf.tsv |
BSD-3-Clause | Vendored (see descriptors/data/) |
hea-bench/
├── data/
│ ├── raw/ per-source data with READMEs, licenses, SHAs
│ └── consolidated/ versioned benchmark releases (v0.1.0 here)
├── src/hea_bench/
│ ├── benchmark/ loaders, consolidator, coverage analysis
│ ├── descriptors/ ΔS_mix, δ, VEC, T_m, ΔH_mix, Ω + data tables
│ ├── rules/ four canonical empirical rules as classifiers
│ ├── classifiers/ diagnostic-stats machinery
│ ├── composition.py formula parser, normalizer
│ ├── constants.py R = 8.314
│ ├── evaluate.py orchestrator: rules vs benchmark → headline stats
│ └── cli.py command-line entry point
├── tests/ 157 tests, all passing
├── web/ self-contained HTML calculator (pure JS, no server)
└── pyproject.toml
git clone <repo>
cd hea-bench
pip install -e ".[dev,data]"
python -m pytest tests/ -qThe HTML calculator (web/index.html) is an independent
JavaScript implementation of the same descriptors and rules. When you
modify Python descriptor code, sanity-check the calculator against
the same composition (e.g. the Cantor alloy values) so the two
surfaces don't drift.
Contributions, bug reports, and dataset additions are welcome. See
CONTRIBUTING.md for development setup, the
testing convention, and the data-provenance policy. To report a bug
or ask a question, open a GitHub issue; for direct contact, email
the maintainer at davjfies@gmail.com. Participation is governed by
the Code of Conduct.
MIT. The vendored
matminer Miedema data files remain
under their upstream BSD-3-Clause license, preserved at
descriptors/data/LICENSE.matminer.txt.
Citation metadata in CITATION.cff. When citing
hea-bench, please also cite the original source datasets (Borg, Pei,
Peivaste) and matminer — see data/raw/<source>/README.md for each
source's preferred citation.
hea-bench is archived on Zenodo. The concept DOI 10.5281/zenodo.20346287 always resolves to the latest version; v0.1.0 specifically is 10.5281/zenodo.20346288.
All numerical parameters, formulas, threshold values, and benchmark numbers are derived from cited primary sources or computed in this codebase from documented inputs; the author verified outputs against the cited literature.