Skip to content

WD928/LEAP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LEAP Data Release

Project Data release Domain Status

This repository contains data tables supporting the LEAP / Perovskite-RL manuscript. It is a data-release repository: model weights, Hugging Face training datasets, raw PDFs, logs, procurement tables, and the full internally curated candidate pool are not stored here.

Resource Links

Resource Link Notes
Manuscript arXiv:2605.20242 Associated LEAP / Perovskite-RL preprint.
Model weights JH976/Perovskite-RL Perovskite-RL model repository. Update this link if the public model repository uses a different name.
Training datasets datasets/JH976/Perovskite-RL SFT and GRPO datasets for the language-model training stages.
Data-release repository WD928/LEAP Tables and source data used for benchmark, ablation, and candidate-selection reporting.

Repository Map

Module Path Contents
Hot-start additive data data/ 36 experimentally characterized additives, hard descriptors, soft mechanistic descriptor statistics, and relative PCE changes.
Mechanism benchmark benchmark/questions.csv 32 multiple-choice questions from held-out literature sources.
Model benchmark results benchmark/model_results/ Per-question answers for Perovskite-RL and baseline models.
Benchmark statistics benchmark/statistics/ Accuracy summaries, exact McNemar-test tables, and Holm-Bonferroni-adjusted pairwise comparisons.
Representation ablation ablation/representation/ Hard/soft/hybrid representation ablation source data and bootstrap confidence intervals for Figure 3 metrics.
Reasoning-source ablation ablation/reasoning_source/ Perovskite-RL versus backbone soft-descriptor ablation tables and top-k diagnostics.
Decision-policy ablation ablation/decision_policy/ Expected-improvement, predicted-mean, uncertainty, and random-policy comparison data.
Candidate selection candidate_selection/ Round-specific top-50 validation shortlists with molecule identifiers and mechanism-score summaries.

Key Files

File Description
data/hot_start_additives.csv Main 36-additive hot-start table with measured PCE values and descriptors.
benchmark/statistics/model_summary_with_ci.csv Benchmark accuracy summary with confidence intervals.
benchmark/statistics/mcnemar_vs_reference_holm.csv Exact McNemar comparisons against Perovskite-RL with Holm-Bonferroni adjustment.
ablation/representation/figure3_bootstrap_ci_table.csv Bootstrap confidence intervals for the Figure 3 representation-ablation metrics.
candidate_selection/top50_validation_shortlists_mechanism_scores.csv Cleaned round-specific top-50 validation shortlist table.

License

This repository is released under the Apache License 2.0. See the LICENSE file for details.

Citation

Please cite the associated arXiv preprint if you use this repository:

https://arxiv.org/abs/2605.20242

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors