Releases
v0.0.2
Compare
Sorry, something went wrong.
No results found
Version 0.0.2 -- Released 2026-05-08
Added
Added per-instance predictor base class (InstancePredictor) and random example
User can now specify patterns to helm runs, suites, or all outputs as predictor input
Added symbol sweeping capability to evaluation card evaluator
Added modal CLI for evaluation.py script
Added support for KWDagger pipelines in evaluation cards (both as explicit pipelines, and YAML defined pipelines)
Added support for symbol overrides to magnet evaluate with the --override argument
Added parallelization to magnet evaluate with the --jobs (and --parallel_backend) arguments
Added claim resolution and final result file output to magnet evaluate
Added support for claim_aggregation_strategy to evaluation cards (supporting any, all, and fraction strategies)
Changed
Switched to single argument path input for example predictors
Cleaned up predicted vs. actual code for predictors
HelmRuns.coerce can now accept a more expressive set of inputs
BREAKING: You must how specify helm_runs when calling the predictor.
magnet download helm can now download multiple benchmarks
Fixed
Fixed doctests and README wrt predictor refactors
Updated predict_inputs_exploration.ipynb notebook wrt API updates
You can’t perform that action at this time.