You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Introduced a configuration-driven benchmarking module for evaluating
backtracking performance.
Integration with lm-evaluation-harness for running standardized NLP
benchmarks.
Hyperparameter optimization using optuna to find the best Operator
settings.
New CLI script for launching benchmark runs from a YAML configuration file.
Fixed
Corrected the stop sequence handling to robustly and automatically stop on
both the model's EOS token and user-defined sequences, preventing premature
generation termination.