fix(suite_C): describe the actual reason a precision level is skipped#46
Merged
Conversation
Follow-up to the cleanup in #45. That PR removed the runner-declared quantization-backend gating logic and renamed the obvious skip-reason in the headline `print` (line 101), but two sibling references to the old strategy were missed: * The function-level docstring still claimed format selection intersects with `runner.SUPPORTED_QUANTIZATIONS` and warns on any format the runner doesn't declare. * The per-format final-summary line printed `skipped (backend not in SUPPORTED_QUANTIZATION_BACKENDS)` even though the `skipped` list now only ever holds the *other* full-precision baseline (e.g. FP16 on Ampere where the hw baseline is BF16). Rewrite both so the docstring describes today's policy (always include the hw-supported full-precision baseline; dispatch every quantized level; let the inference subprocess decide hardware compatibility) and the skip-reason print matches what actually causes the entry. The result.json field name `precision_levels_skipped` is **kept** — it's a stable schema field already indexed by the leaderboard and used by older results, so the name stays; only the human-readable strings around it are corrected. No functional change. Co-authored-by: Cursor <cursoragent@cursor.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Follow-up to the cleanup in #45. That PR removed the runner-declared quantization-backend gating logic and renamed the obvious skip-reason in the headline
print(line 101), but two sibling references to the old strategy were missed:runner.SUPPORTED_QUANTIZATIONSand warns on any format the runner doesn't declare.skipped (backend not in SUPPORTED_QUANTIZATION_BACKENDS)even though theskippedlist now only ever holds the other full-precision baseline (e.g. FP16 on Ampere where the hw baseline is BF16).Rewrite both so the docstring describes today's policy (always include the hw-supported full-precision baseline; dispatch every quantized level; let the inference subprocess decide hardware compatibility) and the skip-reason print matches what actually causes the entry.
The result.json field name
precision_levels_skippedis kept — it's a stable schema field already indexed by the leaderboard and used by older results, so the name stays; only the human-readable strings around it are corrected.No functional change.
Summary
Type of change
Testing
# Commands used to verifyChecklist
result.jsonfiles (or I have explained the migration path)BenchmarkRunner, produces validresult.json, includes a reference resultvalidate_submission.pyupdated and all existing results still validateleaderboard/generate.pyproduces correct output on existing resultsRelated issues