Skip to content

feat(inference): add benchmark performance guardrails#219

Merged
TobiBu merged 4 commits intofeat/posterior-predictive-outputsfrom
feat/perf-guardrails
Apr 21, 2026
Merged

feat(inference): add benchmark performance guardrails#219
TobiBu merged 4 commits intofeat/posterior-predictive-outputsfrom
feat/perf-guardrails

Conversation

@TobiBu
Copy link
Copy Markdown
Collaborator

@TobiBu TobiBu commented Apr 21, 2026

Summary

  • add performance_guardrails module for optimization/VI benchmark threshold checks
  • add runtime/objective threshold dataclasses and pass/fail result object
  • add unit tests for pass/fail regression scenarios
  • export guardrail APIs and document usage in inference workflow docs

Validation

  • pre-commit passed on changed files
  • compileall passed for module and tests

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an inference “performance guardrails” utility to evaluate benchmark results against configurable runtime/objective thresholds, with tests and documentation so regressions can be detected early in CI or local benchmarking workflows.

Changes:

  • Introduces rubix.inference.performance_guardrails with threshold dataclasses and pass/fail check helpers for optimization + VI benchmarks.
  • Exports guardrail APIs from rubix.inference for public use and adds Sphinx docs coverage.
  • Adds unit tests covering pass/fail scenarios for both optimization and VI guardrail checks.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
rubix/inference/performance_guardrails.py New guardrail checking utilities and threshold/result dataclasses.
rubix/inference/__init__.py Re-exports guardrail APIs at the package level.
tests/test_inference_performance_guardrails.py Unit tests for guardrail pass/fail behavior.
docs/rubix.inference.rst Adds the new module to Sphinx API docs.
docs/inference_workflows.rst Documents intended usage pattern for guardrails in workflows.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +16 to +23
@dataclass(frozen=True)
class ObjectiveThresholds:
"""Thresholds for objective quality checks."""

max_final_loss: Optional[float] = None
max_best_loss: Optional[float] = None
max_final_objective: Optional[float] = None
max_best_objective: Optional[float] = None
Copy link

Copilot AI Apr 21, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ObjectiveThresholds mixes optimization loss thresholds (max_final_loss/max_best_loss) with VI objective thresholds (max_final_objective/max_best_objective), but each guardrail checker only evaluates a subset. As written, setting the “wrong” fields for a given checker will be silently ignored and can incorrectly report passed=True. Consider splitting this into two threshold dataclasses (optimization vs VI), or add explicit validation in check_ifu_optimization_guardrails/check_vi_guardrails to raise a ValueError when irrelevant threshold fields are set.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot apply changes based on this feedback

feat(inference): add synthetic science recipe workflow
Agent-Logs-Url: https://github.com/AstroAI-Lab/rubix/sessions/13c7f31b-922d-4a76-8508-d6d549378715

Co-authored-by: TobiBu <7574273+TobiBu@users.noreply.github.com>
@TobiBu
Copy link
Copy Markdown
Collaborator Author

TobiBu commented Apr 21, 2026

@copilot can you try again to apply changes based on the feedback in this PR?

Copy link
Copy Markdown

Copilot AI commented Apr 21, 2026

@copilot can you try again to apply changes based on the feedback in this PR?

The changes are applied in commit 9bffb0e. ObjectiveThresholds has been split into two typed dataclasses:

  • OptimizationObjectiveThresholds(max_final_loss, max_best_loss) — accepted by check_ifu_optimization_guardrails
  • VIObjectiveThresholds(max_final_objective, max_best_objective) — accepted by check_vi_guardrails

This removes the silent-ignore issue: passing the wrong threshold type to a checker now results in a type mismatch rather than incorrect pass/fail reporting. Tests and docs have been updated accordingly.

@TobiBu TobiBu merged commit 30e4be3 into feat/posterior-predictive-outputs Apr 21, 2026
1 of 2 checks passed
Copilot stopped work on behalf of TobiBu due to an error April 21, 2026 14:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants