feat: run robustness estimator on empirical test data and persist metric by Copilot · Pull Request #21 · daedalus/ImpactGuard

Copilot · 2026-05-06T20:29:23Z

Replaces the placeholder example numbers in the robustness evaluator docs with real values measured from the current test suite, and persists the full metric snapshot in impactguard.toml.

Empirical inputs

	Value
Tests run	1,054 (425 adversarial / 629 normal)
Adversarial passing	424 / 425
Normal passing	629 / 629
Coverage	57%
α	0.65 (security context)

Results

Metric	Value	Label
Robustness Score R	0.5691	FAIR
Robustness + Diversity R_d	0.5691	—
Fragility Index F	0.0024	ROBUST
Adversarial ratio	40.3%	✓ ≥ 25%
Diversity D	1.000	all categories covered

Per-category (taxonomy): boundary 28/28, semantic 22/22, evasion 24/24, compositional 19/19.

Changes

impactguard.toml — new [impactguard.robustness] section persisting all metric values and per-category breakdown; acts as the canonical measured baseline
README.md — CLI example, Python API snippet, and "Example output" block replaced with the empirical numbers above

Summary by Sourcery

Persist empirically measured robustness metrics from the current test suite and surface them in documentation as the canonical example outputs.

New Features:

Add an [impactguard.robustness] section to impactguard.toml that stores the latest robustness evaluation snapshot and per-category adversarial breakdown.

Enhancements:

Update README examples for the robustness evaluator CLI and Python API to use real metrics derived from the current test suite instead of placeholder values.

…tric Agent-Logs-Url: https://github.com/daedalus/ImpactGuard/sessions/90d51e2c-b21a-40ca-a340-3949565596b8 Co-authored-by: daedalus <115175+daedalus@users.noreply.github.com>

sourcery-ai · 2026-05-06T20:29:44Z

Reviewer's Guide

Update robustness evaluation documentation and configuration to use empirically measured metrics from the current test suite, and persist the full robustness metric snapshot (including per-category breakdown) in impactguard.toml as a canonical baseline.

File-Level Changes

Change	Details	Files
Replace README robustness evaluator examples with empirical metrics from the current test suite.	Update Python API usage example to pass empirical totals, adversarial/normal counts, coverage, alpha, and per-category CategoryStats matching the current taxonomy tests. Update CLI example invocation arguments (n-total, n-adversarial, passing-adv, passing-norm, coverage, categories JSON) to match empirical test data. Update CLI JSON-output example command to use the same empirical inputs as the primary CLI example. Refresh the sample human-readable CLI output block to show computed metrics and per-category breakdown based on the empirical run, including updated robustness, fragility, and diversity values.	`README.md`
Persist the latest empirically measured robustness metrics and test composition in configuration as a baseline.	Introduce a new [impactguard.robustness] section documenting the last measured robustness evaluation from empirical test runs. Record test composition inputs (n_total, n_adversarial, n_normal, passing_adv, passing_norm, coverage, alpha) used for the robustness calculation. Store derived primary metrics (robustness_score, robustness_score_with_diversity, robustness_label) and adversarial-specific metrics (p_adversarial, p_normal, adversarial_ratio, fragility_index, fragility_label, diversity_score) with inline explanatory comments and thresholds. Add a nested [impactguard.robustness.categories] table capturing per-category adversarial totals and passing counts aligned with test_adversarial_taxonomy.py.	`impactguard.toml`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

codacy-production · 2026-05-06T20:30:23Z

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

_{NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer}
_{TIP This summary will be updated as you push new changes.}

sourcery-ai

Hey - I've left some high level feedback:

The empirical robustness metrics are now duplicated between README examples and impactguard.toml; consider centralizing these values (e.g., generating README snippets from the TOML or a single snapshot file) to avoid future drift when the baseline is updated.
Storing a specific empirical run’s metrics directly in impactguard.toml mixes configuration with measurement output; you might want to move the snapshot to a dedicated metrics/baseline file and keep the TOML focused on user-adjustable settings.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The empirical robustness metrics are now duplicated between README examples and impactguard.toml; consider centralizing these values (e.g., generating README snippets from the TOML or a single snapshot file) to avoid future drift when the baseline is updated.
- Storing a specific empirical run’s metrics directly in impactguard.toml mixes configuration with measurement output; you might want to move the snapshot to a dedicated metrics/baseline file and keep the TOML focused on user-adjustable settings.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

feat: run robustness estimator with empirical test data and update me…

e5d1aab

…tric Agent-Logs-Url: https://github.com/daedalus/ImpactGuard/sessions/90d51e2c-b21a-40ca-a340-3949565596b8 Co-authored-by: daedalus <115175+daedalus@users.noreply.github.com>

Copilot AI assigned Copilot and daedalus May 6, 2026

Copilot created this pull request from a session on behalf of daedalus May 6, 2026 20:29 View session

daedalus marked this pull request as ready for review May 6, 2026 20:30

daedalus merged commit e9bd719 into master May 6, 2026
1 check was pending

sourcery-ai Bot reviewed May 6, 2026

View reviewed changes

daedalus deleted the copilot/run-estimator-empirical-data branch May 7, 2026 02:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: run robustness estimator on empirical test data and persist metric#21

feat: run robustness estimator on empirical test data and persist metric#21
daedalus merged 1 commit intomasterfrom
copilot/run-estimator-empirical-data

Copilot AI commented May 6, 2026 •

edited by sourcery-ai Bot

Loading

Uh oh!

sourcery-ai Bot commented May 6, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

Uh oh!

codacy-production Bot commented May 6, 2026

Uh oh!

sourcery-ai Bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Copilot AI commented May 6, 2026 • edited by sourcery-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Empirical inputs

Results

Changes

Summary by Sourcery

Uh oh!

sourcery-ai Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

Uh oh!

codacy-production Bot commented May 6, 2026

Up to standards ✅

Uh oh!

sourcery-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Copilot AI commented May 6, 2026 •

edited by sourcery-ai Bot

Loading

sourcery-ai Bot commented May 6, 2026 •

edited

Loading