Add golden regression store and coverage-aware validation by DevOpsMadDog · Pull Request #23 · DevOpsMadDog/Fixops

DevOpsMadDog · 2025-10-03T07:57:37Z

Summary

add a golden regression data store for loading and matching historical cases
integrate the decision engine with the store to return coverage-aware regression validation details
seed historical regression cases and add focused tests that exercise pass/fail/no-coverage paths

Testing

pytest tests/test_golden_regression.py

https://chatgpt.com/codex/tasks/task_e_68df1ed639848329924e3bced93280b5

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting

@codex fix this CI failure
@codex address that feedback

chatgpt-codex-connector · 2025-10-03T08:01:02Z

        return {
-            "status": "validated",
-            "confidence": 0.89,
-            "similar_cases": 23,
-            "validation_passed": True
+            "status": status,
+            "confidence": average_confidence,
+            "validation_passed": validation_passed,


Regression failures still boost consensus confidence

When _real_golden_regression_validation finds failing historical cases it sets status to "regression_failed", but it still returns the average match confidence in confidence. Downstream consensus (_real_consensus_checking) only consumes this numeric confidence and never inspects validation_passed, so a regression failure with high-confidence matches will still contribute a positive 0.6–0.9 score toward the overall decision and may allow a release despite detected regressions. Consider zeroing or penalizing confidence when validation_passed is False, or have consensus incorporate the failure flag.

Useful? React with 👍 / 👎.

…endpoints Closed in this batch (canonical envelope, mirroring batch-6 pattern): - /api/v1/posture-reports/reports #7: canonical envelope shipped - /api/v1/cloud-ir/incidents #17: canonical envelope shipped - /api/v1/network-forensics/captures #21: canonical envelope shipped - /api/v1/network-segmentation/segments #22: canonical envelope shipped - /api/v1/microsegmentation/segments #23: canonical envelope shipped - /api/v1/awareness-gamification/challenges #29: canonical envelope shipped - /api/v1/gdpr/activities #30: canonical envelope shipped Pattern (class-c): all seven list endpoints upgraded from minimal {<legacy_key>, total, hint} to the canonical batch-6/batch-7 envelope: { "items": [...], "<legacy_key>": [...], # back-compat (reports/incidents/captures/etc.) "total": int, "org_id": str, "limit": int, # ge=1, le=500 — defaults to 50 "offset": int, # ge=0 — defaults to 0 "filters_applied": {...} # echoes every filter param (None if unset) "hint": str # only present when total == 0 } Each endpoint now (1) accepts limit + offset query params with FastAPI ge/le validation, (2) echoes every filter back into filters_applied even when None (no missing keys), (3) always returns the full envelope shape even on hit (legacy clients keep their original key, new clients use items + pagination context), (4) preserves the actionable empty-state hint with a "this is correct for fresh tenants" framing. Triage status update: 26/30 fully closed. 4 class-a deferred (need real cloud creds, OAuth flows, or PAM tenant access not present in fleet — sprint-able with customer engagement). All class-b importer-gated endpoints (8) and all class-c structured-empty endpoints (12) now closed. Verified: pytest tests/test_empty_endpoints_batch7.py 11/11 PASS. Beast Mode regression on phase4/phase7/trustgraph/pipeline_api: 170/170 PASS. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…34c5fb Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… at HEAD 2c72e3a Suite 1 Beast Mode 13 files: 753 passed, 0 failed in 8.63s Suite 2 Perf -m perf: 194 passed, 2 skipped, 0 failed in 26.28s Suite 3 OWASP -m owasp: 47 passed, 2 skipped, 0 failed in 17.86s Suite 4 Lockdown (asyncio + coroutines): 11 passed, 0 failed in 6.50s Total: 1005 passed, 0 failed, 4 skipped, 0 broken collectors Commits validated since sweep #23: 9945b72, 2c72e3a (doc-only) CERTIFICATION: ALL GREEN Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Add historical regression validation store

96a977c

DevOpsMadDog added the codex label Oct 3, 2025 — with ChatGPT Codex Connector

chatgpt-codex-connector Bot reviewed Oct 3, 2025

View reviewed changes

DevOpsMadDog closed this Oct 4, 2025

DevOpsMadDog added a commit that referenced this pull request May 5, 2026

beast-mode(qa): regression sweep #23 — abbreviated, all green at HEAD a…

f2ddd3b

…34c5fb Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add golden regression store and coverage-aware validation#23

Add golden regression store and coverage-aware validation#23
DevOpsMadDog wants to merge 1 commit into
mainfrom
codex/implement-golden-regression-dataset-loader

DevOpsMadDog commented Oct 3, 2025

Uh oh!

chatgpt-codex-connector Bot left a comment

Uh oh!

chatgpt-codex-connector Bot Oct 3, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

DevOpsMadDog commented Oct 3, 2025

Summary

Testing

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Oct 3, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant