Skip to content

fix: tune v6 pipeline for post-engine-change validation#49

Merged
shaypal5 merged 5 commits intomainfrom
v6-tune-trap-boost
May 3, 2026
Merged

fix: tune v6 pipeline for post-engine-change validation#49
shaypal5 merged 5 commits intomainfrom
v6-tune-trap-boost

Conversation

@shaypal5
Copy link
Copy Markdown
Contributor

@shaypal5 shaypal5 commented May 3, 2026

Summary

Validation results (all pass):

Check Value
Baseline AUC 0.611
GBM improvement +0.021 over LR
Trap mean delta 0.048 (threshold: 0.03)
Trap min delta 0.028 (threshold: 0.015)
Value-aware uplift K=25 +38.7%

Test plan

  • All 26 v6 pipeline tests pass
  • Full validation passes (scripts/validate_v6_dataset.py)
  • CI passes

🤖 Generated with Claude Code

shaypal5 and others added 2 commits May 3, 2026 09:31
Engine changes in PRs #40, #43, #45 weakened the v6 dataset signal.
Two fixes:

1. Add Poisson(1) boost to the leakage trap column for converted leads
   (same approach proven in v5), restoring robust trap delta signal
   (mean 0.048, min 0.028 across 10 seeds).

2. Lower baseline AUC threshold from 0.62 to 0.60 — the engine changes
   reduced baseline LR AUC from 0.667 to 0.611, which is still well
   above chance and pedagogically useful. Snapshot day 10 was tested
   but made AUC worse (0.572), so day 14 is retained.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 3, 2026 06:32
@shaypal5 shaypal5 added type: bugfix Fixes a bug layer: recipes recipes/ recipe assets and registry labels May 3, 2026
@github-actions

This comment has been minimized.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Retunes the v6 lead-scoring intro dataset validation after recent simulation engine changes by strengthening the instructor-only leakage trap signal and relaxing the baseline AUC threshold so the validation suite remains stable and pedagogically appropriate.

Changes:

  • Lowered v6 baseline AUC validation lower bound from 0.62 → 0.60.
  • Added an instructor-only leakage trap “boost” step (Poisson(1) added for converted leads) in the v6 pipeline + snapshot build script.
  • Updated internal planning/notes to reflect the new post-engine-change validation numbers.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

File Description
scripts/validate_v6_dataset.py Relaxes the AUC lower threshold used by the v6 dataset validator.
scripts/build_v6_snapshot.py Applies the new boost_leakage_trap() step to the instructor dataset during snapshot build.
leadforge/pipelines/build_v6.py Introduces boost_leakage_trap() and updates module documentation/exports accordingly.
.agent-plan.md Updates recorded validation results to match the retuned post-engine-change targets.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread leadforge/pipelines/build_v6.py
Comment thread scripts/build_v6_snapshot.py
…sson(3) boost

Engine changes in PRs #40, #43, #45 weakened the v6 dataset signal.
Two tuning changes restore all validation metrics:

1. Shift SNAPSHOT_DAY from 14 to 20 — features need more accumulation
   time after engine changes; day-14 AUC was 0.611, day-20 is 0.676.

2. Poisson(3) trap boost for converted leads — the wider snapshot
   window leaves fewer post-snapshot days for causal trap signal,
   so a stronger boost compensates (mean delta 0.061, min 0.027).

All mandatory checks pass with comfortable margins. AUC threshold
kept at 0.62 (no relaxation needed).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions

This comment has been minimized.

Snapshot day 14→20, updated all baseline AUC, trap delta, value-aware
ranking, and teaching guidance numbers to match retuned pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 3, 2026 08:55
@github-actions

This comment has been minimized.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

COPILOT-1: Add 5 unit tests for boost_leakage_trap() covering:
only converted leads boosted, input immutability, determinism,
converted leads >= original, mean increases after boost.

COPILOT-2: Update build script docstring and inline comment to
reflect that the trap is no longer purely causal — it combines
causal post-snapshot touches with a Poisson(3) boost.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 3, 2026

pr-agent-context report:

No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #49 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 25278792727 attempt 1
Comment timestamp: 2026-05-03T12:10:36.240093+00:00
PR head commit: c907bed215b6292e3c32a9b26b694b4572ab4e4d

@shaypal5 shaypal5 merged commit 0ac1090 into main May 3, 2026
7 checks passed
@shaypal5 shaypal5 deleted the v6-tune-trap-boost branch May 3, 2026 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

layer: recipes recipes/ recipe assets and registry type: bugfix Fixes a bug

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants