feat: make label_window_days affect conversion label derivation#43
Merged
feat: make label_window_days affect conversion label derivation#43
Conversation
The simulation engine now gates `converted_within_90_days` by `config.label_window_days` instead of using the full horizon. Simulation still runs for `horizon_days` to produce rich event histories; only label derivation respects the label window. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
- test_late_conversions_excluded now verifies conversion day offset < 30 (was a tautology — only checked timestamp not None) - Add bundle round-trip integration test (Generator → save → read Parquet) - Document feature-label temporal mismatch in build_snapshot when label_window_days < horizon_days Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Pull request overview
Updates the simulation engine so the binary conversion label (converted_within_90_days) is derived using GenerationConfig.label_window_days rather than always using the full simulation horizon, while keeping the simulation itself running for horizon_days.
Changes:
- Gate
converted_within_90_daysonconversion_day < config.label_window_daysinsimulate_world(). - Update
LeadRowdocumentation to clarify the label is driven bylabel_window_days(name retained for schema stability). - Add unit tests covering several
label_window_daysscenarios and ensuring event generation is unchanged.
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
tests/simulation/test_engine.py |
Extends test config helper to accept label_window_days and adds a new test class validating label-window behavior. |
leadforge/simulation/engine.py |
Computes converted_within_90_days from conversion timing relative to label_window_days, not the full horizon. |
leadforge/schema/entities.py |
Expands LeadRow docstring explaining converted_within_90_days now reflects label_window_days. |
.agent-plan.md |
Updates the agent plan log to reflect the completed work/tests for the label-window change. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Address Copilot review: document the intentional case where a lead converts after the label window, producing a non-None conversion_timestamp with converted_within_90_days=False. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
pr-agent-context report: No unresolved review comments, failing checks, or actionable patch coverage gaps were found on PR #43 in repository https://github.com/leadforge-dev/leadforge. Treat this PR as all clear unless new signals appear.Run metadata: |
3 tasks
shaypal5
added a commit
that referenced
this pull request
May 3, 2026
…sson(3) boost Engine changes in PRs #40, #43, #45 weakened the v6 dataset signal. Two tuning changes restore all validation metrics: 1. Shift SNAPSHOT_DAY from 14 to 20 — features need more accumulation time after engine changes; day-14 AUC was 0.611, day-20 is 0.676. 2. Poisson(3) trap boost for converted leads — the wider snapshot window leaves fewer post-snapshot days for causal trap signal, so a stronger boost compensates (mean delta 0.061, min 0.027). All mandatory checks pass with comfortable margins. AUC threshold kept at 0.62 (no relaxation needed). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
shaypal5
added a commit
that referenced
this pull request
May 3, 2026
* fix: tune v6 pipeline for post-engine-change validation Engine changes in PRs #40, #43, #45 weakened the v6 dataset signal. Two fixes: 1. Add Poisson(1) boost to the leakage trap column for converted leads (same approach proven in v5), restoring robust trap delta signal (mean 0.048, min 0.028 across 10 seeds). 2. Lower baseline AUC threshold from 0.62 to 0.60 — the engine changes reduced baseline LR AUC from 0.667 to 0.611, which is still well above chance and pedagogically useful. Snapshot day 10 was tested but made AUC worse (0.572), so day 14 is retained. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update .agent-plan.md with v6 retune validation results Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: tune v6 for post-engine-change validation — snapshot day 20, Poisson(3) boost Engine changes in PRs #40, #43, #45 weakened the v6 dataset signal. Two tuning changes restore all validation metrics: 1. Shift SNAPSHOT_DAY from 14 to 20 — features need more accumulation time after engine changes; day-14 AUC was 0.611, day-20 is 0.676. 2. Poisson(3) trap boost for converted leads — the wider snapshot window leaves fewer post-snapshot days for causal trap signal, so a stronger boost compensates (mean delta 0.061, min 0.027). All mandatory checks pass with comfortable margins. AUC threshold kept at 0.62 (no relaxation needed). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: update RELEASE_v6.md metrics for retuned pipeline Snapshot day 14→20, updated all baseline AUC, trap delta, value-aware ranking, and teaching guidance numbers to match retuned pipeline. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: address Copilot review — add boost tests, fix stale docstrings COPILOT-1: Add 5 unit tests for boost_leakage_trap() covering: only converted leads boosted, input immutability, determinism, converted leads >= original, mean increases after boost. COPILOT-2: Update build script docstring and inline comment to reflect that the trap is no longer purely causal — it combines causal post-snapshot touches with a Poisson(3) boost. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Closes #41
Summary
simulation/engine.py:converted_within_90_daysis now gated byconfig.label_window_days— a lead is only labeled positive if its conversion event occurred before daylabel_window_days(not the fullhorizon_days). The simulation still runs for the full horizon to produce rich event histories; only label derivation is affected.schema/entities.py:LeadRowdocstring updated to document that the field useslabel_window_days(field name retained for schema stability).conversion_timestampstill set for actual conversions outside window, event counts unchanged by window.Design decisions
converted_within_90_daysis kept as-is per issue guidance — renaming would touch too many files across schema, rendering, validation, and pipelines.label_window_daysonly affects the boolean label, notconversion_timestamp, event generation, or post-simulation entity creation (opportunities, customers, subscriptions).conversion_day < label_window_days(exclusive upper bound), matching the existingrange(horizon_days)convention where day indices are 0-based.Test plan
test_default_90_unchanged— labels identical to old behavior with default configtest_shorter_window_fewer_conversions— 30-day window strictly fewer than 90-daytest_window_1_almost_no_conversions— 1-day window produces 0 conversionstest_late_conversions_excluded— positive labels only for early conversionstest_conversion_timestamp_still_set_outside_window— late converters have timestamp but label=Falsetest_event_count_unchanged_by_window— touches/sessions/activities identical regardless of window🤖 Generated with Claude Code