feat: carry primary_task and label_window_days into WorldSpec by shaypal5 · Pull Request #36 · leadforge-dev/leadforge

shaypal5 · 2026-05-01T07:14:05Z

Summary

Add primary_task: str and label_window_days: int fields to GenerationConfig (defaults preserve current behavior: "converted_within_90_days" / 90)
Propagate through Recipe.from_dict(), resolve_config() (all 4 precedence layers), and Generator.from_recipe()
Update render_dataset_card() to read from world_spec.config instead of hard-coded string literals

Closes #6

Test plan

🤖 Generated with Claude Code

…aset card Add `primary_task` and `label_window_days` fields to `GenerationConfig` (with defaults preserving current behavior). Propagate through `Recipe.from_dict()`, `resolve_config()`, and `Generator.from_recipe()` so recipe YAML can override them. Update `render_dataset_card()` to read from `world_spec.config` instead of hard-coded string literals. Closes #6 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot

Pull request overview

Propagates task metadata (primary_task, label_window_days) through the generation configuration so narrative outputs (dataset card) can reflect resolved task settings instead of hard-coded literals.

Changes:

Add primary_task and label_window_days to GenerationConfig with validation and defaults preserving current behavior.
Extend Recipe.from_dict() / Recipe.resolve_config() and Generator.from_recipe() to parse/resolve/override the new fields.
Update dataset card rendering and add targeted tests for dataset card + config precedence.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`leadforge/core/models.py`	Adds `primary_task` / `label_window_days` to `GenerationConfig` and validates them.
`leadforge/api/recipes.py`	Parses optional `label_window_days` from recipe dict and propagates both fields through config resolution layers.
`leadforge/api/generator.py`	Exposes `primary_task` / `label_window_days` as `Generator.from_recipe()` kwargs and forwards them to config resolution.
`leadforge/narrative/dataset_card.py`	Renders task name + label window from `world_spec.config` instead of literals.
`tests/api/test_recipes.py`	Adds tests for precedence/defaulting behavior of the new config fields.
`tests/narrative/test_dataset_card.py`	Adds tests asserting dataset card reflects custom `primary_task` / `label_window_days`.
`.agent-plan.md`	Updates internal plan checklist for the PR work.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

shaypal5

PR #36 Review — Senior Dev Perspective

The big problem: this PR creates a configurable lie

You've added primary_task and label_window_days to GenerationConfig and wired them into the dataset card renderer. But that's the only place they're actually used. The rest of the system completely ignores these fields:

1. The bundle still hardcodes the task everywhere (critical)

api/bundle.py:97 — task_row_counts={CONVERTED_WITHIN_90_DAYS.task_id: task_row_counts} writes the manifest with the hardcoded constant regardless of config.primary_task
render/tasks.py:25 — task splits are written to tasks/converted_within_90_days/ using the CONVERTED_WITHIN_90_DAYS constant as default
validation/realism.py:43,85 — validates by reading tasks/converted_within_90_days/train.parquet literally
validation/drift.py:33,37 — same hardcoded path

So if someone sets primary_task="churned_within_60_days", the dataset card proudly says "Task: churned_within_60_days" but the actual directory is still tasks/converted_within_90_days/, the manifest still says converted_within_90_days, and validation still looks for that hardcoded path. The card becomes a misleading label on an unchanged box.

2. primary_task on Recipe vs GenerationConfig — asymmetric handling

Recipe already has primary_task: str (line 38), and resolve_config() does resolved["primary_task"] = self.primary_task at layer 3. But the PR doesn't acknowledge this existing field or explain the relationship. Meanwhile, label_window_days gets None-guarded treatment on Recipe (if self.label_window_days is not None) but primary_task always overwrites from the recipe — meaning the package default on GenerationConfig for primary_task is dead code that can never win over any loaded recipe.

3. label_window_days vs horizon_days — no enforcement of any relationship

These two fields are conceptually related (the label window can't exceed the simulation horizon), but there's zero validation tying them together. Someone can set horizon_days=30, label_window_days=90 — simulating 30 days but claiming the label covers 90. The dataset card will happily render "within 90 days" for a 30-day simulation. This is worse than the hardcoded version, which at least was consistent.

4. The dataset card template is still mostly hardcoded

The label definition text still says "A lead is considered converted if a closed_won event is recorded within {N} days..." — only the number is dynamic. If primary_task is "churned_within_60_days", it still says "converted" and "closed_won". The template is task-specific prose with two values punched out. That's not really configurable — it's a string interpolation that produces nonsense for any non-default value.

5. Config hashing side effect

core/hashing.py uses dataclasses.asdict(config) — so adding two new fields to GenerationConfig silently changes the config hash for every existing run, even when using the same defaults. This breaks the determinism invariant: same (recipe, config, seed, version) must produce identical output, but now the hash differs from pre-PR bundles. Not addressed or mentioned.

6. No test for invalid primary_task on GenerationConfig

You validate primary_task must be a non-empty string in __post_init__, but there's no test for GenerationConfig(primary_task="") or GenerationConfig(primary_task=42) raising InvalidConfigError.

7. No test for invalid label_window_days on Recipe.from_dict

You added validation in from_dict for label_window_days (bool check, positive int), but there are no tests exercising those error paths.

Verdict

The issue asks to "carry primary_task and label_window_days into WorldSpec for dataset card rendering" — and technically this PR does exactly that. But it does it in a way that creates semantic inconsistency: the config says one thing, the bundle does another. If the scope is truly just the dataset card, then these fields arguably don't belong on GenerationConfig at all — they should be derived from the TaskManifest / CONVERTED_WITHIN_90_DAYS constant that already exists in schema/tasks.py. Adding them to config suggests they're configurable across the system, which they aren't.

The cleaner approach: have render_dataset_card() read from schema.tasks.CONVERTED_WITHIN_90_DAYS (which already has task_id and label_window_days), or accept a TaskManifest parameter. No new config fields, no precedence plumbing, no pretense of configurability.

If the intent is full configurability, the PR is maybe 20% of the work — it needs to flow through bundle writing, task splits, validation, simulation, entity schema, and features before it's actually useful.

Review feedback addressed: - Remove primary_task/label_window_days as explicit kwargs from resolve_config() and Generator.from_recipe() — these fields are resolved from recipe YAML and override dict only, not casually overridable, since the generation pipeline doesn't yet support arbitrary task types (Copilot-1, Copilot-3, shaypal5 #1, #2) - Add label_window_days <= horizon_days validation in GenerationConfig.__post_init__ (Copilot-2, shaypal5 #3) - Add tests for invalid primary_task on GenerationConfig: empty string, non-string type (shaypal5 #6, pr-agent-context) - Add tests for invalid label_window_days on Recipe.from_dict: bool, non-positive, float (shaypal5 #7, pr-agent-context) - Add test for label_window_days > horizon_days rejection - Fix existing test using horizon_days=30 (now conflicts with default label_window_days=90) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-05-01T10:46:13Z

pr-agent-context report:

This run includes unresolved review comments on PR #36 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: leadforge/narrative/dataset_card.py:104
URL: https://github.com/leadforge-dev/leadforge/pull/36#discussion_r3172443347
Root author: copilot-pull-request-reviewer

Comment:
    `render_dataset_card()` now renders `cfg.primary_task` / `cfg.label_window_days`, but the actual generated bundle still hard-codes the task id/label column/window to `converted_within_90_days`/90 (e.g., bundle writing uses `CONVERTED_WITHIN_90_DAYS` and simulation outputs a `converted_within_90_days` column). This means callers can produce a dataset card that contradicts the exported task files and labels. Either (a) thread these config fields through task export + manifest + label column generation so they truly control the task, or (b) disallow/ignore overrides for now (e.g., validate they match the v1 invariant) so the dataset card can’t drift from the bundle contents.

## COPILOT-2
Location: leadforge/core/models.py:66
URL: https://github.com/leadforge-dev/leadforge/pull/36#discussion_r3172443359
Root author: copilot-pull-request-reviewer

Comment:
    `label_window_days` is validated as positive, but there’s no validation tying it to `horizon_days`. Since the simulation only runs for `horizon_days` days, allowing `label_window_days > horizon_days` will make the config (and dataset card) claim a longer label window than the data can support. Consider enforcing `label_window_days <= horizon_days` (or `==` if that’s the current invariant) in `__post_init__`.

## COPILOT-3
Location: leadforge/api/generator.py
URL: https://github.com/leadforge-dev/leadforge/pull/36#discussion_r3172443369
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    `Generator.from_recipe()` exposes `primary_task` and `label_window_days` as overrides, but (as of now) the generation pipeline still exports `tasks/converted_within_90_days/` and a `converted_within_90_days` label column. This makes these kwargs look like they change the generated task when they currently only affect metadata (dataset card/config). Consider either removing these kwargs until task export supports them, or documenting very explicitly that they don’t affect task generation yet.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 25211600765 attempt 1
Comment timestamp: 2026-05-01T10:45:25.044468+00:00
PR head commit: 23352aea9f9a7357dac5931e576e7bf14fc3a0ce

shaypal5 and others added 2 commits May 1, 2026 09:52

docs: update .agent-plan.md for WorldSpec task fields (PR #36)

1a06183

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Copilot AI review requested due to automatic review settings May 1, 2026 07:14

shaypal5 added this to the v0.3.0 — Motif variability + exposure modes milestone May 1, 2026

shaypal5 added type: feature New capability layer: core core/ primitives (RNG, IDs, models, exceptions) layer: narrative narrative/ vertical story layer layer: recipes recipes/ recipe assets and registry labels May 1, 2026

Copilot started reviewing on behalf of shaypal5 May 1, 2026 07:14 View session

This comment has been minimized.

Sign in to view

Copilot AI reviewed May 1, 2026

View reviewed changes

Comment thread leadforge/narrative/dataset_card.py

Comment thread leadforge/core/models.py

Comment thread leadforge/api/generator.py Outdated

shaypal5 commented May 1, 2026

View reviewed changes

This comment has been minimized.

Sign in to view

This was referenced May 1, 2026

Thread primary_task through bundle writing, validation, and pipeline #37

Closed

Generalize dataset card template prose for non-conversion tasks #38

Closed

shaypal5 merged commit 515363f into main May 1, 2026
7 checks passed

shaypal5 deleted the feat/worldspec-task-fields branch May 1, 2026 15:52

shaypal5 mentioned this pull request May 1, 2026

Make label_window_days affect simulation and snapshot behavior #41

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: carry primary_task and label_window_days into WorldSpec#36

feat: carry primary_task and label_window_days into WorldSpec#36
shaypal5 merged 3 commits intomainfrom
feat/worldspec-task-fields

shaypal5 commented May 1, 2026

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shaypal5 left a comment

Uh oh!

This comment has been minimized.

github-actions Bot commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

shaypal5 commented May 1, 2026

Summary

Test plan

Uh oh!

This comment has been minimized.

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

shaypal5 left a comment

Choose a reason for hiding this comment

PR #36 Review — Senior Dev Perspective

The big problem: this PR creates a configurable lie

Verdict

Uh oh!

This comment has been minimized.

github-actions Bot commented May 1, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants