Skip to content

feat: carry primary_task and label_window_days into WorldSpec#36

Merged
shaypal5 merged 3 commits intomainfrom
feat/worldspec-task-fields
May 1, 2026
Merged

feat: carry primary_task and label_window_days into WorldSpec#36
shaypal5 merged 3 commits intomainfrom
feat/worldspec-task-fields

Conversation

@shaypal5
Copy link
Copy Markdown
Contributor

@shaypal5 shaypal5 commented May 1, 2026

Summary

  • Add primary_task: str and label_window_days: int fields to GenerationConfig (defaults preserve current behavior: "converted_within_90_days" / 90)
  • Propagate through Recipe.from_dict(), resolve_config() (all 4 precedence layers), and Generator.from_recipe()
  • Update render_dataset_card() to read from world_spec.config instead of hard-coded string literals

Closes #6

Test plan

  • Dataset card renders custom primary_task name
  • Dataset card renders custom label_window_days value
  • Dataset card renders both custom fields together
  • Config resolution: primary_task defaults from recipe
  • Config resolution: primary_task via explicit kwarg
  • Config resolution: label_window_days defaults (90)
  • Config resolution: label_window_days from recipe YAML
  • Config resolution: label_window_days via explicit kwarg
  • Config resolution: override dict applies task fields
  • Config resolution: explicit kwargs beat override dict
  • All 750 existing tests still pass
  • ruff check . clean

🤖 Generated with Claude Code

shaypal5 and others added 2 commits May 1, 2026 09:52
…aset card

Add `primary_task` and `label_window_days` fields to `GenerationConfig`
(with defaults preserving current behavior). Propagate through
`Recipe.from_dict()`, `resolve_config()`, and `Generator.from_recipe()`
so recipe YAML can override them. Update `render_dataset_card()` to read
from `world_spec.config` instead of hard-coded string literals.

Closes #6

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 1, 2026 07:14
@shaypal5 shaypal5 added type: feature New capability layer: core core/ primitives (RNG, IDs, models, exceptions) layer: narrative narrative/ vertical story layer layer: recipes recipes/ recipe assets and registry labels May 1, 2026
@github-actions

This comment has been minimized.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Propagates task metadata (primary_task, label_window_days) through the generation configuration so narrative outputs (dataset card) can reflect resolved task settings instead of hard-coded literals.

Changes:

  • Add primary_task and label_window_days to GenerationConfig with validation and defaults preserving current behavior.
  • Extend Recipe.from_dict() / Recipe.resolve_config() and Generator.from_recipe() to parse/resolve/override the new fields.
  • Update dataset card rendering and add targeted tests for dataset card + config precedence.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
leadforge/core/models.py Adds primary_task / label_window_days to GenerationConfig and validates them.
leadforge/api/recipes.py Parses optional label_window_days from recipe dict and propagates both fields through config resolution layers.
leadforge/api/generator.py Exposes primary_task / label_window_days as Generator.from_recipe() kwargs and forwards them to config resolution.
leadforge/narrative/dataset_card.py Renders task name + label window from world_spec.config instead of literals.
tests/api/test_recipes.py Adds tests for precedence/defaulting behavior of the new config fields.
tests/narrative/test_dataset_card.py Adds tests asserting dataset card reflects custom primary_task / label_window_days.
.agent-plan.md Updates internal plan checklist for the PR work.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread leadforge/narrative/dataset_card.py
Comment thread leadforge/core/models.py
Comment thread leadforge/api/generator.py Outdated
Copy link
Copy Markdown
Contributor Author

@shaypal5 shaypal5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR #36 Review — Senior Dev Perspective

The big problem: this PR creates a configurable lie

You've added primary_task and label_window_days to GenerationConfig and wired them into the dataset card renderer. But that's the only place they're actually used. The rest of the system completely ignores these fields:

1. The bundle still hardcodes the task everywhere (critical)

  • api/bundle.py:97task_row_counts={CONVERTED_WITHIN_90_DAYS.task_id: task_row_counts} writes the manifest with the hardcoded constant regardless of config.primary_task
  • render/tasks.py:25 — task splits are written to tasks/converted_within_90_days/ using the CONVERTED_WITHIN_90_DAYS constant as default
  • validation/realism.py:43,85 — validates by reading tasks/converted_within_90_days/train.parquet literally
  • validation/drift.py:33,37 — same hardcoded path

So if someone sets primary_task="churned_within_60_days", the dataset card proudly says "Task: churned_within_60_days" but the actual directory is still tasks/converted_within_90_days/, the manifest still says converted_within_90_days, and validation still looks for that hardcoded path. The card becomes a misleading label on an unchanged box.

2. primary_task on Recipe vs GenerationConfig — asymmetric handling

Recipe already has primary_task: str (line 38), and resolve_config() does resolved["primary_task"] = self.primary_task at layer 3. But the PR doesn't acknowledge this existing field or explain the relationship. Meanwhile, label_window_days gets None-guarded treatment on Recipe (if self.label_window_days is not None) but primary_task always overwrites from the recipe — meaning the package default on GenerationConfig for primary_task is dead code that can never win over any loaded recipe.

3. label_window_days vs horizon_days — no enforcement of any relationship

These two fields are conceptually related (the label window can't exceed the simulation horizon), but there's zero validation tying them together. Someone can set horizon_days=30, label_window_days=90 — simulating 30 days but claiming the label covers 90. The dataset card will happily render "within 90 days" for a 30-day simulation. This is worse than the hardcoded version, which at least was consistent.

4. The dataset card template is still mostly hardcoded

The label definition text still says "A lead is considered converted if a closed_won event is recorded within {N} days..." — only the number is dynamic. If primary_task is "churned_within_60_days", it still says "converted" and "closed_won". The template is task-specific prose with two values punched out. That's not really configurable — it's a string interpolation that produces nonsense for any non-default value.

5. Config hashing side effect

core/hashing.py uses dataclasses.asdict(config) — so adding two new fields to GenerationConfig silently changes the config hash for every existing run, even when using the same defaults. This breaks the determinism invariant: same (recipe, config, seed, version) must produce identical output, but now the hash differs from pre-PR bundles. Not addressed or mentioned.

6. No test for invalid primary_task on GenerationConfig

You validate primary_task must be a non-empty string in __post_init__, but there's no test for GenerationConfig(primary_task="") or GenerationConfig(primary_task=42) raising InvalidConfigError.

7. No test for invalid label_window_days on Recipe.from_dict

You added validation in from_dict for label_window_days (bool check, positive int), but there are no tests exercising those error paths.


Verdict

The issue asks to "carry primary_task and label_window_days into WorldSpec for dataset card rendering" — and technically this PR does exactly that. But it does it in a way that creates semantic inconsistency: the config says one thing, the bundle does another. If the scope is truly just the dataset card, then these fields arguably don't belong on GenerationConfig at all — they should be derived from the TaskManifest / CONVERTED_WITHIN_90_DAYS constant that already exists in schema/tasks.py. Adding them to config suggests they're configurable across the system, which they aren't.

The cleaner approach: have render_dataset_card() read from schema.tasks.CONVERTED_WITHIN_90_DAYS (which already has task_id and label_window_days), or accept a TaskManifest parameter. No new config fields, no precedence plumbing, no pretense of configurability.

If the intent is full configurability, the PR is maybe 20% of the work — it needs to flow through bundle writing, task splits, validation, simulation, entity schema, and features before it's actually useful.

@github-actions

This comment has been minimized.

Review feedback addressed:

- Remove primary_task/label_window_days as explicit kwargs from
  resolve_config() and Generator.from_recipe() — these fields are
  resolved from recipe YAML and override dict only, not casually
  overridable, since the generation pipeline doesn't yet support
  arbitrary task types (Copilot-1, Copilot-3, shaypal5 #1, #2)
- Add label_window_days <= horizon_days validation in
  GenerationConfig.__post_init__ (Copilot-2, shaypal5 #3)
- Add tests for invalid primary_task on GenerationConfig: empty
  string, non-string type (shaypal5 #6, pr-agent-context)
- Add tests for invalid label_window_days on Recipe.from_dict: bool,
  non-positive, float (shaypal5 #7, pr-agent-context)
- Add test for label_window_days > horizon_days rejection
- Fix existing test using horizon_days=30 (now conflicts with default
  label_window_days=90)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

pr-agent-context report:

This run includes unresolved review comments on PR #36 in repository https://github.com/leadforge-dev/leadforge

For each unresolved review comment, recommend one of: resolve as irrelevant, accept and implement
the recommended solution, open a separate issue and resolve as out-of-scope for this PR, accept and
implement a different solution, or resolve as already treated by the code.

After I reply with my decision per item, implement the accepted actions, resolve the corresponding
PR comments, and push all of these changes in a single commit.

# Copilot Comments

## COPILOT-1
Location: leadforge/narrative/dataset_card.py:104
URL: https://github.com/leadforge-dev/leadforge/pull/36#discussion_r3172443347
Root author: copilot-pull-request-reviewer

Comment:
    `render_dataset_card()` now renders `cfg.primary_task` / `cfg.label_window_days`, but the actual generated bundle still hard-codes the task id/label column/window to `converted_within_90_days`/90 (e.g., bundle writing uses `CONVERTED_WITHIN_90_DAYS` and simulation outputs a `converted_within_90_days` column). This means callers can produce a dataset card that contradicts the exported task files and labels. Either (a) thread these config fields through task export + manifest + label column generation so they truly control the task, or (b) disallow/ignore overrides for now (e.g., validate they match the v1 invariant) so the dataset card can’t drift from the bundle contents.

## COPILOT-2
Location: leadforge/core/models.py:66
URL: https://github.com/leadforge-dev/leadforge/pull/36#discussion_r3172443359
Root author: copilot-pull-request-reviewer

Comment:
    `label_window_days` is validated as positive, but there’s no validation tying it to `horizon_days`. Since the simulation only runs for `horizon_days` days, allowing `label_window_days > horizon_days` will make the config (and dataset card) claim a longer label window than the data can support. Consider enforcing `label_window_days <= horizon_days` (or `==` if that’s the current invariant) in `__post_init__`.

## COPILOT-3
Location: leadforge/api/generator.py
URL: https://github.com/leadforge-dev/leadforge/pull/36#discussion_r3172443369
Status: outdated
Root author: copilot-pull-request-reviewer

Comment:
    `Generator.from_recipe()` exposes `primary_task` and `label_window_days` as overrides, but (as of now) the generation pipeline still exports `tasks/converted_within_90_days/` and a `converted_within_90_days` label column. This makes these kwargs look like they change the generated task when they currently only affect metadata (dataset card/config). Consider either removing these kwargs until task export supports them, or documenting very explicitly that they don’t affect task generation yet.

Run metadata:

Tool ref: v4
Tool version: 4.0.21
Trigger: commit pushed
Workflow run: 25211600765 attempt 1
Comment timestamp: 2026-05-01T10:45:25.044468+00:00
PR head commit: 23352aea9f9a7357dac5931e576e7bf14fc3a0ce

@shaypal5 shaypal5 merged commit 515363f into main May 1, 2026
7 checks passed
@shaypal5 shaypal5 deleted the feat/worldspec-task-fields branch May 1, 2026 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

layer: core core/ primitives (RNG, IDs, models, exceptions) layer: narrative narrative/ vertical story layer layer: recipes recipes/ recipe assets and registry type: feature New capability

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: carry primary_task and label_window_days into WorldSpec for dataset card rendering

2 participants