Context
PR #36 added primary_task and label_window_days to GenerationConfig for dataset card rendering. However, the rest of the generation pipeline still hardcodes converted_within_90_days:
api/bundle.py — manifest uses CONVERTED_WITHIN_90_DAYS.task_id
render/tasks.py — task splits written to tasks/converted_within_90_days/
validation/realism.py — reads hardcoded task directory path
validation/drift.py — reads hardcoded task directory path
schema/entities.py — LeadRow.converted_within_90_days field
simulation/engine.py — sets converted_within_90_days=state.converted
pipelines/build_v5.py, build_v6.py — hardcoded column rename mappings
What to do
Make the task name and label window configurable across the full pipeline so that config.primary_task and config.label_window_days control the actual generated output, not just the dataset card.
This is a large refactor touching schema, simulation, render, validation, and pipeline layers.
References
Context
PR #36 added
primary_taskandlabel_window_daystoGenerationConfigfor dataset card rendering. However, the rest of the generation pipeline still hardcodesconverted_within_90_days:api/bundle.py— manifest usesCONVERTED_WITHIN_90_DAYS.task_idrender/tasks.py— task splits written totasks/converted_within_90_days/validation/realism.py— reads hardcoded task directory pathvalidation/drift.py— reads hardcoded task directory pathschema/entities.py—LeadRow.converted_within_90_daysfieldsimulation/engine.py— setsconverted_within_90_days=state.convertedpipelines/build_v5.py,build_v6.py— hardcoded column rename mappingsWhat to do
Make the task name and label window configurable across the full pipeline so that
config.primary_taskandconfig.label_window_dayscontrol the actual generated output, not just the dataset card.This is a large refactor touching schema, simulation, render, validation, and pipeline layers.
References