Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 10 additions & 2 deletions .agent-plan.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

## Current System State

**v0.5.0 in progress — Milestones 7–11 complete, v6 dataset shipped.** Full simulation engine + render/bundle + exposure filtering + CLI commands + validation harness implemented. v6 dataset adds: latent-aware touch intensity (`LatentDecayIntensity`), causally-grounded leakage trap (post-snapshot touches), student/instructor split, `touches_last_7_days` momentum feature, `acquisition_wave` cohort feature, soft ACV winsorization, GBM improvement validation.
**v1.0.0 released (2026-05-02).** All milestones (M0–M13) complete. Package version bumped to 1.0.0 in pyproject.toml and leadforge/version.py. README updated with `pip install leadforge`. CHANGELOG consolidated under v1.0.0 heading.

---

Expand Down Expand Up @@ -154,6 +154,14 @@ Documentation + CI:
- [x] `CHANGELOG.md` — created; covers v0.1.0 through post-v0.5.0 improvements; user-facing descriptions grouped by version
- [x] `.agent-plan.md` — updated deferred items table

### M15: Version bump to v1.0.0 (PR #47)

- [x] `pyproject.toml` — version `0.1.0` → `1.0.0`, classifier `Pre-Alpha` → `Production/Stable`
- [x] `leadforge/version.py` — `__version__` bumped to `1.0.0`
- [x] `README.md` — install changed from `pip install git+...` to `pip install leadforge`; removed "PyPI coming" note
- [x] `CHANGELOG.md` — `Unreleased` renamed to `v1.0.0 — (2026-05-02)`; milestone headings folded into collapsible development history
- [x] `.agent-plan.md` — updated to reflect v1.0.0 release

### Fix: direct conversion bypass for pre-SQL leads (PR #45, closes #44)

- [x] `leadforge/simulation/engine.py` — added `_DIRECT_CONVERSION_STAGES` and `_DIRECT_CONVERSION_DISCOUNT` (0.01) constants; pre-SQL leads (`mql`, `sal`) now have a small daily probability of converting directly, bypassing the full funnel
Expand Down Expand Up @@ -213,7 +221,7 @@ Documentation + CI:
| M14: Notebook 2 (lead scoring baseline) | Deferred | v4 validation script covers this |
| M14: Notebook 3 (public vs instructor) | Discarded | No current audience |
| M14: Notebook 4 (recipe customization) | Discarded | Premature |
| M15: Docs polish + v1.0 RC | **In progress** | README + CHANGELOG done; architecture diagram and notebooks remain |
| M15: Docs polish + v1.0 release | **Done** | README, CHANGELOG, version bump to 1.0.0 complete; architecture diagram and notebooks remain post-v1 |

### From post-v1 list

Expand Down
36 changes: 29 additions & 7 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,23 +7,43 @@ Format inspired by [Keep a Changelog](https://keepachangelog.com/).

## Unreleased

---

## v1.0.0 — 2026-05-02

First stable release. All milestones (M0–M13) complete.

### Highlights

- **Full end-to-end generation**: recipe → simulated world → relational bundle with deterministic reproducibility.
- **90-day daily-step simulation engine** with churn, stage advancement, conversion hazards, and touch emission.
- **5 motif families** (fit-dominant, intent-dominant, sales-execution-sensitive, demo/trial-mediated, buying-committee-friction) with stochastic rewiring.
- **Exposure filtering**: `student_public` (leakage-safe) and `research_instructor` (full truth) modes.
- **CLI**: `generate`, `inspect`, `validate`, `list-recipes` — all fully wired.
- **Validation harness**: determinism, exposure monotonicity, realism bounds, difficulty profiles, cross-seed drift detection.
- **v4–v6 dataset pipelines**: progressive dataset versions with leakage traps, student/instructor splits, value-aware scoring, and GBM improvement validation.

### Post-milestone improvements (since M13)

- **Direct conversion bypass** (PR #45): pre-SQL leads can now convert via a rare direct path, fixing the deterministic `is_sql → converts` invariant.
- **Configurable label window** (PR #43): `label_window_days` controls conversion label derivation in the simulation.
- **Generalized task support** (PR #40, #42): `primary_task` threaded through bundle, validation, and pipelines; dataset card prose adapts to non-conversion tasks.
- **Pipeline extraction** (PR #29, #34): build pipeline functions extracted into `leadforge.pipelines` with proper RNG conventions.
- **Latent-aware touch intensity** (PR #31): `LatentDecayIntensity` mechanism creates causal link between latent traits and touch patterns.
- **Canonical validation module** (PR #26): reusable lead scoring validation with sklearn pipeline.
- **v4–v6 dataset pipelines**: progressive dataset versions with leakage traps, student/instructor splits, value-aware scoring, and GBM improvement validation.

---
### Development history

<details>
<summary>Milestone changelog (no intermediate versions were published)</summary>

## Milestone 0.5.0 — Validation Harness & CLI Complete (2026-04-29)
#### Milestone 0.5.0 — Validation Harness & CLI Complete (2026-04-29)

- Full validation harness: determinism checks, exposure monotonicity, realism bounds, difficulty validation, cross-seed drift detection.
- `leadforge validate` command with artifact checks, FK integrity, leakage detection, and task split validation.
- Parquet metadata used for row counts (no full table reads during validation).

## Milestone 0.4.0 — Simulation Engine & End-to-End Generation (2026-04-28)
#### Milestone 0.4.0 — Simulation Engine & End-to-End Generation (2026-04-28)

- 90-day daily-step simulation engine with churn, stage advancement, conversion hazards, and touch emission.
- Population generation: accounts (3 latent traits), contacts (4 traits), leads (1 trait) with motif-family biases.
Expand All @@ -32,23 +52,25 @@ Format inspired by [Keep a Changelog](https://keepachangelog.com/).
- CLI commands: `generate`, `inspect`, `validate`, `list-recipes` — all fully wired.
- Bundle manifest with provenance, row counts, and SHA-256 file hashes.

## Milestone 0.3.0 — World Structure & Mechanisms (2026-04-25)
#### Milestone 0.3.0 — World Structure & Mechanisms (2026-04-25)

- Hidden world graph (DAG) with 5 motif families: fit-dominant, intent-dominant, sales-execution-sensitive, demo/trial-mediated, buying-committee-friction.
- Stochastic graph rewiring: optional-node dropping, edge-weight jitter, latent-confounder injection.
- Mechanism layer: latent scores, conversion hazards, stage transitions, Poisson intensities, categorical influences, noisy proxies.
- Motif-aware mechanism assignment policies.

## Milestone 0.2.0 — Config, Recipes & Narrative (2026-04-20)
#### Milestone 0.2.0 — Config, Recipes & Narrative (2026-04-20)

- Typed `GenerationConfig`, `Recipe`, `WorldSpec` models with full precedence resolution (kwargs > override > recipe > defaults).
- Seeded RNG with SHA-256-derived named substreams for reproducibility.
- Narrative layer: company, product, market, GTM motion, personas, funnel stages — loaded from recipe YAML.
- Schema layer: 9 entity dataclasses, 10 FK constraints, 29 snapshot features, feature dictionary writer.
- Dataset card renderer from narrative + world spec.

## Milestone 0.1.0 — Project Foundation (2026-04-18)
#### Milestone 0.1.0 — Project Foundation (2026-04-18)

- Package skeleton, CLI entry point (`leadforge list-recipes`), CI pipeline.
- Recipe registry with `b2b_saas_procurement_v1` recipe.
- GitHub Actions: lint, typecheck, test matrix (Python 3.11 + 3.12).

</details>
8 changes: 6 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,14 @@
Requires **Python 3.11+**.

```bash
pip install git+https://github.com/leadforge-dev/leadforge.git
pip install leadforge
```

> PyPI package coming with the v1.0 release.
Or install directly from GitHub:

```bash
pip install git+https://github.com/leadforge-dev/leadforge.git
```

For development:

Expand Down
6 changes: 5 additions & 1 deletion leadforge/version.py
Original file line number Diff line number Diff line change
@@ -1 +1,5 @@
__version__ = "0.1.0"
"""Single-source package version, read from installed metadata."""

from importlib.metadata import version

__version__: str = version("leadforge")
4 changes: 2 additions & 2 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@ build-backend = "setuptools.build_meta"

[project]
name = "leadforge"
version = "0.1.0"
version = "1.0.0"
description = "Opinionated framework for generating synthetic CRM and GTM datasets from simulated commercial worlds"
readme = "README.md"
license = { text = "MIT" }
requires-python = ">=3.11"
authors = [{ name = "leadforge contributors" }]
keywords = ["synthetic data", "CRM", "lead scoring", "machine learning", "simulation"]
classifiers = [
"Development Status :: 2 - Pre-Alpha",
"Development Status :: 5 - Production/Stable",
"Intended Audience :: Developers",
"Intended Audience :: Education",
"Intended Audience :: Science/Research",
Expand Down
Loading