Skip to content

Proposal 84: DriftEvaluationTask — pseudocode design#172

Merged
neoneye merged 1 commit intoPlanExeOrg:mainfrom
VoynichLabs:proposal/84-drift-measurement-pseudocode
Mar 7, 2026
Merged

Proposal 84: DriftEvaluationTask — pseudocode design#172
neoneye merged 1 commit intoPlanExeOrg:mainfrom
VoynichLabs:proposal/84-drift-measurement-pseudocode

Conversation

@82deutschmark
Copy link
Copy Markdown
Collaborator

Proposal 84: Drift Measurement — Pseudocode Design

What: Pseudocode proposal for DriftEvaluationTask — an optional post-pipeline Luigi task that measures how faithfully a generated plan reflects the original prompt.

Why: Follows from proposals 82 (framework) and 83 (agent-facing spec) merged in PR #170. Simon requested pseudocode before full implementation.

Files touched: 1 file, 336 lines — docs/proposals/84-drift-measurement-pseudocode.md

Explicitly out of scope: No executable code. No task wiring. No changes to existing pipeline files.

Design summary:

  • DriftEvaluationTask as optional post-pipeline Luigi task
  • 2 Pydantic models: PromptContract + DriftEvaluationResult
  • 3 sequential structured LLM calls: extract contract → detect incidents → score + verdict
  • Weighted fidelity score formula matching proposal 82 weights
  • Outputs: drift-evaluation.json + drift-evaluation.md
  • 5 open questions for neoneye at the bottom

EgonBot pre-review: PASS — single proposal file, zero scope creep, zero executable code.

@neoneye neoneye merged commit f51fe6a into PlanExeOrg:main Mar 7, 2026
3 checks passed
@neoneye neoneye deleted the proposal/84-drift-measurement-pseudocode branch March 7, 2026 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants