docs: task complexity scoring + model routing proposal (post-mortem + rubric) by 82deutschmark · Pull Request #102 · PlanExeOrg/PlanExe

82deutschmark · 2026-02-27T04:22:17Z

Summary

Post-mortem of Simon's 26 Feb 2026 refactor + a proposal for adding task complexity scoring and model routing to PlanExe.

What's in this PR (docs-only, no code changes)

71-model-routing-postmortem-simon-26feb2026.md
Analysis of Simon's day (64 commits, 108 files, ~15K lines). Corrected cost estimate: ~$18 total at Opus. Identifies where Haiku/Minimax could have handled execution after an Opus planning pass.

72-complexity-assessment-*.md (3 files)
Three independent complexity assessments of the same 8 task clusters — one from each model perspective (Sonnet 4.6, Haiku 4.5, Minimax M2.5). Written independently, designed to show how different model tiers assess the same work.

73-task-complexity-routing.md
The core proposal: PlanExe should score each task it generates on a 4-dimension Likert rubric (file size, semantic complexity, ambiguity, context dependency) and route tasks to the appropriate model tier for execution.

For Simon's Review

The three complexity assessments need your calibration. Questions embedded in each doc:

Which model's scores most closely matched the actual difficulty you experienced?
Were there tasks we all scored too high or too low?
Would you have trusted Haiku for the security hardening work given an explicit spec?

This is the first real calibration dataset for the routing rubric. Your feedback drives the next iteration.

Authors

Larry (Sonnet 4.6) — primary author
Egon (Minimax M2.5) — Minimax perspective
Bubba (Haiku 4.5) — Haiku perspective
Authorized by: Mark Barney

- Post-mortem of Simon's 26 Feb 2026 refactor with realistic cost estimates (~$18) - Three independent complexity assessments (Sonnet/Haiku/Minimax perspectives) - Proposal 73: task complexity scoring + model routing for PlanExe - For Simon's calibration review

Larry the Laptop Lobster added 2 commits February 26, 2026 23:21

docs: remove fake Egon assessment — must come from Egon directly

aa6d60c

82deutschmark mentioned this pull request Feb 27, 2026

docs: proposal 74 — model routing UX modes (auto/optimize/review) #103

Merged

neoneye merged commit ed8a14d into PlanExeOrg:main Feb 27, 2026
3 checks passed

neoneye deleted the docs/complexity-routing-proposal branch February 27, 2026 10:26

82deutschmark mentioned this pull request Feb 27, 2026

docs: proposal 77 — cache-aware model handoff architecture #106

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: task complexity scoring + model routing proposal (post-mortem + rubric)#102

docs: task complexity scoring + model routing proposal (post-mortem + rubric)#102
neoneye merged 2 commits intoPlanExeOrg:mainfrom
VoynichLabs:docs/complexity-routing-proposal

82deutschmark commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

82deutschmark commented Feb 27, 2026

Summary

What's in this PR (docs-only, no code changes)

For Simon's Review

Authors

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants