refactor: remove planning module and repair system by spinje · Pull Request #122 · spinje/pflow

spinje · 2026-03-16T18:58:10Z

Summary

Remove the gated planning module (~7,300 production lines, ~23,000 test lines) and repair system (~1,050 production lines). Replace PocketFlow-based discovery nodes with plain functions. Drop direct anthropic SDK dependency. Fix two pre-existing runtime bugs in error reporting exposed during the refactor.

Net impact: ~42,000 lines removed. 4,070 tests pass, 0 skipped from planning/repair.

Task

Task 92: Remove Planning Module and Repair System

Changes

Phase 1: Relocate active dependencies

parse_structured_response → core/llm_utils.py
smart_filter.py → registry/smart_filter.py
build_planning_context + build_nodes_context → registry/context_builder.py
build_workflows_context → core/workflow/context.py
Prompt loader → core/prompt_utils.py
WorkflowDiscoveryNode → core/workflow/discovery.py:discover_workflow() (plain function + WorkflowMatch dataclass)
ComponentBrowsingNode → registry/discovery.py:discover_components() (plain function + ComponentSelection dataclass)
Removed install_anthropic_model monkey-patch, dropped anthropic>=0.75 direct dep

Phase 2: Delete and clean

Deleted src/pflow/planning/ (32 files), repair files, 68+ test files
Simplified workflow_execution.py from 642 → 83 lines (validate→execute→return)
Removed 654 lines from cli/main.py (13 planner functions, 7 CLI flags)
Removed planner metrics, is_planner threading, __modified_nodes__ tracking
Renamed: CriticalPlanningError → CriticalDiscoveryError, build_planning_context → build_component_context
Updated all CLAUDE.md files, architecture docs, agent definitions, user-facing docs

Bug fixes (pre-existing, exposed by refactor)

API warnings (__warnings__) now surfaced as actionable error messages instead of generic "Workflow failed with action: error"
MCP "error": null no longer produces "None" string in error output
Nested MCP data.error (Slack/Discord style) now correctly unwrapped
Added 3 regression tests protecting these fixes

Explanation

The planning module was gated since Task 107 (markdown format migration) with 516 skipped tests and unreachable code paths. Rather than maintaining dormant code, we removed it cleanly. The few active features that depended on planning code (discovery commands, context builder, structured response parser) were relocated to their natural homes.

Discovery features (pflow workflow discover, pflow registry discover) were preserved as plain functions with typed dataclass returns — simpler, more testable, and with explicit contracts vs the PocketFlow node ceremony.

The anthropic SDK was only used by planning's custom Anthropic wrapper for prompt caching and thinking tokens. The llm-anthropic plugin natively supports schema= for structured output, which is all discovery needs.

Created Docs

.taskmaster/tasks/task_92/task-92.md — updated task spec reflecting what was actually implemented
.taskmaster/tasks/task_92/task-review.md — post-implementation review with integration points, patterns, and AI agent guidance
.taskmaster/tasks/task_92/implementation/progress-log.md — detailed implementation journal with decisions, corrections, and learnings

Testing

make test: 4,070 passed, 0 skipped, 0 failed
make check: all green (ruff, mypy, deptry)
Manual integration testing: 27 scenarios across execution, error handling, CLI flags, registry, workflow management, pipes, batch, traces, MCP server
Zero pflow.planning references remain in src/ or tests/

Move actively-used code out of planning/ module to their natural homes: - parse_structured_response → core/llm_utils.py - smart_filter.py → registry/smart_filter.py - build_planning_context + build_nodes_context → registry/context_builder.py - build_workflows_context → core/workflow/context.py - prompt loader → core/prompt_utils.py Replace PocketFlow discovery nodes with plain functions: - WorkflowDiscoveryNode → core/workflow/discovery.py:discover_workflow() - ComponentBrowsingNode → registry/discovery.py:discover_components() Remove install_anthropic_model monkey-patch and drop direct anthropic>=0.75 dependency (llm-anthropic provides it transitively). Deprecation notices added to planning/ source files pending Phase 2 deletion.

Phase 2 of planning removal — deletes src/pflow/planning/ (~7,300 lines), repair system (~1,050 lines), and all associated tests (~23,000 lines). Code cleanup: - Simplified workflow_execution.py from 642 to 83 lines (validate→execute→return) - Removed 654 lines from cli/main.py (13 planner functions, 7 CLI flags) - Removed planner metrics fields, is_planner threading, __modified_nodes__ tracking - Removed _is_fixable_error(), repaired_workflow_ir, PlannerError, update_ir() - Renamed CriticalPlanningError→CriticalDiscoveryError, build_planning_context→build_component_context Bug fixes (pre-existing, exposed by refactor): - API warnings (__warnings__) now surfaced as actionable error messages instead of generic "Workflow failed with action: error" - MCP "error": null no longer produces "None" string in error output - Nested MCP data.error (Slack/Discord style) now correctly unwrapped Documentation: Updated all CLAUDE.md files, architecture docs, agent definitions, user-facing docs, scripts. Removed natural language mode from CLI docs. Tests: 4,070 pass, 0 skipped from planning/repair. Added 3 regression tests for the error formatting bug fixes.

…iew] - Tightened CLI test assertions to verify post-planner direct execution - Added explicit validation short-circuit assertion in workflow_execution test - Added degraded-success formatter coverage (2 tests) - Updated 5 CLAUDE.md files: build_planning_context → build_component_context - Removed stale DEP003 anthropic exclusion (no longer imported) - Updated progress log

claude · 2026-03-16T19:01:24Z

Code Review — PR #122: Remove Planning Module and Repair System

Overall Assessment: This is an excellent cleanup PR. Removing ~42,000 lines of dead code, simplifying workflow_execution.py from 642 to 83 lines, and preserving active features as plain functions with typed returns is exactly the right approach. The bug fixes for MCP null-error handling and API warning surfacing are well-motivated and well-tested.

Critical — must fix before merge

None.

Warnings — should be addressed

1. context_builder.py is 900 lines — largest new file in the PR

src/pflow/registry/context_builder.py was relocated from planning/context_blocks.py and appears to have been moved largely as-is. While this is fine for a relocation, it's worth noting this is a large module with many internal functions. Not a blocker for this PR (it was already this size), but worth flagging for future decomposition.

2. parse_structured_response silently falls back on Pydantic validation failure

In src/pflow/core/llm_utils.py:62-66:

except Exception as e:
    # If validation fails, log and return raw result
    logger.warning(f"Failed to validate result through {expected_type.__name__}: {e}")
    return result

This silently returns unvalidated data when Pydantic validation fails. The callers (discover_workflow, discover_components, smart_filter) all rely on specific field names being present. If the LLM returns malformed data that passes JSON parsing but fails Pydantic validation, the caller will get a dict missing expected keys and fail with a confusing KeyError downstream.

Consider either:

Raising a ValueError so the caller can handle it explicitly, or
At minimum, logging at error level rather than warning

3. anthropic dependency change could be clearer

pyproject.toml removes anthropic>=0.75 as a direct dependency but adds a DEP003 exception for it as a transitive dep. The comment says "used by llm-anthropic plugin at runtime." This is correct behavior, but if a future llm-anthropic version drops or changes the transitive dep, pflow would break silently at runtime. The current approach is fine — just worth documenting in a comment near the deptry exception that the transitive dep is load-bearing.

Suggestions — optional improvements

1. discover_components returns empty string on context-builder error

In src/pflow/registry/discovery.py:103-106:

if isinstance(component_context, dict) and "error" in component_context:
    logger.warning(...)
    component_context_str = ""

Returning an empty string when build_component_context fails means the MCP server will return empty content to the AI agent with no indication of what went wrong. Consider including the error message in the return or raising so the caller can decide.

2. Bug fix tests are high quality

The three regression tests in test_api_warning_system.py (TestErrorFormattingSurfacesWarnings) are excellent — they test the actual bug scenario end-to-end through WorkflowExecutorService._build_error_list rather than mocking internals. The test names and docstrings clearly explain what bug each test protects against. This is exactly the right testing pattern.

3. workflow_execution.py simplification is clean

The reduction from 478 lines with repair loops to 18 lines of validate-execute-return is a model cleanup. The resume_state parameter removal is correct since it was only used by the repair loop.

4. Consider: WorkflowMatch.workflow typing

WorkflowMatch.workflow is typed as Optional[dict[str, Any]] — consider whether a more specific type (e.g., a TypedDict or reference to the WorkflowManager's return type) would help callers. Low priority since this is internal API.

Summary

Strong PR. The approach of relocating active code to natural homes, converting PocketFlow nodes to plain functions, and removing the entire planning/repair subsystem is well-executed. The bug fixes are well-motivated with clear regression tests. The MCP server discovery service simplification from PocketFlow ceremony to direct function calls is particularly clean.

The only actionable item I'd recommend addressing before merge is the silent fallback in parse_structured_response (Warning #2).

spinje added 3 commits March 16, 2026 12:00

spinje merged commit d8a70ed into main Mar 16, 2026
9 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: remove planning module and repair system#122

refactor: remove planning module and repair system#122
spinje merged 3 commits into
mainfrom
refactor/remove-planning-and-repair

spinje commented Mar 16, 2026

Uh oh!

claude Bot commented Mar 16, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spinje commented Mar 16, 2026

Summary

Task

Changes

Phase 1: Relocate active dependencies

Phase 2: Delete and clean

Bug fixes (pre-existing, exposed by refactor)

Explanation

Created Docs

Testing

Uh oh!

claude Bot commented Mar 16, 2026

Code Review — PR #122: Remove Planning Module and Repair System

Critical — must fix before merge

Warnings — should be addressed

Suggestions — optional improvements

Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant