feat(#106): Planner agent + TaskSpec + auto-inference — Phase B backend by Abernaughty · Pull Request #121 · Abernaughty/agent-dev

Abernaughty · 2026-04-04T21:28:09Z

Summary

Implements Issue #106 Phase B backend: the Planner agent, TaskSpec schema, workspace auto-inference, readiness checklist validation, and three new API endpoints for conversational task planning.

Closes the backend portion of #106 Phase B. Dashboard (ChatView) updates will follow in a subsequent PR.

What's New

Planner Agent (`src/agents/planner.py`)

TaskSpec Pydantic model — structured task specification with workspace, objective, languages, frameworks, output_type, acceptance_criteria, constraints, related_files
PlannerSession — manages conversational planning session state with 30-min idle TTL
ChecklistStatus — tracks readiness checklist completion (Required/Recommended/Optional tiers)
infer_workspace_stack() — auto-detects languages and frameworks from project files (package.json, pyproject.toml, Cargo.toml, go.mod, etc.)
send_planner_message() — core Planner interaction: sends user message to LLM, extracts TaskSpec updates from response JSON block, updates checklist
PlannerSessionStore — in-memory session store with lazy TTL cleanup
PLANNER_MODEL env var — configurable model (default: gemini-2.0-flash-lite), supports Gemini, Anthropic, and OpenAI-compatible models
Zero tool access enforced — Planner never gets filesystem, GitHub, or sandbox tools

API Endpoints (`src/api/main.py`)

POST /tasks/plan — Start a new Planner session with workspace validation + auto-inference
POST /tasks/plan/{session_id}/message — Send a message to the Planner agent
POST /tasks/plan/{session_id}/submit — Confirm and create task from TaskSpec (validates required checklist items)

SSE Events (`src/api/events.py`)

PLANNER_MESSAGE event type — streams Planner responses to the dashboard in real-time

API Models (`src/api/models.py`)

PlannerStartRequest, PlannerMessageRequest — request bodies
PlannerSessionResponse, PlannerSubmitResponse — response schemas
PlannerChecklistItemResponse, PlannerChecklistResponse — readiness checklist display
PlannerTaskSpecResponse — structured task spec for dashboard rendering

Tests (`tests/test_planner.py`)

TaskSpec serialization and to_description()
Checklist validation (required satisfied, recommended warnings, partial required)
Auto-inference: JavaScript, TypeScript, Python/FastAPI, SvelteKit, Rust, Go, multi-language projects
JSON extraction from Planner LLM responses
TaskSpec update application (including workspace override protection)
Session lifecycle (creation, expiry, touch, store CRUD)
Planner message flow with mocked LLM (field extraction, minimal-input warnings, workspace safety)

Security Considerations

Workspace never overridden: _apply_spec_updates() explicitly skips workspace field from LLM output
Zero tools: Planner agent has no filesystem/GitHub/sandbox access
Session TTL: 30-minute idle timeout prevents memory leaks from abandoned sessions
Auto-inference whitelist: Only reads known project manifest files (package.json, pyproject.toml, etc.)
Protected workspace PIN: Checked at session start, consistent with existing POST /tasks flow

Design Decisions

Decision	Resolution
Planner scope	Pre-graph conversational agent — NOT a LangGraph node
Model	`PLANNER_MODEL` env var, default `gemini-2.0-flash-lite`
Tool access	Zero — non-negotiable
Session storage	In-memory dict with TTL (production: consider Redis)
Submission guard	Required fields must be satisfied; recommended warn but don't block

What's Next (Session 2)

Dashboard: ChatView Planner flow — conversational messages, summary card, "Submit to Architect" button
Dashboard: Planner message styling (distinct from system/orchestrator)
Dashboard: Minimal-input warning display
Wire SSE planner_message events to dashboard store

Testing

cd dev-suite
uv run --group dev pytest tests/test_planner.py -v

Summary by CodeRabbit

New Features
- Interactive AI-guided task planning with checklist-based readiness assessment
- Automatic workspace technology detection (languages, frameworks)
- Multi-turn conversational planning sessions with persistence, expiry, and session lifecycle APIs
- Real-time event streaming for planner responses (SSE)
- API endpoints to start sessions, send messages, and submit planned tasks
- User-facing warnings for incomplete or minimal inputs
Tests
- Comprehensive test coverage for planner inference, checklist logic, message flow, and session lifecycle

Phase B Step 1-3: Planner agent module with: - TaskSpec Pydantic model for structured task specifications - PlannerSession for managing conversational planning sessions - ChecklistStatus for tracking readiness checklist completion - infer_workspace_stack() for auto-detecting languages/frameworks - Planner LLM integration (PLANNER_MODEL env var, zero tools) - Comprehensive unit tests for all planner logic

- EventType.PLANNER_MESSAGE for streaming planner responses via SSE - PlannerStartRequest, PlannerMessageRequest for API endpoints - PlannerSessionResponse, PlannerSubmitResponse for responses - PlannerChecklistItemResponse, PlannerChecklistResponse for readiness - PlannerTaskSpecResponse for structured task spec display

Three new endpoints for the Planner conversational flow: - POST /tasks/plan — start a session with workspace auto-inference - POST /tasks/plan/{id}/message — send message to Planner agent - POST /tasks/plan/{id}/submit — confirm and create task from TaskSpec Includes workspace validation, protected workspace PIN checks, PLANNER_MESSAGE SSE events, and submission guardrails (required fields must be satisfied, workspace never overridden by LLM).

coderabbitai · 2026-04-04T21:28:21Z

📝 Walkthrough

Walkthrough

Adds a Planner agent for conversational task-spec refinement: workspace manifest inference, checklist validation, TTL session management, LLM-driven structured updates, an in-memory session store, SSE events, three API endpoints (create/message/submit), and tests covering inference, checklist, JSON handling, and sessions.

Changes

Cohort / File(s)	Summary
Core Planner Agent `dev-suite/src/agents/planner.py`	New Planner implementation: Pydantic models (`TaskSpec`, checklist enums/models, `PlannerSession`, messages/responses), workspace manifest inference (`infer_workspace_stack`), checklist builder (`build_checklist`), session lifecycle & TTL, in-memory `PlannerSessionStore` singleton, LLM-driven flow (`send_planner_message`), JSON extract/apply utilities.
API Routes & Integration `dev-suite/src/api/main.py`	Adds planner endpoints: `POST /tasks/plan` (start session, infer workspace), `POST /tasks/plan/{session_id}/message` (send message, call planner LLM, publish `PLANNER_MESSAGE` SSE), `POST /tasks/plan/{session_id}/submit` (validate readiness, create/submit task). Adds `_planner_session_to_response` helper and SSE integration.
API Event Types `dev-suite/src/api/events.py`	Added `EventType.PLANNER_MESSAGE = "planner_message"` to support planner SSE events.
API Models `dev-suite/src/api/models.py`	New request/response models for Planner API: `PlannerStartRequest`, `PlannerMessageRequest`, checklist/task-spec/session/submit response models.
Tests `dev-suite/tests/test_planner.py`	Comprehensive pytest suite validating TaskSpec/to_description, checklist construction & warnings, manifest-based workspace inference across languages/frameworks, JSON extraction/stripping, spec update application rules (no workspace override, coercions, skip empties), session TTL/store behavior, and end-to-end `send_planner_message` with mocked LLM.

Sequence Diagram(s)

sequenceDiagram
    actor Client
    participant API as API Server
    participant Planner as Planner Agent
    participant LLM as LLM Service
    participant Store as Session Store
    participant EventBus as Event Bus

    Client->>API: POST /tasks/plan (workspace)
    API->>Planner: infer_workspace_stack(workspace)
    Planner-->>API: {languages, frameworks}
    API->>Planner: create_planner_session(...)
    Planner->>Store: store session
    Planner-->>API: PlannerSessionResponse

    Client->>API: POST /tasks/plan/{id}/message
    API->>Store: get session
    API->>Planner: send_planner_message(session, message)
    Planner->>LLM: call model with spec + checklist
    LLM-->>Planner: text + JSON block
    Planner->>Planner: extract JSON, apply updates, rebuild checklist
    Planner-->>API: PlannerResponse (ready, warnings)
    API->>EventBus: publish PLANNER_MESSAGE
    EventBus-->>Client: SSE event

    Client->>API: POST /tasks/plan/{id}/submit
    API->>Store: get & validate session
    API->>Planner: verify required_satisfied
    API->>API: state_manager.create_task + task_runner.submit
    API->>Store: mark session submitted
    API-->>Client: PlannerSubmitResponse

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

feat: Planner agent + Chat task creation — readiness checklist, workspace selector, TaskSpec #106: Implements the Planner agent, TaskSpec/checklist, workspace inference, PLANNER_MODEL usage, and the /tasks/plan API with planner_message SSE described in the issue.

Poem

🐰 I hop through manifests and whisper the plan,
I sniff languages and frameworks by scan,
With checklist in paw and JSON to bake,
I polish your spec for the Architect's sake,
Hoppity-hop — now send it off, fluffle and fan! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 53.66% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately and specifically describes the main change: implementing a Planner agent with TaskSpec and workspace auto-inference as Phase B backend for Issue `#106`.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feature/106-planner-agent

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dev-suite/src/agents/planner.py`:
- Around line 271-315: The heuristics currently use substring matches and weak
file presence checks which cause false positives (e.g., matching "preact" as
React via _PACKAGE_JSON_FRAMEWORK_MAP and treating any build.gradle as Kotlin);
update the inference logic to use exact package-name matching for package.json
deps (compare dependency keys exactly against keys in
_PACKAGE_JSON_FRAMEWORK_MAP, allow scoped-prefix exceptions explicitly like
"react-" if intended), change pyproject checks to require whole-token matches
against _PYPROJECT_FRAMEWORK_MAP, and tighten Kotlin detection so build.gradle
alone does not cause "Kotlin" to be appended—only add Kotlin when a
build.gradle.kts exists or when build.gradle contains explicit Kotlin DSL/plugin
coordinates or the Kotlin JVM/Android plugin identifiers; keep references to the
same variables (languages, frameworks, workspace) and ensure build_checklist()
receives only strongly-evidenced entries.
- Around line 549-564: The _apply_spec_updates function currently writes any
parsed JSON value into TaskSpec fields, which allows wrong types (e.g.,
"languages": "Python") to corrupt the spec; add runtime validation before
mutating: for each field in updates (inside _apply_spec_updates) check the
expected TaskSpec field type/shape (use TaskSpec.model_fields or
TaskSpec.__fields__ to get expected types), coerce or reject mismatched types
(e.g., require list for languages, dict/structured types for
acceptance_criteria), log or skip invalid updates, and only setattr when the
value matches the expected schema; reference TaskSpec, _apply_spec_updates,
build_checklist(), and TaskSpec.to_description() to ensure incoming values will
be compatible with downstream consumers.

In `@dev-suite/src/api/main.py`:
- Around line 450-456: Before calling state_manager.create_task/
task_runner.submit with session.task_spec, re-validate the workspace policy for
session.task_spec.workspace (i.e., run the same allow-list and PIN checks that
were used at session open) or verify a carried PlannerSession auth flag is still
valid; if the re-check fails, abort and return an error instead of creating the
task. Locate the usage around session.task_spec.to_description(),
state_manager.create_task and task_runner.submit and either invoke the existing
workspace policy check function (the one used when opening a session) or consult
and invalidate a stored PlannerSession auth state before proceeding.

In `@dev-suite/tests/test_planner.py`:
- Around line 307-312: The test test_ignore_unknown_fields is currently vacuous
because the assertion uses "or True"; update the assertion in that test to
actually verify unknown fields are not set: remove the "or True" and assert that
hasattr(spec, "unknown_field") is False (or use "assert not hasattr(spec,
'unknown_field')"), ensuring _apply_spec_updates and TaskSpec behavior is
exercised; locate the test function test_ignore_unknown_fields and replace the
final assertion accordingly so it fails if unknown fields are applied.
- Around line 9-28: The import block includes unused names (Path,
ChecklistPriority, PlannerSession) and is not sorted, causing Ruff I001/F401;
remove the unused imports (drop Path, ChecklistPriority, PlannerSession) from
the import list and reorder the remaining imports into a properly sorted group
(standard library first like json, time, pathlib if kept, then third-party like
pytest, then local module imports) or run isort/ruff --fix to normalize
ordering; ensure the test references still match the retained symbols such as
AsyncMock, patch, pytest, and the planner symbols that are actually used (e.g.,
TaskSpec, _apply_spec_updates, build_checklist, create_planner_session, etc.).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b5a3b371-c820-4886-ae50-b35b6fedb694

📥 Commits

Reviewing files that changed from the base of the PR and between e43843c and f316940.

📒 Files selected for processing (5)

dev-suite/src/agents/planner.py
dev-suite/src/api/events.py
dev-suite/src/api/main.py
dev-suite/src/api/models.py
dev-suite/tests/test_planner.py

dev-suite/src/agents/planner.py

dev-suite/src/api/main.py

dev-suite/tests/test_planner.py

1. test_planner.py: Remove unused imports (Path, ChecklistPriority, PlannerSession) fixing ruff F401/I001 CI failure 2. test_planner.py: Fix vacuous assertion (remove `or True`) 3. test_planner.py: Add tests for type coercion and preact/gradle edge cases 4. planner.py: Use exact dep matching in infer_workspace_stack — prevents "preact" → React false positive 5. planner.py: Tighten Kotlin detection — only from .kts, not plain build.gradle 6. planner.py: Add type validation in _apply_spec_updates — coerce string→list, reject non-string/non-list values 7. planner.py: Move `import re` to top-level (was lazy-imported twice)

Addresses CodeRabbit Major: submit_planner_session now re-checks ws_mgr.is_allowed() and ws_mgr.is_protected() before creating the task. Prevents a session opened before workspace removal/protection from submitting into that directory.

coderabbitai

♻️ Duplicate comments (1)

dev-suite/tests/test_planner.py (1)

15-25: ⚠️ Potential issue | 🟡 Minor

Fix import sorting to resolve CI failure.

The ruff linter is failing with I001 because imports within the src.agents.planner block are not sorted according to ruff's isort rules (typically case-insensitive alphabetical order).

🧹 Proposed fix

 from src.agents.planner import (
-    PlannerSessionStore,
-    TaskSpec,
     _apply_spec_updates,
     _extract_task_spec_updates,
     _strip_json_block,
     build_checklist,
     create_planner_session,
     infer_workspace_stack,
+    PlannerSessionStore,
     send_planner_message,
+    TaskSpec,
 )

Alternatively, run ruff check --fix or ruff format to auto-fix.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@dev-suite/tests/test_planner.py` around lines 15 - 25, The import block from
src.agents.planner is not sorted which triggers ruff I001; reorder the imported
symbols (PlannerSessionStore, TaskSpec, _apply_spec_updates,
_extract_task_spec_updates, _strip_json_block, build_checklist,
create_planner_session, infer_workspace_stack, send_planner_message) into
ruff/isort-compliant alphabetical order (case-insensitive) or run `ruff check
--fix`/`ruff format` to auto-fix the imports so the test file passes CI.

🧹 Nitpick comments (2)

dev-suite/src/agents/planner.py (2)
797-841: Consider thread safety if scaling beyond single-worker deployment.

PlannerSessionStore uses a plain dict which is not thread-safe for concurrent modifications. This is fine for:

Single-worker asyncio (default uvicorn mode)

Multiple workers (each has isolated memory)

If you later need shared session state across workers or use threaded concurrency, consider using threading.Lock or migrating to Redis/external store.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dev-suite/src/agents/planner.py` around lines 797 - 841, PlannerSessionStore
is not thread-safe because it mutates _sessions without synchronization; add a
threading.Lock (e.g. self._lock = threading.Lock()) and acquire/release it (use
context manager) around all accesses and mutations of self._sessions in create,
get, remove, count, and _cleanup_expired to prevent concurrent races, or
alternatively replace the in-memory dict with a process-shared store (e.g.
Redis) and update create/get/remove/count/_cleanup_expired to use that backend
if cross-worker sharing is required.
373-385: Semantic note on auto_inferred flag.

The auto_inferred flag is set to True whenever languages or frameworks have values, even if the user manually provided them. This is technically imprecise—it only indicates "has values" rather than "values came from auto-inference."

For the MVP this is acceptable since it's primarily for UI hints, but consider tracking the actual inference source if future features need to distinguish user-provided vs auto-detected values.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@dev-suite/src/agents/planner.py` around lines 373 - 385, The auto_inferred
flag for the ChecklistItem entries for "languages" and "frameworks" incorrectly
treats any non-empty value as auto-inferred; update the code that builds those
items (the ChecklistItem construction for fields "languages" and "frameworks" in
planner.py) to read a true inference indicator from the TaskSpec instead of
using bool(task_spec.languages/frameworks). Add or use explicit flags on
TaskSpec (e.g., task_spec.inferred_languages and task_spec.inferred_frameworks
or similar properties), set auto_inferred=task_spec.inferred_languages for the
languages item and auto_inferred=task_spec.inferred_frameworks for the
frameworks item, and fall back to False if those inference flags are not present
to preserve current behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@dev-suite/tests/test_planner.py`:
- Around line 15-25: The import block from src.agents.planner is not sorted
which triggers ruff I001; reorder the imported symbols (PlannerSessionStore,
TaskSpec, _apply_spec_updates, _extract_task_spec_updates, _strip_json_block,
build_checklist, create_planner_session, infer_workspace_stack,
send_planner_message) into ruff/isort-compliant alphabetical order
(case-insensitive) or run `ruff check --fix`/`ruff format` to auto-fix the
imports so the test file passes CI.

---

Nitpick comments:
In `@dev-suite/src/agents/planner.py`:
- Around line 797-841: PlannerSessionStore is not thread-safe because it mutates
_sessions without synchronization; add a threading.Lock (e.g. self._lock =
threading.Lock()) and acquire/release it (use context manager) around all
accesses and mutations of self._sessions in create, get, remove, count, and
_cleanup_expired to prevent concurrent races, or alternatively replace the
in-memory dict with a process-shared store (e.g. Redis) and update
create/get/remove/count/_cleanup_expired to use that backend if cross-worker
sharing is required.
- Around line 373-385: The auto_inferred flag for the ChecklistItem entries for
"languages" and "frameworks" incorrectly treats any non-empty value as
auto-inferred; update the code that builds those items (the ChecklistItem
construction for fields "languages" and "frameworks" in planner.py) to read a
true inference indicator from the TaskSpec instead of using
bool(task_spec.languages/frameworks). Add or use explicit flags on TaskSpec
(e.g., task_spec.inferred_languages and task_spec.inferred_frameworks or similar
properties), set auto_inferred=task_spec.inferred_languages for the languages
item and auto_inferred=task_spec.inferred_frameworks for the frameworks item,
and fall back to False if those inference flags are not present to preserve
current behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: f08a9345-e23e-460c-8f51-4274f8863c65

📥 Commits

Reviewing files that changed from the base of the PR and between f316940 and e17062c.

📒 Files selected for processing (2)

dev-suite/src/agents/planner.py
dev-suite/tests/test_planner.py

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dev-suite/src/api/main.py`:
- Around line 376-381: The exception message returned to the client leaks
internal details; keep the existing logger.error with exc_info=True for full
diagnostics but replace the client-facing _error call so it sends a generic
message (e.g., "Planner failed" or "Internal planner error") and the same 502
status without including the exception object; update the except block around
send_planner_message to call _error with a sanitized string while still logging
the full exception via logger.error and then return.
- Around line 462-464: The planner submission path currently calls
state_manager.create_task(...) and task_runner.submit(task_id, description,
workspace=workspace) but omits the publish_pr flag that the regular POST /tasks
path forwards; update the planner flow to accept and propagate publish_pr (e.g.
add publish_pr to PlannerStartRequest or the planner handler's params) and pass
it into task_runner.submit(task_id, description, workspace=workspace,
publish_pr=publish_pr) so planner-created tasks honor the same PR publishing
option as the regular endpoint.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0a539a69-8463-4bdb-80e9-e34c070186cd

📥 Commits

Reviewing files that changed from the base of the PR and between e17062c and 61a7456.

📒 Files selected for processing (1)

dev-suite/src/api/main.py

coderabbitai · 2026-04-05T03:30:32Z

dev-suite/src/api/main.py

+    try:
+        planner_resp = await send_planner_message(session, body.message)
+    except Exception as e:
+        logger.error("Planner LLM call failed: %s", e, exc_info=True)
+        _error(f"Planner failed: {e}", 502)
+        return


⚠️ Potential issue | 🟡 Minor

Sanitize exception details before returning to client.

The error message f"Planner failed: {e}" may leak internal details (stack traces, file paths, or configuration info) from the underlying LLM call or JSON extraction. Consider a generic message for the client and log the full exception separately.

🛡️ Suggested fix

try: planner_resp = await send_planner_message(session, body.message) except Exception as e: logger.error("Planner LLM call failed: %s", e, exc_info=True) - _error(f"Planner failed: {e}", 502) + _error("Planner service temporarily unavailable. Please try again.", 502) return

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

try:

planner_resp = await send_planner_message(session, body.message)

except Exception as e:

logger.error("Planner LLM call failed: %s", e, exc_info=True)

_error(f"Planner failed: {e}", 502)

return

try:

planner_resp = await send_planner_message(session, body.message)

except Exception as e:

logger.error("Planner LLM call failed: %s", e, exc_info=True)

_error("Planner service temporarily unavailable. Please try again.", 502)

return

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@dev-suite/src/api/main.py` around lines 376 - 381, The exception message returned to the client leaks internal details; keep the existing logger.error with exc_info=True for full diagnostics but replace the client-facing _error call so it sends a generic message (e.g., "Planner failed" or "Internal planner error") and the same 502 status without including the exception object; update the except block around send_planner_message to call _error with a sanitized string while still logging the full exception via logger.error and then return.

coderabbitai · 2026-04-05T03:30:32Z

dev-suite/src/api/main.py

+    # Create task via existing flow
+    task_id = await state_manager.create_task(description, workspace=workspace)
+    task_runner.submit(task_id, description, workspace=workspace)


⚠️ Potential issue | 🟡 Minor

Missing publish_pr parameter for planner-submitted tasks.

The regular POST /tasks endpoint at line 201 passes publish_pr=body.publish_pr to task_runner.submit(), but the planner submit path omits it. Planner-created tasks will always use the default PR behavior with no user control.

Consider adding publish_pr to the submit request or PlannerStartRequest, or document that planner tasks always use the default.

🔧 Suggested fix if adding to submit endpoint

+from .models import PlannerSubmitRequest # Add new request model `@app.post`("/tasks/plan/{session_id}/submit", response_model=ApiResponse) async def submit_planner_session( session_id: str, + body: PlannerSubmitRequest | None = None, _auth: str | None = Depends(require_auth), ): # ... existing validation ... # Create task via existing flow task_id = await state_manager.create_task(description, workspace=workspace) - task_runner.submit(task_id, description, workspace=workspace) + task_runner.submit( + task_id, + description, + workspace=workspace, + publish_pr=body.publish_pr if body else None, + )

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@dev-suite/src/api/main.py` around lines 462 - 464, The planner submission path currently calls state_manager.create_task(...) and task_runner.submit(task_id, description, workspace=workspace) but omits the publish_pr flag that the regular POST /tasks path forwards; update the planner flow to accept and propagate publish_pr (e.g. add publish_pr to PlannerStartRequest or the planner handler's params) and pass it into task_runner.submit(task_id, description, workspace=workspace, publish_pr=publish_pr) so planner-created tasks honor the same PR publishing option as the regular endpoint.

Single blank line between import block and first code, not two.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

dev-suite/tests/test_planner.py (1)

386-389: Use SESSION_TTL_SECONDS in expiry tests instead of hard-coded 3600.

These assertions are coupled to an implicit 30-minute policy but encode 1 hour directly. If TTL changes, tests can become misleading. Prefer deriving from the source constant.

Proposed diff

 from src.agents.planner import (
     PlannerSessionStore,
+    SESSION_TTL_SECONDS,
     TaskSpec,
@@
     def test_session_expiry(self):
         session = create_planner_session(workspace="/proj")
-        session.last_activity = time.time() - 3600  # 1 hour ago
+        session.last_activity = time.time() - (SESSION_TTL_SECONDS + 1)
         assert session.is_expired
@@
     def test_expired_session_removed(self):
         store = PlannerSessionStore()
         session = create_planner_session(workspace="/proj")
-        session.last_activity = time.time() - 3600
+        session.last_activity = time.time() - (SESSION_TTL_SECONDS + 1)
         store.create(session)
         assert store.get(session.session_id) is None

Also applies to: 412-417

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@dev-suite/tests/test_planner.py` around lines 386 - 389, Update the expiry
tests to compute the stale timestamp using the module constant
SESSION_TTL_SECONDS instead of the hard-coded 3600; specifically, in
test_session_expiry (and the similar test around lines for session inactivity)
set session.last_activity = time.time() - (SESSION_TTL_SECONDS + 1) (or -
SESSION_TTL_SECONDS to test boundary) so the assertion session.is_expired
derives from SESSION_TTL_SECONDS, and import or reference the constant from the
planner/session module where create_planner_session and session.is_expired are
defined.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@dev-suite/tests/test_planner.py`:
- Around line 391-396: The test_session_touch uses time.sleep which causes
flakiness; instead, set session.last_activity to a deterministic older timestamp
and call session.touch(), then assert session.last_activity is greater than the
previous value. Modify test_session_touch to create the session via
create_planner_session(), assign session.last_activity to a past datetime (or
timestamp) value, call session.touch(), and assert the updated
session.last_activity > previous value (referencing the test function name
test_session_touch and methods create_planner_session, session.touch, and
session.last_activity).

---

Nitpick comments:
In `@dev-suite/tests/test_planner.py`:
- Around line 386-389: Update the expiry tests to compute the stale timestamp
using the module constant SESSION_TTL_SECONDS instead of the hard-coded 3600;
specifically, in test_session_expiry (and the similar test around lines for
session inactivity) set session.last_activity = time.time() -
(SESSION_TTL_SECONDS + 1) (or - SESSION_TTL_SECONDS to test boundary) so the
assertion session.is_expired derives from SESSION_TTL_SECONDS, and import or
reference the constant from the planner/session module where
create_planner_session and session.is_expired are defined.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 68c5fb54-8cf2-4e77-afce-254a0cc3424b

📥 Commits

Reviewing files that changed from the base of the PR and between 61a7456 and a34fbd4.

📒 Files selected for processing (1)

dev-suite/tests/test_planner.py

coderabbitai · 2026-04-05T03:38:44Z

dev-suite/tests/test_planner.py

+    def test_session_touch(self):
+        session = create_planner_session(workspace="/proj")
+        old_time = session.last_activity
+        time.sleep(0.01)
+        session.touch()
+        assert session.last_activity > old_time


⚠️ Potential issue | 🟡 Minor

Avoid time.sleep() in test_session_touch to reduce CI flakiness.

Sleeping in unit tests is slow and timing-sensitive. You can set last_activity to a known older value and assert touch() updates it.

Proposed diff

def test_session_touch(self): session = create_planner_session(workspace="/proj") - old_time = session.last_activity - time.sleep(0.01) + old_time = session.last_activity + session.last_activity = old_time - 10 session.touch() - assert session.last_activity > old_time + assert session.last_activity > old_time

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

def test_session_touch(self):

session = create_planner_session(workspace="/proj")

old_time = session.last_activity

time.sleep(0.01)

session.touch()

assert session.last_activity > old_time

def test_session_touch(self):

session = create_planner_session(workspace="/proj")

old_time = session.last_activity

session.last_activity = old_time - 10

session.touch()

assert session.last_activity > old_time

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@dev-suite/tests/test_planner.py` around lines 391 - 396, The test_session_touch uses time.sleep which causes flakiness; instead, set session.last_activity to a deterministic older timestamp and call session.touch(), then assert session.last_activity is greater than the previous value. Modify test_session_touch to create the session via create_planner_session(), assign session.last_activity to a past datetime (or timestamp) value, call session.touch(), and assert the updated session.last_activity > previous value (referencing the test function name test_session_touch and methods create_planner_session, session.touch, and session.last_activity).

Wire the dashboard ChatView to the Planner backend (PR #121). Conversational task planning flow: workspace select → describe objective → Planner validates checklist → submit to Architect. New files: - stores/planner.svelte.ts — session state, message history, checklist - routes/api/planner/+server.ts — proxy POST /tasks/plan - routes/api/planner/[id]/message/+server.ts — proxy message endpoint - routes/api/planner/[id]/submit/+server.ts — proxy submit endpoint Modified files: - types/api.ts — Planner types + planner_message SSE event - sse.ts — dispatch planner_message events to plannerStore - stores/index.ts — export + teardown plannerStore - components/views/ChatView.svelte — full rewrite with Planner flow Closes #106

Abernaughty added 3 commits April 4, 2026 15:24

coderabbitai bot reviewed Apr 4, 2026

View reviewed changes

Abernaughty added 2 commits April 4, 2026 21:20

coderabbitai bot reviewed Apr 5, 2026

View reviewed changes

fix(#106): remove extra blank line after imports (ruff I001)

a34fbd4

Single blank line between import block and first code, not two.

coderabbitai bot reviewed Apr 5, 2026

View reviewed changes

Abernaughty merged commit c92efbc into main Apr 5, 2026
3 checks passed

Abernaughty deleted the feature/106-planner-agent branch April 5, 2026 03:40

Abernaughty mentioned this pull request Apr 5, 2026

feat: Planner agent + Chat task creation — readiness checklist, workspace selector, TaskSpec #106

Closed

29 tasks

Abernaughty mentioned this pull request Apr 5, 2026

feat(dashboard): #106 Phase B — ChatView planner UI #124

Merged

Conversation

Abernaughty commented Apr 4, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's New

Planner Agent (src/agents/planner.py)

API Endpoints (src/api/main.py)

SSE Events (src/api/events.py)

API Models (src/api/models.py)

Tests (tests/test_planner.py)

Security Considerations

Design Decisions

What's Next (Session 2)

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Apr 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related issues

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Abernaughty commented Apr 4, 2026 •

edited by coderabbitai bot

Loading

Planner Agent (`src/agents/planner.py`)

API Endpoints (`src/api/main.py`)

SSE Events (`src/api/events.py`)

API Models (`src/api/models.py`)

Tests (`tests/test_planner.py`)

coderabbitai bot commented Apr 4, 2026 •

edited

Loading