fix(discovery): Allow starting discovery when project is running but discovery idle#181
fix(discovery): Allow starting discovery when project is running but discovery idle#181
Conversation
…discovery idle When a project is created, it's set to status "running" but discovery may not have started (state: "idle"). Previously, the /start endpoint returned "already running" without checking discovery state, blocking the "Start Discovery" button. Changes: - agents.py: Check discovery state via LeadAgent when project is running - If discovery is "idle": start discovery (202 Accepted) - If discovery is "discovering": return "already running" (200 OK) - If discovery is "completed": return appropriate status (200 OK) - On status check error: fallback to starting discovery - Updated unit tests with 4 new test cases for discovery state handling - Updated existing test to mock discovery state correctly - Added frontend test for "already running" response handling - Added E2E tests for discovery start workflow
WalkthroughAdds discovery-state awareness to the project start endpoint: when a project is RUNNING and an API key is set, the router queries LeadAgent for discovery status (idle / discovering / completed / error) and either starts discovery in background or returns an appropriate 200/202 response. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Router as Agent Router
participant LeadAgent
participant BG as Background Task
Client->>Router: POST /projects/:id/start
Router->>Router: Fetch project & check status
Router->>Router: Check API key configured?
alt API key configured
Router->>LeadAgent: get_discovery_status(project_id)
alt discovery == idle
LeadAgent-->>Router: idle
Router->>BG: start_agent(project_id) (async)
Router-->>Client: 202 Accepted (starting discovery)
else discovery == discovering
LeadAgent-->>Router: discovering
Router-->>Client: 200 OK (discovery in progress)
else discovery == completed
LeadAgent-->>Router: completed
Router-->>Client: 200 OK (project running)
else status check error
LeadAgent-->>Router: error/exception
Router->>BG: start_agent(project_id) (fallback)
Router-->>Client: 202 Accepted (starting discovery)
end
else No API key
Router-->>Client: 200 OK (project running)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~28 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Comment |
Code Review - PR #181Great work identifying and fixing the discovery state synchronization issue! This is a thoughtful solution that properly checks discovery state before blocking the "Start Discovery" button. Here's my detailed review: ✅ Strengths
🔍 Code Quality Issues1. Resource Leak - LeadAgent Not Cleaned Up
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (5)
codeframe/ui/routers/agents.py (2)
84-94: Potential duplicate LeadAgent instantiation.When discovery is idle (line 90-94), a temporary
LeadAgentis created to check status, then immediatelystart_agentis called which creates anotherLeadAgentinstance (per thestart_agentsnippet in relevant code). This results in twoLeadAgentinstances being created for the same project.Consider either:
- Reusing the temp_agent by storing it in
running_agentsbefore callingstart_agent- Checking if an agent already exists in
running_agentsbefore creatingtemp_agent🔎 Proposed optimization to avoid duplicate agent creation
if discovery_state == "idle": # Discovery not started - proceed to start discovery logger.info(f"Project {project_id} is running but discovery is idle - starting discovery") + # Store the temp_agent to avoid creating a duplicate in start_agent + from codeframe.ui.shared import shared_state + async with shared_state._agents_lock: + if project_id not in shared_state._running_agents: + shared_state._running_agents[project_id] = temp_agent + running_agents[project_id] = temp_agent background_tasks.add_task(start_agent, project_id, db, running_agents, api_key) return {"message": f"Starting discovery for project {project_id}", "status": "starting"}Note: This is a performance consideration. The current implementation works correctly but may be wasteful if
LeadAgentinitialization is non-trivial.
113-118: Reasonable fallback when API key unavailable.When the API key is not configured, the endpoint can't instantiate a
LeadAgentto check discovery state, so returning a generic "already running" message is appropriate. Consider adding a log message here for observability.else: # No API key available - can't check discovery status or start + logger.debug(f"Project {project_id} is running but no API key to check discovery state") return JSONResponse( status_code=200, content={"message": f"Project {project_id} is already running", "status": "running"}, )tests/e2e/test_start_agent_flow.spec.ts (3)
139-147: Unused locator and weak final assertion.The
discoveryActivelocator (lines 139-141) is defined but never used in any assertion. The final assertion only checks thatdiscoverySectionis visible, which was already verified at line 115. This makes the test's verification incomplete.Consider either:
- Removing the unused locator
- Adding a meaningful assertion using
discoveryActive🔎 Proposed fix to strengthen the test assertion
// At this point, discovery should be started (either auto or via button) // Verify that we see either: // - Discovery question input // - Discovery in progress indicator // - Discovery complete message - const discoveryActive = page.locator('[data-testid="discovery-answer-input"]') - .or(page.locator('text=/discovery complete/i')) - .or(page.locator('text=/not started/i')); - - // Give some time for the state to update - await page.waitForTimeout(1000); - - // The test passes if discovery section is visible and properly initialized - await expect(discoverySection).toBeVisible(); + // Give some time for the state to update + await page.waitForTimeout(1000); + + // Verify discovery section shows meaningful content (not just visible) + const discoveryContent = page.locator('[data-testid="discovery-answer-input"]') + .or(page.locator('text=/discovery complete/i')) + .or(page.locator('text=/not started/i')) + .or(page.getByTestId('mock-progress-bar')); + + await expect(discoverySection).toBeVisible(); + // At least one of these states should be present + await expect(discoveryContent).toBeVisible({ timeout: 5000 }); });
130-132: Consider replacingwaitForTimeoutwith explicit element waits.Using
page.waitForTimeout(2000)introduces arbitrary delays that can cause flaky tests. Prefer waiting for specific state changes.🔎 More robust wait alternative
- // Wait for discovery to actually start (question appears or progress updates) - await page.waitForTimeout(2000); + // Wait for discovery to actually start (question appears or button state changes) + await Promise.race([ + page.getByTestId('discovery-answer-input').waitFor({ state: 'visible', timeout: 5000 }), + page.locator('text=/not started/i').waitFor({ state: 'hidden', timeout: 5000 }), + ]).catch(() => { + // Either condition is acceptable - discovery may or may not have fully started + });
164-172: Test assertion could be more explicit.The conditional check means the test passes whether or not an alert is visible. If you specifically want to ensure no error alert appears, consider a more direct assertion.
- const errorAlert = page.getByRole('alert'); - const hasError = await errorAlert.isVisible().catch(() => false); - - // Should not show any error on initial load - if (hasError) { - const errorText = await errorAlert.textContent(); - // "Failed to start discovery" is an error we want to catch - expect(errorText).not.toContain('Failed to start discovery'); - } + // Verify no "Failed to start discovery" error is shown + // Using a short timeout since we expect this NOT to be visible + await expect(page.locator('text=/failed to start discovery/i')) + .not.toBeVisible({ timeout: 2000 });
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
codeframe/ui/routers/agents.pytests/agents/test_agent_lifecycle.pytests/e2e/test_start_agent_flow.spec.tsweb-ui/src/components/__tests__/DiscoveryProgress.test.tsx
🧰 Additional context used
📓 Path-based instructions (5)
tests/e2e/**/*.ts
📄 CodeRabbit inference engine (CLAUDE.md)
Implement E2E tests using Playwright + TestSprite with loginUser() helper from tests/e2e/test-utils.ts for authentication
Files:
tests/e2e/test_start_agent_flow.spec.ts
web-ui/src/**/*.{ts,tsx}
📄 CodeRabbit inference engine (CLAUDE.md)
web-ui/src/**/*.{ts,tsx}: Use TypeScript 5.3+ for frontend development with React 18, Next.js 14, Tailwind CSS, and Hugeicons for icons
Use TypeScript strict mode and ensure 100% type safety with no hardcoded types or any usage in frontend code
Files:
web-ui/src/components/__tests__/DiscoveryProgress.test.tsx
web-ui/src/components/**/*.tsx
📄 CodeRabbit inference engine (CLAUDE.md)
web-ui/src/components/**/*.tsx: Use shadcn/ui Nova template components with semantic color palette (bg-card, text-foreground, etc.) and avoid hardcoded color values
Use cn() utility for conditional Tailwind CSS classes and follow Nova's compact spacing conventions
Files:
web-ui/src/components/__tests__/DiscoveryProgress.test.tsx
web-ui/src/**/*.tsx
📄 CodeRabbit inference engine (CLAUDE.md)
Replace all icon usage with Hugeicons (@hugeicons/react) and do not mix with lucide-react
Files:
web-ui/src/components/__tests__/DiscoveryProgress.test.tsx
codeframe/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
codeframe/**/*.py: Use Python 3.11+ for backend development with FastAPI, AsyncAnthropic, SQLite with async support (aiosqlite), and tiktoken for token counting
Use token counting via tiktoken library for token budget management with ~50,000 token limit per conversation
Use asyncio patterns with AsyncAnthropic for async/await in Python backend for concurrent operations
Implement quality gates with multi-stage pre-completion checks (tests → type → coverage → review) and Git + SQLite + context snapshots for project state rollback
Use tiered memory system (HOT/WARM/COLD) with importance scoring using hybrid exponential decay algorithm for context management with 30-50% token reduction
Implement session lifecycle management with auto-save/restore using file-based storage at .codeframe/session_state.json
Files:
codeframe/ui/routers/agents.py
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: frankbria/codeframe PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-04T06:26:12.845Z
Learning: Applies to web-ui/src/app/page.tsx : Implement automatic project discovery start after project creation with loading state transitions and 'Start Discovery' button for idle projects
📚 Learning: 2026-01-04T06:26:12.845Z
Learnt from: CR
Repo: frankbria/codeframe PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-04T06:26:12.845Z
Learning: Applies to web-ui/src/app/page.tsx : Implement automatic project discovery start after project creation with loading state transitions and 'Start Discovery' button for idle projects
Applied to files:
tests/e2e/test_start_agent_flow.spec.tsweb-ui/src/components/__tests__/DiscoveryProgress.test.tsxcodeframe/ui/routers/agents.py
📚 Learning: 2025-11-25T19:08:37.203Z
Learnt from: CR
Repo: frankbria/codeframe PR: 0
File: docs/CLAUDE.md:0-0
Timestamp: 2025-11-25T19:08:37.203Z
Learning: Implement Lead Agent for orchestration and Worker Agents for specialization (Backend, Frontend, Test, Review) with maturity levels D1-D4
Applied to files:
codeframe/ui/routers/agents.py
📚 Learning: 2025-12-17T19:21:30.131Z
Learnt from: frankbria
Repo: frankbria/codeframe PR: 128
File: tests/agents/test_bottleneck_detection.py:486-500
Timestamp: 2025-12-17T19:21:30.131Z
Learning: In tests/agents/*.py, when testing bottleneck detection logic, ensure that tests exercising detect_bottlenecks are async and mock _get_agent_workload to return a value below AGENT_OVERLOAD_THRESHOLD (5) while providing a non-empty tasks list to prevent early return. This guarantees the code path for low workload is exercised and behavior under threshold is verified.
Applied to files:
tests/agents/test_agent_lifecycle.py
🧬 Code graph analysis (3)
web-ui/src/components/__tests__/DiscoveryProgress.test.tsx (2)
web-ui/src/types/api.ts (1)
DiscoveryProgressResponse(135-139)web-ui/src/lib/api.ts (1)
projectsApi(30-54)
codeframe/ui/routers/agents.py (2)
codeframe/agents/lead_agent.py (2)
LeadAgent(28-2126)get_discovery_status(393-447)codeframe/ui/shared.py (1)
start_agent(257-360)
tests/agents/test_agent_lifecycle.py (3)
codeframe/core/models.py (2)
ProjectStatus(34-43)status(246-247)codeframe/persistence/database.py (1)
update_project(286-288)codeframe/agents/lead_agent.py (1)
get_discovery_status(393-447)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Backend Unit Tests
- GitHub Check: Frontend Unit Tests
- GitHub Check: E2E Smoke Tests (Chromium)
- GitHub Check: claude-review
🔇 Additional comments (6)
codeframe/ui/routers/agents.py (1)
107-112: Broad exception handling with fallback is appropriate here.The fallback to starting discovery when the status check fails is a reasonable defensive pattern. The comment correctly notes that
start_agenthandles duplicates gracefully. The warning log provides visibility into these edge cases.web-ui/src/components/__tests__/DiscoveryProgress.test.tsx (1)
448-505: Well-structured test for the "already running" edge case.The test correctly validates that:
startProjectis called when the button is clicked- No error is shown when the backend returns a "running" status (non-error response)
- The component still refreshes discovery progress after the response
The comment on line 449 clearly explains the scenario being tested.
tests/agents/test_agent_lifecycle.py (4)
191-234: Updated test correctly reflects new behavior.The test now properly mocks
LeadAgent.get_discovery_statusto return "discovering" state, which is the condition under which the endpoint should return 200 "already running". The docstring update clearly explains the new behavior.
609-648: Comprehensive test for the core fix scenario.This test directly validates the fix: when project is RUNNING but discovery is idle, the endpoint should return 202 and start discovery. The assertion at line 647 verifies that
start_agentwas actually called.
649-689: Good coverage for "discovery already active" scenario.The test correctly verifies that when discovery is already in progress:
- The endpoint returns 200 OK
start_agentis NOT called (avoiding duplicate work)
736-768: Important error fallback test.This test validates the defensive behavior: when discovery status check fails (e.g., database error), the endpoint falls back to starting discovery rather than failing. This ensures the user can still start discovery even if there's a transient error checking status.
Check running_agents for existing agent before creating temp_agent for status check. This avoids creating two LeadAgent instances when one already exists. Addresses CodeRabbit feedback on PR #181.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
codeframe/ui/routers/agents.py (2)
114-119: Error handling falls back gracefully.The exception handler catches all errors when checking discovery status and falls back to starting discovery. This is a safe approach since
start_agenthandles duplicates gracefully. The warning log at line 115 is appropriate.Consider enhancing the log message to include the exception type for better debugging.
🔎 Optional enhancement: Include exception type in log
- logger.warning(f"Failed to check discovery status for project {project_id}: {e}") + logger.warning(f"Failed to check discovery status for project {project_id}: {type(e).__name__}: {e}")
127-130: Consider refactoring to avoid duplicate API key fetch.The API key is fetched from the environment at both line 82 (for discovery state check) and line 127 (for starting the agent). While this doesn't cause incorrect behavior, it's slightly redundant.
Consider fetching the API key once at the beginning of the function after authorization checks.
🔎 Proposed refactor to eliminate duplicate fetch
# Authorization check if not db.user_has_project_access(current_user.id, project_id): raise HTTPException(status_code=403, detail="Access denied") + + # Get API key from environment once + api_key = os.environ.get("ANTHROPIC_API_KEY") # cf-10.2: Handle idempotent behavior - check discovery state if already running if project["status"] == ProjectStatus.RUNNING.value: # Check discovery state before returning "already running" # If discovery is idle, we should still start discovery - api_key = os.environ.get("ANTHROPIC_API_KEY") if api_key: try: # First check if agent already exists in running_agents to avoid duplicate instantiation existing_agent = running_agents.get(project_id) if existing_agent: # Reuse existing agent for status check discovery_status = existing_agent.get_discovery_status() else: # Create temporary agent only for status check (will be replaced by start_agent if needed) temp_agent = LeadAgent(project_id=project_id, db=db, api_key=api_key) discovery_status = temp_agent.get_discovery_status() discovery_state = discovery_status.get("state", "idle") if discovery_state == "idle": # Discovery not started - proceed to start discovery logger.info(f"Project {project_id} is running but discovery is idle - starting discovery") background_tasks.add_task(start_agent, project_id, db, running_agents, api_key) return {"message": f"Starting discovery for project {project_id}", "status": "starting"} elif discovery_state == "discovering": # Discovery already in progress return JSONResponse( status_code=200, content={"message": f"Project {project_id} discovery already in progress", "status": "running"}, ) elif discovery_state == "completed": # Discovery completed return JSONResponse( status_code=200, content={"message": f"Project {project_id} discovery already completed", "status": "completed"}, ) except Exception as e: logger.warning(f"Failed to check discovery status for project {project_id}: {e}") # Fall back to normal start flow on error - try to start discovery # This is safer than blocking since start_agent handles duplicates gracefully background_tasks.add_task(start_agent, project_id, db, running_agents, api_key) return {"message": f"Starting discovery for project {project_id}", "status": "starting"} else: # No API key available - can't check discovery status or start return JSONResponse( status_code=200, content={"message": f"Project {project_id} is already running", "status": "running"}, ) - # cf-10.2: Get API key from environment - api_key = os.environ.get("ANTHROPIC_API_KEY") if not api_key: raise HTTPException(status_code=500, detail="ANTHROPIC_API_KEY not configured") # cf-10.2: Start agent in background task (non-blocking) background_tasks.add_task(start_agent, project_id, db, running_agents, api_key) # cf-10.2: Return 202 Accepted immediately return {"message": f"Starting Lead Agent for project {project_id}", "status": "starting"}
📜 Review details
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
codeframe/ui/routers/agents.py
🧰 Additional context used
📓 Path-based instructions (1)
codeframe/**/*.py
📄 CodeRabbit inference engine (CLAUDE.md)
codeframe/**/*.py: Use Python 3.11+ for backend development with FastAPI, AsyncAnthropic, SQLite with async support (aiosqlite), and tiktoken for token counting
Use token counting via tiktoken library for token budget management with ~50,000 token limit per conversation
Use asyncio patterns with AsyncAnthropic for async/await in Python backend for concurrent operations
Implement quality gates with multi-stage pre-completion checks (tests → type → coverage → review) and Git + SQLite + context snapshots for project state rollback
Use tiered memory system (HOT/WARM/COLD) with importance scoring using hybrid exponential decay algorithm for context management with 30-50% token reduction
Implement session lifecycle management with auto-save/restore using file-based storage at .codeframe/session_state.json
Files:
codeframe/ui/routers/agents.py
🧠 Learnings (3)
📓 Common learnings
Learnt from: CR
Repo: frankbria/codeframe PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-04T06:26:12.845Z
Learning: Applies to web-ui/src/app/page.tsx : Implement automatic project discovery start after project creation with loading state transitions and 'Start Discovery' button for idle projects
📚 Learning: 2025-11-25T19:08:37.203Z
Learnt from: CR
Repo: frankbria/codeframe PR: 0
File: docs/CLAUDE.md:0-0
Timestamp: 2025-11-25T19:08:37.203Z
Learning: Implement Lead Agent for orchestration and Worker Agents for specialization (Backend, Frontend, Test, Review) with maturity levels D1-D4
Applied to files:
codeframe/ui/routers/agents.py
📚 Learning: 2026-01-04T06:26:12.845Z
Learnt from: CR
Repo: frankbria/codeframe PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-04T06:26:12.845Z
Learning: Applies to web-ui/src/app/page.tsx : Implement automatic project discovery start after project creation with loading state transitions and 'Start Discovery' button for idle projects
Applied to files:
codeframe/ui/routers/agents.py
🧬 Code graph analysis (1)
codeframe/ui/routers/agents.py (3)
codeframe/agents/lead_agent.py (1)
LeadAgent(28-2126)codeframe/core/models.py (2)
ProjectStatus(34-43)project_id(234-235)codeframe/ui/shared.py (1)
start_agent(257-360)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Backend Unit Tests
- GitHub Check: Frontend Unit Tests
- GitHub Check: claude-review
- GitHub Check: E2E Smoke Tests (Chromium)
🔇 Additional comments (5)
codeframe/ui/routers/agents.py (5)
23-23: LGTM!The
LeadAgentimport is necessary for the discovery-state checking logic introduced in this PR.
48-63: LGTM!The docstring updates accurately reflect the new discovery-state-aware behavior and all possible return codes.
95-113: Discovery state branching logic is correct.The three-way branch (idle/discovering/completed) with appropriate HTTP status codes (202 for starting, 200 for already running/completed) correctly implements idempotent behavior. The status messages are clear and informative.
89-89:get_discovery_status()method is properly implemented on LeadAgent. The method exists at line 393 ofcodeframe/agents/lead_agent.pywith return typeDict[str, Any]and correctly returns a dictionary with a "state" key containing values "idle", "discovering", or "completed", as used at lines 89 and 93.
85-93: The race condition is real but mitigated by design.At line 86,
running_agents.get(project_id)lacks synchronization. A thread could find no agent, allow another thread to create one, then create a temporary agent. However, verification confirms this is acceptable because:
- The temporary agent is not stored in
running_agents, so it's garbage collected after the status check- If
start_agentis called (lines 100 or 118), it usesshared_state._agents_lockto prevent duplicate instantiation- The
get_discovery_status()method exists on LeadAgent and is safe to invokeThe race window is narrow and consequential only to brief unnecessary instantiation. No additional locking is required.
Summary
/startendpoint now checks discovery state viaLeadAgent.get_discovery_status()before respondingRoot Cause
When a project is created, it's set to status "running" but discovery may not have started (state: "idle"). The
/startendpoint was returning "already running" without checking discovery state, blocking the Start Discovery button.Changes
codeframe/ui/routers/agents.pytests/agents/test_agent_lifecycle.pyweb-ui/.../DiscoveryProgress.test.tsxtests/e2e/test_start_agent_flow.spec.tsTest plan
npx playwright test test_start_agent_flow.spec.tsSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings.