Skip to content

Conversation

@svenaric
Copy link
Collaborator

@svenaric svenaric commented Oct 23, 2025

Overview: This PR introduces a new API endpoint to allow users to check the current processing status of a report.

Changes

  • Added a GET endpoint at /reports/{report_id}/status within app/api/v1/routes.py.
  • Integrated with report_service.py to fetch the report's state from the in-memory store.
  • The endpoint returns a JSON object containing the report_id and its status (e.g., 'processing', 'completed', 'failed').
  • Implemented error handling to return a 404 HTTPException if the provided report_id does not exist.

Summary by CodeRabbit

  • New Features

    • Agents now run concurrently for faster processing.
    • Report status responses are more structured, returning report ID, overall status, and per-agent results.
  • Bug Fixes & Improvements

    • Per-agent timeouts and clearer per-agent error reporting added.
    • Overall report status now distinguishes complete vs. partial success.
    • API route updated: /reports/{report_id}/status (404 for missing reports remains).

@coderabbitai
Copy link

coderabbitai bot commented Oct 23, 2025

Walkthrough

Changes add a concrete Orchestrator implementation with named agent registration, concurrent agent execution and per-agent result aggregation; introduce a public get_report_status_from_memory(report_id) service function; and update the API route to /reports/{report_id}/status to use that service.

Changes

Cohort / File(s) Summary
Orchestrator Implementation
backend/app/core/orchestrator.py
Replaces prior abstract/placeholder behavior with concrete agent registry using register_agent(name, agent_func), async execute_agents(report_id, token_id) producing per-agent statuses and timeouts, aggregate_results(results: Dict) signature, new Orchestrator subclass with execute_agents_concurrently(report_id, token_id) that aggregates results, derives overall status, updates in_memory_reports, and exports orchestrator singleton.
Report Service Layer
backend/app/services/report_service.py
Adds public function `get_report_status_from_memory(report_id: str) -> Dict
API Endpoint
backend/app/api/v1/routes.py
Changes route path to /reports/{report_id}/status, imports and uses get_report_status_from_memory(report_id), returns {"report_id": report_id, "status": report["status"]} or raises 404 if not found.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant API as "API Routes (/reports/{report_id}/status)"
    participant Service as "Report Service"
    participant Memory as "in_memory_reports"
    participant Orch as "Orchestrator"

    Client->>API: GET /reports/{report_id}/status
    API->>Service: get_report_status_from_memory(report_id)
    Service->>Memory: lookup report_id
    Memory-->>Service: report dict | None
    Service-->>API: report dict | None
    alt found
        API-->>Client: { "report_id": id, "status": report.status }
    else not found
        API-->>Client: 404 Report not found
    end

    Note over Orch,Memory: Background: concurrent agent execution
    Orch->>Orch: execute_agents(report_id, token_id) (async)
    par Agents (concurrent)
        Orch->>AgentA: invoke agent_func()
        AgentA-->>Orch: result | error (with timeout handling)
    end
    Orch->>Orch: aggregate_results(per_agent_results)
    Orch->>Memory: update in_memory_reports[report_id] with overall status & agent_results
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 I hopped through code with nimble feet,
Registered agents, all names complete,
They ran in parallel, each reported back,
I stitched their notes into a single track,
Now reports return — status neat and sweet.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The title "Feat: Add endpoint for report processing status" accurately describes the primary user-facing feature introduced in this PR. The PR objectives confirm that the main goal is to introduce a GET API endpoint at /reports/{report_id}/status for checking report processing status, which the title correctly captures. While the changeset includes significant backend refactoring in the orchestrator (changed agent storage, async execution, error handling), the title appropriately focuses on the main deliverable rather than supporting infrastructure changes. The title is clear, specific, and sufficiently descriptive that a teammate scanning the repository history would understand the purpose of this PR.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/report-status-endpoint

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
backend/app/api/v1/routes.py (1)

37-43: On background orchestration failure, flip report status to failed and fix logging.

Surface failures to clients by setting status, and avoid including the exception object in logger.exception message (TRY401).

     def _on_done(t: asyncio.Task):
         try:
             t.result()
-        except Exception as e:
-            logger.exception('Background orchestration failed for %s: %s', report_id, e)
-            # Optionally update report status to failed here as well
+        except Exception as e:
+            logger.exception('Background orchestration failed for %s', report_id)
+            # Update report status so status endpoint reflects failure
+            if report_id in in_memory_reports:
+                in_memory_reports[report_id].update({"status": "failed", "error": str(e)})

Optional: dispatch an async save via asyncio.create_task(save_report_data(report_id, {"status": "failed", "error": str(e)})) if you prefer to keep writes encapsulated in the service layer. (As per static analysis hints)

🧹 Nitpick comments (4)
backend/app/core/orchestrator.py (2)

59-63: Classify overall status as failed when all agents fail.

Current logic never returns "failed". Compute fail count to distinguish completed, partial_success, and failed.

-        overall_status = "completed"
-        if any(result["status"] == "failed" for result in agent_results.values()):
-            overall_status = "partial_success"
+        fail_count = sum(1 for r in agent_results.values() if r["status"] == "failed")
+        if fail_count == 0:
+            overall_status = "completed"
+        elif fail_count == len(agent_results):
+            overall_status = "failed"
+        else:
+            overall_status = "partial_success"

3-3: Tighten agent typing to ensure async call signature.

Annotate agents as Callable[[str, str], Awaitable[Dict[str, Any]]] for clarity and static safety.

-from typing import Callable, Dict, Any, List
+from typing import Callable, Dict, Any, Awaitable, List
@@
-        self.agents: Dict[str, Callable] = {}
+        self.agents: Dict[str, Callable[[str, str], Awaitable[Dict[str, Any]]]] = {}
@@
-    def register_agent(self, name: str, agent_func: Callable):
+    def register_agent(self, name: str, agent_func: Callable[[str, str], Awaitable[Dict[str, Any]]]):
@@
-            agent_func (Callable): The asynchronous function representing the agent.
+            agent_func (Callable[[str, str], Awaitable[Dict[str, Any]]]): The asynchronous function representing the agent.

Also applies to: 15-25

backend/app/services/report_service.py (1)

29-30: Accessor returns full record; consider a minimal, typed status view.

Expose only status to callers to avoid leaking internal fields and to decouple storage shape. Option: add get_report_status(report_id) -> str | None returning just the status, or a small Pydantic model.

Also, confirm runtime is Python 3.10+ for Dict | None union syntax.

backend/app/api/v1/routes.py (1)

14-21: Use logger instead of print in async agents.

Prefer structured logging over stdout for observability.

-    print(f"Dummy Agent One running for report {report_id} and token {token_id}")
+    logger.info("Dummy Agent One running for report %s and token %s", report_id, token_id)
@@
-    print(f"Dummy Agent Two running for report {report_id} and token {token_id}")
+    logger.info("Dummy Agent Two running for report %s and token %s", report_id, token_id)

Additionally, consider moving agent registration to an application startup event to avoid duplicate registration on reloads.

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 318af84 and 11529b4.

⛔ Files ignored due to path filters (6)
  • backend/__pycache__/__init__.cpython-313.pyc is excluded by !**/*.pyc
  • backend/app/core/__pycache__/orchestrator.cpython-313.pyc is excluded by !**/*.pyc
  • backend/app/services/__pycache__/report_processor.cpython-313.pyc is excluded by !**/*.pyc
  • backend/app/services/__pycache__/report_service.cpython-313.pyc is excluded by !**/*.pyc
  • backend/tests/__pycache__/test_orchestrator.cpython-313-pytest-8.4.2.pyc is excluded by !**/*.pyc
  • backend/tests/__pycache__/test_report_processor.cpython-313-pytest-8.4.2.pyc is excluded by !**/*.pyc
📒 Files selected for processing (3)
  • backend/app/api/v1/routes.py (2 hunks)
  • backend/app/core/orchestrator.py (2 hunks)
  • backend/app/services/report_service.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
backend/app/api/v1/routes.py (1)
backend/app/services/report_service.py (2)
  • generate_report (11-20)
  • get_report_status_from_memory (29-30)
🪛 Ruff (0.14.1)
backend/app/core/orchestrator.py

36-36: Redundant exception object included in logging.exception call

(TRY401)


38-38: Local variable has_failed_agent is assigned to but never used

Remove assignment to unused variable has_failed_agent

(F841)

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
backend/app/core/orchestrator.py (3)

26-40: Bind timeouts at scheduling; iterate with as_completed; don’t swallow cancellations.

Loop-based wait_for(..., timeout=10) grants each agent extra time relative to when it’s awaited, not when scheduled. Bind the timeout at task creation and process tasks as they finish. Also re-raise CancelledError and downgrade timeouts to warning.

Apply:

-        tasks = {name: asyncio.create_task(agent_func(report_id, token_id)) for name, agent_func in self.agents.items()}
-        results = {}
-
-        for name, task in tasks.items():
-            try:
-                result = await asyncio.wait_for(task, timeout=10) # Added timeout
-                results[name] = {"status": "completed", "data": result}
-            except asyncio.TimeoutError: # Handle timeout specifically
-                logger.exception("Agent %s timed out for report %s", name, report_id)
-                results[name] = {"status": "failed", "error": "Agent timed out"}
-            except Exception as e:
-                logger.exception("Agent %s failed for report %s", name, report_id)
-                results[name] = {"status": "failed", "error": str(e)}
+        task_to_name = {
+            asyncio.create_task(asyncio.wait_for(agent_func(report_id, token_id), timeout=self.agent_timeout)): name
+            for name, agent_func in self.agents.items()
+        }
+        results = {}
+        for task in asyncio.as_completed(task_to_name):
+            name = task_to_name[task]
+            try:
+                result = await task
+                results[name] = {"status": "completed", "data": result}
+            except asyncio.TimeoutError:
+                logger.warning("Agent %s timed out for report %s", name, report_id)
+                results[name] = {"status": "failed", "error": "Agent timed out"}
+            except asyncio.CancelledError:
+                # Preserve cooperative cancellation (shutdown, client disconnect, etc.)
+                raise
+            except Exception as e:
+                logger.exception("Agent %s failed for report %s", name, report_id)
+                results[name] = {"status": "failed", "error": str(e)}
         return results

14-16: Make per‑agent timeout configurable.

Hardcoding 10s reduces flexibility. Add a parameter and store it.

Apply:

-    def __init__(self):
-        self.agents: Dict[str, Callable] = {}
+    def __init__(self, agent_timeout: float = 10.0):
+        self.agents: Dict[str, Callable] = {}
+        self.agent_timeout = agent_timeout

17-24: Strengthen typing and validate agent registration.

Constrain agent_func to an async callable and fail fast on mis-registration.

Apply:

-    def register_agent(self, name: str, agent_func: Callable):
+    def register_agent(self, name: str, agent_func: Callable[[str, str], Any]):
         """
         Registers an AI agent with the orchestrator.
         Args:
             name (str): The name of the agent.
-            agent_func (Callable): The asynchronous function representing the agent.
+            agent_func (Callable): The asynchronous function representing the agent.
         """
-        self.agents[name] = agent_func
+        # Runtime guard: must be async
+        if not inspect.iscoroutinefunction(agent_func):
+            raise TypeError(f"agent_func for '{name}' must be an async function")
+        self.agents[name] = agent_func

Add imports (outside this hunk):

import inspect
from typing import Awaitable  # if you later refine to Callable[[str, str], Awaitable[Any]]
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 11529b4 and 24d5a64.

📒 Files selected for processing (2)
  • backend/app/api/v1/routes.py (2 hunks)
  • backend/app/core/orchestrator.py (2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • backend/app/api/v1/routes.py
🔇 Additional comments (2)
backend/app/core/orchestrator.py (2)

66-74: The review comment's primary concern is already addressed by the code.

The status endpoint at lines 47–51 correctly filters the response to {"report_id": report_id, "status": report["status"]}, explicitly excluding agent_results and any internal data. The route does not expose per-agent internals to clients. No further action is required for the data exposure concern.

The persistence concern (in-memory storage across multi-process workers) is valid architectural guidance but is not a bug in the code being reviewed—it relates to deployment resilience rather than this specific change.

Likely an incorrect or invalid review comment.


58-64: The review comment is incorrect; "partial_success" is the intended API contract.

The test at backend/tests/test_orchestrator.py:57 explicitly asserts assert in_memory_reports[report_id]["status"] == "partial_success" for a scenario where one agent completes successfully and one fails. This status value accurately reflects a mixed outcome (partial work completed, not total failure). Changing it to "failed" would misrepresent the result and break the test.

The endpoint directly returns report["status"] (routes.py:50), so the status vocabulary in in_memory_reports is the public API contract. The existing design correctly uses:

  • "completed" when all agents succeed
  • "partial_success" when some agents succeed and some fail
  • "failed" for individual agent failures

Ignore the proposed status change to "failed". The only reasonable suggestion from the original comment—adding status to aggregated_data for convenience—is independent of the vocabulary concern.

Likely an incorrect or invalid review comment.

@felixjordandev
Copy link
Collaborator

the new /reports/{report_id}/status endpoint will definitely help users keep track of their report processing; nice addition.

@felixjordandev felixjordandev merged commit 83e8b51 into main Oct 23, 2025
1 check passed
@felixjordandev felixjordandev deleted the feat/report-status-endpoint branch October 23, 2025 01:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants