-
Notifications
You must be signed in to change notification settings - Fork 0
Feat: Update report state in orchestrator after each stage #67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughOrchestration moved from in-memory to DB-backed: Orchestrator now constructed with a session factory and uses ReportRepository and expanded ReportStatusEnum values. Agents run under AsyncSession contexts, persist partial outputs/statuses, and process_report/NLG/summary stages update report state via the repository. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant API as FastAPI
participant BG as Background Task
participant Repo as ReportRepository (session_factory)
participant Orch as Orchestrator (session per-agent)
participant Agents as Agents (concurrent)
participant NLG as NLG/Summary
API->>Repo: create_report_entry(report_id,...)
API->>BG: enqueue _run_agents_in_background(report_id, token, repo)
BG->>Repo: update_partial(report_id, {status: RUNNING_AGENTS})
BG->>Orch: create Orchestrator(session_factory)
BG->>Orch: Orch.execute_agents(report_id, token)
Orch->>Agents: dispatch agents (each opens AsyncSession via session_factory)
par Agents run concurrently
Agents-->>Repo: update_partial(report_id, {agent: partial_output}) or return result
end
Orch-->>Repo: update_partial(report_id, {status: AGENTS_COMPLETED/AGENTS_PARTIAL_SUCCESS, agent_outputs: ...})
BG->>Repo: update_partial(report_id, {status: GENERATING_NLG})
BG->>NLG: generate NLG content
NLG-->>Repo: update_partial(report_id, {nlg_sections: ..., status: NLG_COMPLETED})
BG->>Repo: update_partial(report_id, {status: GENERATING_SUMMARY})
BG->>NLG: generate summary
NLG-->>Repo: update_partial(report_id, {final_report_json: ..., status: SUMMARY_COMPLETED/COMPLETED})
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/core/orchestrator.py (1)
41-251: Critical: Session lifecycle violation will cause database errors.The
create_orchestratorfunction creates a session within anasync withcontext (line 51), but returns anOrchestratorinstance (line 251) that holds a reference to that session viaself.report_repository. When the function returns, the context manager closes the session, leaving the returned orchestrator with a closed session.Any subsequent database operations through
orchestrator.report_repositorywill fail with session-closed errors.Solution: The orchestrator should either:
- Accept a session from the caller (who manages the lifecycle), or
- Create and manage its own session internally with proper lifecycle management
Apply this diff for option 1 (caller-managed session):
-async def create_orchestrator(register_dummy: bool = False) -> Orchestrator: +async def create_orchestrator(session: AsyncSession, register_dummy: bool = False) -> Orchestrator: """ Factory function to create and configure an Orchestrator instance. Args: + session (AsyncSession): Database session for the orchestrator register_dummy (bool): If True, a 'dummy_agent' will be registered with the orchestrator. Returns: Orchestrator: A new instance of the Orchestrator. """ - async with get_session() as session: - def _is_valid_url(url: str | None, url_name: str) -> bool: + def _is_valid_url(url: str | None, url_name: str) -> bool: # ... rest of function - return orch + return orchThen the caller would manage the session:
async with get_session() as session: orchestrator = await create_orchestrator(session) # use orchestrator while session is open
🧹 Nitpick comments (1)
backend/app/core/orchestrator.py (1)
141-141: Refine exception handling to catch specific exceptions.Catching blind
Exception(lines 141, 182, 233) is overly broad and can mask unexpected errors. While the current logging withorchestrator_logger.exception()helps, it's better practice to catch specific expected exceptions.Consider catching specific exceptions for each agent. For example:
- except Exception as e: + except (asyncio.TimeoutError, ValueError, KeyError, httpx.HTTPError) as e: orchestrator_logger.exception("Social Sentiment Agent failed for report %s", report_id) return {"status": "failed", "error": str(e)} + except Exception as e: + orchestrator_logger.exception("Unexpected error in Social Sentiment Agent for report %s", report_id) + raise # Re-raise unexpected exceptionsThis pattern allows expected errors to be handled gracefully while surfacing unexpected issues for investigation.
Also applies to: 182-182, 233-233
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (53)
__pycache__/main.cpython-313.pycis excluded by!**/*.pycbackend/__pycache__/main.cpython-313.pycis excluded by!**/*.pycbackend/app/api/v1/__pycache__/__init__.cpython-313.pycis excluded by!**/*.pycbackend/app/api/v1/__pycache__/routes.cpython-313.pycis excluded by!**/*.pycbackend/app/core/__pycache__/config.cpython-313.pycis excluded by!**/*.pycbackend/app/core/__pycache__/exceptions.cpython-313.pycis excluded by!**/*.pycbackend/app/core/__pycache__/orchestrator.cpython-313.pycis excluded by!**/*.pycbackend/app/core/__pycache__/storage.cpython-313.pycis excluded by!**/*.pycbackend/app/db/__pycache__/connection.cpython-313.pycis excluded by!**/*.pycbackend/app/db/models/__pycache__/report_state.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/db/repositories/__pycache__/__init__.cpython-313.pycis excluded by!**/*.pycbackend/app/db/repositories/__pycache__/report_repository.cpython-313.pycis excluded by!**/*.pycbackend/app/db/tests/__pycache__/test_connection.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/__pycache__/report_processor.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/__pycache__/report_processor.cpython-313.pycis excluded by!**/*.pycbackend/app/services/agents/__pycache__/price_agent.cpython-313.pycis excluded by!**/*.pycbackend/app/services/agents/__pycache__/trend_agent.cpython-313.pycis excluded by!**/*.pycbackend/app/services/agents/__pycache__/volume_agent.cpython-313.pycis excluded by!**/*.pycbackend/app/services/agents/tests/__pycache__/test_code_audit_agent.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/agents/tests/__pycache__/test_code_audit_agent.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/agents/tests/__pycache__/test_onchain_agent.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/agents/tests/__pycache__/test_onchain_agent.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/agents/tests/__pycache__/test_social_sentiment_agent.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/agents/tests/__pycache__/test_social_sentiment_agent.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/agents/tests/__pycache__/test_team_doc_agent.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/agents/tests/__pycache__/test_team_doc_agent.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/agents/tests/__pycache__/test_team_doc_agent_new_feature.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/agents/tests/__pycache__/test_team_doc_agent_new_feature.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/nlg/tests/__pycache__/test_llm_client.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/nlg/tests/__pycache__/test_llm_client.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/nlg/tests/__pycache__/test_nlg_engine.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/nlg/tests/__pycache__/test_nlg_engine.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/nlg/tests/__pycache__/test_report_nlg_engine.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/nlg/tests/__pycache__/test_report_nlg_engine.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/summary/tests/__pycache__/test_report_summary_engine.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/summary/tests/__pycache__/test_report_summary_engine.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/summary/tests/__pycache__/test_summary_engine_advanced.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/summary/tests/__pycache__/test_summary_engine_advanced.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/tests/__pycache__/test_onchain_agent.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/services/validation/__pycache__/__init__.cpython-313.pycis excluded by!**/*.pycbackend/app/services/validation/__pycache__/validation_engine.cpython-313.pycis excluded by!**/*.pycbackend/app/services/validation/tests/__pycache__/test_validation_engine.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/services/validation/tests/__pycache__/test_validation_engine.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/app/tests/__pycache__/test_report_processor.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/app/tests/__pycache__/test_report_processor.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/tests/__pycache__/test_orchestrator.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/tests/__pycache__/test_orchestrator_config.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/tests/__pycache__/test_orchestrator_config.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/tests/__pycache__/test_orchestrator_integration.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/tests/__pycache__/test_orchestrator_integration.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/tests/__pycache__/test_report_processor.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pycbackend/tests/__pycache__/test_routes.cpython-313-pytest-8.2.0.pycis excluded by!**/*.pycbackend/tests/__pycache__/test_routes.cpython-313-pytest-8.4.2.pycis excluded by!**/*.pyc
📒 Files selected for processing (4)
backend/app/core/orchestrator.py(3 hunks)backend/app/db/models/report_state.py(1 hunks)backend/app/db/repositories/report_repository.py(1 hunks)requirements.txt(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
backend/app/db/repositories/report_repository.py (1)
backend/app/db/models/report_state.py (1)
ReportState(25-34)
backend/app/core/orchestrator.py (6)
backend/app/db/repositories/report_repository.py (1)
ReportRepository(8-82)backend/app/db/models/report_state.py (1)
ReportState(25-34)backend/app/services/agents/onchain_agent.py (2)
fetch_onchain_metrics(36-78)fetch_tokenomics(86-129)backend/app/services/agents/social_sentiment_agent.py (3)
SocialSentimentAgent(28-148)fetch_social_data(63-91)analyze_sentiment(93-148)backend/app/services/agents/team_doc_agent.py (2)
scrape_team_profiles(100-146)analyze_whitepaper(148-188)backend/app/services/agents/code_audit_agent.py (4)
CodeAuditAgent(44-336)fetch_repo_metrics(195-236)analyze_code_activity(238-266)search_and_summarize_audit_reports(268-296)
🪛 Ruff (0.14.5)
backend/app/db/repositories/report_repository.py
79-79: Consider moving this statement to an else block
(TRY300)
backend/app/core/orchestrator.py
141-141: Do not catch blind exception: Exception
(BLE001)
182-182: Do not catch blind exception: Exception
(BLE001)
233-233: Do not catch blind exception: Exception
(BLE001)
🔇 Additional comments (4)
backend/app/db/repositories/report_repository.py (1)
73-82: LGTM! Implementation is consistent with existing patterns.The
update_partialmethod follows the same structure as other update methods in this repository. The use of**dataprovides flexibility for updating arbitrary fields, which aligns with the PR's goal of persisting various stage-specific updates.backend/app/core/orchestrator.py (2)
10-13: LGTM! Database integration imports are appropriate.The imports for
ReportRepository,ReportState,AsyncSession, andget_sessionproperly support the shift to database-backed orchestration.
37-39: LGTM! Repository initialization is correct.The orchestrator properly stores the
ReportRepositoryinstance for database operations.requirements.txt (1)
12-12: Preserved async database driver is appropriate.The
asyncpg==0.30.0dependency aligns with the orchestrator's shift toAsyncSessionfor database-backed state persistence as described in the PR objectives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 7
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
backend/app/services/summary/report_summary_engine.py (1)
23-29: Keep code‑maturity comments aligned with actual inputsYou removed the unused
lines_of_codeplaceholder (good), but the surrounding comments still reference it and “reasonable LOC”, while the score now depends only on coverage and bug density. This is slightly misleading.Either reintroduce
lines_of_codeinto the formula or (simpler) update the comments, e.g.:- # Assuming 'code_audit_data' contains 'lines_of_code', 'test_coverage', 'bug_density' + # Assuming 'code_audit_data' contains 'test_coverage' and 'bug_density' @@ - # Simple rule: higher coverage, lower bug density, reasonable LOC - scores["code_maturity"] = (test_coverage * 0.6 + (1 - bug_density) * 0.4) * 10 # Scale to 1-10 + # Simple rule: higher coverage and lower bug density + scores["code_maturity"] = (test_coverage * 0.6 + (1 - bug_density) * 0.4) * 10 # Scale to 1-10backend/app/core/orchestrator.py (1)
68-127: Critical:orchis undefined, factory is missing, and trailingreturn orchmakes the file invalid.
- All agent functions and registrations (
onchain_data_agent,social_sentiment_agent_func,team_documentation_agent,code_audit_agent_func) callorch.report_repository...andorch.register_agent(...), but there is no definition oforchanywhere in this file. This will raise aNameErrorat import time whenorch.register_agent(...)is executed (e.g., Line [127], Line [160], Line [215], Line [270]).- The docstring says instances “should be created using the
create_orchestratorfactory function”, andbackend/main.pyimportscreate_orchestrator, but there is no such function defined here.- Line [274] (
return orch) is indented without being inside any function/class, which triggers anIndentationErrorand prevents the module from importing at all.You need to make the wiring coherent and syntactically valid, for example by:
- Introducing an async factory
create_orchestrator(session: AsyncSession) -> Orchestratorthat:
- Instantiates
orch = Orchestrator(session).- Registers all these agents on
orch.- Returns
orch.- Moving the agent definitions and
orch.register_agent(...)calls inside that factory, so they close over the localorchinstead of relying on a global.- Removing the stray top-level
return orchand any other unreachable/incorrectly indented code.Until this is fixed,
backend.app.core.orchestrator(and anything importing it) will fail to import.Also applies to: 131-161, 163-215, 217-272, 274-274
backend/app/services/report_processor.py (1)
74-77: Fixoverall_statushandling to avoid type mismatches and crashes on unexpected agent results.Two issues here:
Type mismatch when persisting status (Line [90]):
overall_status = "completed" # or "failed" await report_repository.update_report_status(report_id, overall_status)
update_report_statusis typed to acceptReportStatusEnum, and the rest of the code (routes, services) expect values from that enum. Persisting raw strings breaks that contract and makes DB state heterogeneous.Prefer:
overall_status = "completed"if any(result["status"] == "failed" for result in agent_results.values()):overall_status = "failed"
if any(isinstance(result, dict) and result.get("status") == "failed" for result in agent_results.values()):overall_status = ReportStatusEnum.FAILEDelse:overall_status = ReportStatusEnum.COMPLETED...
await report_repository.update_report_status(report_id, overall_status)
await report_repository.update_report_status(report_id, overall_status)
Potential crash if an agent returns an Exception (Line [75]):
Because
execute_agentsusesreturn_exceptions=True, a failing agent can produce anExceptionobject instead of a dict. Indexingresult["status"]on such an object will raiseTypeError. Theisinstance(result, dict)guard in the diff above fixes that by only inspecting structured results.Together, these changes make status updates type-safe and robust to unexpected agent failures.
Also applies to: 80-83, 90-91
backend/app/api/v1/routes.py (1)
48-73: Extend status handling in/reports/{report_id}/datafor all ReportStatusEnum values.Currently,
get_report_data_endpointtreats statuses as follows:
COMPLETED→ return data (Lines [53-55]).PROCESSING,PENDING,RUNNING_AGENTS,GENERATING_NLG,GENERATING_SUMMARY→ 202 “still processing” (Lines [56-63]).FAILED→ 409 with error detail (Lines [64-73]).However
ReportStatusEnumincludes additional states (AGENTS_COMPLETED,AGENTS_FAILED,AGENTS_PARTIAL_SUCCESS,NLG_COMPLETED,SUMMARY_COMPLETED, and likelyCANCELLEDper other code) that fall through to the final “not found or not completed” branch and are treated as an error.Consider:
- Mapping
AGENTS_COMPLETED,NLG_COMPLETED,SUMMARY_COMPLETEDinto the “still processing” path until the finalCOMPLETEDstate is reached.- Mapping
AGENTS_FAILEDandCANCELLEDinto a failure branch similar toFAILED, returning a 4xx with appropriate detail.- Ensuring whatever additional failure statuses you add in
ReportStatusEnumare properly surfaced here.This will keep the API semantics aligned with the richer state machine implemented in the orchestrator and processor.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
backend/app/api/v1/routes.py(1 hunks)backend/app/core/orchestrator.py(6 hunks)backend/app/services/report_processor.py(4 hunks)backend/app/services/report_service.py(1 hunks)backend/app/services/summary/report_summary_engine.py(1 hunks)backend/main.py(2 hunks)requirements.txt(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
backend/main.py (1)
backend/app/core/orchestrator.py (1)
Orchestrator(22-49)
backend/app/services/report_processor.py (6)
backend/app/core/orchestrator.py (1)
Orchestrator(22-49)backend/app/core/storage.py (1)
set_report_status(7-13)backend/app/services/nlg/report_nlg_engine.py (2)
ReportNLGEngine(12-147)generate_nlg_outputs(24-72)backend/app/services/summary/report_summary_engine.py (1)
ReportSummaryEngine(4-72)backend/app/db/repositories/report_repository.py (2)
ReportRepository(8-82)update_report_status(35-44)backend/app/db/models/report_state.py (1)
ReportStatusEnum(8-20)
backend/app/services/report_service.py (4)
backend/app/db/repositories/report_repository.py (1)
ReportRepository(8-82)backend/app/db/models/report_state.py (1)
ReportStatusEnum(8-20)backend/app/models/report_models.py (2)
ReportRequest(4-6)ReportResponse(8-10)backend/app/utils/id_generator.py (1)
generate_report_id(3-8)
🪛 Ruff (0.14.5)
backend/app/services/report_processor.py
93-93: Consider moving this statement to an else block
(TRY300)
backend/app/core/orchestrator.py
115-115: missing closing quote in string literal
(invalid-syntax)
117-117: missing closing quote in string literal
(invalid-syntax)
274-274: Unexpected indentation
(invalid-syntax)
274-274: Expected a statement
(invalid-syntax)
backend/app/api/v1/routes.py
24-24: Do not catch blind exception: Exception
(BLE001)
29-29: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
38-38: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
48-48: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
🔇 Additional comments (2)
requirements.txt (1)
12-16: ✓ All dependencies verified as legitimately integrated and security fix confirmed.The review comment is accurate. All newly added dependencies are actively imported and used in the codebase:
- tenacity (retry logic): Used in
onchain_agent.pyandsocial_sentiment_agent.py- textblob (NLG): Used in
social_sentiment_agent.py- beautifulsoup4 (HTML parsing): Used in
team_doc_agent.py(imported asbs4)- requests (HTTP): Upgraded from 2.32.3 to 2.32.5, addressing CVE-2024-47081
All versions are current and the dependencies align well with the PR's orchestration and state management objectives. No spurious or unused additions detected.
backend/app/core/orchestrator.py (1)
35-49: Core orchestration helpers look sound.
execute_agentsandaggregate_resultscorrectly run all registered agents concurrently and only merge"status" == "completed"payloads, logging everything else._is_valid_url’s HTTP/HTTPS and netloc checks are a good guard against misconfigured endpoints. No changes needed here.Also applies to: 52-62
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
backend/app/services/report_processor.py (1)
36-42: Orchestrator now creates its ownReportRepositoryinternally.Line 36 creates an
Orchestrator(report_repository.session), but looking at theOrchestrator.__init__inorchestrator.py, it creates its ownReportRepositoryfrom the session. This means there are now twoReportRepositoryinstances for the same session—one passed toprocess_reportand one created byOrchestrator. This could lead to subtle commit/rollback inconsistencies.Additionally, lines 37-39 register
price_agent_run,trend_agent_run,volume_agent_runwhich appear to be legacy agents, while thecreate_orchestratorfactory inorchestrator.pyregisters the new DB-backed agents (onchain_data_agent,social_sentiment_agent, etc.). This inconsistency means the processor won't use the new agents.Consider either:
- Use
create_orchestrator()factory instead of manually instantiating and registering agents- Or pass the existing
report_repositoryto the orchestrator instead of creating a new one internally
♻️ Duplicate comments (1)
backend/app/api/v1/routes.py (1)
19-19: Remove duplicate import.
get_sessionis already imported at line 4. This redefinition is flagged by Ruff (F811).-from backend.app.db.connection import get_session # Added for background task -
🧹 Nitpick comments (3)
backend/app/services/report_processor.py (1)
7-7: Unused imports frombackend.app.core.storage.
save_report_dataandset_report_statusare imported but never used—onlytry_set_processingis called. Consider removing the unused imports.-from backend.app.core.storage import save_report_data, set_report_status, try_set_processing +from backend.app.core.storage import try_set_processingbackend/app/api/v1/routes.py (1)
60-60: Duplicate condition check forRUNNING_AGENTS.value.The condition checks
ReportStatusEnum.RUNNING_AGENTS.valuetwice, which is redundant.- elif report_result.get("status") == ReportStatusEnum.RUNNING_AGENTS.value or report_result.get("status") == ReportStatusEnum.PENDING.value or report_result.get("status") == ReportStatusEnum.RUNNING_AGENTS.value or report_result.get("status") == ReportStatusEnum.GENERATING_NLG.value or report_result.get("status") == ReportStatusEnum.GENERATING_SUMMARY.value: + elif report_result.get("status") in ( + ReportStatusEnum.PENDING.value, + ReportStatusEnum.RUNNING_AGENTS.value, + ReportStatusEnum.GENERATING_NLG.value, + ReportStatusEnum.GENERATING_SUMMARY.value, + ):backend/app/core/orchestrator.py (1)
36-41: Addstrict=Truetozip()for safer iteration.If the number of task names doesn't match the number of results (which shouldn't happen here but is a good defensive practice),
strict=Truewould raise an error instead of silently truncating.- results = await asyncio.gather(*tasks.values(), return_exceptions=True) - return dict(zip(tasks.keys(), results)) + results = await asyncio.gather(*tasks.values(), return_exceptions=True) + return dict(zip(tasks.keys(), results, strict=True))
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
backend/app/api/v1/routes.py(1 hunks)backend/app/core/orchestrator.py(3 hunks)backend/app/services/report_processor.py(4 hunks)backend/app/services/report_service.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
backend/app/services/report_service.py (4)
backend/app/db/repositories/report_repository.py (3)
ReportRepository(8-82)create_report_entry(12-33)get_report_state(68-71)backend/app/db/models/report_state.py (1)
ReportStatusEnum(8-20)backend/app/models/report_models.py (2)
ReportRequest(4-6)ReportResponse(8-10)backend/app/utils/id_generator.py (1)
generate_report_id(3-8)
backend/app/services/report_processor.py (4)
backend/app/core/orchestrator.py (1)
Orchestrator(23-50)backend/app/services/nlg/report_nlg_engine.py (2)
ReportNLGEngine(12-147)generate_nlg_outputs(24-72)backend/app/db/repositories/report_repository.py (5)
ReportRepository(8-82)update_report_status(35-44)update_partial(73-82)store_partial_report_results(46-55)save_final_report(57-66)backend/app/db/models/report_state.py (1)
ReportStatusEnum(8-20)
backend/app/core/orchestrator.py (6)
backend/app/db/repositories/report_repository.py (3)
ReportRepository(8-82)update_report_status(35-44)update_partial(73-82)backend/app/db/models/report_state.py (1)
ReportStatusEnum(8-20)backend/app/services/agents/onchain_agent.py (2)
fetch_onchain_metrics(36-78)fetch_tokenomics(86-129)backend/app/services/agents/social_sentiment_agent.py (2)
fetch_social_data(63-91)analyze_sentiment(93-148)backend/app/services/agents/team_doc_agent.py (3)
TeamDocAgent(9-188)scrape_team_profiles(100-146)analyze_whitepaper(148-188)backend/app/services/agents/code_audit_agent.py (4)
CodeAuditAgent(44-336)fetch_repo_metrics(195-236)analyze_code_activity(238-266)search_and_summarize_audit_reports(268-296)
🪛 Ruff (0.14.5)
backend/app/services/report_processor.py
93-93: Consider moving this statement to an else block
(TRY300)
backend/app/core/orchestrator.py
41-41: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
65-65: Redefinition of unused get_session from line 13
Remove definition: get_session
(F811)
135-135: Consider moving this statement to an else block
(TRY300)
140-140: Do not catch blind exception: Exception
(BLE001)
170-170: Consider moving this statement to an else block
(TRY300)
175-175: Do not catch blind exception: Exception
(BLE001)
227-227: Consider moving this statement to an else block
(TRY300)
232-232: Do not catch blind exception: Exception
(BLE001)
284-284: Consider moving this statement to an else block
(TRY300)
289-289: Do not catch blind exception: Exception
(BLE001)
backend/app/api/v1/routes.py
26-26: Expected except or finally after try block
(invalid-syntax)
28-28: Expected a statement
(invalid-syntax)
28-28: Expected a statement
(invalid-syntax)
28-29: Expected an expression
(invalid-syntax)
29-29: Unexpected indentation
(invalid-syntax)
32-32: Expected a statement
(invalid-syntax)
🔇 Additional comments (4)
backend/app/services/report_service.py (2)
8-12: Repository integration looks correct.The
generate_reportfunction now properly usescreate_report_entry(report_id)and returns the enum value viaReportStatusEnum.PENDING.value, which aligns with the repository API.
14-19: Status retrieval correctly migrated to repository.The function now uses
get_report_stateand properly accessesreport.status.valuefor the response.backend/app/core/orchestrator.py (2)
23-30: Orchestrator class structure is sound.The class properly initializes with a session and creates a
ReportRepository. The agent registration and execution patterns are appropriate.
53-63: URL validation helper is well-implemented.The
_is_valid_urlfunction properly validates HTTP/HTTPS URLs and logs appropriate warnings for invalid configurations.
| except Exception as e: | ||
| logger.exception("Error generating NLG outputs for report %s", report_id) | ||
| await report_repository.update_partial(report_id, {"status": ReportStatusEnum.FAILED, "error": str(e)}) | ||
| raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
Script executed:
#!/bin/bash
# Find ReportState model and examine its structure
find . -type f -name "*.py" | xargs grep -l "class ReportState" | head -5Repository: LumintelAnalytics/ChainReport-API
<shell_result>
Repository: LumintelAnalytics/ChainReport-API
Exit code: 0
stdout:
./backend/app/db/models/report_state.py
</shell_result>
Script executed:
#!/bin/bash
# Examine ReportState model definition
cat -n backend/app/db/models/report_state.pyRepository: LumintelAnalytics/ChainReport-API
<shell_result>
Repository: LumintelAnalytics/ChainReport-API
Exit code: 0
stdout:
1 from enum import Enum as PyEnum
2 from sqlalchemy import Column, String, DateTime, JSON, ForeignKey
3 from sqlalchemy.sql import func
4 from sqlalchemy.types import Enum as SQLEnum
5
6 from backend.app.db.base import Base
7
8 class ReportStatusEnum(PyEnum):
9 PENDING = "pending"
10 RUNNING = "running"
11 FAILED = "failed"
12 COMPLETED = "completed"
13 RUNNING_AGENTS = "running_agents"
14 AGENTS_COMPLETED = "agents_completed"
15 AGENTS_FAILED = "agents_failed"
16 AGENTS_PARTIAL_SUCCESS = "agents_partial_success"
17 GENERATING_NLG = "generating_nlg"
18 NLG_COMPLETED = "nlg_completed"
19 GENERATING_SUMMARY = "generating_summary"
20 SUMMARY_COMPLETED = "summary_completed"
21
22
23
24
25 class ReportState(Base):
26 __tablename__ = "report_states"
27
28 report_id = Column(String, ForeignKey("reports.id"), primary_key=True)
29 status = Column(SQLEnum(ReportStatusEnum, values_callable=lambda x: [e.value for e in x]), default=ReportStatusEnum.PENDING, index=True)
30 created_at = Column(DateTime(timezone=True), server_default=func.now())
31 updated_at = Column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())
32 partial_agent_output = Column(JSON, nullable=True) # Stores partial outputs from agents
33 raw_data = Column(JSON, nullable=True) # Stores raw data collected by agents
34 final_report_json = Column(JSON, nullable=True) # Stores the final generated report JSON
</shell_result>
Script executed:
#!/bin/bash
# Search for update_partial method definition
rg -n "def update_partial" backend/app -A 10Repository: LumintelAnalytics/ChainReport-API
<shell_result>
Repository: LumintelAnalytics/ChainReport-API
Exit code: 0
stdout:
backend/app/db/repositories/report_repository.py:73: async def update_partial(self, report_id: str, data: Dict[str, Any]) -> ReportState | None:
backend/app/db/repositories/report_repository.py-74- try:
backend/app/db/repositories/report_repository.py-75- stmt = update(ReportState).where(ReportState.report_id == report_id).values(**data).returning(ReportState)
backend/app/db/repositories/report_repository.py-76- result = await self.session.execute(stmt)
backend/app/db/repositories/report_repository.py-77- updated_report_state = result.scalar_one_or_none()
backend/app/db/repositories/report_repository.py-78- await self.session.commit()
backend/app/db/repositories/report_repository.py-79- return updated_report_state
backend/app/db/repositories/report_repository.py-80- except Exception:
backend/app/db/repositories/report_repository.py-81- await self.session.rollback()
backend/app/db/repositories/report_repository.py-82- raise
</shell_result>
Perfect! I've verified the concern. The review comment is correct - there is indeed a bug:
-
The
ReportStatemodel has columns:report_id,status,created_at,updated_at,partial_agent_output,raw_data, andfinal_report_json- noerrorcolumn. -
The
update_partialmethod uses SQLAlchemy's.values(**data)which directly maps dictionary keys to model columns. -
Passing
{"status": ReportStatusEnum.FAILED, "error": str(e)}will cause SQLAlchemy to attempt setting a non-existenterrorcolumn, resulting in a runtime error.
update_partial call passes undefined "error" key to ReportState model.
The ReportState model has no error column. The call update_partial(report_id, {"status": ReportStatusEnum.FAILED, "error": str(e)}) will raise a SQLAlchemy error at runtime because .values(**data) attempts to set the non-existent column. Either remove the "error" key or add an error column to the ReportState model if error tracking is needed.
🤖 Prompt for AI Agents
In backend/app/services/report_processor.py around lines 50 to 53, the call to
update_partial passes an "error" key which the ReportState model does not have,
causing SQLAlchemy to fail when mapping .values(**data); remove the "error" key
from the data passed (e.g. update_partial(report_id, {"status":
ReportStatusEnum.FAILED})) or alternatively add an error column to the
ReportState model and migrate the DB if you need to persist error text; choose
one approach and implement the corresponding code and migration so
update_partial only receives valid model fields.
| await report_repository.save_final_report(report_id, final_report_content) | ||
|
|
||
| set_report_status(report_id, overall_status) | ||
| await report_repository.update_report_status(report_id, overall_status) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type mismatch: passing string to update_report_status expecting ReportStatusEnum.
Line 90 passes overall_status (a string like "completed" or "failed") to update_report_status, but the repository method signature expects status: ReportStatusEnum. This will likely cause a database error or type validation failure.
- await report_repository.update_report_status(report_id, overall_status)
+ final_status = ReportStatusEnum.COMPLETED if overall_status == "completed" else ReportStatusEnum.FAILED
+ await report_repository.update_report_status(report_id, final_status)🤖 Prompt for AI Agents
In backend/app/services/report_processor.py around lines 88 to 90, the call
await report_repository.update_report_status(report_id, overall_status) passes
overall_status as a raw string but the repository expects a ReportStatusEnum;
change the code to convert/validate the string into the enum before calling
update_report_status (e.g., import ReportStatusEnum and map or construct the
enum from the string, handling invalid values by raising a clear error or
defaulting to a safe enum value), then pass that enum instance to
update_report_status.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 9
🧹 Nitpick comments (5)
alembic.ini (1)
63-63: Hardcoded SQLite URL may cause issues in production environments.The
sqlalchemy.urlis hardcoded to a local SQLite database. For production deployments, this should typically be overridden via environment variables or a separate configuration. Consider using Alembic's ability to set the URL programmatically inenv.pyfrom environment variables.backend/app/db/repositories/report_repository.py (1)
72-76: Missing exception handling inget_report_by_id.Unlike other methods,
get_report_by_idlacks try/except/rollback handling. While SELECT queries rarely require rollback, consistency with other methods and protection against connection errors would be beneficial.async def get_report_by_id(self, report_id: str) -> ReportState | None: async with self.session_factory() as session: - stmt = select(ReportState).where(ReportState.report_id == report_id) - result = await session.execute(stmt) - return result.scalar_one_or_none() + try: + stmt = select(ReportState).where(ReportState.report_id == report_id) + result = await session.execute(stmt) + return result.scalar_one_or_none() + except Exception: + await session.rollback() + raisebackend/app/core/orchestrator.py (1)
41-41: Addstrict=Truetozip()for safety.Per Ruff B905, using
zip()without explicitstrict=can silently drop elements if iterables have different lengths. Sincetasks.keys()andresultsshould always match, addingstrict=Trueensures early detection of any mismatch.- return dict(zip(tasks.keys(), results)) + return dict(zip(tasks.keys(), results, strict=True))backend/app/api/v1/routes.py (1)
27-29: Consider narrowing the exception catch.Catching bare
Exceptionis flagged by Ruff (BLE001). While acceptable for top-level background task handlers to prevent silent failures, consider catching more specific exceptions or at minimum excludingBaseExceptionsubclasses likeKeyboardInterruptandSystemExit.- except Exception as e: + except (ValueError, RuntimeError, IOError) as e: # or define specific exceptionsAlternatively, if you want to catch broadly but re-raise system exceptions:
except BaseException as e: if isinstance(e, (KeyboardInterrupt, SystemExit)): raise # handle other exceptionsbackend/app/db/migrations/script.py.mako (1)
15-18: Broadendown_revisiontype annotation to support merge revisionsAlembic allows
down_revisionto be a tuple or list (e.g.,('rev1', 'rev2')) for merge migrations, not just a single string orNone. The current annotation is narrower than what Alembic actually generates.To match real values and align with
branch_labels/depends_on, change line 16:-down_revision: Union[str, None] = ${repr(down_revision)} +down_revision: Union[str, Sequence[str], None] = ${repr(down_revision)}This preserves runtime behavior while enabling type checkers and IDEs to recognize all valid
down_revisionshapes.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
backend/app/db/__pycache__/base.cpython-313.pycis excluded by!**/*.pycbackend/app/db/migrations/__pycache__/env.cpython-313.pycis excluded by!**/*.pyc
📒 Files selected for processing (9)
alembic.ini(1 hunks)backend/app/api/v1/routes.py(1 hunks)backend/app/core/orchestrator.py(3 hunks)backend/app/db/migrations/README(1 hunks)backend/app/db/migrations/env.py(1 hunks)backend/app/db/migrations/script.py.mako(1 hunks)backend/app/db/models/report_state.py(2 hunks)backend/app/db/repositories/report_repository.py(1 hunks)requirements.txt(2 hunks)
✅ Files skipped from review due to trivial changes (1)
- backend/app/db/migrations/README
🚧 Files skipped from review as they are similar to previous changes (1)
- requirements.txt
🧰 Additional context used
🧬 Code graph analysis (3)
backend/app/db/repositories/report_repository.py (2)
backend/app/db/models/report.py (1)
Report(4-6)backend/app/db/models/report_state.py (2)
ReportState(25-35)ReportStatusEnum(8-20)
backend/app/api/v1/routes.py (7)
backend/app/db/database.py (1)
get_db(26-28)backend/app/db/repositories/report_repository.py (3)
ReportRepository(8-88)update_report_status(36-46)update_partial(78-88)backend/app/models/report_models.py (2)
ReportRequest(4-6)ReportResponse(8-10)backend/app/services/report_service.py (3)
generate_report(8-12)get_report_status(14-19)get_report_data(21-34)backend/app/core/storage.py (1)
get_report_status(15-18)backend/app/core/exceptions.py (1)
ReportNotFoundException(4-6)backend/app/db/models/report_state.py (1)
ReportStatusEnum(8-20)
backend/app/db/migrations/env.py (1)
backend/app/db/connection.py (1)
connect(18-62)
🪛 Ruff (0.14.5)
backend/app/core/orchestrator.py
41-41: zip() without an explicit strict= parameter
Add explicit value for parameter strict=
(B905)
129-129: Consider moving this statement to an else block
(TRY300)
134-134: Do not catch blind exception: Exception
(BLE001)
166-166: Consider moving this statement to an else block
(TRY300)
171-171: Do not catch blind exception: Exception
(BLE001)
225-225: Consider moving this statement to an else block
(TRY300)
230-230: Do not catch blind exception: Exception
(BLE001)
284-284: Consider moving this statement to an else block
(TRY300)
289-289: Do not catch blind exception: Exception
(BLE001)
backend/app/db/repositories/report_repository.py
21-21: Consider moving this statement to an else block
(TRY300)
43-43: Consider moving this statement to an else block
(TRY300)
55-55: Consider moving this statement to an else block
(TRY300)
67-67: Consider moving this statement to an else block
(TRY300)
85-85: Consider moving this statement to an else block
(TRY300)
backend/app/api/v1/routes.py
27-27: Do not catch blind exception: Exception
(BLE001)
33-33: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
33-33: Undefined name get_session
(F821)
42-42: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
42-42: Undefined name get_session
(F821)
52-52: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
52-52: Undefined name get_session
(F821)
🔇 Additional comments (5)
backend/app/db/models/report_state.py (2)
13-20: New status enum values properly support stage tracking.The expanded
ReportStatusEnumprovides granular visibility into the report generation pipeline stages as intended by the PR objectives. The values align with the orchestrator's state transitions.
35-35: LGTM on the error_message column addition.The
error_messagecolumn provides a proper place to persist error details during failed processing stages, complementing the status transitions.backend/app/core/orchestrator.py (2)
65-69: Factory function correctly initializes Orchestrator with session factory.The
create_orchestratorfunction properly passesAsyncSessionLocalas the session factory to theOrchestrator, which is the correct pattern for the repository's per-call session management.
76-138: Agent function structure implements DB-backed state tracking as intended.The
onchain_data_agentfunction properly:
- Updates status to
RUNNING_AGENTSbefore execution- Handles timeout and exception scenarios
- Persists partial results and status via
update_partial- Uses appropriate
ReportStatusEnumvaluesThis aligns with the PR objectives to update report state after each processing stage.
However, verify the session/repository instantiation issue flagged earlier is resolved before merging.
backend/app/db/migrations/script.py.mako (1)
1-13: Template structure and stubs look solidHeader metadata, imports, and
upgrade/downgradestubs are idiomatic for Alembic script templates. The use of type hints and optionalimportsinjection is clean and should work well with autogenerated bodies.Also applies to: 21-26
| from fastapi import APIRouter, BackgroundTasks, Depends | ||
| from fastapi.responses import JSONResponse | ||
| from sqlalchemy.ext.asyncio import AsyncSession | ||
| from backend.app.db.database import get_db |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
get_session is undefined; should use get_db.
Line 4 imports get_db, but lines 33, 42, and 52 reference get_session which doesn't exist. This will raise a NameError at runtime.
-from backend.app.db.database import get_db
+from backend.app.db.database import get_db as get_sessionAlternatively, update all usages on lines 33, 42, 52 to use get_db instead of get_session.
🤖 Prompt for AI Agents
In backend/app/api/v1/routes.py around line 4 (and usages at lines 33, 42, 52),
the module imports get_db but the route dependencies call get_session
(undefined); replace each get_session reference with get_db in the Depends(...)
calls so the correct dependency is used, or if you intended to keep get_session
instead, import get_session from backend.app.db.database; ensure the dependency
name matches the imported function everywhere.
| async def _run_agents_in_background(report_id: str, token_id: str): | ||
| try: | ||
| await orchestrator_instance.execute_agents_concurrently(report_id, token_id) | ||
| except Exception as e: | ||
| api_logger.error(f"Agent execution failed for report {report_id}: {e}") | ||
| # Here you might want to update the report status to 'failed' in in_memory_reports | ||
| # For now, we'll just log it. | ||
| if report_id in in_memory_reports: | ||
| in_memory_reports[report_id]["status"] = "failed" | ||
| in_memory_reports[report_id]["detail"] = f"Agent execution failed: {e}" | ||
| async for session in get_db(): | ||
| report_repository = ReportRepository(session) | ||
| try: | ||
| await report_repository.update_report_status(report_id, ReportStatusEnum.RUNNING_AGENTS) | ||
| await process_report(report_id, token_id, report_repository) | ||
| await report_repository.update_report_status(report_id, ReportStatusEnum.COMPLETED) | ||
| break # Exit the async for loop after successful processing | ||
| except Exception as e: | ||
| api_logger.error(f"Report processing failed for report {report_id}: {e}") | ||
| await report_repository.update_partial(report_id, {"status": ReportStatusEnum.FAILED, "error_message": str(e)}) | ||
| break # Exit the async for loop on failure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReportRepository expects a session factory, not a session instance.
According to ReportRepository.__init__(self, session_factory: Callable[..., AsyncSession]), the constructor expects a callable that creates sessions. The repository methods internally call async with self.session_factory() as session. Passing a raw AsyncSession object will fail when the repository attempts to invoke it as a callable.
Either pass the factory directly:
async def _run_agents_in_background(report_id: str, token_id: str):
- async for session in get_db():
- report_repository = ReportRepository(session)
- try:
- await report_repository.update_report_status(report_id, ReportStatusEnum.RUNNING_AGENTS)
- await process_report(report_id, token_id, report_repository)
- await report_repository.update_report_status(report_id, ReportStatusEnum.COMPLETED)
- break # Exit the async for loop after successful processing
- except Exception as e:
- api_logger.error(f"Report processing failed for report {report_id}: {e}")
- await report_repository.update_partial(report_id, {"status": ReportStatusEnum.FAILED, "error_message": str(e)})
- break # Exit the async for loop on failure
+ from backend.app.db.database import AsyncSessionLocal
+ report_repository = ReportRepository(AsyncSessionLocal)
+ try:
+ await report_repository.update_report_status(report_id, ReportStatusEnum.RUNNING_AGENTS)
+ await process_report(report_id, token_id, report_repository)
+ await report_repository.update_report_status(report_id, ReportStatusEnum.COMPLETED)
+ except Exception as e:
+ api_logger.error(f"Report processing failed for report {report_id}: {e}")
+ await report_repository.update_partial(report_id, {"status": ReportStatusEnum.FAILED, "error_message": str(e)})Or refactor ReportRepository to accept either a session or session factory based on your architectural preference.
🧰 Tools
🪛 Ruff (0.14.5)
27-27: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
In backend/app/api/v1/routes.py around lines 19-30, the code constructs
ReportRepository with a session instance but ReportRepository.__init__ expects a
session factory (callable). Change the loop to obtain and pass the session
factory (rename the loop variable to session_factory) and instantiate the
repository as ReportRepository(session_factory) so repository methods can call
async with self.session_factory(); do not pass a raw AsyncSession
instance—alternatively, if get_db yields sessions instead of a factory, wrap it
in a no-arg async callable that returns that session and pass that callable to
ReportRepository.
| async def generate_report_endpoint(request: ReportRequest, background_tasks: BackgroundTasks, session: AsyncSession = Depends(get_session)): | ||
| api_logger.info(f"Received report generation request for token_id: {request.token_id}") | ||
| report_response = await generate_report(request) | ||
| report_repository = ReportRepository(session) | ||
| report_response = await generate_report(request, report_repository) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same ReportRepository constructor mismatch in endpoint.
The endpoint passes a raw AsyncSession to ReportRepository, but the repository expects a session factory callable. This inconsistency with the repository's API will cause failures when repository methods try to use self.session_factory() as a context manager.
This applies to all three endpoints (lines 33-35, 42-44, 52-54). Either the ReportRepository needs to be refactored to accept raw sessions, or you need to pass a factory. Given the repository's current design, you might need a wrapper:
report_repository = ReportRepository(lambda: contextmanager_yielding_session(session))Or consider a design where request-scoped endpoints use a different repository pattern than background tasks.
🧰 Tools
🪛 Ruff (0.14.5)
33-33: Do not perform function call Depends in argument defaults; instead, perform the call within the function, or read the default from a module-level singleton variable
(B008)
33-33: Undefined name get_session
(F821)
🤖 Prompt for AI Agents
In backend/app/api/v1/routes.py around lines 33-36 (and similarly at 42-44 and
52-54), the code passes a raw AsyncSession into ReportRepository even though the
repository expects a session-factory callable; wrap the request-scoped session
into a callable that returns a contextmanager yielding that session (e.g.,
create a small lambda or helper using contextlib.contextmanager that yields the
provided session) and pass that callable to ReportRepository for all three
endpoints, or alternatively refactor ReportRepository to accept raw sessions and
update its methods accordingly so the usage is consistent.
| api_logger.info(f"Returning data for report_id: {report_id}") | ||
| return report_result | ||
| elif report_result.get("status") == "processing": | ||
| elif report_result.get("status") == ReportStatusEnum.RUNNING_AGENTS.value or report_result.get("status") == ReportStatusEnum.PENDING.value or report_result.get("status") == ReportStatusEnum.RUNNING_AGENTS.value or report_result.get("status") == ReportStatusEnum.GENERATING_NLG.value or report_result.get("status") == ReportStatusEnum.GENERATING_SUMMARY.value: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duplicate condition: RUNNING_AGENTS.value checked twice.
The condition checks ReportStatusEnum.RUNNING_AGENTS.value twice, which is redundant and suggests a copy-paste error.
- elif report_result.get("status") == ReportStatusEnum.RUNNING_AGENTS.value or report_result.get("status") == ReportStatusEnum.PENDING.value or report_result.get("status") == ReportStatusEnum.RUNNING_AGENTS.value or report_result.get("status") == ReportStatusEnum.GENERATING_NLG.value or report_result.get("status") == ReportStatusEnum.GENERATING_SUMMARY.value:
+ elif report_result.get("status") in (
+ ReportStatusEnum.PENDING.value,
+ ReportStatusEnum.RUNNING_AGENTS.value,
+ ReportStatusEnum.GENERATING_NLG.value,
+ ReportStatusEnum.GENERATING_SUMMARY.value,
+ ):This refactor also improves readability by using in with a tuple instead of chained or comparisons.
🤖 Prompt for AI Agents
In backend/app/api/v1/routes.py around line 60, the conditional redundantly
checks ReportStatusEnum.RUNNING_AGENTS.value twice and uses a long chained OR
expression; replace it with a single membership test using "in" and a tuple of
the desired status values (remove the duplicate RUNNING_AGENTS entry) so the
condition reads something like: report_result.get("status") in
(ReportStatusEnum.RUNNING_AGENTS.value, ReportStatusEnum.PENDING.value,
ReportStatusEnum.GENERATING_NLG.value,
ReportStatusEnum.GENERATING_SUMMARY.value).
| def __init__(self, session_factory: Callable[..., AsyncSession]): | ||
| self._agents: Dict[str, Callable] = {} | ||
| self.report_repository = ReportRepository(session_factory) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent session handling between Orchestrator and agent functions.
The Orchestrator.__init__ correctly receives a session_factory and passes it to ReportRepository. However, the agent functions (lines 78-79, 145-146, 180-181, 241-242) create their own session via get_db() and then pass that session directly to ReportRepository(session). This creates two issues:
- The
ReportRepositoryconstructor expects a factory, not a session instance - The orchestrator's
self.report_repositoryis never used by agents (they create their own)
Consider either:
- Having agents use
self.report_repositoryfrom the orchestrator (requires passing orchestrator reference) - Updating agents to pass
AsyncSessionLocalas the factory:
- async with get_db() as session:
- report_repository = ReportRepository(session)
+ report_repository = ReportRepository(AsyncSessionLocal)Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In backend/app/core/orchestrator.py around lines 28-30 and agent usages at lines
78-79, 145-146, 180-181, 241-242: agents are creating a session via get_db() and
instantiating ReportRepository(session) even though Orchestrator.__init__
receives a session_factory and ReportRepository expects a factory, and
Orchestrator.self.report_repository is unused; fix by either (A) have agents use
the orchestrator's report_repository (pass the orchestrator or its
report_repository into agent functions and remove per-agent ReportRepository
creation), or (B) if agents must construct their own repository, pass the
session factory (AsyncSessionLocal) into ReportRepository instead of a session
instance and stop calling get_db() there; ensure ReportRepository always
receives the factory type and that all code paths consistently use the same
factory-based pattern.
| existing_report = await report_repository.get_report_by_id(report_id) | ||
| existing_partial_agent_output = existing_report.partial_agent_output if existing_report else {} | ||
| if overall_agent_status == "completed": | ||
| await report_repository.update_partial(report_id, {"status": ReportStatusEnum.AGENTS_COMPLETED, "partial_agent_output": {**existing_partial_agent_output, "onchain_data_agent": result}}) | ||
| else: | ||
| await report_repository.update_partial(report_id, {"status": ReportStatusEnum.AGENTS_FAILED, "error_message": "Onchain Data Agent failed", "partial_agent_output": {**existing_partial_agent_output, "onchain_data_agent": result}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential AttributeError if partial_agent_output is None.
When existing_report exists but partial_agent_output is None (its default), the dict unpacking {**existing_partial_agent_output, ...} will fail with TypeError: 'NoneType' object is not a mapping.
Apply defensive handling:
existing_report = await report_repository.get_report_by_id(report_id)
- existing_partial_agent_output = existing_report.partial_agent_output if existing_report else {}
+ existing_partial_agent_output = (existing_report.partial_agent_output or {}) if existing_report else {}The same fix is needed at lines 164, 223, and 282.
🤖 Prompt for AI Agents
In backend/app/core/orchestrator.py around lines 123-128,
existing_report.partial_agent_output can be None which will cause a TypeError
when using dict unpacking; change the code to coalesce the value to an empty
dict before unpacking (e.g., set existing_partial_agent_output =
existing_report.partial_agent_output or {}), then use
{**existing_partial_agent_output, "onchain_data_agent": result} and similar for
the failure branch; apply the same defensive coalescing fix at the other
occurrences on lines ~164, ~223, and ~282.
| # this is the Alembic Config object, which provides | ||
| # access to the values within the .ini file in use. | ||
| config = context.config | ||
|
|
||
| # Interpret the config file for Python logging. | ||
| # This line sets up loggers basically. | ||
| if config.config_file_name is not None: | ||
| fileConfig(config.config_file_name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion | 🟠 Major
Remove duplicate configuration blocks.
Lines 8-15 and 23-30 contain identical code (config assignment and fileConfig call). This duplication should be removed.
# this is the Alembic Config object, which provides
# access to the values within the .ini file in use.
config = context.config
# Interpret the config file for Python logging.
# This line sets up loggers basically.
if config.config_file_name is not None:
fileConfig(config.config_file_name)
# add your model's MetaData object here
# for 'autogenerate' support
import os, sys
sys.path.append(os.getcwd())
from backend.app.db.base import Base
-
-# this is the Alembic Config object, which provides
-# access to the values within the .ini file in use.
-config = context.config
-
-# Interpret the config file for Python logging.
-# This line sets up loggers basically.
-if config.config_file_name is not None:
- fileConfig(config.config_file_name)
# add your model's MetaData object hereAlso applies to: 23-30
🤖 Prompt for AI Agents
In backend/app/db/migrations/env.py around lines 8-15 and 23-30, the assignment
"config = context.config" and the subsequent fileConfig call are duplicated;
remove the second duplicate block (lines 23-30) so the config assignment and
fileConfig(config.config_file_name) appear only once before any logging or
environment setup, leaving the rest of the migration setup unchanged.
| import asyncio | ||
|
|
||
| async def process_migrations(): | ||
| connectable = engine_from_config( | ||
| config.get_section(config.config_ini_section, {}), | ||
| prefix="sqlalchemy.", | ||
| poolclass=pool.NullPool, | ||
| ) | ||
|
|
||
| async with connectable.connect() as connection: | ||
| context.configure( | ||
| connection=connection, target_metadata=target_metadata | ||
| ) | ||
|
|
||
| async with context.begin_transaction(): | ||
| context.run_migrations() | ||
|
|
||
| asyncio.run(process_migrations()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Critical: Async/sync engine mismatch will cause runtime errors.
engine_from_config returns a synchronous Engine, but the code attempts to use it with async with connectable.connect() patterns. This will fail at runtime because a sync engine doesn't support async context managers.
For async migrations with aiosqlite, use the run_sync pattern:
def run_migrations_online() -> None:
"""Run migrations in 'online' mode.
In this scenario we need to create an Engine
and associate a connection with the context.
"""
- import asyncio
-
- async def process_migrations():
- connectable = engine_from_config(
- config.get_section(config.config_ini_section, {}),
- prefix="sqlalchemy.",
- poolclass=pool.NullPool,
- )
-
- async with connectable.connect() as connection:
- context.configure(
- connection=connection, target_metadata=target_metadata
- )
-
- async with context.begin_transaction():
- context.run_migrations()
-
- asyncio.run(process_migrations())
+ from sqlalchemy.ext.asyncio import create_async_engine
+ import asyncio
+
+ async def run_async_migrations():
+ connectable = create_async_engine(
+ config.get_main_option("sqlalchemy.url"),
+ poolclass=pool.NullPool,
+ )
+
+ async with connectable.connect() as connection:
+ await connection.run_sync(do_run_migrations)
+
+ await connectable.dispose()
+
+ def do_run_migrations(connection):
+ context.configure(
+ connection=connection,
+ target_metadata=target_metadata,
+ )
+ with context.begin_transaction():
+ context.run_migrations()
+
+ asyncio.run(run_async_migrations())Committable suggestion skipped: line range outside the PR's diff.
| def __init__(self, session_factory: Callable[..., AsyncSession]): | ||
| self.session_factory = session_factory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constructor signature conflicts with usage in orchestrator agent functions.
The constructor expects session_factory: Callable[..., AsyncSession], but in orchestrator.py (lines 79, 146, 181, 242), agents call ReportRepository(session) passing an AsyncSession directly, not a callable factory. This type mismatch will cause runtime errors when methods try to call self.session_factory().
Either update the orchestrator to pass a factory, or provide an overloaded constructor:
class ReportRepository:
- def __init__(self, session_factory: Callable[..., AsyncSession]):
- self.session_factory = session_factory
+ def __init__(self, session_or_factory: AsyncSession | Callable[..., AsyncSession]):
+ if callable(session_or_factory) and not isinstance(session_or_factory, AsyncSession):
+ self.session_factory = session_or_factory
+ else:
+ # Wrap a single session in a context manager
+ self._session = session_or_factory
+ self.session_factory = self._wrap_session
+
+ @contextlib.asynccontextmanager
+ async def _wrap_session(self):
+ yield self._sessionAlternatively, update orchestrator agent functions to pass AsyncSessionLocal instead of session.
Committable suggestion skipped: line range outside the PR's diff.
🤖 Prompt for AI Agents
In backend/app/db/repositories/report_repository.py around lines 9-10, the
constructor currently types and treats session_factory as Callable[...,
AsyncSession] but orchestrator passes an AsyncSession instance; change the
constructor to accept either a Callable[..., AsyncSession] or an AsyncSession
(use Union[Callable[..., AsyncSession], AsyncSession]) and normalize it: if
given a callable store it as self.session_factory, otherwise wrap the provided
AsyncSession in a zero-arg factory (e.g., a lambda returning that session) so
existing methods can safely call self.session_factory(); update the type hints
and any imports accordingly so runtime calls to self.session_factory() work
without changing orchestrator, or alternatively update the orchestrator to pass
the actual session factory (AsyncSessionLocal) at the call sites (lines 79, 146,
181, 242).
Overview: The orchestrator now updates the report state after each processing stage, providing better visibility into report generation progress.
Changes
report_repository.update_partialcalls intoorchestrator.pyto persist status changes.pending,running, andcompletedstatuses.report_idand relevant stage-specific data are correctly passed for partial updates.Summary by CodeRabbit
New Features
Improvements
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.