Skip to content

Conversation

@svenaric
Copy link
Collaborator

@svenaric svenaric commented Dec 3, 2025

Changes requested:

Create a background task monitor_generation_time inside app/tasks/ that periodically checks active report timers. If a report is taking too long, log an alert and update the report state accordingly.

Please review.

Summary by CodeRabbit

  • Tests

    • Refactored integration tests to use repository-based mocking for improved test reliability and maintainability.
  • Refactor

    • Updated internal orchestrator initialization to accept optional session configuration parameters.
  • Chores

    • Added dependency import for async database session management.

✏️ Tip: You can customize this high-level summary in your review settings.

…time` inside `app/tasks/` that periodically checks active report timers. If
@coderabbitai
Copy link

coderabbitai bot commented Dec 3, 2025

Walkthrough

The pull request migrates test infrastructure from direct in-memory state management to mock-based repository interactions. Changes include adding AsyncSession import to the report processor, removing the autouse fixture that cleared in-memory state, introducing a session_factory parameter to create_orchestrator, and updating test assertions to verify repository method calls instead of direct state inspection.

Changes

Cohort / File(s) Summary
Import and service layer updates
backend/app/services/report_processor.py
Added AsyncSession import from sqlalchemy.ext.asyncio; import marked as potentially unused.
Orchestrator factory signature
backend/app/core/orchestrator.py
Updated create_orchestrator() signature to accept optional session_factory parameter for customizing session provisioning in tests and runtime.
Test infrastructure refactoring
backend/tests/test_orchestrator_integration.py
Removed autouse fixture; replaced in-memory state setup with mock-based approach; introduced session_factory and mock AsyncSession injection; mocked ReportRepository.get_report_by_id() and update_partial() methods; updated test assertions to verify repository interactions (e.g., status transitions and error details) instead of direct state inspection.
Test route adjustments
backend/tests/test_routes.py
Removed import of in_memory_reports, which may create a broken reference in any dependent fixtures.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~40 minutes

  • Public API signature change: create_orchestrator() now accepts session_factory parameter; verify all call sites have been updated.
  • Test infrastructure overhaul: Shift from autouse fixture-based state clearing to mock-based repository interactions introduces multiple mock assertion patterns that require careful validation.
  • Incomplete refactoring risk: The removal of in_memory_reports import from test_routes.py without visible corresponding fixture cleanup suggests potential breakage in dependent test setup.
  • Unused import: AsyncSession import in report_processor.py appears unused and may be placeholder for future changes.

Possibly related PRs

Suggested reviewers

  • felixjordandev

Poem

🐰 The mocks replace the memories clear,
Session factories now appear,
State flows through repositories true,
Async sessions see tests through—
A cleaner path for code to cheer!

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Title check ⚠️ Warning The PR title mentions creating a background task 'monitor_generation_time', but the actual changes involve refactoring report processing tests to use mock-based repository interactions and updating orchestrator dependencies, not implementing the claimed background task. Update the PR title to accurately reflect the actual changes: something like 'Refactor report processor tests to use mock-based repository approach' would better represent the modifications to test orchestrator integration and test routes.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/20251203204046

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 123e0ea and 5b7c113.

📒 Files selected for processing (3)
  • backend/app/services/report_processor.py (1 hunks)
  • backend/tests/test_orchestrator_integration.py (4 hunks)
  • backend/tests/test_routes.py (0 hunks)
💤 Files with no reviewable changes (1)
  • backend/tests/test_routes.py
🧰 Additional context used
🧬 Code graph analysis (1)
backend/tests/test_orchestrator_integration.py (4)
backend/app/db/models/report_state.py (1)
  • ReportStatusEnum (8-20)
backend/tests/test_orchestrator_updates.py (2)
  • orchestrator (31-34)
  • mock_report_repository (24-28)
backend/app/core/orchestrator.py (1)
  • create_orchestrator (102-305)
backend/app/db/repositories/report_repository.py (2)
  • get_report_by_id (72-76)
  • update_partial (90-100)
🔇 Additional comments (4)
backend/app/services/report_processor.py (1)

4-4: AsyncSession import is consistent with session-factory based orchestration

Importing AsyncSession here is fine as preparation for Async DB session wiring and keeps this module aligned with the orchestrator/session_factory changes; no further changes needed in this file.

backend/tests/test_orchestrator_integration.py (3)

92-98: Repository interaction assertions look good once mocking is fixed

The assertions on get_report_by_id and update_partial correctly verify that the orchestrator updates the report to AGENTS_COMPLETED with the expected report ID. After fixing the mocking as suggested above, this block should behave as intended.

Please re-run this test after applying the mocking fix to confirm the final update_partial call indeed carries status == ReportStatusEnum.AGENTS_COMPLETED.


160-166: Timeout-path repository assertions are appropriate

These assertions correctly check that:

  • The report lookup uses the right report_id.
  • The final update_partial call sets status to AGENTS_FAILED and includes an onchain_data_agent entry in errors.

Once the repository methods are mocked with AsyncMock as above, this should accurately validate timeout handling.

After fixing the mocks, verify that update_partial.call_args_list[-1].args[1] actually contains the expected errors structure in practice.


227-233: Exception-path assertions correctly validate AGENTS_FAILED + errors

These checks mirror the timeout test and look correct: they assert that the orchestrator marks the report as AGENTS_FAILED and includes an error entry for onchain_data_agent. With the mocking adjustments applied, this will robustly validate exception handling behavior.

Please re-run the exception-handling test after updating the mocks to ensure the last update_partial call indeed carries the expected status and errors.

Comment on lines +50 to +78
# Setup mock for ReportRepository and its interactions
from backend.app.db.models.report_state import ReportStatusEnum

mock_session = AsyncMock()
# Mock the async context manager for the session factory
mock_session_factory = AsyncMock(return_value=AsyncMock(__aenter__=AsyncMock(return_value=mock_session)))

orchestrator = await create_orchestrator(session_factory=mock_session_factory)
mock_report_repository = orchestrator.report_repository

# Mock initial report state (PENDING)
mock_report_state_pending = AsyncMock()
mock_report_state_pending.report_id = SAMPLE_REPORT_ID
mock_report_state_pending.status = ReportStatusEnum.PENDING
mock_report_state_pending.partial_agent_output = {}

# Mock final report state (COMPLETED)
mock_report_state_completed = AsyncMock()
mock_report_state_completed.report_id = SAMPLE_REPORT_ID
mock_report_state_completed.status = ReportStatusEnum.AGENTS_COMPLETED
mock_report_state_completed.partial_agent_output = {} # Will be updated

# Configure get_report_by_id to return pending then completed after update
mock_report_repository.get_report_by_id.side_effect = [
mock_report_state_pending, # First call during initial status check
mock_report_state_completed # Second call after update_partial for AGENTS_COMPLETED
]
mock_report_repository.update_partial.return_value = mock_report_state_completed # Ensure update returns something

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Mocking ReportRepository methods via .side_effect/.return_value on real methods will break

Here orchestrator.report_repository is a real ReportRepository instance; its get_report_by_id / update_partial are bound async methods, not mocks. Accessing .side_effect/.return_value on them will not work and will likely raise AttributeError.

Override these attributes with AsyncMock instances instead:

@@
-        orchestrator = await create_orchestrator(session_factory=mock_session_factory)
-        mock_report_repository = orchestrator.report_repository
+        orchestrator = await create_orchestrator(session_factory=mock_session_factory)
+        mock_report_repository = orchestrator.report_repository
@@
-        # Configure get_report_by_id to return pending then completed after update
-        mock_report_repository.get_report_by_id.side_effect = [
-            mock_report_state_pending, # First call during initial status check
-            mock_report_state_completed # Second call after update_partial for AGENTS_COMPLETED
-        ]
-        mock_report_repository.update_partial.return_value = mock_report_state_completed # Ensure update returns something
+        # Configure get_report_by_id to return pending then completed after update
+        mock_report_repository.get_report_by_id = AsyncMock(side_effect=[
+            mock_report_state_pending,  # First call during initial status check
+            mock_report_state_completed  # Second call after update_partial for AGENTS_COMPLETED
+        ])
+        # Ensure update_partial is a mock and returns the completed state
+        mock_report_repository.update_partial = AsyncMock(return_value=mock_report_state_completed)

This keeps the orchestrator instance but safely mocks just the repository calls used in the test.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Setup mock for ReportRepository and its interactions
from backend.app.db.models.report_state import ReportStatusEnum
mock_session = AsyncMock()
# Mock the async context manager for the session factory
mock_session_factory = AsyncMock(return_value=AsyncMock(__aenter__=AsyncMock(return_value=mock_session)))
orchestrator = await create_orchestrator(session_factory=mock_session_factory)
mock_report_repository = orchestrator.report_repository
# Mock initial report state (PENDING)
mock_report_state_pending = AsyncMock()
mock_report_state_pending.report_id = SAMPLE_REPORT_ID
mock_report_state_pending.status = ReportStatusEnum.PENDING
mock_report_state_pending.partial_agent_output = {}
# Mock final report state (COMPLETED)
mock_report_state_completed = AsyncMock()
mock_report_state_completed.report_id = SAMPLE_REPORT_ID
mock_report_state_completed.status = ReportStatusEnum.AGENTS_COMPLETED
mock_report_state_completed.partial_agent_output = {} # Will be updated
# Configure get_report_by_id to return pending then completed after update
mock_report_repository.get_report_by_id.side_effect = [
mock_report_state_pending, # First call during initial status check
mock_report_state_completed # Second call after update_partial for AGENTS_COMPLETED
]
mock_report_repository.update_partial.return_value = mock_report_state_completed # Ensure update returns something
# Setup mock for ReportRepository and its interactions
from backend.app.db.models.report_state import ReportStatusEnum
mock_session = AsyncMock()
# Mock the async context manager for the session factory
mock_session_factory = AsyncMock(return_value=AsyncMock(__aenter__=AsyncMock(return_value=mock_session)))
orchestrator = await create_orchestrator(session_factory=mock_session_factory)
mock_report_repository = orchestrator.report_repository
# Mock initial report state (PENDING)
mock_report_state_pending = AsyncMock()
mock_report_state_pending.report_id = SAMPLE_REPORT_ID
mock_report_state_pending.status = ReportStatusEnum.PENDING
mock_report_state_pending.partial_agent_output = {}
# Mock final report state (COMPLETED)
mock_report_state_completed = AsyncMock()
mock_report_state_completed.report_id = SAMPLE_REPORT_ID
mock_report_state_completed.status = ReportStatusEnum.AGENTS_COMPLETED
mock_report_state_completed.partial_agent_output = {} # Will be updated
# Configure get_report_by_id to return pending then completed after update
mock_report_repository.get_report_by_id = AsyncMock(side_effect=[
mock_report_state_pending, # First call during initial status check
mock_report_state_completed # Second call after update_partial for AGENTS_COMPLETED
])
# Ensure update_partial is a mock and returns the completed state
mock_report_repository.update_partial = AsyncMock(return_value=mock_report_state_completed)
🤖 Prompt for AI Agents
In backend/tests/test_orchestrator_integration.py around lines 50 to 78, the
test sets .side_effect and .return_value on real bound async methods of
orchestrator.report_repository which will fail; replace those bound methods with
AsyncMock instances on the repository before assigning side_effect/return_value
(e.g., orchestrator.report_repository.get_report_by_id = AsyncMock(...) and
orchestrator.report_repository.update_partial = AsyncMock(...)), then configure
their side_effect and return_value as needed so the orchestrator instance
remains but the repository calls are properly mocked.

Comment on lines +128 to +155
# Setup mock for ReportRepository and its interactions
from backend.app.db.models.report_state import ReportStatusEnum
mock_session = AsyncMock()
mock_session_factory = AsyncMock(return_value=AsyncMock(__aenter__=AsyncMock(return_value=mock_session)))

orchestrator = await create_orchestrator(session_factory=mock_session_factory)
mock_report_repository = orchestrator.report_repository

# Mock initial report state (PENDING)
mock_report_state_pending = AsyncMock()
mock_report_state_pending.report_id = SAMPLE_REPORT_ID
mock_report_state_pending.status = ReportStatusEnum.PENDING
mock_report_state_pending.partial_agent_output = {}

# Mock failed report state
mock_report_state_failed = AsyncMock()
mock_report_state_failed.report_id = SAMPLE_REPORT_ID
mock_report_state_failed.status = ReportStatusEnum.AGENTS_FAILED
mock_report_state_failed.partial_agent_output = {}
mock_report_state_failed.errors = {"onchain_data_agent": True}

# Configure get_report_by_id to return pending then failed after update
mock_report_repository.get_report_by_id.side_effect = [
mock_report_state_pending, # First call during initial status check
mock_report_state_failed # Second call after update_partial for AGENTS_FAILED
]
mock_report_repository.update_partial.return_value = mock_report_state_failed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Same mocking issue for timeout test: methods must be replaced with AsyncMock

This block repeats the same pattern: mock_report_repository.get_report_by_id.side_effect = [...] and mock_report_repository.update_partial.return_value = ... on real async methods. Use AsyncMock assignments instead, analogous to the success test:

@@
-        orchestrator = await create_orchestrator(session_factory=mock_session_factory)
-        mock_report_repository = orchestrator.report_repository
+        orchestrator = await create_orchestrator(session_factory=mock_session_factory)
+        mock_report_repository = orchestrator.report_repository
@@
-        # Configure get_report_by_id to return pending then failed after update
-        mock_report_repository.get_report_by_id.side_effect = [
-            mock_report_state_pending, # First call during initial status check
-            mock_report_state_failed # Second call after update_partial for AGENTS_FAILED
-        ]
-        mock_report_repository.update_partial.return_value = mock_report_state_failed
+        # Configure get_report_by_id to return pending then failed after update
+        mock_report_repository.get_report_by_id = AsyncMock(side_effect=[
+            mock_report_state_pending,  # First call during initial status check
+            mock_report_state_failed   # Second call after update_partial for AGENTS_FAILED
+        ])
+        mock_report_repository.update_partial = AsyncMock(return_value=mock_report_state_failed)

Without this, the test will not run reliably.

🤖 Prompt for AI Agents
In backend/tests/test_orchestrator_integration.py around lines 128 to 155, the
test assigns side_effect and return_value directly to coroutine methods on the
real mock_report_repository which are async and must be replaced with AsyncMock
instances; change the assignments so mock_report_repository.get_report_by_id =
AsyncMock(side_effect=[mock_report_state_pending, mock_report_state_failed]) and
mock_report_repository.update_partial =
AsyncMock(return_value=mock_report_state_failed) (and apply the same AsyncMock
replacement pattern to any other async repository methods used in this timeout
test) so the async methods behave correctly in the test.

Comment on lines +195 to +221
# Setup mock for ReportRepository and its interactions
from backend.app.db.models.report_state import ReportStatusEnum
mock_session = AsyncMock()
mock_session_factory = AsyncMock(return_value=AsyncMock(__aenter__=AsyncMock(return_value=mock_session)))

orchestrator = await create_orchestrator(session_factory=mock_session_factory)
mock_report_repository = orchestrator.report_repository

# Mock initial report state (PENDING)
mock_report_state_pending = AsyncMock()
mock_report_state_pending.report_id = SAMPLE_REPORT_ID
mock_report_state_pending.status = ReportStatusEnum.PENDING
mock_report_state_pending.partial_agent_output = {}

# Mock failed report state
mock_report_state_failed = AsyncMock()
mock_report_state_failed.report_id = SAMPLE_REPORT_ID
mock_report_state_failed.status = ReportStatusEnum.AGENTS_FAILED
mock_report_state_failed.partial_agent_output = {}
mock_report_state_failed.errors = {"onchain_data_agent": True}

# Configure get_report_by_id to return pending then failed after update
mock_report_repository.get_report_by_id.side_effect = [
mock_report_state_pending, # First call during initial status check
mock_report_state_failed # Second call after update_partial for AGENTS_FAILED
]
mock_report_repository.update_partial.return_value = mock_report_state_failed
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Repeat of mocking issue for exception-handling test

Same concern here: get_report_by_id and update_partial are real async methods, so setting .side_effect / .return_value on them is unsafe. Mirror the AsyncMock pattern:

@@
-        orchestrator = await create_orchestrator(session_factory=mock_session_factory)
-        mock_report_repository = orchestrator.report_repository
+        orchestrator = await create_orchestrator(session_factory=mock_session_factory)
+        mock_report_repository = orchestrator.report_repository
@@
-        # Configure get_report_by_id to return pending then failed after update
-        mock_report_repository.get_report_by_id.side_effect = [
-            mock_report_state_pending, # First call during initial status check
-            mock_report_state_failed # Second call after update_partial for AGENTS_FAILED
-        ]
-        mock_report_repository.update_partial.return_value = mock_report_state_failed
+        # Configure get_report_by_id to return pending then failed after update
+        mock_report_repository.get_report_by_id = AsyncMock(side_effect=[
+            mock_report_state_pending,  # First call during initial status check
+            mock_report_state_failed   # Second call after update_partial for AGENTS_FAILED
+        ])
+        mock_report_repository.update_partial = AsyncMock(return_value=mock_report_state_failed)

This keeps all three tests consistent and prevents AttributeErrors when configuring mocks.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Setup mock for ReportRepository and its interactions
from backend.app.db.models.report_state import ReportStatusEnum
mock_session = AsyncMock()
mock_session_factory = AsyncMock(return_value=AsyncMock(__aenter__=AsyncMock(return_value=mock_session)))
orchestrator = await create_orchestrator(session_factory=mock_session_factory)
mock_report_repository = orchestrator.report_repository
# Mock initial report state (PENDING)
mock_report_state_pending = AsyncMock()
mock_report_state_pending.report_id = SAMPLE_REPORT_ID
mock_report_state_pending.status = ReportStatusEnum.PENDING
mock_report_state_pending.partial_agent_output = {}
# Mock failed report state
mock_report_state_failed = AsyncMock()
mock_report_state_failed.report_id = SAMPLE_REPORT_ID
mock_report_state_failed.status = ReportStatusEnum.AGENTS_FAILED
mock_report_state_failed.partial_agent_output = {}
mock_report_state_failed.errors = {"onchain_data_agent": True}
# Configure get_report_by_id to return pending then failed after update
mock_report_repository.get_report_by_id.side_effect = [
mock_report_state_pending, # First call during initial status check
mock_report_state_failed # Second call after update_partial for AGENTS_FAILED
]
mock_report_repository.update_partial.return_value = mock_report_state_failed
# Setup mock for ReportRepository and its interactions
from backend.app.db.models.report_state import ReportStatusEnum
mock_session = AsyncMock()
mock_session_factory = AsyncMock(return_value=AsyncMock(__aenter__=AsyncMock(return_value=mock_session)))
orchestrator = await create_orchestrator(session_factory=mock_session_factory)
mock_report_repository = orchestrator.report_repository
# Mock initial report state (PENDING)
mock_report_state_pending = AsyncMock()
mock_report_state_pending.report_id = SAMPLE_REPORT_ID
mock_report_state_pending.status = ReportStatusEnum.PENDING
mock_report_state_pending.partial_agent_output = {}
# Mock failed report state
mock_report_state_failed = AsyncMock()
mock_report_state_failed.report_id = SAMPLE_REPORT_ID
mock_report_state_failed.status = ReportStatusEnum.AGENTS_FAILED
mock_report_state_failed.partial_agent_output = {}
mock_report_state_failed.errors = {"onchain_data_agent": True}
# Configure get_report_by_id to return pending then failed after update
mock_report_repository.get_report_by_id = AsyncMock(side_effect=[
mock_report_state_pending, # First call during initial status check
mock_report_state_failed # Second call after update_partial for AGENTS_FAILED
])
mock_report_repository.update_partial = AsyncMock(return_value=mock_report_state_failed)
🤖 Prompt for AI Agents
In backend/tests/test_orchestrator_integration.py around lines 195 to 221, the
test assigns .side_effect and .return_value directly on real async methods which
can raise AttributeErrors; change the setup so
mock_report_repository.get_report_by_id and
mock_report_repository.update_partial are AsyncMock instances (use
AsyncMock(side_effect=[...]) for get_report_by_id and
AsyncMock(return_value=...) for update_partial) mirroring the pattern used
elsewhere in the tests, and ensure the mock_session_factory/mock_session usage
remains an AsyncMock context manager as currently done.

@felixjordandev
Copy link
Collaborator

that background task for monitor_generation_time is gonna help catch those slow reports. nice touch with the logging!

@felixjordandev felixjordandev merged commit 2e7b4f6 into main Dec 3, 2025
1 check passed
@felixjordandev felixjordandev deleted the feat/20251203204046 branch December 3, 2025 22:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants