refactor: core/app pipeline, core/datasource, and core/indexing_runner#34359
Conversation
Pyrefly Diffbase → PR--- /tmp/pyrefly_base.txt 2026-04-01 00:26:17.932690376 +0000
+++ /tmp/pyrefly_pr.txt 2026-04-01 00:26:07.360672106 +0000
@@ -3021,13 +3021,13 @@
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
--> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:65:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:148:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:143:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:173:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:168:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:203:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:194:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:266:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:253:37
ERROR `Literal['generated-conversation-id']` is not assignable to attribute `id` with type `Never` [bad-assignment]
--> tests/unit_tests/core/app/apps/test_advanced_chat_app_generator.py:53:22
ERROR `Literal['generated-message-id']` is not assignable to attribute `id` with type `Never` [bad-assignment]
@@ -3873,7 +3873,7 @@
ERROR Object of class `BlobChunkMessage` has no attribute `text`
ERROR Object of class `BlobChunkMessage` has no attribute `json_object`
ERROR No matching overload found for function `list.__init__` called with arguments: (Generator[Unknown] | None) [no-matching-overload]
- --> tests/unit_tests/core/datasource/test_datasource_file_manager.py:404:20
+ --> tests/unit_tests/core/datasource/test_datasource_file_manager.py:390:20
ERROR Object of class `FunctionType` has no attribute `assert_called_once` [missing-attribute]
--> tests/unit_tests/core/datasource/test_datasource_manager.py:52:5
ERROR Argument `SimpleNamespace` is not assignable to parameter `datasource_type` with type `DatasourceProviderType` in function `core.datasource.datasource_manager.DatasourceManager.get_datasource_plugin_provider` [bad-argument-type]
@@ -5184,13 +5184,13 @@
ERROR Missing required key `enable` for TypedDict `SummaryIndexSettingDict` [bad-typed-dict-key]
--> tests/unit_tests/core/rag/indexing/processor/test_qa_index_processor.py:333:78
ERROR Argument `None` is not assignable to parameter `state` with type `InstanceState[Any]` in function `sqlalchemy.orm.exc.ObjectDeletedError.__init__` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:934:84
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:945:84
ERROR Argument `Literal['completed']` is not assignable to parameter `after_indexing_status` with type `IndexingStatus` in function `core.indexing_runner.IndexingRunner._update_document_index_status` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1045:13
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1063:13
ERROR Argument `Literal['completed']` is not assignable to parameter `after_indexing_status` with type `IndexingStatus` in function `core.indexing_runner.IndexingRunner._update_document_index_status` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1060:71
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1078:71
ERROR Argument `Literal['completed']` is not assignable to parameter `after_indexing_status` with type `IndexingStatus` in function `core.indexing_runner.IndexingRunner._update_document_index_status` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1071:71
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1089:71
ERROR Object of class `FunctionType` has no attribute `assert_called_once` [missing-attribute]
--> tests/unit_tests/core/rag/rerank/test_reranker.py:1578:9
ERROR Object of class `FunctionType` has no attribute `call_args` [missing-attribute]
|
976bae1 to
680ec03
Compare
Pyrefly Diffbase → PR--- /tmp/pyrefly_base.txt 2026-04-01 00:47:53.147056397 +0000
+++ /tmp/pyrefly_pr.txt 2026-04-01 00:47:42.788905975 +0000
@@ -3021,13 +3021,13 @@
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
--> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:65:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:148:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:143:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:173:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:168:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:203:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:194:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:266:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:253:37
ERROR `Literal['generated-conversation-id']` is not assignable to attribute `id` with type `Never` [bad-assignment]
--> tests/unit_tests/core/app/apps/test_advanced_chat_app_generator.py:53:22
ERROR `Literal['generated-message-id']` is not assignable to attribute `id` with type `Never` [bad-assignment]
@@ -3873,7 +3873,7 @@
ERROR Object of class `BlobChunkMessage` has no attribute `text`
ERROR Object of class `BlobChunkMessage` has no attribute `json_object`
ERROR No matching overload found for function `list.__init__` called with arguments: (Generator[Unknown] | None) [no-matching-overload]
- --> tests/unit_tests/core/datasource/test_datasource_file_manager.py:404:20
+ --> tests/unit_tests/core/datasource/test_datasource_file_manager.py:390:20
ERROR Object of class `FunctionType` has no attribute `assert_called_once` [missing-attribute]
--> tests/unit_tests/core/datasource/test_datasource_manager.py:52:5
ERROR Argument `SimpleNamespace` is not assignable to parameter `datasource_type` with type `DatasourceProviderType` in function `core.datasource.datasource_manager.DatasourceManager.get_datasource_plugin_provider` [bad-argument-type]
@@ -5184,13 +5184,13 @@
ERROR Missing required key `enable` for TypedDict `SummaryIndexSettingDict` [bad-typed-dict-key]
--> tests/unit_tests/core/rag/indexing/processor/test_qa_index_processor.py:333:78
ERROR Argument `None` is not assignable to parameter `state` with type `InstanceState[Any]` in function `sqlalchemy.orm.exc.ObjectDeletedError.__init__` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:934:84
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:924:84
ERROR Argument `Literal['completed']` is not assignable to parameter `after_indexing_status` with type `IndexingStatus` in function `core.indexing_runner.IndexingRunner._update_document_index_status` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1045:13
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1038:13
ERROR Argument `Literal['completed']` is not assignable to parameter `after_indexing_status` with type `IndexingStatus` in function `core.indexing_runner.IndexingRunner._update_document_index_status` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1060:71
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1053:71
ERROR Argument `Literal['completed']` is not assignable to parameter `after_indexing_status` with type `IndexingStatus` in function `core.indexing_runner.IndexingRunner._update_document_index_status` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1071:71
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1064:71
ERROR Object of class `FunctionType` has no attribute `assert_called_once` [missing-attribute]
--> tests/unit_tests/core/rag/rerank/test_reranker.py:1578:9
ERROR Object of class `FunctionType` has no attribute `call_args` [missing-attribute]
|
Pyrefly Diffbase → PR--- /tmp/pyrefly_base.txt 2026-04-01 00:49:48.514788947 +0000
+++ /tmp/pyrefly_pr.txt 2026-04-01 00:49:37.831720618 +0000
@@ -3021,13 +3021,13 @@
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
--> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:65:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:148:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:143:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:173:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:168:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:203:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:194:37
ERROR Argument `SimpleNamespace` is not assignable to parameter `application_generate_entity` with type `RagPipelineGenerateEntity` in function `core.app.apps.pipeline.pipeline_runner.PipelineRunner.__init__` [bad-argument-type]
- --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:266:37
+ --> tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py:253:37
ERROR `Literal['generated-conversation-id']` is not assignable to attribute `id` with type `Never` [bad-assignment]
--> tests/unit_tests/core/app/apps/test_advanced_chat_app_generator.py:53:22
ERROR `Literal['generated-message-id']` is not assignable to attribute `id` with type `Never` [bad-assignment]
@@ -3873,7 +3873,7 @@
ERROR Object of class `BlobChunkMessage` has no attribute `text`
ERROR Object of class `BlobChunkMessage` has no attribute `json_object`
ERROR No matching overload found for function `list.__init__` called with arguments: (Generator[Unknown] | None) [no-matching-overload]
- --> tests/unit_tests/core/datasource/test_datasource_file_manager.py:404:20
+ --> tests/unit_tests/core/datasource/test_datasource_file_manager.py:390:20
ERROR Object of class `FunctionType` has no attribute `assert_called_once` [missing-attribute]
--> tests/unit_tests/core/datasource/test_datasource_manager.py:52:5
ERROR Argument `SimpleNamespace` is not assignable to parameter `datasource_type` with type `DatasourceProviderType` in function `core.datasource.datasource_manager.DatasourceManager.get_datasource_plugin_provider` [bad-argument-type]
@@ -5184,13 +5184,13 @@
ERROR Missing required key `enable` for TypedDict `SummaryIndexSettingDict` [bad-typed-dict-key]
--> tests/unit_tests/core/rag/indexing/processor/test_qa_index_processor.py:333:78
ERROR Argument `None` is not assignable to parameter `state` with type `InstanceState[Any]` in function `sqlalchemy.orm.exc.ObjectDeletedError.__init__` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:934:84
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:924:84
ERROR Argument `Literal['completed']` is not assignable to parameter `after_indexing_status` with type `IndexingStatus` in function `core.indexing_runner.IndexingRunner._update_document_index_status` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1045:13
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1038:13
ERROR Argument `Literal['completed']` is not assignable to parameter `after_indexing_status` with type `IndexingStatus` in function `core.indexing_runner.IndexingRunner._update_document_index_status` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1060:71
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1053:71
ERROR Argument `Literal['completed']` is not assignable to parameter `after_indexing_status` with type `IndexingStatus` in function `core.indexing_runner.IndexingRunner._update_document_index_status` [bad-argument-type]
- --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1071:71
+ --> tests/unit_tests/core/rag/indexing/test_indexing_runner.py:1064:71
ERROR Object of class `FunctionType` has no attribute `assert_called_once` [missing-attribute]
--> tests/unit_tests/core/rag/rerank/test_reranker.py:1578:9
ERROR Object of class `FunctionType` has no attribute `call_args` [missing-attribute]
|
|
@asukaminato0721 Please review. Thanks. |
There was a problem hiding this comment.
Pull request overview
Refactors ORM usage in the core RAG indexing/pipeline/datasource paths to SQLAlchemy 2.0 “select()/session.get()” style, and updates unit-test mocks accordingly to improve typing and modernize query patterns.
Changes:
- Replaced
db.session.query(...)lookups withsession.get(...),session.scalar(select(...).limit(1)), andsession.scalars(select(...))across core modules. - Migrated bulk delete/update/count patterns to
session.execute(delete(...)),session.execute(update(...).values(...)), andselect(func.count()). - Updated unit-test mock plumbing to match the new Session APIs.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| api/core/indexing_runner.py | Migrates dataset/user/document/segment queries and bulk updates/deletes to SQLAlchemy 2.0 style. |
| api/core/datasource/datasource_file_manager.py | Switches UploadFile/MessageFile/ToolFile retrieval to session.get() PK lookups. |
| api/core/app/apps/pipeline/pipeline_runner.py | Switches EndUser/Pipeline retrieval to session.get() and workflow/document fetches to scalar(select(...)). |
| api/core/app/apps/pipeline/pipeline_generator.py | Switches workflow fetch to session.get() inside generation path. |
| api/tests/unit_tests/core/rag/indexing/test_indexing_runner.py | Updates DB mocking to use session.get()/session.scalar() for IndexingRunner tests. |
| api/tests/unit_tests/core/datasource/test_datasource_file_manager.py | Updates DB mocking to use session.get() for file retrieval tests. |
| api/tests/unit_tests/core/app/apps/pipeline/test_pipeline_runner.py | Updates some DB mocking to use session.get()/session.scalar() in PipelineRunner tests. |
| api/tests/unit_tests/core/app/apps/pipeline/test_pipeline_generator.py | Updates workflow-not-found test to mock session.get() instead of query chaining. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| db.session.scalar( | ||
| select(func.count()) | ||
| .select_from(DatasetDocument) | ||
| .where(DatasetDocument.id == document_id, DatasetDocument.is_paused == True) |
There was a problem hiding this comment.
For nullable boolean columns, prefer Document.is_paused.is_(True) instead of == True to avoid SQLAlchemy boolean-comparison warnings and to match the project’s existing pattern (e.g., api/services/dataset_service.py:1259).
| .where(DatasetDocument.id == document_id, DatasetDocument.is_paused == True) | |
| .where(DatasetDocument.id == document_id, DatasetDocument.is_paused.is_(True)) |
| end_user = MagicMock(session_id="sess") | ||
|
|
||
| session = MagicMock() | ||
| session.query.side_effect = [query_end_user, query_pipeline] | ||
| session.get.side_effect = [end_user, pipeline] | ||
| mocker.patch.object(module.db, "session", session) |
There was a problem hiding this comment.
This test file still contains cases mocking db.session.query(...) (e.g., test_run_pipeline_not_found / test_run_workflow_not_initialized), but PipelineRunner.run() now uses db.session.get(...). Because MagicMock auto-creates missing attributes, those tests can accidentally pass without exercising the intended branch. Please update the remaining tests to mock session.get return values/side_effects so they fail for the right reason (pipeline missing vs workflow missing).
| session = MagicMock() | ||
| session.query.return_value.where.return_value.first.return_value = None | ||
| session.get.return_value = None | ||
| mocker.patch.object(module.db, "session", session) |
There was a problem hiding this comment.
PipelineGenerator._generate() now uses db.session.get(Workflow, workflow_id), but this test module still has at least one test (test_generate_success_returns_converted) mocking session.query(...).where(...).first(). Since MagicMock will provide a truthy session.get() by default, that test can become a false positive. Update it to set session.get.return_value = workflow (and remove/avoid the unused query mock) so the test validates the correct behavior.
Summary
db.session.query()calls to SQLAlchemy 2.0select()style across 4 files:pipeline_runner.py,pipeline_generator.py,datasource_file_manager.py, andindexing_runner.pysession.get()for PK lookups,scalar(select(...).limit(1))for non-PK filtered queries,scalars(select(...))for multi-row queries,execute(delete(...))for bulk deletes,execute(update(...).values(...))for bulk updates, andscalar(select(func.count()))for count queriesTest plan
make type-checkpasses for all 4 changed source files (basedpyright)Part of #22668