refactor: select in external_knowledge_service#34493
refactor: select in external_knowledge_service#34493asukaminato0721 merged 2 commits intolanggenius:mainfrom
Conversation
Pyrefly Diffbase → PR--- /tmp/pyrefly_base.txt 2026-04-03 03:04:05.833578513 +0000
+++ /tmp/pyrefly_pr.txt 2026-04-03 03:03:57.184574213 +0000
@@ -6262,7 +6262,7 @@
ERROR Argument `None` is not assignable to parameter `response` with type `Response` in function `httpx._exceptions.HTTPStatusError.__init__` [bad-argument-type]
--> tests/unit_tests/services/enterprise/test_plugin_manager_service.py:61:26
ERROR Argument `SimpleNamespace` is not assignable to parameter `metadata_condition` with type `MetadataCondition | None` in function `services.external_knowledge_service.ExternalDatasetService.fetch_external_knowledge_retrieval` [bad-argument-type]
- --> tests/unit_tests/services/external_dataset_service.py:849:36
+ --> tests/unit_tests/services/external_dataset_service.py:851:36
ERROR Cannot index into `list[Unknown]` [bad-index]
--> tests/unit_tests/services/hit_service.py:430:20
ERROR Cannot index into `object` [bad-index]
@@ -6547,11 +6547,11 @@
ERROR Argument `None` is not assignable to parameter `api_settings` with type `dict[Unknown, Unknown]` in function `services.external_knowledge_service.ExternalDatasetService.validate_api_list` [bad-argument-type]
--> tests/unit_tests/services/test_external_dataset_service.py:401:54
ERROR Argument `str | None` is not assignable to parameter `s` with type `bytearray | bytes | str` in function `json.loads` [bad-argument-type]
- --> tests/unit_tests/services/test_external_dataset_service.py:893:31
+ --> tests/unit_tests/services/test_external_dataset_service.py:880:31
ERROR `None` is not subscriptable [unsupported-operation]
- --> tests/unit_tests/services/test_external_dataset_service.py:1478:16
+ --> tests/unit_tests/services/test_external_dataset_service.py:1417:16
ERROR `None` is not subscriptable [unsupported-operation]
- --> tests/unit_tests/services/test_external_dataset_service.py:1479:16
+ --> tests/unit_tests/services/test_external_dataset_service.py:1418:16
ERROR Argument `Literal['invalid']` is not assignable to parameter `session_factory` with type `Engine | sessionmaker[Unknown] | None` in function `services.file_service.FileService.__init__` [bad-argument-type]
--> tests/unit_tests/services/test_file_service.py:48:41
ERROR `in` is not supported between `Literal['form_id=test-form']` and `None` [not-iterable]
|
Pyrefly Diffbase → PR--- /tmp/pyrefly_base.txt 2026-04-03 03:05:52.834213706 +0000
+++ /tmp/pyrefly_pr.txt 2026-04-03 03:05:43.975152628 +0000
@@ -6262,7 +6262,7 @@
ERROR Argument `None` is not assignable to parameter `response` with type `Response` in function `httpx._exceptions.HTTPStatusError.__init__` [bad-argument-type]
--> tests/unit_tests/services/enterprise/test_plugin_manager_service.py:61:26
ERROR Argument `SimpleNamespace` is not assignable to parameter `metadata_condition` with type `MetadataCondition | None` in function `services.external_knowledge_service.ExternalDatasetService.fetch_external_knowledge_retrieval` [bad-argument-type]
- --> tests/unit_tests/services/external_dataset_service.py:849:36
+ --> tests/unit_tests/services/external_dataset_service.py:851:36
ERROR Cannot index into `list[Unknown]` [bad-index]
--> tests/unit_tests/services/hit_service.py:430:20
ERROR Cannot index into `object` [bad-index]
@@ -6547,11 +6547,11 @@
ERROR Argument `None` is not assignable to parameter `api_settings` with type `dict[Unknown, Unknown]` in function `services.external_knowledge_service.ExternalDatasetService.validate_api_list` [bad-argument-type]
--> tests/unit_tests/services/test_external_dataset_service.py:401:54
ERROR Argument `str | None` is not assignable to parameter `s` with type `bytearray | bytes | str` in function `json.loads` [bad-argument-type]
- --> tests/unit_tests/services/test_external_dataset_service.py:893:31
+ --> tests/unit_tests/services/test_external_dataset_service.py:880:31
ERROR `None` is not subscriptable [unsupported-operation]
- --> tests/unit_tests/services/test_external_dataset_service.py:1478:16
+ --> tests/unit_tests/services/test_external_dataset_service.py:1417:16
ERROR `None` is not subscriptable [unsupported-operation]
- --> tests/unit_tests/services/test_external_dataset_service.py:1479:16
+ --> tests/unit_tests/services/test_external_dataset_service.py:1418:16
ERROR Argument `Literal['invalid']` is not assignable to parameter `session_factory` with type `Engine | sessionmaker[Unknown] | None` in function `services.file_service.FileService.__init__` [bad-argument-type]
--> tests/unit_tests/services/test_file_service.py:48:41
ERROR `in` is not supported between `Literal['form_id=test-form']` and `None` [not-iterable]
|
There was a problem hiding this comment.
Pull request overview
Refactors ExternalDatasetService DB access in external_knowledge_service.py to use SQLAlchemy 2.x select() / Session.scalar() patterns, updating unit-test mocks accordingly as part of the repo-wide effort to improve typing and modernize ORM usage (Issue #22668).
Changes:
- Migrated multiple
db.session.query(...).filter_by(...).first()/count()usages toselect(...).where(...)withdb.session.scalar(...). - Updated count queries to
select(func.count(...))and ensured scalar results are handled safely. - Updated unit tests’ DB-session mocks to match the new
scalar()-based query style.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| api/services/external_knowledge_service.py | Replaces legacy Query API usage with SQLAlchemy 2.x select() / scalar() patterns for external knowledge API/binding lookups. |
| api/tests/unit_tests/services/test_external_dataset_service.py | Updates mocks from session.query() chains to session.scalar() return values/side effects. |
| api/tests/unit_tests/services/external_dataset_service.py | Updates mocks to use session.scalar() and adapts multi-call scenarios via side_effect. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| select(ExternalKnowledgeApis) | ||
| .where(ExternalKnowledgeApis.id == external_knowledge_binding.external_knowledge_api_id) |
There was a problem hiding this comment.
In fetch_external_knowledge_retrieval, the API template lookup only filters by ExternalKnowledgeApis.id and does not constrain by tenant_id. Since there are no FK constraints between external_knowledge_bindings.external_knowledge_api_id and external_knowledge_apis.id, a malformed/corrupted binding could cause a cross-tenant API template read. Consider adding ExternalKnowledgeApis.tenant_id == tenant_id (or external_knowledge_binding.tenant_id) to the where() clause to enforce tenant isolation at query time.
| select(ExternalKnowledgeApis) | |
| .where(ExternalKnowledgeApis.id == external_knowledge_binding.external_knowledge_api_id) | |
| select(ExternalKnowledgeApis).where( | |
| ExternalKnowledgeApis.id == external_knowledge_binding.external_knowledge_api_id, | |
| ExternalKnowledgeApis.tenant_id == tenant_id, | |
| ) |
| def test_fetch_external_knowledge_retrieval_non_200_status_returns_empty_list(self, mock_db_session: MagicMock): | ||
| """ | ||
| Non‑200 responses should be treated as an empty result set. | ||
| """ | ||
|
|
||
| binding = ExternalDatasetTestDataFactory.create_external_binding() | ||
| api = Mock(spec=ExternalKnowledgeApis) | ||
| api.settings = '{"endpoint":"https://example.com","api_key":"secret"}' | ||
|
|
||
| mock_db_session.query.return_value.filter_by.return_value.first.side_effect = [ | ||
| mock_db_session.scalar.side_effect = [ | ||
| binding, | ||
| api, | ||
| ] |
There was a problem hiding this comment.
This test asserts that a non-200 response is treated as an empty result set, but ExternalDatasetService.fetch_external_knowledge_retrieval() currently raises ValueError(response.text) for non-200 (see api/services/external_knowledge_service.py:350-351). Either update the test to expect the exception, or update the service implementation to match the documented/tested behavior (alternatively mark the test xfail with a clear reason if it’s intentionally tracking a known issue).
|
Thanks. 😊 |
Co-authored-by: autofix-ci[bot] <114827586+autofix-ci[bot]@users.noreply.github.com>
Summary
db.session.query()calls to SQLAlchemy 2.xselect()style inexternal_knowledge_service.py.filter_by().first()withdb.session.scalar(select().where().limit(1)).filter_by().count()withdb.session.scalar(select(func.count()).where())test_external_dataset_service.pyandexternal_dataset_service.pyto match new patterns (unavoidable)Note:
test_fetch_external_knowledge_retrieval_non_200_status_returns_empty_listis a pre-existing failure on main (test expects empty list but code raises ValueError).Test plan
Part of #22668