Skip to content

fix(api): preserve dataset metadata filters#35700

Merged
fatelei merged 1 commit intolanggenius:mainfrom
princepal9120:fix/dataset-retrieve-metadata-filter
May 1, 2026
Merged

fix(api): preserve dataset metadata filters#35700
fatelei merged 1 commit intolanggenius:mainfrom
princepal9120:fix/dataset-retrieve-metadata-filter

Conversation

@princepal9120
Copy link
Copy Markdown
Contributor

Summary

Fixes #35666.

The Service API hit-testing/retrieve endpoints already pass retrieval_model into HitTestingService.retrieve, and that service already knows how to apply metadata_filtering_conditions when present. However, the request schema model for retrieval_model did not include metadata_filtering_conditions, so Pydantic validation/model dumping dropped the field before it reached the retrieval layer.

This adds metadata_filtering_conditions to the shared RetrievalModel schema and adds a service API regression test proving the filter survives request parsing and is forwarded to HitTestingService.retrieve.

Why this matters

Requests like this should filter by metadata:

{
  "query": "some query",
  "retrieval_model": {
    "search_method": "semantic_search",
    "reranking_enable": false,
    "top_k": 4,
    "score_threshold_enabled": false,
    "metadata_filtering_conditions": {
      "logical_operator": "and",
      "conditions": [
        {"name": "category", "comparison_operator": "is", "value": "finance"}
      ]
    }
  }
}

Before this change, the metadata filter was silently removed during payload validation.

Validation

  • python3 -m py_compile api/services/entities/knowledge_entities/knowledge_entities.py api/tests/unit_tests/controllers/service_api/dataset/test_hit_testing.py
  • git diff --check

I also attempted the focused pytest command:

  • UV_CACHE_DIR=/tmp/uv-cache uv run --frozen pytest tests/unit_tests/controllers/service_api/dataset/test_hit_testing.py -q

but the local runner could not complete dependency setup because this workspace ran out of disk while uv was downloading/building Dify's full API dependency graph.

@dosubot dosubot Bot added the size:XS This PR changes 0-9 lines, ignoring generated files. label Apr 29, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 1, 2026

Pyrefly Diff

base → PR
--- /tmp/pyrefly_base.txt	2026-05-01 09:12:24.451911780 +0000
+++ /tmp/pyrefly_pr.txt	2026-05-01 09:12:11.678801237 +0000
@@ -1845,7 +1845,7 @@
 ERROR Missing argument `app_model` in function `handler` [missing-argument]
   --> tests/unit_tests/controllers/console/app/test_wraps.py:43:16
 ERROR Cannot set item in `OrderedDict[str, bool | list[str] | str]` [unsupported-operation]
-   --> tests/unit_tests/controllers/console/app/workflow_draft_variables_test.py:137:47
+   --> tests/unit_tests/controllers/console/app/workflow_draft_variables_test.py:134:47
 ERROR `None` is not subscriptable [unsupported-operation]
    --> tests/unit_tests/controllers/console/auth/test_login_logout.py:516:16
 ERROR `None` is not subscriptable [unsupported-operation]
@@ -5433,21 +5433,21 @@
 ERROR Argument `_FakeSession` is not assignable to parameter `session` with type `Session | None` in function `core.workflow.human_input_forms.load_form_tokens_by_form_id` [bad-argument-type]
    --> tests/unit_tests/core/workflow/test_human_input_forms.py:101:17
 ERROR Argument `object` is not assignable to parameter `node_data_memory` with type `MemoryConfig | None` in function `core.workflow.node_factory.fetch_memory` [bad-argument-type]
-   --> tests/unit_tests/core/workflow/test_node_factory.py:116:30
+   --> tests/unit_tests/core/workflow/test_node_factory.py:110:30
 ERROR Argument `object` is not assignable to parameter `node_data_memory` with type `MemoryConfig | None` in function `core.workflow.node_factory.fetch_memory` [bad-argument-type]
-   --> tests/unit_tests/core/workflow/test_node_factory.py:152:30
+   --> tests/unit_tests/core/workflow/test_node_factory.py:146:30
 ERROR Argument `SimpleNamespace` is not assignable to parameter `graph_init_params` with type `GraphInitParams` in function `core.workflow.node_factory.DifyNodeFactory.__init__` [bad-argument-type]
-   --> tests/unit_tests/core/workflow/test_node_factory.py:344:35
+   --> tests/unit_tests/core/workflow/test_node_factory.py:338:35
 ERROR `SimpleNamespace` is not assignable to attribute `graph_runtime_state` with type `GraphRuntimeState` [bad-assignment]
-   --> tests/unit_tests/core/workflow/test_node_factory.py:407:39
+   --> tests/unit_tests/core/workflow/test_node_factory.py:401:39
 ERROR `SimpleNamespace` is not assignable to attribute `_dify_context` with type `DifyRunContext` [bad-assignment]
-   --> tests/unit_tests/core/workflow/test_node_factory.py:408:33
+   --> tests/unit_tests/core/workflow/test_node_factory.py:402:33
 ERROR `form_repository` may be uninitialized [unbound-name]
-   --> tests/unit_tests/core/workflow/test_node_factory.py:548:49
+   --> tests/unit_tests/core/workflow/test_node_factory.py:542:49
 ERROR `SimpleNamespace` is not assignable to attribute `_dify_context` with type `DifyRunContext` [bad-assignment]
-   --> tests/unit_tests/core/workflow/test_node_factory.py:770:33
+   --> tests/unit_tests/core/workflow/test_node_factory.py:686:33
 ERROR `SimpleNamespace` is not assignable to attribute `graph_runtime_state` with type `GraphRuntimeState` [bad-assignment]
-   --> tests/unit_tests/core/workflow/test_node_factory.py:771:39
+   --> tests/unit_tests/core/workflow/test_node_factory.py:687:39
 ERROR Object of class `ExternalRecipient` has no attribute `reference_id` [missing-attribute]
   --> tests/unit_tests/core/workflow/test_node_runtime.py:88:12
 ERROR Argument `object` is not assignable to parameter `method` with type `EmailDeliveryMethod | InteractiveSurfaceDeliveryMethod` in function `core.workflow.node_runtime.apply_dify_debug_email_recipient` [bad-argument-type]

@dosubot dosubot Bot added the lgtm This PR has been approved by a maintainer label May 1, 2026
@fatelei fatelei added this pull request to the merge queue May 1, 2026
Merged via the queue into langgenius:main with commit 54bde0b May 1, 2026
27 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer size:XS This PR changes 0-9 lines, ignoring generated files.

Projects

None yet

2 participants