Skip to content

feat: add Valkey as a vector store backend#35337

Open
daric93 wants to merge 7 commits intolanggenius:mainfrom
daric93:feat/add-valkey-vector-store
Open

feat: add Valkey as a vector store backend#35337
daric93 wants to merge 7 commits intolanggenius:mainfrom
daric93:feat/add-valkey-vector-store

Conversation

@daric93
Copy link
Copy Markdown

@daric93 daric93 commented Apr 16, 2026

Fixes #35229

Summary

Adds Valkey as a supported vector database backend for knowledge base embeddings and retrieval in Dify, using valkey-glide (the official Valkey Python client) and the valkey-search module for vector similarity search.

What's included:

  • VALKEY added to the VectorType enum
  • ValkeyVector implementing the full BaseVector interface (create, add, search, delete)
  • ValkeyVectorFactory registered via entry point in pyproject.toml
  • ValkeyConfig Pydantic settings with env vars: VALKEY_HOST, VALKEY_PORT, VALKEY_PASSWORD, VALKEY_DB, VALKEY_USE_SSL, VALKEY_DISTANCE_METRIC
  • Valkey service in docker-compose.yaml using valkey/valkey-bundle:9.1.0-rc1 (includes valkey-search module 1.2.0)
  • CI integration: Valkey added to both vdb-tests.yml (smoke) and vdb-tests-full.yml (weekly)
  • 35 unit tests (pure logic, no mocks) + 31 integration tests (live Valkey)

Key design decisions:

  • Uses the typed ft.create / ft.search glide API where available, client.hset / client.exists / client.delete for CRUD
  • HNSW vector index with configurable distance metric (COSINE, L2, IP)
  • Dimensions auto-detected from first embedding
  • Distributed locking via Dify's existing Redis client (ext_redis) for index creation — same pattern as Qdrant backend
  • Cosine distance conversion follows the valkey-search spec: similarity = 1 - distance/2 (range [0, 2])

Dependencies:

  • valkey-glide>=1.3.0 — official async Valkey client
  • Valkey server with valkey-search module >= 1.2.0

Checklist

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly: docs: add Valkey vector store environment variables dify-docs#750
  • I ran make lint && make type-check (backend) and cd web && pnpm exec vp staged (frontend) to appease the lint gods

Signed-off-by: Daria Korenieva <daric2612@gmail.com>
@dosubot dosubot bot added the size:XL This PR changes 500-999 lines, ignoring generated files. label Apr 16, 2026
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
@github-actions
Copy link
Copy Markdown
Contributor

Pyrefly Diff

base → PR
--- /tmp/pyrefly_base.txt	2026-04-17 03:41:53.732644455 +0000
+++ /tmp/pyrefly_pr.txt	2026-04-17 03:41:42.198612951 +0000
@@ -552,6 +552,8 @@
   --> providers/vdb/vdb-upstash/tests/unit_tests/test_upstash_vector.py:33:5
 ERROR Object of class `ModuleType` has no attribute `Index` [missing-attribute]
   --> providers/vdb/vdb-upstash/tests/unit_tests/test_upstash_vector.py:34:5
+ERROR Cannot set item in `list[list[float] | str]` [unsupported-operation]
+   --> providers/vdb/vdb-valkey/tests/integration_tests/test_valkey.py:248:9
 ERROR Object of class `ModuleType` has no attribute `SimpleConnectionPool` [missing-attribute]
   --> providers/vdb/vdb-vastbase/tests/unit_tests/test_vastbase_vector.py:27:5
 ERROR Object of class `ModuleType` has no attribute `execute_values` [missing-attribute]
@@ -4592,7 +4594,7 @@
 ERROR Argument `list[Document | SimpleNamespace]` is not assignable to parameter `texts` with type `list[Document]` in function `core.rag.datasource.vdb.vector_base.BaseVector._get_uuids` [bad-argument-type]
   --> tests/unit_tests/core/rag/datasource/vdb/test_vector_base.py:87:30
 ERROR Class member `_Expr.__eq__` overrides parent class `object` in an inconsistent manner [bad-override]
-   --> tests/unit_tests/core/rag/datasource/vdb/test_vector_factory.py:203:13
+   --> tests/unit_tests/core/rag/datasource/vdb/test_vector_factory.py:204:13
 ERROR Argument `list[str]` is not assignable to parameter `docs` with type `Sequence[Document]` in function `core.rag.docstore.dataset_docstore.DatasetDocumentStore.add_documents` [bad-argument-type]
    --> tests/unit_tests/core/rag/docstore/test_dataset_docstore.py:307:37
 ERROR Argument `None` is not assignable to parameter `orig` with type `BaseException` in function `sqlalchemy.exc.DBAPIError.__init__` [bad-argument-type]

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Adds Valkey as a supported vector store backend for embeddings + retrieval, wiring it through configuration, packaging/entrypoints, Docker env, and CI workflows.

Changes:

  • Introduces a new ValkeyVector backend (factory + config), backed by valkey-glide and valkey-search.
  • Adds Valkey configuration/env examples and registers the provider in the API workspace.
  • Adds unit + integration tests and includes Valkey in vdb CI workflows.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 12 comments.

Show a summary per file
File Description
docker/docker-compose.yaml Adds Valkey env vars to the shared API/worker environment block.
docker/.env.example Documents valkey as a vector store option and adds Valkey-specific env vars.
api/.env.example Documents valkey as a vector store option and adds Valkey-specific env vars.
api/core/rag/datasource/vdb/vector_type.py Adds VALKEY to the VectorType enum.
api/configs/middleware/vdb/valkey_config.py Introduces Pydantic settings for Valkey connection + metric selection.
api/configs/middleware/init.py Registers ValkeyConfig in the middleware config mixin.
api/pyproject.toml Adds dify-vdb-valkey to workspace deps and feature groups.
api/tests/unit_tests/core/rag/datasource/vdb/test_vector_factory.py Registers Valkey factory module path in vector factory tests.
api/providers/vdb/vdb-valkey/pyproject.toml Defines the new provider package + entrypoint.
api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py Implements Valkey vector backend + helpers + factory.
api/providers/vdb/vdb-valkey/tests/unit_tests/test_valkey_vector.py Adds unit tests for helper functions + config defaults.
api/providers/vdb/vdb-valkey/tests/integration_tests/test_valkey.py Adds integration tests against a live Valkey+valkey-search instance.
.github/workflows/vdb-tests.yml Adds Valkey to smoke vdb integration matrix and test paths.
.github/workflows/vdb-tests-full.yml Adds Valkey to weekly/full vdb integration matrix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py
Comment thread api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py Outdated
Comment thread api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py Outdated
Comment thread api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py
Comment thread api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py
Comment thread api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py
Comment thread api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py Outdated
Comment thread api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py
Comment thread api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py Outdated
Comment thread docker/docker-compose.yaml Outdated
@crazywoola
Copy link
Copy Markdown
Member

Please mark those comments as resolved if you think it's ok.

@github-actions
Copy link
Copy Markdown
Contributor

Pyrefly Diff

base → PR
--- /tmp/pyrefly_base.txt	2026-04-17 04:50:08.566547899 +0000
+++ /tmp/pyrefly_pr.txt	2026-04-17 04:49:57.595535056 +0000
@@ -552,6 +552,8 @@
   --> providers/vdb/vdb-upstash/tests/unit_tests/test_upstash_vector.py:33:5
 ERROR Object of class `ModuleType` has no attribute `Index` [missing-attribute]
   --> providers/vdb/vdb-upstash/tests/unit_tests/test_upstash_vector.py:34:5
+ERROR Cannot set item in `list[list[float] | str]` [unsupported-operation]
+   --> providers/vdb/vdb-valkey/tests/integration_tests/test_valkey.py:248:9
 ERROR Object of class `ModuleType` has no attribute `SimpleConnectionPool` [missing-attribute]
   --> providers/vdb/vdb-vastbase/tests/unit_tests/test_vastbase_vector.py:27:5
 ERROR Object of class `ModuleType` has no attribute `execute_values` [missing-attribute]
@@ -4592,7 +4594,7 @@
 ERROR Argument `list[Document | SimpleNamespace]` is not assignable to parameter `texts` with type `list[Document]` in function `core.rag.datasource.vdb.vector_base.BaseVector._get_uuids` [bad-argument-type]
   --> tests/unit_tests/core/rag/datasource/vdb/test_vector_base.py:87:30
 ERROR Class member `_Expr.__eq__` overrides parent class `object` in an inconsistent manner [bad-override]
-   --> tests/unit_tests/core/rag/datasource/vdb/test_vector_factory.py:203:13
+   --> tests/unit_tests/core/rag/datasource/vdb/test_vector_factory.py:204:13
 ERROR Argument `list[str]` is not assignable to parameter `docs` with type `Sequence[Document]` in function `core.rag.docstore.dataset_docstore.DatasetDocumentStore.add_documents` [bad-argument-type]
    --> tests/unit_tests/core/rag/docstore/test_dataset_docstore.py:307:37
 ERROR Argument `None` is not assignable to parameter `orig` with type `BaseException` in function `sqlalchemy.exc.DBAPIError.__init__` [bad-argument-type]

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 16 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread api/providers/vdb/vdb-valkey/src/dify_vdb_valkey/valkey_vector.py
Comment thread .github/workflows/vdb-tests.yml
Comment thread .github/workflows/vdb-tests-full.yml
daric93 added 3 commits April 17, 2026 11:01
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
Signed-off-by: Daria Korenieva <daric2612@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:XL This PR changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add Valkey as a Vector Store Backend

3 participants