fix(api): inherit REDIS_USE_CLUSTERS for event bus pubsub config (#35291) 🤖🤖🤖#35338
Open
jeanibarz wants to merge 3 commits intolanggenius:mainfrom
Open
fix(api): inherit REDIS_USE_CLUSTERS for event bus pubsub config (#35291) 🤖🤖🤖#35338jeanibarz wants to merge 3 commits intolanggenius:mainfrom
jeanibarz wants to merge 3 commits intolanggenius:mainfrom
Conversation
The event bus pub/sub config (PUBSUB_REDIS_*/EVENT_BUS_REDIS_*) inherits connection details from the generic Redis config (HOST, PORT, USERNAME, PASSWORD, DB, USE_SSL) via normalized_pubsub_redis_url, but PUBSUB_REDIS_USE_CLUSTERS did not fall back to REDIS_USE_CLUSTERS. With the old False default, _create_pubsub_client called redis.Redis.from_url instead of RedisCluster.from_url whenever operators set REDIS_USE_CLUSTERS alone, causing the event bus to fail in cluster-only deployments. Change PUBSUB_REDIS_USE_CLUSTERS default from False to None, add a normalized_pubsub_use_clusters property that mirrors the existing normalized_pubsub_redis_url fallback pattern, and route the single ext_redis.py call site through it. Explicitly set values still win over the inherited flag; non-cluster deployments remain unchanged.
The Python-level fallback added in the previous commit is bypassed by
the docker-compose template, which injects
`EVENT_BUS_REDIS_USE_CLUSTERS: ${EVENT_BUS_REDIS_USE_CLUSTERS:-false}`
into every container's environment. With a literal "false" string
reaching the API container, Pydantic coerces it to `False` before the
new inheritance logic can fire, so the issue reproducer (operator sets
`REDIS_USE_CLUSTERS=true` without touching the event-bus flag) still
fails for docker deployments — the primary reported context of langgenius#35291.
- Drop the hard-coded `false` default from .env.example and document
that leaving the value empty inherits `REDIS_USE_CLUSTERS`.
- Regenerate docker-compose.yaml so the compose-level default becomes
an empty string, matching the new .env.example.
- Add a `field_validator` on `PUBSUB_REDIS_USE_CLUSTERS` that coerces
empty / whitespace-only env values to `None`, because Pydantic
otherwise rejects `""` for `bool | None`.
- Add unit tests for the empty-string path and for the "pubsub-only
cluster" configuration (main Redis standalone, event bus clustered).
Contributor
Pyrefly DiffNo changes detected. |
The property tests on DifyConfig pin the config-layer semantics but don't catch regressions in the one-line call-site swap at init_app — if someone later restores `dify_config.PUBSUB_REDIS_USE_CLUSTERS` in place of `normalized_pubsub_use_clusters`, all existing unit tests would still pass while the langgenius#35291 bug silently returns. Add three integration-style tests that patch the redis client factories and assert `_create_pubsub_client` receives the correct `use_clusters` kwarg for: - inheritance path (normalized_pubsub_use_clusters=True) - explicit-false override (normalized_pubsub_use_clusters=False) - no pubsub URL configured (client never constructed)
Contributor
Pyrefly Diffbase → PR--- /tmp/pyrefly_base.txt 2026-04-16 20:51:04.258946249 +0000
+++ /tmp/pyrefly_pr.txt 2026-04-16 20:50:53.477839681 +0000
@@ -5777,7 +5777,7 @@
ERROR Cannot index into `object` [bad-index]
--> tests/unit_tests/extensions/otel/test_celery_sqlcommenter.py:140:20
ERROR Object of class `Retry` has no attribute `_retries` [missing-attribute]
- --> tests/unit_tests/extensions/test_redis.py:34:16
+ --> tests/unit_tests/extensions/test_redis.py:35:16
ERROR Argument `dict[str, bytes | str]` is not assignable to parameter `headers` with type `Headers | Mapping[bytes, bytes] | Mapping[str, str] | Sequence[tuple[bytes, bytes]] | Sequence[tuple[str, str]] | None` in function `httpx._models.Response.__init__` [bad-argument-type]
--> tests/unit_tests/factories/test_build_from_mapping.py:75:21
ERROR Object of class `NoneType` has no attribute `storage_key` [missing-attribute]
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #35291.
The event bus pub/sub config (
PUBSUB_REDIS_*/EVENT_BUS_REDIS_*) already inherits connection details from the generic Redis config (REDIS_HOST,REDIS_PORT,REDIS_USERNAME,REDIS_PASSWORD,REDIS_DB,REDIS_USE_SSL) via the existingnormalized_pubsub_redis_urlproperty, butPUBSUB_REDIS_USE_CLUSTERSwas not covered by that fallback. With the oldFalsedefault,_create_pubsub_clientatapi/extensions/ext_redis.py:454always invokedredis.Redis.from_url(...)— neverRedisCluster.from_url(...)— even when the operator had setREDIS_USE_CLUSTERS=truefor the main Redis. In cluster-only deployments this makes the event bus fail.Three commits, all scoped to the fix:
fix(api)— extendRedisConfigDefaultswithREDIS_USE_CLUSTERS, change thePUBSUB_REDIS_USE_CLUSTERSdefault fromFalsetoNone, add a computednormalized_pubsub_use_clustersproperty mirroringnormalized_pubsub_redis_url, and route the single call site through it.fix(docker)— without this follow-up the Python-level fallback is shadowed bydocker-compose.yamlinjectingEVENT_BUS_REDIS_USE_CLUSTERS: ${EVENT_BUS_REDIS_USE_CLUSTERS:-false}into every container. That makes Pydantic see the literal string"false"before the new inheritance logic can fire, reproducing the bug for docker users (the reported deployment context). The.env.exampleand generateddocker-compose.yamldefaults are now empty; a smallfield_validatorcoerces empty/whitespace-only env values toNoneso Pydantic does not reject"".test(ext_redis)— pin the call-site swap with three integration-style tests that patch the client factories and assert_create_pubsub_clientreceives the correctuse_clusterskwarg. Without these, a future refactor that restores the rawPUBSUB_REDIS_USE_CLUSTERSread atinit_appwould silently reintroduce the bug with all existing property tests still passing.Explicitly setting either
PUBSUB_REDIS_USE_CLUSTERSorEVENT_BUS_REDIS_USE_CLUSTERSstill wins over the inherited flag. Non-cluster deployments with both flags unset keep the current behavior (cluster off).Verification
Unit tests — 8 new, all passing locally (config layer + runtime routing):
The 5 config-layer tests cover: inheritance, explicit override, pubsub-only cluster, empty-string env (docker-compose
${VAR:-}pattern), and the no-flag default. The 3 ext_redis tests assert the runtime-path swap:init_appcalls_create_pubsub_client(url, True)whennormalized_pubsub_use_clustersis True,...(url, False)when explicitly False, and skips pubsub client construction when the URL is absent.End-to-end verification against a live Redis cluster —
docker run -p 7000-7005:7000-7005 grokzen/redis-cluster(3 masters + 3 replicas, all 16384 slots assigned,cluster_state:ok), then ran the sameinit_appcode path with the operator's reported env:upstream/mainREDIS_USE_CLUSTERS=true,EVENT_BUS_REDIS_USE_CLUSTERS=""(the post-fix docker-compose default)
ValidationError: Input should be a valid boolean, unable to interpret input [input_value='']— API fails to startnormalized_pubsub_use_clusters=True, pubsub client isredis.cluster.RedisCluster, live PUBLISH reaches the clusterREDIS_USE_CLUSTERS=true,EVENT_BUS_REDIS_USE_CLUSTERS="false"(the pre-fix docker-compose default)
use_clusters=False, pubsub client is standaloneredis.RedisAgainst a vanilla Redis Cluster,
redis-py's standalone client transparently follows MOVED redirects for data commands, and plainPUBLISHgossip-propagates across cluster nodes — so the misrouting does not always produce a hard failure in vanilla tests. The consequential failures reported in #35291 happen on managed cluster-only endpoints (AWS ElastiCache cluster-mode, GCP Memorystore cluster-mode, Kubernetes deployments behind a single LB endpoint) that reject non-cluster-protocol connections outright. The fix is to always use the correct client type when the operator has opted into cluster mode for the main Redis.Static checks —
ruff format + checkclean on all three files,basedpyrightandmypyclean onconfigs/middleware/cache/redis_pubsub_config.pyandextensions/ext_redis.py(Success: no issues found in 2 source files).Screenshots
N/A — backend-only config, docker template, and test change.
Checklist
make lint && make type-check(backend) andcd web && pnpm exec vp staged(frontend) to appease the lint godsFrom Claude Code