Valkey Sentinel: Fixing Auth/ACL gaps in TLS environments and security hardening (Enhancements for PR #137) by jose-10000 · Pull Request #1 · khtee/valkey-helm

jose-10000 · 2026-04-03T08:26:41Z

Your implementation of Sentinel in PR valkey-io#137 is a great addition to the chart. However, during testing in secured environments, I found that the stack encounters critical issues when the default user is disabled (a common security requirement).

Without these fixes, Sentinels are unable to discover each other or the master in Auth+TLS scenarios (the "empty array" issue). This PR addresses those gaps and introduces several security hardening measures and schema fixes to make the implementation production-ready.

Key Improvements & Bug Fixes

1. Security & ACL Logic (Auth + TLS Interop)

Native Inter-Sentinel Auth: Added support for sentinel-user and sentinel-pass. This is the missing link that allows Sentinels to form a quorum and gossip when TLS and ACLs are both active.
Role Segregation (Least Privilege): Introduced monitorUser for Sentinel-to-Valkey communication. This allows separating the Replication User (sync only) from the Monitor User (Pub/Sub and health), following the principle of least privilege.
Valkey 9 Compatibility: Added conditional logic to omit auth-user and sentinel-user when the user is default, preventing fatal config errors.
Defense in Depth: Verified support for disabling the default user (off resetchannels -@all) without breaking cluster discovery.

2. Credential Protection & Logging

Leak Prevention: Removed the cat "$SENTINEL_CONF" command from the init script to prevent passwords from being exposed in Kubernetes logs.
Log Redirection: Modified the log() function to output to stderr (>&2). This prevents log messages from polluting command substitutions or leaking sensitive data into configuration files.
SHA256 Hashing: Inbound Sentinel ACLs are now securely hashed with SHA256 before being written to the config, ensuring no plaintext passwords reside in the generated files.

3. Helm Schema & Configuration Fixes

Schema Validation: Fixed a missing comma in values.schema.json and added the sentinelAclUsers definition to pass validation.
HAProxy Tag Fix: Converted the HAProxy image tag from a number (2.9) to a string ("2.9") in values.yaml, fixing YAML parsing errors and ensuring helm lint passes.

Testing Status

Verified: Full Sentinel discovery and failover in a cluster with ACL enabled and TLS active.
Verified: No sensitive credentials leaked in kubectl logs.
Work in Progress: Currently reviewing the HAProxy watcher configuration with credentials. I will be performing further validation in the coming days to ensure the sidecar can properly authenticate and update backends in this secured setup.

fix(chart): resolve schema validation and template errors

- Updating HAProxy watcher for near-instant IP-based failover. - Refactoring init scripts to support dynamic topology and universal auth/TLS injection. - Adding smart L7 health checks in HAProxy to handle ACL-protected nodes. - Fully parameterizing Service and ConfigMap ports for end-to-end flexibility.

jose-10000 · 2026-04-07T12:37:54Z

Hi @khtee,

Following my initial comments on this PR, I have pushed an update to stabilize the Sentinel High Availability architecture. After testing in environments (with TLS and ACLs enabled), I’ve identified and fixed several edge cases where the cluster would fail to recover during failovers.

Here is a summary of the key improvements included in this update:

1. Sentinel-Aware Topology Management (`init.sh`)

Dynamic Role Assignment: Removed hardcoded replicaof during initial boot. Nodes now start in a "clean" state, allowing Sentinel to manage the topology without conflicting with static configuration.
State Persistence: The script now captures and restores existing replication states (SAVED_REPLICA_LINE) from persistent volumes. This ensures that Sentinel's runtime decisions survive pod restarts.
Universal Credential Injection: masterauth, masteruser, and tls-replication are now injected into all nodes (including the initial master). This is critical because any master can be demoted to a replica by Sentinel; without these credentials pre-configured, demoted nodes were failing to reconnect to the new primary.

2. Enhanced HAProxy Sentinel Watcher

Hostname-Based Tracking: Updated the watcher logic to track the Master by hostname instead of Sentinel index. This ensures accurate failover detection even if the Sentinel quorum is partially unstable.
DNS Bypass via Runtime API: The watcher now resolves hostnames to IPs and injects them directly into HAProxy's memory using socat. This prevents HAProxy from entering a "DNS NX maintenance" state, which is a common point of failure in Kubernetes when Pod IPs change.

3. Robust L7 Health Checks

Auth-Agnostic Validation: Rewrote the tcp-check using a smart regex: ^(\+PONG|-[Nn][Oo][Aa][Uu][Tt][Hh]|-[Ee][Rr][Rr]). This allows the read pool to stay healthy even when nodes require authentication, ensuring traffic isn't dropped due to -NOAUTH responses during the health check phase.
Fail-Fast Timeouts: Added specific backend timeouts to ensure health checks are more aggressive than client traffic, speeding up failure detection.

4. Full Parameterization

Dynamic Port Mapping: Fully parameterized both write and read ports across the Service, HAProxy ConfigMap, and Values. This removes the hardcoded 6379 targetPort constraint.

…notations in haproxy-deployment

jose-10000 · 2026-04-12T16:49:52Z

Hey! I've updated this branch with the correct DCO sign-offs and fixed the author email. Since you already merged the previous version, you need to force-push my new changes into your branch to clear the DCO error in the main PR.

You can do this by running:
git fetch origin
git reset --hard origin/fix/sentinel-acl-hardening-and-schema
git push origin [su-rama] --force

Have a nice day
Jose

jose added 4 commits April 2, 2026 16:45

feat(sentinel): add native ACL support with secure password handling

2421963

fix(chart): resolve schema validation and template errors

add Inter-Sentinel and Sentinel-Monitor authentication configuration

fccfbbc

updated

0e9d6f3

Clenning haproxy-service and added support for commonLabels and podAn…

d6b55d5

…notations in haproxy-deployment

deejay1 approved these changes Apr 9, 2026

View reviewed changes

khtee merged commit d8f854d into khtee:main Apr 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Valkey Sentinel: Fixing Auth/ACL gaps in TLS environments and security hardening (Enhancements for PR #137)#1

Valkey Sentinel: Fixing Auth/ACL gaps in TLS environments and security hardening (Enhancements for PR #137)#1
khtee merged 5 commits into
khtee:mainfrom
jose-10000:fix/sentinel-acl-hardening-and-schema

jose-10000 commented Apr 3, 2026

Uh oh!

jose-10000 commented Apr 7, 2026

Uh oh!

jose-10000 commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

jose-10000 commented Apr 3, 2026

Key Improvements & Bug Fixes

1. Security & ACL Logic (Auth + TLS Interop)

2. Credential Protection & Logging

3. Helm Schema & Configuration Fixes

Testing Status

Uh oh!

jose-10000 commented Apr 7, 2026

1. Sentinel-Aware Topology Management (init.sh)

2. Enhanced HAProxy Sentinel Watcher

3. Robust L7 Health Checks

4. Full Parameterization

Uh oh!

jose-10000 commented Apr 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

1. Sentinel-Aware Topology Management (`init.sh`)