sudo kubectl logs -f pao-redis-new-server-1 -n pod-services
Defaulted container "redis" out of: redis, sentinel, split-brain-fix, aof-repair, config-init (init)
1:C 06 Apr 2026 12:00:42.127 * oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 06 Apr 2026 12:00:42.127 * Redis version=7.2.4, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 06 Apr 2026 12:00:42.127 * Configuration loaded
1:S 06 Apr 2026 12:00:42.128 * monotonic clock: POSIX clock_gettime
1:S 06 Apr 2026 12:00:42.128 * Running mode=standalone, port=6379.
1:S 06 Apr 2026 12:00:42.128 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:S 06 Apr 2026 12:00:42.128 * Server initialized
1:S 06 Apr 2026 12:00:42.134 * Reading RDB base file on AOF loading...
1:S 06 Apr 2026 12:00:42.134 * Loading RDB produced by version 7.2.4
1:S 06 Apr 2026 12:00:42.134 * RDB age 302 seconds
1:S 06 Apr 2026 12:00:42.134 * RDB memory usage when created 2831.30 Mb
1:S 06 Apr 2026 12:00:42.134 * RDB is base AOF
1:S 06 Apr 2026 12:01:05.564 * Done loading RDB, keys loaded: 7422, keys expired: 0.
1:S 06 Apr 2026 12:01:05.564 * DB loaded from base file appendonly.aof.8.base.rdb: 23.433 seconds
1:S 06 Apr 2026 12:01:07.469 * DB loaded from incr file appendonly.aof.8.incr.aof: 1.905 seconds
1:S 06 Apr 2026 12:01:07.469 * DB loaded from append only file: 25.338 seconds
1:S 06 Apr 2026 12:01:07.469 * Opening AOF incr file appendonly.aof.8.incr.aof on server start
1:S 06 Apr 2026 12:01:07.469 * Ready to accept connections tcp
1:S 06 Apr 2026 12:01:07.977 * Connecting to MASTER 192.168.254.227:6379
1:S 06 Apr 2026 12:01:07.977 * MASTER <-> REPLICA sync started
1:S 06 Apr 2026 12:01:07.978 * Non blocking connect for SYNC fired the event.
1:S 06 Apr 2026 12:01:07.978 * Master replied to PING, replication can continue...
1:S 06 Apr 2026 12:01:07.978 * Partial resynchronization not possible (no cached master)
1:S 06 Apr 2026 12:01:12.677 * Full resync from master: 469d4f3b5479b4e010bdf9509f32661966fba63c:4148981196
1:S 06 Apr 2026 12:01:12.780 * MASTER <-> REPLICA sync: receiving streamed RDB from master with EOF to disk
1:signal-handler (1775476900) Received SIGTERM scheduling shutdown...
1:S 06 Apr 2026 12:01:41.059 * User requested shutdown...
1:S 06 Apr 2026 12:01:41.059 * Calling fsync() on the AOF file.
1:S 06 Apr 2026 12:01:41.059 # Redis is now ready to exit, bye bye...
~ # sudo kubectl describe po redis-new-server-1 -n services
Name: redis-new-server-1
Namespace: services
Priority: 4000
Priority Class Name: critical-services
Service Account: redis-new
Node: server3/192.168.11.3
Start Time: Mon, 06 Apr 2026 12:00:34 +0000
Labels: app=redis-ha
apps.kubernetes.io/pod-index=1
controller-revision-hash=redis-new-server-67744877f9
redis-new=replica
release=redis-new
statefulset.kubernetes.io/pod-name=redis-new-server-1
Annotations: checksum/init-config: dab9fb7c0beaa52722cd1ff7ccad151f8ab6bb8c7bd9830141a28c9e7f269c8e
Status: Running
IP: 192.168.250.219
IPs:
IP: 192.168.250.219
Controlled By: StatefulSet/redis-new-server
Init Containers:
config-init:
Container ID: containerd://82b4727be07282a7c78e78ef6f12d7cf8dff6115dc14010711c9738933cb9829
Image: public.ecr.aws/docker/library/redis:7.2.4-alpine
Image ID: public.ecr.aws/docker/library/redis@sha256:c8bb255c3559b3e458766db810aa7b3c7af1235b204cfdb304e79ff388fe1a5a
Port: <none>
Host Port: <none>
SeccompProfile: RuntimeDefault
Command:
sh
Args:
/readonly-config/init.sh
State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 06 Apr 2026 12:00:41 +0000
Finished: Mon, 06 Apr 2026 12:00:41 +0000
Ready: True
Restart Count: 0
Environment:
SENTINEL_ID_0: 4ec322c9018270bd52453c319fc8868df0f8727b
SENTINEL_ID_1: fcf3a9e314980721aa93e8ca6314b03a2143682d
SENTINEL_ID_2: 5c4bc6e744473d9b673b8e9a89867a8334a3f023
AUTH: <set to the key 'redis-password' in secret 'pao-redis-secret'> Optional: false
SENTINELAUTH: <set to the key 'redis-password' in secret 'pao-redis-secret'> Optional: false
Mounts:
/data from data (rw)
/readonly-config from config (ro)
Containers:
redis:
Container ID: containerd://e816ef254f85aa1634cbed9a62973e2d12b6a4973e59a6f8d426a050ed89ef45
Image: public.ecr.aws/docker/library/redis:7.2.4-alpine
Image ID: public.ecr.aws/docker/library/redis@sha256:c8bb255c3559b3e458766db810aa7b3c7af1235b204cfdb304e79ff388fe1a5a
Port: 6379/TCP
Host Port: 0/TCP
SeccompProfile: RuntimeDefault
Command:
redis-server
Args:
/data/conf/redis.conf
State: Running
Started: Mon, 06 Apr 2026 12:01:41 +0000
Last State: Terminated
Reason: Completed
Exit Code: 0
Started: Mon, 06 Apr 2026 12:00:42 +0000
Finished: Mon, 06 Apr 2026 12:01:41 +0000
Ready: False
Restart Count: 1
Limits:
cpu: 8
memory: 8000Mi
Requests:
cpu: 1
memory: 300Mi
Liveness: exec [sh -c /health/redis_liveness.sh] delay=20s timeout=6s period=5s #success=1 #failure=5
Readiness: exec [sh -c /health/redis_readiness.sh] delay=20s timeout=2s period=5s #success=1 #failure=5
Startup: exec [sh -c /health/redis_readiness.sh] delay=30s timeout=5s period=10s #success=1 #failure=3
Environment:
AUTH: <set to the key 'redis-password' in secret 'pao-redis-secret'> Optional: false
Mounts:
/data from data (rw)
/health from health (rw)
/readonly-config from config (ro)
sentinel:
Container ID: containerd://6dd56f036cc0ca82ddc7a813e315eb9d51bb7d872b3bee1ff3bebc2cb18c21d0
Image: public.ecr.aws/docker/library/redis:7.2.4-alpine
Image ID: public.ecr.aws/docker/library/redis@sha256:c8bb255c3559b3e458766db810aa7b3c7af1235b204cfdb304e79ff388fe1a5a
Port: 26379/TCP
Host Port: 0/TCP
SeccompProfile: RuntimeDefault
Command:
redis-sentinel
Args:
/data/conf/sentinel.conf
State: Running
Started: Mon, 06 Apr 2026 12:00:42 +0000
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 1000Mi
Requests:
cpu: 50m
memory: 300Mi
Liveness: exec [sh -c /health/sentinel_liveness.sh] delay=20s timeout=6s period=5s #success=1 #failure=5
Readiness: exec [sh -c /health/sentinel_liveness.sh] delay=20s timeout=2s period=5s #success=3 #failure=5
Startup: exec [sh -c /health/sentinel_liveness.sh] delay=30s timeout=15s period=10s #success=1 #failure=3
Environment:
AUTH: <set to the key 'redis-password' in secret 'pao-redis-secret'> Optional: false
SENTINELAUTH: <set to the key 'redis-password' in secret 'pao-redis-secret'> Optional: false
Mounts:
/data from data (rw)
/health from health (rw)
split-brain-fix:
Container ID: containerd://261f112415883e9d5fe28c98e16c089d63ae402e2aab734252c5960a9c80b25a
Image: public.ecr.aws/docker/library/redis:7.2.4-alpine
Image ID: public.ecr.aws/docker/library/redis@sha256:c8bb255c3559b3e458766db810aa7b3c7af1235b204cfdb304e79ff388fe1a5a
Port: <none>
Host Port: <none>
SeccompProfile: RuntimeDefault
Command:
sh
Args:
/readonly-config/fix-split-brain.sh
State: Running
Started: Mon, 06 Apr 2026 12:00:42 +0000
Ready: True
Restart Count: 0
Liveness: exec [cat /readonly-config/redis.conf] delay=30s timeout=15s period=15s #success=1 #failure=5
Readiness: exec [sh -c test -d /proc/1] delay=30s timeout=15s period=15s #success=1 #failure=5
Environment:
SENTINEL_ID_0: 4ec322c9018270bd52453c319fc8868df0f8727b
SENTINEL_ID_1: fcf3a9e314980721aa93e8ca6314b03a2143682d
SENTINEL_ID_2: 5c4bc6e744473d9b673b8e9a89867a8334a3f023
AUTH: <set to the key 'redis-password' in secret 'pao-redis-secret'> Optional: false
SENTINELAUTH: <set to the key 'redis-password' in secret 'pao-redis-secret'> Optional: false
Mounts:
/data from data (rw)
/readonly-config from config (ro)
aof-repair:
Container ID: containerd://96d948fe8dca892d0ff929822e54b50790e843d327afd22335e59d001ca511b3
Image: public.ecr.aws/docker/library/redis:7.2.4-alpine
Image ID: public.ecr.aws/docker/library/redis@sha256:c8bb255c3559b3e458766db810aa7b3c7af1235b204cfdb304e79ff388fe1a5a
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
echo "Starting AOF check sidecar..."
while true; do
if ls /data/appendonlydir/*.aof* 1> /dev/null 2>&1; then
echo "AOF files detected. Attempting repair...";
for file in /data/appendonlydir/*.aof*; do
yes | redis-check-aof --fix "$file" && echo "redis-check-aof --fix success for $file";
done
else
echo "No AOF files found in /data/appendonlydir/, skipping repair.";
fi;
sleep 300
done
State: Running
Started: Mon, 06 Apr 2026 12:00:42 +0000
Ready: True
Restart Count: 0
Environment: <none>
Mounts:
/data from data (rw)
Conditions:
Type Status
PodReadyToStartContainers True
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: data-redis-new-server-1
ReadOnly: false
config:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: redis-new-configmap
Optional: false
health:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: redis-new-health-configmap
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 82s default-scheduler Successfully assigned services/redis-new-server-1 to pod-server3-49-6155-502-7kd2
Normal Pulled 77s kubelet Container image "public.ecr.aws/docker/library/redis:7.2.4-alpine" already present on machine
Normal Created 77s kubelet Created container config-init
Normal Started 76s kubelet Started container config-init
Normal Started 75s kubelet Started container split-brain-fix
Normal Created 75s kubelet Created container split-brain-fix
Normal Started 75s kubelet Started container aof-repair
Normal Pulled 75s kubelet Container image "public.ecr.aws/docker/library/redis:7.2.4-alpine" already present on machine
Normal Created 75s kubelet Created container sentinel
Normal Started 75s kubelet Started container sentinel
Normal Pulled 75s kubelet Container image "public.ecr.aws/docker/library/redis:7.2.4-alpine" already present on machine
Normal Created 75s kubelet Created container aof-repair
Normal Pulled 75s kubelet Container image "public.ecr.aws/docker/library/redis:7.2.4-alpine" already present on machine
Warning Unhealthy 17s (x3 over 37s) kubelet Startup probe failed: role=slave; repl=sync
Normal Killing 17s kubelet Container redis failed startup probe, will be restarted
Normal Pulled 16s (x2 over 76s) kubelet Container image "public.ecr.aws/docker/library/redis:7.2.4-alpine" already present on machine
Normal Created 16s (x2 over 76s) kubelet Created container redis
Normal Started 16s (x2 over 75s) kubelet Started container redis
# sudo kubectl get po -A | grep redis
services redis-new-server-0 4/4 Running 2 (13m ago) 17m
services redis-new-server-1 3/4 Running 1 (27s ago) 94s
services redis-new-server-2 4/4 Running 0 53m
Describe the bug
instance is not able to reload the dataset (2gb and more)
cluster is not forming the quorum cleanly.
To Reproduce
Steps to reproduce the behavior:
deploy dandydev/redis-ha chart
using "helm install redis-new dandydev/redis-ha -n pod-services -f values.yaml"
once deploy, push random data into redis master using below command
for i in $(seq 1 8200); do
key="randkey:$i"
keylen=${#key}
{
printf "*3\r\n"
printf "$3\r\nSET\r\n"
printf "$%d\r\n%s\r\n" "$keylen" "$key"
printf "$262144\r\n"
head -c 262144 /dev/urandom
printf "\r\n"
}
done | kubectl exec -i -n services redis-server-new-0 -c redis -- redis-cli -a redis --pipe
once all data is copied, and stable, delete master instance, and check all come into the cluster again.
Expected behavior
check all come into the cluster again.
Additional context
logs