Redis server pods not replaced upon change of auth Secret in RedisFailover resource #658

SaberStrat · 2023-09-04T11:31:25Z

Expected behaviour

We wanted to fix a wrong Secret being used under spec.auth.secretPath for a RedisFailover resource that we're deploying with a helm chart. We created a new chart version referencing the correct Secret, and deployed the new version over the deployed old version via helm update ....

We expected the Redis server pods to get replaced with new pods with the update Secret path.

Actual behaviour

The helm chart update went by successfully. However, the Redis server pods were still the old ones, with the incorrect Secret path, while the RedisFailover resource did get updated with the new Secret path.

This led to the application getting deployed with the helm chart failing to boot up due to failing to connect with redis due to the wrong auth password.

Steps to reproduce the behaviour

Deployed a RedisFailover resource with one Secret path via helm chart
Updated the helm chart release, where the only difference is a new Secret path used in the RedisFailover resource

Environment

Redis Operator version 1.2.1
Kubernetes version 1.25.6

Logs

As the RedisFailover remains deployed in this state of missmatching Secrets in the RedisFailover resource and the Redis server pods, the operator controller's pod logs are getting spammed with this group of entries every 30s:

# kubectl logs redis-operator-54966b4cdb-kl5hm -n redis | grep error
...
time="2023-09-04T11:16:19Z" level=error msg="Get redis info failed, maybe this node is not ready, pod ip: 10.233.66.71" src="checker.go:112"
time="2023-09-04T11:16:19Z" level=error msg="Get redis info failed, maybe this node is not ready, pod ip: 10.233.67.220" src="checker.go:112"
time="2023-09-04T11:16:19Z" level=error msg="Get redis info failed, maybe this node is not ready, pod ip: 10.233.66.61" src="checker.go:112"
time="2023-09-04T11:16:19Z" level=error msg="Make new master failed, master ip: 10.233.66.71, error: WRONGPASS invalid username-password pair or user is disabled." src="checker.go:135"
time="2023-09-04T11:16:19Z" level=error msg="Make slave failed, slave pod ip: 10.233.67.220, master ip: 10.233.66.71, error: WRONGPASS invalid username-password pair or user is disabled." src="checker.go:135"
time="2023-09-04T11:16:19Z" level=error msg="Make slave failed, slave pod ip: 10.233.66.61, master ip: 10.233.66.71, error: WRONGPASS invalid username-password pair or user is disabled." src="checker.go:135"
time="2023-09-04T11:16:19Z" level=error msg="Get redis info failed, maybe this node is not ready, pod ip: 10.233.66.71" src="checker.go:149"
time="2023-09-04T11:16:19Z" level=error msg="Get redis info failed, maybe this node is not ready, pod ip: 10.233.67.220" src="checker.go:149"
time="2023-09-04T11:16:19Z" level=error msg="Get redis info failed, maybe this node is not ready, pod ip: 10.233.66.61" src="checker.go:149"
time="2023-09-04T11:16:19Z" level=error msg="error on object processing: number of redis nodes known as master is different than 1" controller-id=redisfailover object-key=default/changeme-redis operator=redisfailover service=kooper.controller src="controller.go:279"

We've taken a look at the operator's source code looking for the error strings. It feels like this might have to do with the logic used to get the a MasterIP. It's using the password from the Secret referenced in the RedisFailover resource - which was successfully updated in our case - to access the Redis servers - which are still using the old password.

If the controller is trying to do this stuff before replacing the old Redis server pods, then it looks like "hot-swapping" a Secret for Redis auth might not be supported.

The text was updated successfully, but these errors were encountered:

ggramal · 2023-09-13T11:16:59Z

I think we have the exact same issue. The difference is that we wanted to add auth to redis instances that were created without it and rfr-<redis_name> statefullset was never changed (REDIS_PASSWORD was not added)

Redis Operator version 1.2.4
Kubernetes version 1.26.5

github-actions · 2023-11-11T01:48:20Z

This issue is stale because it has been open for 45 days with no activity.

github-actions · 2023-11-25T01:48:26Z

This issue was closed because it has been inactive for 14 days since being marked as stale.

github-actions bot added the stale label Nov 11, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Redis server pods not replaced upon change of auth Secret in RedisFailover resource #658

Redis server pods not replaced upon change of auth Secret in RedisFailover resource #658

SaberStrat commented Sep 4, 2023 •

edited

ggramal commented Sep 13, 2023 •

edited

github-actions bot commented Nov 11, 2023

github-actions bot commented Nov 25, 2023

Redis server pods not replaced upon change of auth Secret in RedisFailover resource #658

Redis server pods not replaced upon change of auth Secret in RedisFailover resource #658

Comments

SaberStrat commented Sep 4, 2023 • edited

Expected behaviour

Actual behaviour

Steps to reproduce the behaviour

Environment

Logs

ggramal commented Sep 13, 2023 • edited

github-actions bot commented Nov 11, 2023

github-actions bot commented Nov 25, 2023

SaberStrat commented Sep 4, 2023 •

edited

ggramal commented Sep 13, 2023 •

edited