[stable/redis-ha] improvement: refactor of redis-ha #7323

ssalaues · 2018-08-23T23:12:28Z

What this PR does / why we need it:
There's many issues with this chart and the simple fact is that in reality it is not highly-available as stated in the chart description/name. This refactor brings a simpler approach to a redis master/slave configuration with sentinel management as the sentinels are simply deployed as sidecars containers to each redis.

This provides native redis management, failover, and election

This also removes dependencies on very specific redis images thus allowing for use of any redis images. All init scripting is now managed from a configmap within this chart.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #
fixes #5441, fixes #3197, fixes #3403, fixes #2780, fixes #8240, fixes #8062, fixes #7968

Special notes for your reviewer:

Clarification edit:
~~
From my understanding of the redis-ha chart in its current state, it uses docker images built from the repo here https://github.com/smileisak/docker-images/tree/master/redis/alpine which uses an assortment of scripts to update labels, find master/slaves, and start elections/promotions via kubectl commands, which require the appropriate roles and accounts to function in a typical RBAC enabled environment. Which is super cool in theory but has been the source of many issues (multiple masters being labeled and no failover as the primary ones I personally encountered)

The approach I took in this refactor is more of a redis native approach where I tried to remove much of the complexity of the scripts and allow all the election/promotion to be done through the redis-sentinels with a small init script hosted here as a configmap. I feel like this also makes it easier to maintain this chart and allows it to be more dynamic as it can be used with the official redis image or any image really. While I don't think it's perfect, I think this puts the chart in much more of a stable category than it's current state.

As a result I removed RBAC roles and accounts however if they are necessary for other aspects that I did not encounter or immediately see please feel free to point it out.
~~

ssalaues · 2018-08-24T00:11:41Z

/assign @unguiculus
/assign @scottrigby

KoviaX · 2018-08-27T10:23:55Z

In my opinion there is no need to have a different masterGroupName than the previous installations (mymaster), saves a bit of configuring in applications making use of it if it is not changed.

From your PR I have made a local fork to test if it solves our issues. I will update if I run into any new issues, but so far it seems to be running well. Thanks for your efforts so far!

ceshihao · 2018-09-03T10:03:28Z

Thanks for the PR.

But it seems not to work well in my cluster.
I install the chart by helm install stable/redis-ha -n dayu-test, but the slaves are s_down and No suitable slave to promote.

kubectl exec -it dayu-test-redis-ha-server-0 bash -n default
Defaulting container name to redis.
Use 'kubectl describe pod/dayu-test-redis-ha-server-0 -n default' to see all of the containers in this pod.
I have no name!@dayu-test-redis-ha-server-0:/data$ redis-cli -p 26379
127.0.0.1:26379> sentinel master mymaster
(error) ERR No such master with that name
127.0.0.1:26379> sentinel master zenko
 1) "name"
 2) "zenko"
 3) "ip"
 4) "10.18.39.6"
 5) "port"
 6) "6379"
 7) "runid"
 8) "a3f918a1e9f7d9455d6fe2e04de31404a6e66e0f"
 9) "flags"
10) "master"
11) "link-pending-commands"
12) "0"
13) "link-refcount"
14) "1"
15) "last-ping-sent"
16) "0"
17) "last-ok-ping-reply"
18) "640"
19) "last-ping-reply"
20) "640"
21) "down-after-milliseconds"
22) "10000"
23) "info-refresh"
24) "741"
25) "role-reported"
26) "master"
27) "role-reported-time"
28) "291756"
29) "config-epoch"
30) "0"
31) "num-slaves"
32) "2"
33) "num-other-sentinels"
34) "2"
35) "quorum"
36) "2"
37) "failover-timeout"
38) "180000"
39) "parallel-syncs"
40) "5"
127.0.0.1:26379> sentinel slaves zenko
1)  1) "name"
    2) "10.18.57.0:6379"
    3) "ip"
    4) "10.18.57.0"
    5) "port"
    6) "6379"
    7) "runid"
    8) ""
    9) "flags"
   10) "s_down,slave,disconnected"
   11) "link-pending-commands"
   12) "3"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "290992"
   17) "last-ok-ping-reply"
   18) "290992"
   19) "last-ping-reply"
   20) "290992"
   21) "s-down-time"
   22) "280933"
   23) "down-after-milliseconds"
   24) "10000"
   25) "info-refresh"
   26) "1535968470392"
   27) "role-reported"
   28) "slave"
   29) "role-reported-time"
   30) "290992"
   31) "master-link-down-time"
   32) "0"
   33) "master-link-status"
   34) "err"
   35) "master-host"
   36) "?"
   37) "master-port"
   38) "0"
   39) "slave-priority"
   40) "100"
   41) "slave-repl-offset"
   42) "0"
2)  1) "name"
    2) "10.18.40.0:6379"
    3) "ip"
    4) "10.18.40.0"
    5) "port"
    6) "6379"
    7) "runid"
    8) ""
    9) "flags"
   10) "s_down,slave,disconnected"
   11) "link-pending-commands"
   12) "3"
   13) "link-refcount"
   14) "1"
   15) "last-ping-sent"
   16) "290994"
   17) "last-ok-ping-reply"
   18) "290994"
   19) "last-ping-reply"
   20) "290994"
   21) "s-down-time"
   22) "280934"
   23) "down-after-milliseconds"
   24) "10000"
   25) "info-refresh"
   26) "1535968470393"
   27) "role-reported"
   28) "slave"
   29) "role-reported-time"
   30) "290994"
   31) "master-link-down-time"
   32) "0"
   33) "master-link-status"
   34) "err"
   35) "master-host"
   36) "?"
   37) "master-port"
   38) "0"
   39) "slave-priority"
   40) "100"
   41) "slave-repl-offset"
   42) "0"
127.0.0.1:26379> sentinel failover zenko
(error) NOGOODSLAVE No suitable slave to promote
127.0.0.1:26379> exit

Fixes issues: Race condition with masters Announced service no longer working PVCs possibilites redis-ha doesn't failover properly Signed-off-by: Salim <salim.salaues@scality.com>

Signed-off-by: Salim <salim.salaues@scality.com>

ssalaues · 2018-09-04T17:47:22Z

@KoviaX You are totally right, I had this configured for my own testing and accidentally left it in. I just pushed updates to fix this.

Also rebased to sign off the commits and fixed merge conflicts.

@ceshihao yeah it looks like your slaves are down for some reason which is why you're getting NOGOODSLAVE No suitable slave to promote. Have all 3 pods been scheduled (at least for default replica count)? What do the logs say for the two slave pods?

ceshihao · 2018-09-05T03:45:49Z

@ssalaues

Yes, 3 pods were scheduled, and I can not find some reason from redis master/slave log.

helm status dayu-test
LAST DEPLOYED: Wed Sep  5 03:07:12 2018
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/Service
NAME                TYPE       CLUSTER-IP  EXTERNAL-IP  PORT(S)             AGE
dayu-test-redis-ha  ClusterIP  None        <none>       6379/TCP,26379/TCP  34m

==> v1/StatefulSet
NAME                       DESIRED  CURRENT  AGE
dayu-test-redis-ha-server  3        3        34m

==> v1beta1/PodDisruptionBudget
NAME                    MIN AVAILABLE  MAX UNAVAILABLE  ALLOWED DISRUPTIONS  AGE
dayu-test-redis-ha-pdb  N/A            1                1                    34m

==> v1/Pod(related)
NAME                         READY  STATUS   RESTARTS  AGE
dayu-test-redis-ha-server-0  2/2    Running  0         34m
dayu-test-redis-ha-server-1  2/2    Running  0         34m
dayu-test-redis-ha-server-2  2/2    Running  0         34m

==> v1/ConfigMap
NAME                          DATA  AGE
dayu-test-redis-ha-configmap  3     34m


NOTES:
Redis cluster can be accessed via port 6379 on the following DNS name from within your cluster:
dayu-test-redis-ha.default.svc.cluster.local

And redis slave log

kubectl logs dayu-test-redis-ha-server-1 redis
1:C 05 Sep 03:07:56.176 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 05 Sep 03:07:56.199 # Redis version=4.0.11, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 05 Sep 03:07:56.199 # Configuration loaded
1:S 05 Sep 03:07:56.200 # Not listening to IPv6: unsupproted
1:S 05 Sep 03:07:56.201 * Running mode=standalone, port=6379.
1:S 05 Sep 03:07:56.201 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:S 05 Sep 03:07:56.201 # Server initialized
1:S 05 Sep 03:07:56.201 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:S 05 Sep 03:07:56.201 * Ready to accept connections
1:S 05 Sep 03:07:56.201 * Connecting to MASTER 10.18.112.4:6379
1:S 05 Sep 03:07:56.201 * MASTER <-> SLAVE sync started
1:S 05 Sep 03:07:56.202 * Non blocking connect for SYNC fired the event.
1:S 05 Sep 03:07:56.203 * Master replied to PING, replication can continue...
1:S 05 Sep 03:07:56.203 * Partial resynchronization not possible (no cached master)
1:S 05 Sep 03:08:02.953 * Full resync from master: 0d0678eb209be726961e8ba3fbdc45ace192a4fd:859
1:S 05 Sep 03:08:02.955 * MASTER <-> SLAVE sync: receiving streamed RDB from master
1:S 05 Sep 03:08:02.955 * MASTER <-> SLAVE sync: Flushing old data
1:S 05 Sep 03:08:02.955 * MASTER <-> SLAVE sync: Loading DB in memory
1:S 05 Sep 03:08:02.955 * MASTER <-> SLAVE sync: Finished with success

redis master log

kubectl logs dayu-test-redis-ha-server-0 redis
1:C 05 Sep 03:07:30.858 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
1:C 05 Sep 03:07:30.887 # Redis version=4.0.11, bits=64, commit=00000000, modified=0, pid=1, just started
1:C 05 Sep 03:07:30.887 # Configuration loaded
1:M 05 Sep 03:07:30.889 # Not listening to IPv6: unsupproted
1:M 05 Sep 03:07:30.890 * Running mode=standalone, port=6379.
1:M 05 Sep 03:07:30.890 # WARNING: The TCP backlog setting of 511 cannot be enforced because /proc/sys/net/core/somaxconn is set to the lower value of 128.
1:M 05 Sep 03:07:30.890 # Server initialized
1:M 05 Sep 03:07:30.890 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
1:M 05 Sep 03:07:30.890 * Ready to accept connections
1:M 05 Sep 03:07:56.203 * Slave 10.18.14.0:6379 asks for synchronization
1:M 05 Sep 03:07:56.203 * Full resync requested by slave 10.18.14.0:6379
1:M 05 Sep 03:07:56.204 * Delay next BGSAVE for diskless SYNC
1:M 05 Sep 03:08:02.952 * Starting BGSAVE for SYNC with target: slaves sockets
1:M 05 Sep 03:08:02.953 * Background RDB transfer started by pid 30
30:C 05 Sep 03:08:02.954 * RDB: 6 MB of memory used by copy-on-write
1:M 05 Sep 03:08:03.054 * Background RDB transfer terminated with success
1:M 05 Sep 03:08:03.054 # Slave 10.18.14.0:6379 correctly received the streamed RDB file.
1:M 05 Sep 03:08:03.054 * Streamed RDB transfer with slave 10.18.14.0:6379 succeeded (socket). Waiting for REPLCONF ACK from slave to enable streaming
1:M 05 Sep 03:08:03.215 * Synchronization with slave 10.18.14.0:6379 succeeded
1:M 05 Sep 03:08:13.695 * Slave 10.18.5.0:6379 asks for synchronization
1:M 05 Sep 03:08:13.695 * Full resync requested by slave 10.18.5.0:6379
1:M 05 Sep 03:08:13.695 * Delay next BGSAVE for diskless SYNC
1:M 05 Sep 03:08:19.990 * Starting BGSAVE for SYNC with target: slaves sockets
1:M 05 Sep 03:08:19.990 * Background RDB transfer started by pid 46
46:C 05 Sep 03:08:19.992 * RDB: 8 MB of memory used by copy-on-write
1:M 05 Sep 03:08:20.091 * Background RDB transfer terminated with success
1:M 05 Sep 03:08:20.091 # Slave 10.18.5.0:6379 correctly received the streamed RDB file.
1:M 05 Sep 03:08:20.091 * Streamed RDB transfer with slave 10.18.5.0:6379 succeeded (socket). Waiting for REPLCONF ACK from slave to enable streaming
1:M 05 Sep 03:08:20.708 * Synchronization with slave 10.18.5.0:6379 succeeded

ssalaues · 2018-09-05T23:03:25Z

@ceshihao from the logs, it looks to be working as intended. They are replicating to one another.

xsm74 · 2018-09-06T23:24:31Z

Couple of problems we came across in testing:

How do we connect to the master? The previous version had a seperate service. Tried running this an 2/3s of the time connected to a slave and was unable to write.
"base64 -d" not "base64 -D" on Linux

ssalaues · 2018-09-06T23:46:51Z

@xsm74 In my experience, client libraries typically are able to discover the Redis master via the Sentinels. Since the Sentinels keep track of the redis master they can be queried for the current master and in the case of a failover will update the clients accordingly.

And seems like whoever originally wrote the instructions must have been running on Mac, regarding the base64 discrepancy.

khrisrichardson · 2018-09-07T00:40:20Z

stable/redis-ha/templates/redis-ha-statefulset.yaml

+            command: ["redis-cli", "-p", "{{ .Values.sentinel.port }}", "ping"]
+          initialDelaySeconds: 15
+          periodSeconds: 5
+        readiness:


This should be readinessProbe

khrisrichardson · 2018-09-07T00:40:48Z

stable/redis-ha/templates/redis-ha-statefulset.yaml

+            command: ["redis-cli", "ping"]
+          initialDelaySeconds: 15
+          periodSeconds: 5
+        readiness:


stable/redis-ha/values.yaml

ssalaues · 2018-10-31T20:40:25Z

@unguiculus I removed them for a couple reasons:

Neither seems to be active in the charts community which is fine but there have been plenty of issues in which they were tagged in but unavailable to respond. Currently I am more active and available to help.
As a result of this PR, the chart no longer relies on any code that they originally used and maintained through their personal repositories (specifically smileisak). Since there is not much resemblance to the chart they originally maintained other than in name, it makes sense that they wouldn't be the most relevantly informed on future issues or reviews (Not saying they couldn't)

If it makes it easier to merge, then I will drop that commit as that is not the objective of this PR and just an attempt to help more in the charts community.

unguiculus · 2018-10-31T20:44:08Z

OK, I guess it's fine then given they haven't objected. After all, this PR has been open for quite some time.

unguiculus · 2018-10-31T20:45:39Z

/lgtm

k8s-ci-robot · 2018-10-31T20:45:49Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ssalaues, unguiculus

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [unguiculus]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

ssalaues · 2018-10-31T20:57:44Z

Thanks @unguiculus

ssalaues · 2018-10-31T21:32:55Z

@lakano yeah I just did some basic functionality test with 5.0 and seems to be working as expected

* improvement: refactor of redis-ha Fixes issues: Race condition with masters Announced service no longer working PVCs possibilites redis-ha doesn't failover properly Signed-off-by: Salim <salim.salaues@scality.com> * cleanup and default to mymaster Signed-off-by: Salim <salim.salaues@scality.com> * fixes: requested changes readiness typo, move security context to pod level, and remove -x flag from init script Signed-off-by: Salim <salim.salaues@scality.com> * fix sentinel var name Signed-off-by: Salim <salim.salaues@scality.com> * docs: upgrade notes and refinements Signed-off-by: Salim <salim.salaues@scality.com> * improvement: update docs, various fixes, simpler init script, better resiliency fix notes force failover in the event that the existing master is not accessible simpler script, better failover, values update fix corner cases where upgrades can fail fix auth issue Cleanup unused code Default to auth disabled to prevent a new password to be generated on each upgrade Signed-off-by: Salim <salim.salaues@scality.com> * Fixed auth issues Switched 'exit 0' to 'return 0' so that auth can be configured correctly. Fixed `SENTINEl` typo Signed-off-by: Salim <salim.salaues@scality.com> * fixes: update with best practices Signed-off-by: Salim <salim.salaues@scality.com> * updates: improved doc, consistent style, and added options for custom configmap files Signed-off-by: Salim <salim.salaues@scality.com> * add auth info to README Signed-off-by: Salim <salim.salaues@scality.com> Signed-off-by: Patrick Montanari <patrick.montanari@gmail.com>

alexvicegrab · 2018-11-05T14:04:18Z

The fact we can't access the master directly, via a service, is a bit problematic, as @xsm74 mentioned. Since we should be using the services, rather than the pods themselves, to connect to the master to write.

@ssalaues, is there a way to get our clients to work well while connecting to the service?

pmontanari · 2018-11-05T14:09:26Z

The fact we can't access the master directly, via a service, is a bit problematic, as @xsm74 mentioned. Since we should be using the services, rather than the pods themselves, to connect to the master to write.

@pmontanari, is there a way to get our clients to work well while connecting to the service?

Hello,
Sorry, My commit (PR) was not related to redis but prometheus-operator.
Unfortunately I messed-up somewhere and had a lot of other changes merged with my PR so I closed it.

alexvicegrab · 2018-11-05T14:33:38Z

Apologies @pmontanari, mis-copied, fixed now

ssalaues · 2018-11-06T06:31:53Z

@alexvicegrab While there is no "master only" service after this PR, have you tried to point your application to the sentinel port of the Redis service? Most redis libraries support the use of sentinels and simply need to be pointed to the sentinel port itself instead of the redis. If the library natively supports sentinel, it will simply query any sentinel to return the ip:port of the current master. This can allow for failover at the application level and not only the Kubernetes level.

I still like the idea of the "master only" service but will need some more thought before implementing as there were some issues with the prior implementation.

shantanuthatte · 2018-11-06T07:23:29Z

How about a binary (may be in Go) that listens for events from sentinel, and changes the tag on a pod, coupled with a service which targets the tagged pod (similar to the v2 of this chart)?

But I agree with using the load-balanced kube service to sentinel instances as the preferred way.

alexvicegrab · 2018-11-06T10:33:09Z

Thanks @ssalaues, I'll try to test with the Sentinels.

rainhacker · 2018-11-06T15:44:24Z

@alexvicegrab While there is no "master only" service after this PR, have you tried to point your application to the sentinel port of the Redis service? Most redis libraries support the use of sentinels and simply need to be pointed to the sentinel port itself instead of the redis. If the library natively supports sentinel, it will simply query any sentinel to return the ip:port of the current master. This can allow for failover at the application level and not only the Kubernetes level.

I still like the idea of the "master only" service but will need some more thought before implementing as there were some issues with the prior implementation.

@ssalaues If clients running outside Kubernetes want to access Redis inside Kubernetes, the internal address of K8 pods returned by sentinels cannot be used by the clients. Is such a use case supported ? As the clients can interact with K8 services from outside easily.

prodriguezdefino · 2018-11-06T15:53:49Z

@alexvicegrab @rainhacker we have been using an HAProxy based K8s service to expose the redis cluster to external clients and also to provide master or slave only access for "dumb" clients. You can see the chart in this fork. If worthy I can see to make a PR.

ssalaues · 2018-11-07T18:58:24Z

@rainhacker currently there is no support within the chart for this use case but I think the haproxy or similar method would be a good approach.

jeremy-albuixech · 2018-11-08T23:00:27Z

@prodriguezdefino @ssalaues
Hi !
My main cluster is part of a cluster federation and I need to access the redis sentinel and the redis master services from several other clusters, all in the same GCP network.

I was trying to use the haproxy-redis chart that @prodriguezdefino linked but without any success so far, my knowledge of how haproxy works is currently limited, so sorry for this question in advance, I just want to make sure I'm going in the right direction.

I guess the haproxy-redis chart is not meant to use as-is and was built for your specific needs right (because sentinel doesn't seem to be exposed in it) ?

Could you let me know if I've got the correct idea in order to have it work with the current redis-ha implementation ? My main point being that I don't have to talk to the master/slaves directly, my redis client knows how to talk to sentinel:

Need to have a haproxy frontend for the master, slave and sentinel services.
Point my redis client to the haproxy IP and the sentinel port
Sentinel will tell my client the address of the master redis
My client will go through the haproxy IP for the correct master redis service

Does that make sense? I think it could be worth it to include some sort of documentation for this use case, from my current experience this is a common way to setup Redis especially on large projects with geo redundancy (and I guess when people are looking at Redis-HA it's usually due to some specific needs for a resilient and large scale architecture).

And by the way, thanks for this refactor, I was using the chart in the initial version and the way it works now is much better in my opinion.

prodriguezdefino · 2018-11-08T23:26:15Z

Let me see if I can answer your questions @Albi34 :

I guess the haproxy-redis chart is not meant to use as-is and was built for your specific needs right (because sentinel doesn't seem to be exposed in it)?
The sentinel is not exposed, that is correct, adding it should be very simple but I didn't had a reason to. In this chart case the slaves and the master are exposed through different services (each of them on diff IPs), HAProxy checks should be able to take care of discovering which of the underlying Redis instances is elected as master and which others as slaves.

Could you let me know if I've got the correct idea in order to have it work with the current redis-ha implementation ?
If you are going to access the Redis cluster from Kubernetes, then using the sentinels directly and the IPs they return should be sufficient.
If you are going to access the Redis cluster from the outside world, if you use the haproxy chart I built then you should only connect to the master service (if you need read/write ops) or the slave service (only for reads) and you should be good to go.

Of course this should work when configured correctly, so maybe if you want to, you can open an issue on my repo and I can help you there =).

* improvement: refactor of redis-ha Fixes issues: Race condition with masters Announced service no longer working PVCs possibilites redis-ha doesn't failover properly Signed-off-by: Salim <salim.salaues@scality.com> * cleanup and default to mymaster Signed-off-by: Salim <salim.salaues@scality.com> * fixes: requested changes readiness typo, move security context to pod level, and remove -x flag from init script Signed-off-by: Salim <salim.salaues@scality.com> * fix sentinel var name Signed-off-by: Salim <salim.salaues@scality.com> * docs: upgrade notes and refinements Signed-off-by: Salim <salim.salaues@scality.com> * improvement: update docs, various fixes, simpler init script, better resiliency fix notes force failover in the event that the existing master is not accessible simpler script, better failover, values update fix corner cases where upgrades can fail fix auth issue Cleanup unused code Default to auth disabled to prevent a new password to be generated on each upgrade Signed-off-by: Salim <salim.salaues@scality.com> * Fixed auth issues Switched 'exit 0' to 'return 0' so that auth can be configured correctly. Fixed `SENTINEl` typo Signed-off-by: Salim <salim.salaues@scality.com> * fixes: update with best practices Signed-off-by: Salim <salim.salaues@scality.com> * updates: improved doc, consistent style, and added options for custom configmap files Signed-off-by: Salim <salim.salaues@scality.com> * add auth info to README Signed-off-by: Salim <salim.salaues@scality.com> Signed-off-by: Ben Drucker <bvdrucker@gmail.com>

* improvement: refactor of redis-ha Fixes issues: Race condition with masters Announced service no longer working PVCs possibilites redis-ha doesn't failover properly Signed-off-by: Salim <salim.salaues@scality.com> * cleanup and default to mymaster Signed-off-by: Salim <salim.salaues@scality.com> * fixes: requested changes readiness typo, move security context to pod level, and remove -x flag from init script Signed-off-by: Salim <salim.salaues@scality.com> * fix sentinel var name Signed-off-by: Salim <salim.salaues@scality.com> * docs: upgrade notes and refinements Signed-off-by: Salim <salim.salaues@scality.com> * improvement: update docs, various fixes, simpler init script, better resiliency fix notes force failover in the event that the existing master is not accessible simpler script, better failover, values update fix corner cases where upgrades can fail fix auth issue Cleanup unused code Default to auth disabled to prevent a new password to be generated on each upgrade Signed-off-by: Salim <salim.salaues@scality.com> * Fixed auth issues Switched 'exit 0' to 'return 0' so that auth can be configured correctly. Fixed `SENTINEl` typo Signed-off-by: Salim <salim.salaues@scality.com> * fixes: update with best practices Signed-off-by: Salim <salim.salaues@scality.com> * updates: improved doc, consistent style, and added options for custom configmap files Signed-off-by: Salim <salim.salaues@scality.com> * add auth info to README Signed-off-by: Salim <salim.salaues@scality.com>

k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 23, 2018

k8s-ci-robot requested review from mgoodness and scottrigby August 23, 2018 23:12

k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 23, 2018

ssalaues force-pushed the redis-ha-refactor branch 3 times, most recently from a662a78 to 46b8dd5 Compare August 23, 2018 23:48

ssalaues changed the title ~~[stable/redis-ha] improvment: refactor of redis-ha~~ [stable/redis-ha] improvement: refactor of redis-ha Aug 23, 2018

ssalaues mentioned this pull request Aug 24, 2018

[stable/redis-ha] Change to statefulset (from #4038) #6706

Closed

k8s-ci-robot assigned scottrigby and unguiculus Aug 24, 2018

ssalaues mentioned this pull request Aug 24, 2018

[stable/redis-ha] redis-ha doesn't failover properly #2780

Closed

mattfarina added the Contribution Allowed If the contributor has signed the DCO or the CNCF CLA (prior to the move to a DCO). label Aug 27, 2018

ey-bot removed the Contribution Allowed If the contributor has signed the DCO or the CNCF CLA (prior to the move to a DCO). label Sep 4, 2018

Salim added 2 commits September 4, 2018 10:34

improvement: refactor of redis-ha

4263ae3

Fixes issues: Race condition with masters Announced service no longer working PVCs possibilites redis-ha doesn't failover properly Signed-off-by: Salim <salim.salaues@scality.com>

cleanup and default to mymaster

dfb6fdc

Signed-off-by: Salim <salim.salaues@scality.com>

ssalaues force-pushed the redis-ha-refactor branch from dcf1be7 to dfb6fdc Compare September 4, 2018 17:35

ey-bot added the Contribution Allowed If the contributor has signed the DCO or the CNCF CLA (prior to the move to a DCO). label Sep 4, 2018

khrisrichardson reviewed Sep 7, 2018

View reviewed changes

stable/redis-ha/values.yaml Show resolved Hide resolved

ey-bot removed the Contribution Allowed If the contributor has signed the DCO or the CNCF CLA (prior to the move to a DCO). label Sep 7, 2018

unguiculus approved these changes Oct 31, 2018

View reviewed changes

k8s-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 31, 2018

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 31, 2018

k8s-ci-robot merged commit d73ba50 into helm:master Oct 31, 2018

ssalaues deleted the redis-ha-refactor branch October 31, 2018 21:33

ssalaues mentioned this pull request Nov 6, 2018

redis sentinel return wrong master instance #8970

Closed

ssalaues mentioned this pull request Nov 6, 2018

[redis-ha]Service redirected to slave instead of master #8988

Closed

ssalaues mentioned this pull request Nov 13, 2018

Redis connection issue #9065

Closed

burnettk mentioned this pull request Dec 12, 2018

[stable/redis-ha] do we still need a statefulset? #9944

Closed

hardbyte mentioned this pull request Jan 3, 2019

Use the redis sentinel protocol data61/anonlink-entity-service#301

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[stable/redis-ha] improvement: refactor of redis-ha #7323

[stable/redis-ha] improvement: refactor of redis-ha #7323

ssalaues commented Aug 23, 2018 •

edited

ssalaues commented Aug 24, 2018

KoviaX commented Aug 27, 2018

ceshihao commented Sep 3, 2018

ssalaues commented Sep 4, 2018

ceshihao commented Sep 5, 2018

ssalaues commented Sep 5, 2018

xsm74 commented Sep 6, 2018

ssalaues commented Sep 6, 2018

khrisrichardson Sep 7, 2018

khrisrichardson Sep 7, 2018

ssalaues commented Oct 31, 2018

unguiculus commented Oct 31, 2018

unguiculus commented Oct 31, 2018

k8s-ci-robot commented Oct 31, 2018

ssalaues commented Oct 31, 2018

ssalaues commented Oct 31, 2018

alexvicegrab commented Nov 5, 2018 •

edited

pmontanari commented Nov 5, 2018

alexvicegrab commented Nov 5, 2018

ssalaues commented Nov 6, 2018

shantanuthatte commented Nov 6, 2018

alexvicegrab commented Nov 6, 2018

rainhacker commented Nov 6, 2018 •

edited

prodriguezdefino commented Nov 6, 2018 •

edited

ssalaues commented Nov 7, 2018

jeremy-albuixech commented Nov 8, 2018 •

edited

prodriguezdefino commented Nov 8, 2018

[stable/redis-ha] improvement: refactor of redis-ha #7323

[stable/redis-ha] improvement: refactor of redis-ha #7323

Conversation

ssalaues commented Aug 23, 2018 • edited

ssalaues commented Aug 24, 2018

KoviaX commented Aug 27, 2018

ceshihao commented Sep 3, 2018

ssalaues commented Sep 4, 2018

ceshihao commented Sep 5, 2018

ssalaues commented Sep 5, 2018

xsm74 commented Sep 6, 2018

ssalaues commented Sep 6, 2018

khrisrichardson Sep 7, 2018

Choose a reason for hiding this comment

khrisrichardson Sep 7, 2018

Choose a reason for hiding this comment

ssalaues commented Oct 31, 2018

unguiculus commented Oct 31, 2018

unguiculus commented Oct 31, 2018

k8s-ci-robot commented Oct 31, 2018

ssalaues commented Oct 31, 2018

ssalaues commented Oct 31, 2018

alexvicegrab commented Nov 5, 2018 • edited

pmontanari commented Nov 5, 2018

alexvicegrab commented Nov 5, 2018

ssalaues commented Nov 6, 2018

shantanuthatte commented Nov 6, 2018

alexvicegrab commented Nov 6, 2018

rainhacker commented Nov 6, 2018 • edited

prodriguezdefino commented Nov 6, 2018 • edited

ssalaues commented Nov 7, 2018

jeremy-albuixech commented Nov 8, 2018 • edited

prodriguezdefino commented Nov 8, 2018

ssalaues commented Aug 23, 2018 •

edited

alexvicegrab commented Nov 5, 2018 •

edited

rainhacker commented Nov 6, 2018 •

edited

prodriguezdefino commented Nov 6, 2018 •

edited

jeremy-albuixech commented Nov 8, 2018 •

edited