New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can't create a basic redis failover, fail to sync with master #93
Comments
Hi, You just have to wait for the operator to get the failover to the desired state. At this moment, it waits when a new redis-failover is created because of security reasons. After a few minutes, is the railover running? |
Ok it's working fine. We had an issue with an istio proxy injected automatically which was breaking liveness probe. My bad. Thank you |
Could you elaborate on why would istio-proxy be breaking the probe and how did you fix it? |
So to put it back in context, we inject an istio-proxy by default in all our pods unless we put a specific annotation, because we have mostly http services and we want it for those http services. But for redis, in fact, we don't need it. I can't really tell exactly why it breaks the probe (the probe is just a ping to the hostname:port), but removing the istio proxy did fix it, we might investigate further. |
@francoispqt create a service redis (operator will only create a service for sentinel) and it'll work. See https://github.com/istio/istio/blob/master/pilot/cmd/pilot-agent/status/ready/probe.go#L41-L65 for details. You will probably not use this service, but it'll get istio working. |
I'm having the exact same problem as the author.
Dummy service created looks as follow :
EDIT: After adding clusterIP: None to the dummy service, the redis master-slave replication was able to happen.
Which is strange because a service already exist for them. |
@Paic what version of the operator are you using? That error appears when the operator tries to connect directly to the redises to check it's status. It happens when the redis container is not started yet, thus it should disappear after a few moments. In the last version, that behaviour has changes and it checks only the running pods only, so it should not appear again. |
I'm using the latest operator version (docker image :latest). As stated in my edit, the replication is now ok, but the same error occurs with sentinels. One thing I noticed is that the port in the sentinel service is named "sentinel". I tried renaming it to redis-sentinel (for Istio) but I don't know if envoy proxies picked up the change (@iroller). EDIT : Looking at a sentinel stats (INFO command) the number of total_connections_received kept growing : https://snag.gy/NeAgDb.jpg, which mean the connection to redis can at least be made. After installing redis-cli into the operator, I was also able to connect to a sentinel (DNS or ServiceIP) but every command was resetting the connection : https://snag.gy/Io270F.jpg |
The operator does not use the service to connect to the redis/sentinel pods, so it's the same if there are a service for redises or not. The operator ensures that what it creates remain unchanged, so if you edit it, it should come back as before. |
I see. After further testing, that's why inside the operator redis-cli using DNS works (Istio created a listener with the sentinel service IP) but doesn't with direct PodIP (no listener for each sentinel pod IP) If you create a dummy headless sentinel service, the operator is able to connect to them directly with the PodIP and everything looks fine. Thanks for the help :) EDIT: slaves are recognized by sentinel as 127.0.0.1, which detect them a down (obviously) |
I was able to solve with:
|
Expected behaviour
What do you want to achieve?
Create a basic RedisFailover, using the helm chart for the operator and the minimal example
Actual behaviour
What is happening? Are all the pieces created? Can you access to the service?
The operator is created. When creating the RedisFailover, the
rfr-redisfailover-0
pod fails, which puts all rfr and rfs pods to fail.Pod
rfr-redisfailover-0
logs output:Steps to reproduce the behaviour
Then created the RedisFailover with the following config:
Environment
How are the pieces configured?
Redis Operator version
0.5.2
Kubernetes version
v1.10.5
Kubernetes configuration used (eg: Is RBAC active?)
RBAC is active
Logs
From rfr redis container
From operator:
The text was updated successfully, but these errors were encountered: