You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
We're currently running an AWS Elasticache Redis Cluster with multiple shard and nodes which works great, until we do a scheduled update event where we start receiving a lot of Predis\Connection\ConnectionException: Connection timed out [tls://clustercfg.##############.eh01hp.use1.cache.amazonaws.com:6379] when one of the nodes in the shard get updated./patched.
It appears that even though there are other nodes in the shard and the rest of the cluster is healthy that requests fail until the server is updated and returned back into rotation. The current predis setup we're using is to pass the AWS Elasticache configuration endpoint as the host value when initialising predis. The AWS elasticache cluster configuration endpoint is basically a CNAME that returns all the A records of the nodes in the cluster (including the node that's been taken out of service).
I suspect what's happening here is that predis (due to the order of the A records rotating) is attempting to query a single node when making the request to the cluster and is unable do autodiscovery of the other nodes etc? If that's the case i suspect it maybe as simple as changing the way we initialize predis to contain all the nodes returned by the configuration endpoint.
In that track does predis cache the autodiscovered nodes from the cluster for a period of time like phpredis? I wasn't able to see anything in the documentation.
Lastly if i'm on the right track with the DNS issue being problematic would we be open to having the library handle enumerating the nodes from a DNS value that returns multiple values i.e with a flag in the initialization or by default?
I've had a good hunt to find any real documentation regarding this specific issue and have come up with basically nothing.
To Reproduce
Steps to reproduce the behavior:
Run an elasticache cluster using the cluster configuration endpoint as the host value for predis. Then perform patching / maintenance on a node.
Expected behavior
Predis to continue serving traffic to the other nodes in the cluster (when they're promoted), and at the least don't return a timeout error.
Versions (please complete the following information):
v1.1.7
Code sample
If applicable, a small snippet of code that reproduces the issue.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Well it doesn't necessarily need to be implemented in predis, it could be up to the wrapped code to enumerate these values. I guess what I'm trying to determine, is the aforementioned issue then expected result and if so that's ok.
I think it could be dangerous if this behaviour is changed if others are expecting it to behave consistently.
It could be a case of just updating the doc to call this out and an example of how to enumerate the records prior to hand off.
Describe the bug
We're currently running an AWS Elasticache Redis Cluster with multiple shard and nodes which works great, until we do a scheduled update event where we start receiving a lot of
Predis\Connection\ConnectionException: Connection timed out [tls://clustercfg.##############.eh01hp.use1.cache.amazonaws.com:6379]
when one of the nodes in the shard get updated./patched.It appears that even though there are other nodes in the shard and the rest of the cluster is healthy that requests fail until the server is updated and returned back into rotation. The current predis setup we're using is to pass the AWS Elasticache configuration endpoint as the host value when initialising predis. The AWS elasticache cluster configuration endpoint is basically a CNAME that returns all the A records of the nodes in the cluster (including the node that's been taken out of service).
I suspect what's happening here is that predis (due to the order of the A records rotating) is attempting to query a single node when making the request to the cluster and is unable do autodiscovery of the other nodes etc? If that's the case i suspect it maybe as simple as changing the way we initialize predis to contain all the nodes returned by the configuration endpoint.
In that track does predis cache the autodiscovered nodes from the cluster for a period of time like phpredis? I wasn't able to see anything in the documentation.
Lastly if i'm on the right track with the DNS issue being problematic would we be open to having the library handle enumerating the nodes from a DNS value that returns multiple values i.e with a flag in the initialization or by default?
I've had a good hunt to find any real documentation regarding this specific issue and have come up with basically nothing.
To Reproduce
Steps to reproduce the behavior:
Run an elasticache cluster using the cluster configuration endpoint as the host value for predis. Then perform patching / maintenance on a node.
Expected behavior
Predis to continue serving traffic to the other nodes in the cluster (when they're promoted), and at the least don't return a timeout error.
Versions (please complete the following information):
v1.1.7
Code sample
If applicable, a small snippet of code that reproduces the issue.
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: