Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Redis client can't connect to server #6018

Open
Supporterino opened this issue Jan 5, 2023 Discussed in #6013 · 11 comments
Open

Redis client can't connect to server #6018

Supporterino opened this issue Jan 5, 2023 Discussed in #6013 · 11 comments

Comments

@Supporterino
Copy link

Discussed in #6013

Originally posted by Supporterino January 3, 2023
Hello guys,

I am just updating my thanos stack to v0.30.0 and want to switch over to redis as the cache provider. I set up a redis cluster on version v7 with the bitnami helm chart. I am using the following cache configuration (as example query-range cache):

config:
  addr: "redis-redis-cluster-0.redis-redis-cluster-headless:6379,redis-redis-cluster-1.redis-redis-cluster-headless:6379,redis-redis-cluster-2.redis-redis-cluster-headless:6379"
  password: "SECURE-PASSWORD"
  db: 0
  dial_timeout: 5s
  read_timeout: 3s
  write_timeout: 3s
  pool_size: 100
  min_idle_conns: 10
  idle_timeout: 5m0s
  max_conn_age: 0s
  max_get_multi_concurrency: 100
  get_multi_batch_size: 100
  max_set_multi_concurrency: 100
  set_multi_batch_size: 100
  tls_enabled: false
  cache_size: 1GiB
type: "REDIS"

But my redis instance isn't getting any load and the query frontend just logs the following:

level=error ts=2023-01-03T10:41:45.346637454Z caller=redis_cache.go:46 msg="error connecting to redis" name=redis err="got 4 elements in cluster info address, expected 2 or 3"
level=info ts=2023-01-03T10:41:45.347196342Z caller=query_frontend.go:339 msg="starting query frontend"
level=info ts=2023-01-03T10:41:45.347215738Z caller=intrumentation.go:56 msg="changing probe status" status=ready
level=info ts=2023-01-03T10:41:45.347303382Z caller=intrumentation.go:75 msg="changing probe status" status=healthy
level=info ts=2023-01-03T10:41:45.34734474Z caller=http.go:73 service=http/server component=query-frontend msg="listening for requests and metrics" address=0.0.0.0:9090
level=info ts=2023-01-03T10:41:45.347625667Z caller=tls_config.go:232 service=http/server component=query-frontend msg="Listening on" address=[::]:9090
level=info ts=2023-01-03T10:41:45.347645684Z caller=tls_config.go:235 service=http/server component=query-frontend msg="TLS is disabled." http2=false address=[::]:9090
level=error ts=2023-01-03T10:45:24.948875993Z caller=redis_cache.go:75 msg="failed to get from redis" name=redis err="got 4 elements in cluster info address, expected 2 or 3"
level=error ts=2023-01-03T10:45:25.407642957Z caller=redis_cache.go:103 msg="failed to put to redis" name=redis err="got 4 elements in cluster info address, expected 2 or 3"
level=error ts=2023-01-03T10:45:25.426421268Z caller=redis_cache.go:75 msg="failed to get from redis" name=redis err="got 4 elements in cluster info address, expected 2 or 3"
level=error ts=2023-01-03T10:45:25.519582802Z caller=redis_cache.go:75 msg="failed to get from redis" name=redis err="got 4 elements in cluster info address, expected 2 or 3"
level=error ts=2023-01-03T10:45:25.612233941Z caller=redis_cache.go:75 msg="failed to get from redis" name=redis err="got 4 elements in cluster info address, expected 2 or 3"
level=error ts=2023-01-03T10:45:25.709052914Z caller=redis_cache.go:75 msg="failed to get from redis" name=redis err="got 4 elements in cluster info address, expected 2 or 3"

What exactly am I missing ?

@yeya24
Copy link
Contributor

yeya24 commented Jan 5, 2023

Answered this in that discussion post.
It is related to redis/go-redis#2085. For redis version v7 we should use go-redis version v9 instead while we are using go-redis v8.

For this usecase we should upgrade go-redis library. But it looks like it is not backward compatible so if we upgrade it it breaks redis 6.x.

@Supporterino
Copy link
Author

Temporarly I downgraded your redis cluister to 6.2.8 since it is only used for thanos. Now it is working like a charm. Ty for your help.
It might be useful to make a little note at the redis cache section maybe

@kforsthoevel
Copy link

Why does the Store Gateway works w/ Redis 7 and the Query Frontend does not?

@Schmitze333
Copy link

Would it be an option to use this Redis client (https://github.com/rueian/rueidis) via the cacheutils internal package?

@yeya24
Copy link
Contributor

yeya24 commented Feb 1, 2023

Yeah it would be great to use the same rueidis client in query frontend redis cache as well.

@Schmitze333
Copy link

@yeya24 I'm working on a PR targeting the use of rueidis also as Redis client in the query-frontend, but something puzzles me with regard to the Redis configs. I wonder whether this issue is the right place to discuss or rather WIP PR.

@michalschott
Copy link

michalschott commented Jun 8, 2023

Hi,

Recently tried to enable redis cache for query-frontend component - it failed with this error:

{"caller":"redis_cache.go:75","err":"ERR unknown command 'select', with args beginning with: '1' ","level":"error","msg":"failed to get from redis","name":"redis","ts":"2023-06-07T15:58:53.129616978Z"}

@douglascamata suggested there might be incompatibility between client and server, so I tried these redis versions but none of them succeeded (same error):

  • 7.0.8
  • 6.2.12
  • 6.0.19

I was unable to test with <6.0 because to operator I'm using to deploy redis to k8s is not supporting such old versions ;)

Thanos 0.31.0

@douglascamata
Copy link
Contributor

After some back and forth with @michalschott in Slack, he found out that most of his problems come from using a Redis Cluster for HA.

So for anyone out there using Redis Cluster: you have to leave the DB unset, otherwise it'll fail with an error like so: "ERR SELECT is not allowed in cluster mode", which comes from the DB selection command.

@dschaaff
Copy link

I get errors using a v6 redis cluster with query frontend even with the db unset. Example errors

msg="failed to get from redis" name=redis err="MOVED 10784 10.0.200.62:6379"
msg="failed to put to redis" name=redis err="EXECABORT Transaction discarded because of previous errors."

This occurs when pointing query frontend at the same AWS Elasticache cluster I use for the store component. Happy to open a separate issue if needed.

@douglascamata
Copy link
Contributor

Hey folks, can you try again after #6520 got merged? Should be fixed, I believe.

@calvinbui
Copy link

Hey folks, can you try again after #6520 got merged? Should be fixed, I believe.

not working for me with exact same config for store gateway

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants