Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

replicating redis 7.0.2 cause a sync loop #1680

Closed
cecchigl opened this issue Aug 9, 2023 · 4 comments · Fixed by #1744
Closed

replicating redis 7.0.2 cause a sync loop #1680

cecchigl opened this issue Aug 9, 2023 · 4 comments · Fixed by #1744
Assignees
Labels
bug Something isn't working

Comments

@cecchigl
Copy link

cecchigl commented Aug 9, 2023

Describe the bug
Setting dragonfly as a replicaof a redis 7.0.2 instance cause a sync loop

To Reproduce
Steps to reproduce the behavior:

  1. Start redis 7.0.2 server
  2. Start dragonfly v1.8.0
  3. running on dragonfly command replicaof host port (redis address)

Expected behavior
Full sync with redis db

Environment (please complete the following information):

  • Dragonfly container: docker.dragonflydb.io/dragonflydb/dragonfly:v1.8.0
  • Containerized?: Kubernetes
  • Dragonfly Version: 1.8.0
  • Redis Docker image: redis:7-alpine

Additional context
Dragonfly logs:

server_family.cc:2008] Replicating 10.42.6.106:6379
I20230809 14:14:36.637770 10 protocol_client.cc:219] Resetting endpoint! 10.42.6.106, 6379
I20230809 14:14:36.754384 10 replica.cc:331] Starting full sync
W20230809 14:15:42.008188 10 rdb_load.cc:2226] Unrecognized RDB AUX field: 'aof-base'
I20230809 14:21:38.797895 10 replica.cc:511] Transitioned into stable sync
E20230809 14:21:38.798534 10 protocol_client.cc:291] Could not shutdown socket system:107
E20230809 14:21:38.976694 10 protocol_client.cc:321] Socket error system:103
W20230809 14:21:38.976796 10 replica.cc:220] Error stable sync with 10.42.6.106:6379 system:103 Software caused connection abort
I20230809 14:21:39.476967 10 protocol_client.cc:219] Resetting endpoint! 10.42.6.106, 6379
I20230809 14:21:41.242125 10 replica.cc:331] Starting full sync
W20230809 14:22:55.181116 10 rdb_load.cc:2226] Unrecognized RDB AUX field: 'aof-base'
I20230809 14:27:31.509820 10 replica.cc:511] Transitioned into stable sync
E20230809 14:27:31.510350 10 protocol_client.cc:291] Could not shutdown socket system:107
E20230809 14:27:32.164361 10 protocol_client.cc:321] Socket error system:103
W20230809 14:27:32.164419 10 replica.cc:220] Error stable sync with 10.42.6.106:6379 system:103 Software caused connection abort
I20230809 14:27:32.664480 10 protocol_client.cc:219] Resetting endpoint! 10.42.6.106, 6379
I20230809 14:27:32.759276 10 replica.cc:331] Starting full sync
W20230809 14:28:44.290982 10 rdb_load.cc:2226] Unrecognized RDB AUX field: 'aof-base'
I20230809 14:33:20.009881 10 replica.cc:511] Transitioned into stable sync
E20230809 14:33:20.010550 10 protocol_client.cc:291] Could not shutdown socket system:107
E20230809 14:33:20.337633 10 protocol_client.cc:321] Socket error system:103
W20230809 14:33:20.337699 10 replica.cc:220] Error stable sync with 10.42.6.106:6379 system:103 Software caused connection abort
I20230809 14:33:20.839649 10 protocol_client.cc:219] Resetting endpoint! 10.42.6.106, 6379
I20230809 14:33:26.584765 10 replica.cc:331] Starting full sync
@cecchigl cecchigl added the bug Something isn't working label Aug 9, 2023
@romange
Copy link
Collaborator

romange commented Aug 9, 2023

looks like that this field is set by redis to "0" for a regular (non aof) case: https://github.com/redis/redis/blob/1b6bdff48dcc333affab02b945821eb865529de2/src/rdb.c#L3156

@royjacobson
Copy link
Contributor

            /* AUX: generic string-string fields. Use to add state to RDB
             * which is backward compatible. Implementations of RDB loading
             * are required to skip AUX fields they don't understand.

So it appears we aren't doing that :) Do you think we should?

@romange
Copy link
Collaborator

romange commented Aug 9, 2023

seems like we should :)

@royjacobson
Copy link
Contributor

Oops, we ARE correctly ignoring this field. When running locally I get the same messages:

I20230814 13:46:53.659343  5182 server_family.cc:2027] Replicating localhost:6379
I20230814 13:46:53.697757  5182 protocol_client.cc:219] Resetting endpoint! 127.0.0.1, 6379
I20230814 13:46:53.699290  5182 replica.cc:331] Starting full sync
W20230814 13:46:58.237557  5182 rdb_load.cc:2226] Unrecognized RDB AUX field: 'aof-base'
I20230814 13:46:58.288884  5182 replica.cc:511] Transitioned into stable sync

But no reconnection loop. @cecchigl, would you mind posting your redis configuration check the redis server's logs? To me it looks like the connection might have been closed from redis' side for some reason.

romange added a commit that referenced this issue Aug 25, 2023
If an empty buffer is passed to the socket Recv function - it returns error 103.
Even if we returned success, this would lead to the endless loop since the parser
requires more data to parse the load.
Fixes #1680

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
romange added a commit that referenced this issue Aug 25, 2023
…1744)

If an empty buffer is passed to the socket Recv function - it returns error 103.
Even if we returned success, this would lead to the endless loop since the parser
requires more data to parse the load.
Fixes #1680

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
kostasrim pushed a commit that referenced this issue Aug 29, 2023
…1744)

If an empty buffer is passed to the socket Recv function - it returns error 103.
Even if we returned success, this would lead to the endless loop since the parser
requires more data to parse the load.
Fixes #1680

Signed-off-by: Roman Gershman <roman@dragonflydb.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants