Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynomite immediately dropping connections #456

Open
Aatish04 opened this issue Jun 21, 2017 · 17 comments
Open

Dynomite immediately dropping connections #456

Aatish04 opened this issue Jun 21, 2017 · 17 comments

Comments

@Aatish04
Copy link

Aatish04 commented Jun 21, 2017

I am trying to configured dynomite with two datacenters running on different machine, each will have one rack and one node. Each node connecting to redis instance running on it's machine.
I am trying to start dynomite with gossip enable.
After staring dynomite on both the servers after making first request it crashes.

I am able to connect to redis but not to dynomite

On server 1
$ redis-cli -p 8102
127.0.0.1:8102> ping
(error) ERR Storage: Connection timed out
(3.00s)
127.0.0.1:8102>ping
Could not connect to Redis at 127.0.0.1:8102: Connection refused

On server 2
$ redis-cli -p 8102
127.0.0.1:8102> ping
Error: Connection reset by peer
$ redis-cli -p 8102
Could not connect to Redis at 127.0.0.1:8102: Connection refused

Below is my configuration

Server 1
dyn_o_mite:
datacenter: dc1
rack: rack1
dyn_listen: 0.0.0.0:8101
dyn_read_timeout: 200000
dyn_seed_provider: simple_provider
dyn_seeds:

  • dynomite-host2:8101:rack2:dc2:1383429731
    listen: 0.0.0.0:8102
    preconnect: true
    servers:
  • 127.0.0.1:22122:1
    auto_eject_hosts: true
    server_retry_timeout: 3000
    timeout: 3000
    tokens: '12345678'
    secure_server_option: datacenter
    pem_key_file: conf/dynomite.pem
    data_store: 0

Server 2
dyn_o_mite:
datacenter: dc2
rack: rack2
dyn_listen: 0.0.0.0:8101
dyn_read_timeout: 200000
dyn_seed_provider: simple_provider
dyn_seeds:

  • dynomite-host1:8101:rack1:dc1:12345678
    listen: 0.0.0.0:8102
    preconnect: true
    servers:
  • 127.0.0.1:22123:1
    auto_eject_hosts: true
    server_retry_timeout: 3000
    timeout: 3000
    tokens: '1383429731'
    secure_server_option: datacenter
    pem_key_file: conf/dynomite.pem
    data_store: 0

Below is the log for Server 1

[2017-06-21 12:50:58.137] conf_validate_pool:2156 setting read_consistency to default value:dc_one
[2017-06-21 12:50:58.137] conf_validate_pool:2163 setting write_consistency to default value:dc_one
[2017-06-21 12:50:58.137] conf_validate_pool:2198 setting env to default value:aws
[2017-06-21 12:50:58.137] conf_validate_pool:2208 setting reconciliation key file to default value:conf/recon_key.pem
[2017-06-21 12:50:58.137] conf_validate_pool:2213 setting reconciliation IV file to default value:conf/recon_iv.pem
[2017-06-21 12:50:58.137] load_private_rsa_key_by_file:68 Private RSA structure filled
[2017-06-21 12:50:58.138] stats_listen:1362 m 5 listening on '0.0.0.0:22222'
[2017-06-21 12:50:58.138] entropy_key_iv_load:364 Key File name: conf/recon_key.pem - IV File name: conf/recon_iv.pem
[2017-06-21 12:50:58.138] entropy_key_iv_load:419 key loaded: 0123456789012345
[2017-06-21 12:50:58.138] entropy_key_iv_load:427 iv loaded: 0123456789012345
[2017-06-21 12:50:58.138] entropy_listen:328 anti-entropy m 8 listening on '127.0.0.1:8105'
[2017-06-21 12:50:58.138] event_base_create:68 e 9 with nevent 1024
[2017-06-21 12:50:58.138] conn_connect:536 connecting to '127.0.0.1:22122:1' on p 10
[2017-06-21 12:50:58.138] proxy_init:124 p 11 listening on '0.0.0.0:8102' in redis pool 'dyn_o_mite'
[2017-06-21 12:50:58.138] dnode_init:112 dyn: p 12 listening on '0.0.0.0:8101' in redis pool 'dyn_o_mite' with 34832512 servers
[2017-06-21 12:50:58.138] preselect_remote_rack_for_replication:1838 my rack index 0
[2017-06-21 12:50:58.138] preselect_remote_rack_for_replication:1865 Selected rack rack2 for replication to remote region dc2
[2017-06-21 12:50:58.138] server_connected:450 connected on s 10 to server '127.0.0.1:22122:1'
[2017-06-21 12:51:15.730] proxy_accept:220 accepted CLIENT 15 on PROXY 11 from '127.0.0.1:50592'
[2017-06-21 12:51:15.731] _msg_get:290 alloc_msg_count: 1 caller: req_get conn: CLIENT sd: 15
[2017-06-21 12:51:15.731] redis_parse_req:1230 parsed unsupported command 'COMMAND'
[2017-06-21 12:51:15.731] redis_parse_req:1829 parsed bad req 1 res 1 type 0 state 5
00000000 2a 31 0d 0a 24 37 0d 0a 43 4f 4d 4d 41 4e 44 0d |*1..$7..COMMAND.|
00000010 0a |.|
[2017-06-21 12:51:15.731] core_recv:309 recv on CLIENT 15 failed: Invalid argument
[2017-06-21 12:51:15.731] core_close_log:343 close CLIENT 15 '127.0.0.1:50592' on event 00FF eof 0 done 0 rb 17 sb 0: Invalid argument
[2017-06-21 12:51:15.731] client_close:204 close c 15 discarding pending req 1 len 17 type 0
[2017-06-21 12:51:15.731] client_unref_internal_try_put:101 unref conn 0x2146f30 owner 0x21380a0 from pool 'dyn_o_mite'
[2017-06-21 12:51:17.640] proxy_accept:220 accepted CLIENT 15 on PROXY 11 from '127.0.0.1:50594'
[2017-06-21 12:51:20.641] core_timeout:417 req 2 on SERVER 10 timedout, timeout was 3000
[2017-06-21 12:51:20.641] core_close_log:343 close SERVER 10 '127.0.0.1:22122' on event FF00 eof 0 done 0 rb 0 sb 0: Connection timed out
[2017-06-21 12:51:20.641] _msg_get:290 alloc_msg_count: 2 caller: server_ack_err conn: SERVER sd: 10
[2017-06-21 12:51:20.641] server_ack_err:351 close SERVER 10 req 2:0 len 14 type 86 from c 15: Connection timed out
[2017-06-21 12:51:20.641] client_handle_response:256 ASSERTION FAILED: response 3:0 has peer set
[2017-06-21 12:51:20.641] dn_assert:342 assert '!rsp->peer' failed @ (dyn_client.c, 256)
[2017-06-21 12:51:20.641] msg_local_one_rsp_handler:979 Req 2:0 selected_rsp 3:0
[2017-06-21 12:51:20.641] server_close:405 close SERVER 10 Dropped 0 outqueue & 1 inqueue requests
[2017-06-21 12:51:28.138] gossip_loop:821 I am still joining the ring!
[2017-06-21 12:51:34.256] conn_connect:536 connecting to '127.0.0.1:22122:1' on p 10
[2017-06-21 12:51:34.256] conn_connect:536 connecting to 'dynomite-host2:8101:rack2:dc2:1383429731' on p 16
[2017-06-21 12:51:34.257] dn_stacktrace:326 [0] /lib64/libpthread.so.0() [0x3fb740f7e0]
[2017-06-21 12:51:34.259] dn_stacktrace:326 [1] src/dynomite(mbuf_remove+0x2b) [0x41eb2b]
[2017-06-21 12:51:34.262] dn_stacktrace:326 [2] src/dynomite(dnode_peer_gossip_forward+0x116) [0x416a26]
[2017-06-21 12:51:34.264] dn_stacktrace:326 [3] src/dynomite(dnode_peer_handshake_announcing+0x182) [0x416102]
[2017-06-21 12:51:34.266] dn_stacktrace:326 [4] src/dynomite(core_loop+0x6e) [0x40a26e]
[2017-06-21 12:51:34.268] dn_stacktrace:326 [5] src/dynomite(main+0x6d8) [0x42f338]
[2017-06-21 12:51:34.271] dn_stacktrace:326 [6] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3fb701ed1d]
[2017-06-21 12:51:34.272] dn_stacktrace:326 [7] src/dynomite() [0x4088f9]
[2017-06-21 12:51:34.275] signal_handler:132 signal 11 (SIGSEGV) received, core dumping

below is the log for server 2

[2017-06-21 12:51:10.390] conf_validate_pool:2156 setting read_consistency to default value:dc_one
[2017-06-21 12:51:10.390] conf_validate_pool:2163 setting write_consistency to default value:dc_one
[2017-06-21 12:51:10.390] conf_validate_pool:2198 setting env to default value:aws
[2017-06-21 12:51:10.390] conf_validate_pool:2208 setting reconciliation key file to default value:conf/recon_key.pem
[2017-06-21 12:51:10.390] conf_validate_pool:2213 setting reconciliation IV file to default value:conf/recon_iv.pem
[2017-06-21 12:51:10.390] load_private_rsa_key_by_file:68 Private RSA structure filled
[2017-06-21 12:51:10.390] stats_listen:1362 m 5 listening on '0.0.0.0:22223'
[2017-06-21 12:51:10.390] entropy_key_iv_load:364 Key File name: conf/recon_key.pem - IV File name: conf/recon_iv.pem
[2017-06-21 12:51:10.390] entropy_key_iv_load:419 key loaded: 0123456789012345
[2017-06-21 12:51:10.390] entropy_key_iv_load:427 iv loaded: 0123456789012345
[2017-06-21 12:51:10.390] entropy_listen:328 anti-entropy m 8 listening on '127.0.0.1:8105'
[2017-06-21 12:51:10.391] event_base_create:68 e 9 with nevent 1024
[2017-06-21 12:51:10.391] conn_connect:536 connecting to '127.0.0.1:22123:1' on p 10
[2017-06-21 12:51:10.391] proxy_init:124 p 11 listening on '0.0.0.0:8102' in redis pool 'dyn_o_mite'
[2017-06-21 12:51:10.391] dnode_init:112 dyn: p 12 listening on '0.0.0.0:8101' in redis pool 'dyn_o_mite' with 18210944 servers
[2017-06-21 12:51:10.391] preselect_remote_rack_for_replication:1838 my rack index 0
[2017-06-21 12:51:10.391] preselect_remote_rack_for_replication:1865 Selected rack rack1 for replication to remote region dc1
[2017-06-21 12:51:10.391] server_connected:450 connected on s 10 to server '127.0.0.1:22123:1'
[2017-06-21 12:51:34.247] dnode_accept:172 Accepting client connection from 10.131.12.47/55326 on sd 15
[2017-06-21 12:51:34.247] dnode_accept:217 dyn: accepted LOCAL_PEER_CLIENT 15 on PEER_PROXY 12 from '10.131.12.47:55326'
[2017-06-21 12:51:34.468] _msg_get:290 alloc_msg_count: 1 caller: req_get conn: LOCAL_PEER_CLIENT sd: 15
[2017-06-21 12:51:34.468] conn_recv_data:632 recv on sd 15 eof rb 0 sb 0
[2017-06-21 12:51:34.468] req_recv_next:346 c 15 is done
[2017-06-21 12:51:40.391] gossip_loop:821 I am still joining the ring!
[2017-06-21 12:51:44.040] proxy_accept:220 accepted CLIENT 16 on PROXY 11 from '127.0.0.1:38642'
[2017-06-21 12:51:44.040] conn_connect:536 connecting to 'dynomite-host1:8101:rack1:dc1:12345678' on p 17
[2017-06-21 12:51:44.041] dn_stacktrace:326 [0] /lib64/libpthread.so.0() [0x3fb740f7e0]
[2017-06-21 12:51:44.044] dn_stacktrace:326 [1] src/dynomite(mbuf_remove+0x2b) [0x41eb2b]
[2017-06-21 12:51:44.047] dn_stacktrace:326 [2] src/dynomite(dnode_peer_gossip_forward+0x116) [0x416a26]
[2017-06-21 12:51:44.049] dn_stacktrace:326 [3] src/dynomite(dnode_peer_handshake_announcing+0x182) [0x416102]
[2017-06-21 12:51:44.052] dn_stacktrace:326 [4] src/dynomite(core_loop+0x6e) [0x40a26e]
[2017-06-21 12:51:44.055] dn_stacktrace:326 [5] src/dynomite(main+0x6d8) [0x42f338]
[2017-06-21 12:51:44.057] dn_stacktrace:326 [6] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3fb701ed1d]
[2017-06-21 12:51:44.059] dn_stacktrace:326 [7] src/dynomite() [0x4088f9]
[2017-06-21 12:51:44.061] signal_handler:132 signal 11 (SIGSEGV) received, core dumping

Looks like gossiping not working but both hosts are accessible to each other.
I am using master branch.

@ipapapa
Copy link
Contributor

ipapapa commented Jun 21, 2017

A couple of questions:

  • [2017-06-21 12:51:15.731] redis_parse_req:1230 parsed unsupported command 'COMMAND': What version of redis-cli are you using?
  • What version of Dynomite are you using?
  • Have you built Dynomite in Debug Mode?

@Aatish04
Copy link
Author

Aatish04 commented Jun 22, 2017

What version of redis-cli are you using?
3.2.9
What version of Dynomite are you using?
This is dynomite-4757009 (have taken latest from master and built)

Have you built Dynomite in Debug Mode?
debug logs enabled and assertions disabled

@Aatish04
Copy link
Author

Hi @ipapapa , Any clue for above issue ?
Is anything wrong with my configuration?

@ipapapa
Copy link
Contributor

ipapapa commented Jun 26, 2017

A couple of things.

  1. Use the latest dev branch;
  2. Replace DNS names with IP addresses? That is to resolve the seg fault;
  3. Use an older version of redis-cli. I think the latest one has an unsupported command: parsed unsupported command 'COMMAND'. We probably need to add support to this command.

@Aatish04
Copy link
Author

Thanks @ipapapa
Could you please let me know which version of redis is compatible with dynomite.
Also I have configure dynoJedisClient to communicate with dynomite and observed that for all read operations it performs without any issues but whenever write request comes one of the node gets crashed.
However, data which is written is available on that node.
Is it happening because of redis version?

Below is stack trace

[2017-06-27 06:19:52.683] conf_validate_pool:2163 setting write_consistency to default value:dc_one
[2017-06-27 06:19:52.683] conf_validate_pool:2198 setting env to default value:aws
[2017-06-27 06:19:52.683] conf_validate_pool:2208 setting reconciliation key file to default value:conf/recon_key.pem
[2017-06-27 06:19:52.683] conf_validate_pool:2213 setting reconciliation IV file to default value:conf/recon_iv.pem
[2017-06-27 06:19:52.684] load_private_rsa_key_by_file:68 Private RSA structure filled
[2017-06-27 06:19:52.684] stats_listen:1362 m 5 listening on '0.0.0.0:22223'
[2017-06-27 06:19:52.684] entropy_key_iv_load:364 Key File name: conf/recon_key.pem - IV File name: conf/recon_iv.pem
[2017-06-27 06:19:52.684] entropy_key_iv_load:419 key loaded: 0123456789012345
[2017-06-27 06:19:52.684] entropy_key_iv_load:427 iv loaded: 0123456789012345
[2017-06-27 06:19:52.684] entropy_listen:328 anti-entropy m 8 listening on '127.0.0.1:8105'
[2017-06-27 06:19:52.684] event_base_create:68 e 9 with nevent 1024
[2017-06-27 06:19:52.684] conn_connect:536 connecting to '127.0.0.1:22123:1' on p 10
[2017-06-27 06:19:52.684] proxy_init:124 p 11 listening on '0.0.0.0:8102' in redis pool 'dyn_o_mite'
[2017-06-27 06:19:52.685] dnode_init:112 dyn: p 12 listening on '0.0.0.0:8101' in redis pool 'dyn_o_mite' with 3004351808 servers
[2017-06-27 06:19:52.685] conn_connect:536 connecting to 'dynomite-host1:8101:rack1:dc1:12345678' on p 13
[2017-06-27 06:19:52.685] preselect_remote_rack_for_replication:1838 my rack index 0
[2017-06-27 06:19:52.685] preselect_remote_rack_for_replication:1865 Selected rack rack1 for replication to remote region dc1
[2017-06-27 06:19:52.685] server_connected:450 connected on s 10 to server '127.0.0.1:22123:1'
[2017-06-27 06:19:52.685] dnode_peer_connected:1013 dyn: peer connected on sd 13 to server 'dynomite-host1:8101:rack1:dc1:12345678'
[2017-06-27 06:19:52.759] proxy_accept:220 accepted CLIENT 16 on PROXY 11 from '175.100.189.106:52289'
[2017-06-27 06:19:52.830] proxy_accept:220 accepted CLIENT 17 on PROXY 11 from '175.100.189.106:52290'
[2017-06-27 06:19:58.289] core_close_log:343 close CLIENT 16 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:19:58.289] client_unref_internal_try_put:101 unref conn 0x1a6d130 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:19:58.296] core_close_log:343 close CLIENT 17 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:19:58.296] client_unref_internal_try_put:101 unref conn 0x1a6d370 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:20:02.364] _msg_get:290 alloc_msg_count: 1 caller: rsp_get conn: REMOTE_PEER_SERVER sd: 13
[2017-06-27 06:20:02.364] conn_recv_data:632 recv on sd 13 eof rb 0 sb 0
[2017-06-27 06:20:32.286] proxy_accept:220 accepted CLIENT 16 on PROXY 11 from '175.100.189.106:52315'
[2017-06-27 06:20:32.354] proxy_accept:220 accepted CLIENT 17 on PROXY 11 from '175.100.189.106:52316'
[2017-06-27 06:20:32.419] proxy_accept:220 accepted CLIENT 18 on PROXY 11 from '175.100.189.106:52317'
[2017-06-27 06:20:35.091] core_close_log:343 close CLIENT 18 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:20:35.091] client_unref_internal_try_put:101 unref conn 0x1a6d760 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:20:35.093] core_close_log:343 close CLIENT 16 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:20:35.093] client_unref_internal_try_put:101 unref conn 0x1a6d370 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:20:35.094] core_close_log:343 close CLIENT 17 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:20:35.094] client_unref_internal_try_put:101 unref conn 0x1a6d130 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:20:50.541] dnode_accept:172 Accepting client connection from dynomite-host1/57272 on sd 16
[2017-06-27 06:20:50.541] dnode_accept:217 dyn: accepted LOCAL_PEER_CLIENT 16 on PEER_PROXY 12 from 'dynomite-host1:57272'
[2017-06-27 06:21:02.490] proxy_accept:220 accepted CLIENT 17 on PROXY 11 from '175.100.189.106:52341'
[2017-06-27 06:21:02.559] proxy_accept:220 accepted CLIENT 18 on PROXY 11 from '175.100.189.106:52342'
[2017-06-27 06:21:02.630] proxy_accept:220 accepted CLIENT 19 on PROXY 11 from '175.100.189.106:52343'
[2017-06-27 06:21:32.694] proxy_accept:220 accepted CLIENT 20 on PROXY 11 from '175.100.189.106:52354'
[2017-06-27 06:21:32.760] proxy_accept:220 accepted CLIENT 21 on PROXY 11 from '175.100.189.106:52355'
[2017-06-27 06:21:32.833] proxy_accept:220 accepted CLIENT 22 on PROXY 11 from '175.100.189.106:52356'
[2017-06-27 06:21:36.380] core_close_log:343 close CLIENT 21 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:21:36.380] client_unref_internal_try_put:101 unref conn 0x1a6dcd0 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:21:36.381] core_close_log:343 close CLIENT 17 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:21:36.381] client_unref_internal_try_put:101 unref conn 0x1a6d370 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:21:36.381] core_close_log:343 close CLIENT 18 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:21:36.381] client_unref_internal_try_put:101 unref conn 0x1a6d760 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:21:36.382] core_close_log:343 close CLIENT 20 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:21:36.382] client_unref_internal_try_put:101 unref conn 0x1a6db00 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:21:36.384] core_close_log:343 close CLIENT 19 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:21:36.384] client_unref_internal_try_put:101 unref conn 0x1a6d930 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:21:36.387] core_close_log:343 close CLIENT 22 'unknown' on event FF00FF eof 0 done 0 rb 0 sb 0: Connection reset by peer
[2017-06-27 06:21:36.387] client_unref_internal_try_put:101 unref conn 0x1a6dea0 owner 0x1a5e0a0 from pool 'dyn_o_mite'
[2017-06-27 06:22:03.186] proxy_accept:220 accepted CLIENT 17 on PROXY 11 from '175.100.189.106:52362'
[2017-06-27 06:22:03.251] proxy_accept:220 accepted CLIENT 18 on PROXY 11 from '175.100.189.106:52363'
[2017-06-27 06:22:03.312] proxy_accept:220 accepted CLIENT 19 on PROXY 11 from '175.100.189.106:52364'
[2017-06-27 06:22:33.380] proxy_accept:220 accepted CLIENT 20 on PROXY 11 from '175.100.189.106:52380'
[2017-06-27 06:22:33.443] proxy_accept:220 accepted CLIENT 21 on PROXY 11 from '175.100.189.106:52381'
[2017-06-27 06:22:33.507] proxy_accept:220 accepted CLIENT 22 on PROXY 11 from '175.100.189.106:52382'
[2017-06-27 06:22:33.924] proxy_accept:220 accepted CLIENT 23 on PROXY 11 from '175.100.189.106:52383'
[2017-06-27 06:22:33.992] proxy_accept:220 accepted CLIENT 24 on PROXY 11 from '175.100.189.106:52385'
[2017-06-27 06:22:34.058] proxy_accept:220 accepted CLIENT 25 on PROXY 11 from '175.100.189.106:52388'
[2017-06-27 06:22:44.161] server_pool_update:511 update pool 'dyn_o_mite' to add 0 servers
[2017-06-27 06:22:44.161] _msg_get:290 alloc_msg_count: 2 caller: rsp_get conn: SERVER sd: 10
[2017-06-27 06:22:44.161] server_rsp_forward:916 c_conn 0x1a6d130 2:0 <-> 3:0
[2017-06-27 06:22:44.161] dn_assert:342 assert 'req->done && !req->swallow' failed @ (dyn_dnode_client.c, 531)
[2017-06-27 06:22:44.161] dn_stacktrace:326 [0] src/dynomite() [0x413113]
[2017-06-27 06:22:44.165] dn_stacktrace:326 [1] src/dynomite() [0x425158]
[2017-06-27 06:22:44.168] dn_stacktrace:326 [2] src/dynomite(msg_send+0x9a) [0x4252a0]
[2017-06-27 06:22:44.170] dn_stacktrace:326 [3] src/dynomite() [0x40aa73]
[2017-06-27 06:22:44.172] dn_stacktrace:326 [4] src/dynomite(core_core+0x1f6) [0x40b1c4]
[2017-06-27 06:22:44.174] dn_stacktrace:326 [5] src/dynomite(event_wait+0x18f) [0x44e50d]
[2017-06-27 06:22:44.177] dn_stacktrace:326 [6] src/dynomite(core_loop+0x30) [0x40b8e8]
[2017-06-27 06:22:44.179] dn_stacktrace:326 [7] src/dynomite() [0x43e11b]
[2017-06-27 06:22:44.181] dn_stacktrace:326 [8] src/dynomite(main+0x11c) [0x43e27f]
[2017-06-27 06:22:44.184] dn_stacktrace:326 [9] /lib64/libc.so.6(__libc_start_main+0xfd) [0x3fb701ed1d]

@ipapapa
Copy link
Contributor

ipapapa commented Jun 27, 2017

Any Redis version is compatible... The problem is that the new redis-cli has an extra command for which we need to add support for. We use 3.0.7 in our systems. I will let @shailesh33 comment what might have caused the segmentation fault. I see the following assert statement:

[2017-06-27 06:22:44.161] dn_assert:342 assert 'req->done && !req->swallow' failed @ (dyn_dnode_client.c, 531)

@shailesh33
Copy link
Contributor

Have you tried v0.5.9? that is the recent stable release. I believe I fixed a bunch of assert related issues.

@Aatish04
Copy link
Author

@shailesh33 Thanks! Now tried with v0.5.9 and not getting above error.
One doubt with this version, Is gossiping is enabled by default for this release.
Because It is not accepting -g parameter as earlier version.

@ipapapa
Copy link
Contributor

ipapapa commented Jun 29, 2017

Does using enable_gossip=false parameter in the YAML work better?

@shailesh33
Copy link
Contributor

@Aatish04 can you check the parameter in the YAML file?

@fozboz
Copy link

fozboz commented Jan 16, 2018

@shailesh33 should this issue be fixed? I am getting the same parsed unsupported command 'COMMAND' error when using redis-cli.

I'm unable to issue in-line commands to redis. E.g. redis-cli -h 192.168.1.1 -p 8102 set foo "bar" returns Error: Server closed the connection.

However, if I use redis-cli -h 192.168.1.1 -p 8102 I get the CLI prompt and can issue any commands and it works fine, but I still get the unsupported command error on initial connection.

This is the full error:

[2018-01-16 14:09:45.263] proxy_accept:214 <CONN_PROXY 0x12d0220 11 listening on '192.168.1.1:8102'> accepted <CONN_CLIENT 0x12d4960 20 from '192.168.1.2:39672'> [2018-01-16 14:09:45.263] redis_parse_req:1263 parsed unsupported command 'COMMAND' [2018-01-16 14:09:45.263] redis_parse_req:1888 parsed bad req 24 res 1 type 0 state 5 00000000 2a 31 0d 0a 24 37 0d 0a 43 4f 4d 4d 41 4e 44 0d |*1..$7..COMMAND.| 00000010 0a |.| [2018-01-16 14:09:45.263] core_close:417 close <CONN_CLIENT 0x12d4960 20 from '192.168.1.2:39672'> on event FFFF eof 0 done 0 rb 17 sb 0: Invalid argument [2018-01-16 14:09:45.263] client_unref_internal_try_put:94 <CONN_CLIENT 0x12d4960 -1 from '192.168.1.2:39672'> unref owner <POOL 0x12c20f0 'dyn_o_mite'>

Dynomite v0.6.2-5-ge811564
redis-server v3.2.3
redis-cli v3.2.3

Thanks!

@shailesh33
Copy link
Contributor

@fozboz yes I am able to see the issue. I presumed the fix is a 5 line code which is, but I think the redis response parser is unable to parse the multi bulk response properly. I will try to give it a stab.

@shailesh33
Copy link
Contributor

@fozboz It seems the response parser is not done properly, It fails with multilevel array responses.
Fixing this is some good amount of work. Is this something critical for you and if you can give it a shot?

@fozboz
Copy link

fozboz commented Jan 18, 2018

@shailesh33 not a big issue with us as we're going to use it as a back-end for Conductor. Can you point me in the direction of where I might start to try and fix this? Would it be something in dynomite/src/proto/dyn_redis.c ?

@shailesh33
Copy link
Contributor

yes that is the place to start. Start with parse_rsp.
essentially, the parser starts will, it understands the multibulk response, and reads the number of elements well, then it starts to read the first element, which again is a multibulk response, and starts its "number of elements" and overwrites the previously read value and everything falls apart.

Essentially it should understand that it is reading an internal element and so on and so forth.

@shailesh33
Copy link
Contributor

shailesh33 commented Jan 18, 2018

@fozboz if you are excited enough to rewrite the entire parser I will not stop you :)

@xwiz
Copy link

xwiz commented Aug 24, 2020

In my case I keep getting whenever I issue any command via redis-cli

dn_assert:301 assert 'ncontinuum != 0' failed @ (dyn_vnode.c, 129)
dn_assert:301 assert 'idx < a->nelem' failed @ (../dyn_array.c, 135)

While using php-redis client leads to:

redis_parse_req:1583 parsed unsupported command 'COMMAND'
redis_parse_req:2383 parsed bad req 1 res 1 type 0 state 5

This is on Ubuntu Focal Fossa.
Dynamite v 0.6.21rc2

I'm not exactly sure I understand the proposed solutions above @shailesh33

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants