-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Safe handling of websockets drop #1797
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got panic running run_tests.py ws.test_ws_ping.CacheTest.test_ping_websockets
Looks like we use the server even if its connection closed. tfw_ws_conn_release
should have "backend management" code.
[20093.217396] kernel BUG at /home/constantine/projects/tempesta/fw/sock.c:456!
[20093.217940] invalid opcode: 0000 [#1] SMP NOPTI
[20093.218276] CPU: 2 PID: 7450 Comm: python3 Kdump: loaded Tainted: G OE 5.10.35+ #1
[20093.218867] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.1-1-1 04/01/2014
[20093.219542] RIP: 0010:ss_send+0x2bd/0x310 [tempesta_fw]
[20093.219905] Code: 89 44 24 08 eb 98 e8 52 ea d0 e0 85 c0 0f 85 5c 74 01 00 41 bc f4 ff ff ff e9 55 ff ff ff 48 c7 03 00 00 00 00 e9 f8 fe ff ff <0f> 0b be 01 00 00 00 48 89 ef e8 b4 de a7 e0 e9 39 fe ff ff 0f 0b
[20093.221189] RSP: 0018:ffffc900000f4a00 EFLAGS: 00010246
[20093.221557] RAX: 0000000000000000 RBX: ffffc900000f4b28 RCX: ffff88817530f878
[20093.222072] RDX: 0000000000000002 RSI: ffff88817530f030 RDI: 0000000000000000
[20093.222562] RBP: ffff88817530f020 R08: ffff8881602dd598 R09: 0000000000000001
[20093.223052] R10: 0000000000000001 R11: 000000000000000c R12: ffff8881602dd578
[20093.223542] R13: ffff8881602dd220 R14: ffff8881602dd330 R15: ffff8881602dd220
[20093.224037] FS: 00007f6f52f15740(0000) GS:ffff888277d00000(0000) knlGS:0000000000000000
[20093.224591] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[20093.224987] CR2: 00007f6f4fcf4018 CR3: 000000018eca8000 CR4: 0000000000350ee0
[20093.225477] Call Trace:
[20093.225654] <IRQ>
[20093.225813] tfw_http_req_fwd_send.constprop.0+0x51/0x2d0 [tempesta_fw]
[20093.226282] ? tfw_http_msg_hdr_xfrm+0xb8/0xf0 [tempesta_fw]
[20093.226677] tfw_http_conn_fwd_unsent+0xe7/0x3d0 [tempesta_fw]
[20093.227089] tfw_http_req_cache_cb+0x37c/0x640 [tempesta_fw]
[20093.227483] ? tfw_http_req_redir+0x7e0/0x7e0 [tempesta_fw]
[20093.227871] tfw_cache_process+0x146/0x290 [tempesta_fw]
[20093.228250] tfw_http_req_process+0x39d/0x820 [tempesta_fw]
[20093.228682] ? __coalesce_frag+0x99/0x150 [tempesta_fw]
[20093.229039] ? ss_skb_queue_coalesce_tail+0xe7/0x120 [tempesta_fw]
[20093.229457] tfw_connection_recv+0x53/0x90 [tempesta_fw]
[20093.229822] ss_tcp_process_data+0x1f1/0x400 [tempesta_fw]
[20093.230196] ss_tcp_data_ready+0x3f/0xa0 [tempesta_fw]
[20093.230552] tcp_data_queue+0x8d3/0xdf0
[20093.230820] tcp_rcv_established+0x234/0x650
[20093.231112] ? sk_filter_trim_cap+0xd2/0x220
[20093.231407] tcp_v4_do_rcv+0x127/0x1e0
[20093.231666] tcp_v4_rcv+0xc49/0xdb0
[20093.231906] ip_protocol_deliver_rcu+0x2b/0x1c0
[20093.232215] ip_local_deliver_finish+0x44/0x50
[20093.232517] __netif_receive_skb_one_core+0x85/0xa0
[20093.232851] process_backlog+0x96/0x170
[20093.233112] net_rx_action+0x142/0x3e0
[20093.233369] __do_softirq+0xd0/0x28a
[20093.233618] asm_call_irq_on_stack+0x12/0x20
[20093.233914] </IRQ>
[20093.234064] do_softirq_own_stack+0x32/0x40
[20093.234388] do_softirq+0x5e/0x70
[20093.234621] __local_bh_enable_ip+0x4b/0x50
[20093.234906] ip_finish_output2+0x193/0x540
[20093.235185] __ip_queue_xmit+0x17a/0x400
[20093.235454] ? __local_bh_enable_ip+0x4b/0x50
[20093.235751] __tcp_transmit_skb+0x9f0/0xb90
[20093.236037] tcp_write_xmit+0x408/0x1270
[20093.236305] __tcp_push_pending_frames+0x32/0xe0
[20093.236620] tcp_sendmsg_locked+0xa1d/0xb80
[20093.236907] tcp_sendmsg+0x27/0x40
[20093.237153] sock_sendmsg+0x54/0x60
[20093.237400] __sys_sendto+0xde/0x120
[20093.237649] ? init_wait_entry+0x30/0x30
[20093.237922] ? __sys_connect+0x8c/0xb0
[20093.238182] ? switch_fpu_return+0x40/0xb0
[20093.238463] __x64_sys_sendto+0x20/0x30
[20093.238728] do_syscall_64+0x33/0x80
[20093.238979] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[20093.239327] RIP: 0033:0x7f6f5329c44c
[20093.239574] Code: 89 02 48 c7 c0 ff ff ff ff eb b5 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 19 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 64 c3 0f 1f 00 55 48 83 ec 20 48 89 54 24 10
[20093.240835] RSP: 002b:00007fff37a14e58 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[20093.241346] RAX: ffffffffffffffda RBX: 00007fff37a14f10 RCX: 00007f6f5329c44c
[20093.241831] RDX: 0000000000000217 RSI: 0000000003969810 RDI: 0000000000000005
[20093.242314] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
[20093.242796] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
[20093.243277] R13: ffffffffc4653600 R14: 00007fff37a14f10 R15: 0000000000000001
[20093.243765] Modules linked in: tempesta_fw(OE) tempesta_db(OE) tempesta_tls(OE) tempesta_lib(OE) sha256_ssse3(E) sha512_ssse3(E) sha512_generic(E) uinput(E) joydev(E) hid_generic(E) usbhid(E) hid(E) rfkill(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) libaes(E) crypto_simd(E) cryptd(E) glue_helper(E) snd_hda_codec_generic(E) ledtrig_audio(E) snd_hda_intel(E) snd_intel_dspcfg(E) iTCO_wdt(E) soundwire_intel(E) intel_pmc_bxt(E) soundwire_generic_allocation(E) iTCO_vendor_support(E) snd_soc_core(E) watchdog(E) virtio_net(E) xhci_pci(E) snd_compress(E) net_failover(E) soundwire_cadence(E) snd_hda_codec(E) i2c_i801(E) xhci_hcd(E) snd_hda_core(E) i2c_smbus(E) pcspkr(E) usbcore(E) failover(E) virtiofs(E) virtio_balloon(E) snd_hwdep(E) ahci(E) soundwire_bus(E) libahci(E) virtio_console(E) snd_pcm(E) libata(E) snd_timer(E) snd(E) soundcore(E) scsi_mod(E) lpc_ich(E) qemu_fw_cfg(E) button(E) msr(E) parport_pc(E) ppdev(E) lp(E) parport(E) fuse(E) configfs(E) virtio_rng(E) rng_core(E)
[20093.243797] ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) virtio_gpu(E) virtio_dma_buf(E) drm_kms_helper(E) cec(E) drm(E) virtio_blk(E) evdev(E) crct10dif_pclmul(E) crct10dif_common(E) crc32c_intel(E) psmouse(E) serio_raw(E) virtio_pci(E) virtio_ring(E) virtio(E) [last unloaded: tempesta_lib]
[ 0.041690] Tempesta: cannot allocate 256 continous huge pages at node 0
[ 0.041929] Tempesta: cannot vmalloc area of 536870912 bytes at node 0
@const-t, i can't reproduce your crash with the same test |
The fix looks good, though i'm worried about the aforementioned crash. |
Yeah, I also tried The call stack has no any WS function and it seems the regular HTTP code path is broken. The crash is in the request processing and I don't see any changes except 5c2226e ... Need to dig deeper into the logic. If the bug can be reproduced, then it'll simplify debugging a lot. @const-t could you please share you config and any specs about your setup? |
@krizhanovsky @s0nx I think problem in this line. We must clear To reproduce I building tempesta with no debug flags. Also, maybe worth to change this line to config: const-t-vm-config.txt |
I got crash, after two hours tests, but i can't tell is it relates to this issue (i want to run tests all night long to check that it will not reproduced) : |
I reproduced exactly the same crash as @const-t reproduced on his kernel config. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
req_sent might be NULL and tfw_http_conn_on_hold() checks this, but still may return true and we may crash on NULL pointer dereference, so check req_sent before entering the if condition.
corresponding to the clinet connection pair. The reference counter is set under the server socket spinlock. This makes the server connection management safe and the server connection can no pass away from under the client connection. Client connection can not be freed until the upgrade response is received and processed (either response can be dropped or tfw_http_websocket_upgrade() called), so we never get freed client connection in tfw_http_websocket_upgrade(). The only possible problem is that tfw_http_websocket_upgrade() makes the connections pair, but does not acquire one more reference count for the client connection. This patch adds the client connection reference count and decrements it in tfw_ws_conn_drop(), when a server connection is dead.
about HTTP and is mostly about connections handling. Now we can export higher level tfw_http_websocket_upgrade() than tfw_ws_srv_new_steal_sk(); Add asserttion that websocket upgrade happens under server socket lock.
forwarding a response. This is required to keep the request in the seq_queue that the client connection will be referenced by at least the request paired with currently processed response. Move tfw_ws_cli_mod_timer() above the first user. More accurate and safe connection initialization and putting.
and client websocket connections: * tfw_ws_srv_ss_hook_drop() is used for server connections only, so it should use the same sequence of calls as tfw_sock_clnt_drop() for client connections. * on upgrade time both the server and client connections must have reference count at least 2: for pairing pointers and for downcall destruction. Only client connection may have refcounter > 2 (in case of pipelined HTTP requests after the upgrade request). * tfw_ws_conn_unpair() is responsible now for it's own connection only since it doesn't know the state of the pair. This makes the code robust against concurrent drop of the paired connections. * obviously, we should put connection only after closing it. * Small coding style cleanups and additional tracing.
has just recovered after destructor cull in tfw_ws_srv_new_steal_sk(). In this case tfw_ws_srv_new_steal_sk() might clear TFW_CONN_B_UNSCHED after setting it in concurrent tfw_http_req_fwd_send(), which just make the connection available for transmission. We still need clear_bit(TFW_CONN_B_UNSCHED) in normal operation since we have fixed and limited number of server connections and must make them available after the socket stealing.
DEBUG=3 DBG_HTTP=3 DBG_TLS=3 DBG_SS=3 DBG_WS=3 make kvmalloc_node() and kvfree() symbols can't be found.
just raises the reconnection timer and the srv_conn remains with sk == NULL. So this isn't a good place to clear the TFW_CONN_B_UNSCHED bit. This patch clears the TFW_CONN_B_UNSCHED on error in tfw_ws_srv_new_steal_sk() or when a server connection is established in tfw_sock_srv_connect_complete().
Fix for #1647 (see commit messages). Now
passes for local tests.