Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RESEARCH] High memory consumption after frang configuration #2098

Open
ykargin opened this issue Apr 8, 2024 · 5 comments
Open

[RESEARCH] High memory consumption after frang configuration #2098

ykargin opened this issue Apr 8, 2024 · 5 comments
Assignees
Labels
bug question Questions and support tasks
Milestone

Comments

@ykargin
Copy link
Contributor

ykargin commented Apr 8, 2024

Motivation

After frang configuration in PR 598 tests started to fail with

 [ 6570.228871] ksoftirqd/0: page allocation failure: order:9, mode:0x40a20(GFP_ATOMIC|__GFP_COMP), nodemask=(null),cpuset=/,mems_allowed=0
 [ 6570.229960] CPU: 0 PID: 12 Comm: ksoftirqd/0 Tainted: G           OE     5.10.35.tfw-04d37a1 #1
 [ 6570.230476] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.15.0-0-g2dd4b9b3f840-prebuilt.qemu.org 04/01/2014
 [ 6570.231459] Call Trace:
 [ 6570.231964]  dump_stack+0x74/0x92
 [ 6570.232458]  warn_alloc.cold+0x7b/0xdf

Research needed

Test to reproduce - tls/test_tls_integrity.ManyClients and tls/test_tls_integrity.ManyClientsH2 with -T 1 option and body > 16KB (8GB memory)

@krizhanovsky
Copy link
Contributor

Looks like not enough memory.

@krizhanovsky krizhanovsky added the question Questions and support tasks label Apr 8, 2024
@krizhanovsky krizhanovsky added this to the 0.8 - Beta milestone Apr 8, 2024
@RomanBelozerov
Copy link
Contributor

I receive a lof of Warning: cannot alloc memory for TLS encryption. and traceback

[26347.797820] CPU: 6 PID: 50 Comm: ksoftirqd/6 Kdump: loaded Tainted: P        W  OE     5.10.35.tfw-04d37a1 #1
[26347.797821] Hardware name: Micro-Star International Co., Ltd. GF63 Thin 11UC/MS-16R6, BIOS E16R6IMS.10D 06/23/2022
[26347.797822] Call Trace:
[26347.797829]  dump_stack+0x74/0x92
[26347.797831]  warn_alloc.cold+0x7b/0xdf
[26347.797834]  __alloc_pages_slowpath.constprop.0+0xd2e/0xd60
[26347.797835]  ? prep_new_page+0xcd/0x120
[26347.797837]  __alloc_pages_nodemask+0x2cf/0x330
[26347.797839]  alloc_pages_current+0x87/0xe0
[26347.797841]  kmalloc_order+0x2c/0x100
[26347.797842]  kmalloc_order_trace+0x1d/0x80
[26347.797843]  __kmalloc+0x3e9/0x470
[26347.797857]  tfw_tls_encrypt+0x7a2/0x820 [tempesta_fw]
[26347.797860]  ? memcpy_fast+0xe/0x10 [tempesta_lib]
[26347.797867]  ? tfw_strcpy+0x1ae/0x2b0 [tempesta_fw]
[26347.797870]  ? irq_exit_rcu+0x42/0xb0
[26347.797872]  ? sysvec_apic_timer_interrupt+0x48/0x90
[26347.797873]  ? asm_sysvec_apic_timer_interrupt+0x12/0x20
[26347.797880]  ? tfw_h2_make_frames+0x1da/0x370 [tempesta_fw]
[26347.797886]  ? tfw_h2_make_data_frames+0x19/0x20 [tempesta_fw]
[26347.797892]  ? tfw_sk_prepare_xmit+0x69c/0x7b0 [tempesta_fw]
[26347.797898]  tfw_sk_write_xmit+0x6a/0xc0 [tempesta_fw]
[26347.797900]  tcp_tfw_sk_write_xmit+0x36/0x80
[26347.797902]  tcp_write_xmit+0x2a9/0x1210
[26347.797903]  __tcp_push_pending_frames+0x37/0x100
[26347.797904]  tcp_push+0xfc/0x100
[26347.797910]  ss_tx_action+0x492/0x670 [tempesta_fw]
[26347.797912]  net_tx_action+0x9c/0x250
[26347.797914]  __do_softirq+0xd9/0x291
[26347.797915]  run_ksoftirqd+0x2b/0x40
[26347.797916]  smpboot_thread_fn+0xd0/0x170
[26347.797918]  kthread+0x114/0x150
[26347.797918]  ? sort_range+0x30/0x30
[26347.797919]  ? kthread_park+0x90/0x90
[26347.797921]  ret_from_fork+0x1f/0x30
[26347.797923] Mem-Info:
[26347.797925] active_anon:132045 inactive_anon:1833119 isolated_anon:0
                active_file:492217 inactive_file:119308 isolated_file:0
                unevictable:199 dirty:23 writeback:0
                slab_reclaimable:45118 slab_unreclaimable:41418
                mapped:244887 shmem:205996 pagetables:15978 bounce:0
                free:758589 free_pcp:3043 free_cma:0

@RomanBelozerov
Copy link
Contributor

I receive meamleak for these tests and tempesta commit - 10b38e0. I used remote setup (Tempesta on a separate VM) and cmd ./run_tests.py -T 1 tls/test_tls_integrity.ManyClientsH2 with MTU 80. So I run this test with 16KB, 64KB and 200KB body and I see the usage of all available memory (6GB for my VM for Tempesta) and meamleak after test ~1GB

look like it is fixed in #2105. I cannot get meamleak for this PR, but I see the usage of all available memory. I think Tempesta uses an unexpected lot of memory in these tests. 10 clients with 64KB response/request body, python uses ~400MB, but Tempesta ~5GB, why?

@biathlon3
Copy link
Contributor

Here we have the next situation for 64KB test.

In this test, Tempesta FW receives 65536 bytes of data request from 10 clients, routes them to a server, gets responds from a server and sends them to the clients.
With option -T 1, each request and respond are split by byte.
The key point is that if Tempesta FW receives only one byte, it uses a full skb (about 900 bytes).

Tempesta FW receives at least 655360 skbs from clients, it is 655360 * 900 = 589 824 000 bytes.
Tempesta FW makes copies of all skbs in ss_skb_unroll() because all skbs are marked as cloned. Since the original skbs are marked as SKB_FCLONE_CLONE, they are not freed after consume_skb() right at this point.

Next, before routing these skbs to the server Tempesta FW makes clones in ss_send() with the purpose of resending if something goes wrong.

After the server has responded, Tempesta FW receives the same amount of skbs as from the clients.
And as all skbs are marked as cloned, it makes copies of these skbs.

Here we have allocated at least 589 824 000 * 5 = 2 949 120 000 bytes and only after Tempesta FW starts sending responds to clients, it starts freeing skbs.

@krizhanovsky
Copy link
Contributor

@biathlon3 thank you for the detailed analysis! I still have a couple of questions and appreciate your elaboration on them:

  1. skb_cloned() in ss_skb_unroll() comes under unlikely() and IIRC this is because modern HW NICs form skbs with data in pages only (unfortunately, I don't remember why clones appear otherwise). So please research why clones appear in the network stack? Whether moving to a different virtual adapters (e.g. virtio-net or SR-IOV) helps to avoid the clones? Please see https://tempesta-tech.com/knowledge-base/Hardware-virtualization-performance/ . Since virtual environments aren't rare, probably we need to remove unlikely add comments to the code why clones appear and rework our wiki in recommendations for virtual environments.
  2. What sk_buff spends 900 bytes for? Could you please write down how much memory which parts of SKB spend and which the Linux kernel compilation options may reduce the memory footprint. This probably can be documented in our wiki.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug question Questions and support tasks
Projects
None yet
Development

No branches or pull requests

4 participants