Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenOnload on non solarflare adapter #28

Open
hjastenger opened this issue May 29, 2021 · 55 comments
Open

OpenOnload on non solarflare adapter #28

hjastenger opened this issue May 29, 2021 · 55 comments

Comments

@hjastenger
Copy link

I feel like an idiot for asking this here, could not find a conclusive answer anywhere. Is it possible to run OpenOnload on non solarflare / Xilinx hardware. Aka if a setup my hosting from AWS, is it possible to configure OpenOnload to work on the ENA driver that automatically comes with that machine?

@rhughes-xilinx
Copy link

Hi. Yes, Onload has an AF_XDP backend, which makes it possible to run it on any NIC.

AF_XDP support comes in 3 flavours, in increasing performance: generic, in-driver, and zero-copy. Last time I looked (admittedly a while ago) the ena driver supported neither in-driver nor zero-copy, so you're stuck with the slowest option, provided generically by Linux. Onload should still be faster than native sockets, though.

@hjastenger
Copy link
Author

Tried a couple of hours today to get the setup working. Noticed that the default ENA provided by Amazon was a bit out of date. After updating, compiling from this repository and running it I had the onload_cp_server running. Adding my eth0 yielded the following issue in dmesg [ 905.128865] [sfc efhw] xdp_set_link: eth0 does not support XDP. Im a bit stuck currently, any idea why this is happening?

I was under the impression that after updating both the kernel (Red Hat) and the ENA driver I should have XDP support. Also I bit unsure how to check if my current setup supports XDP, grepping for CONFIG_XDP_SOCKET in my boot config yields it should be enabled. Was following this guide to check / enable XDP on ena https://trying2adult.com/what-is-xdp-and-how-do-you-use-it-in-linux-amazon-ec2-example/

Any idea why it looks like my eth0 isn't supporting XDP? Any help is appreciated, bit stuck atm. Thanks in advance

@rhughes-xilinx
Copy link

I'm sorry, it looks like I gave you bad info - re-checking the code it looks like we do currently need in-driver support for XDP. As I see it you have a few options:

  1. Give up
  2. Enhance Onload to make it work with the generic XDP. It's possible that this might prove difficult if, for example, the necessary kernel APIs do not exist for Onload to get the details it needs.
  3. Add XDP support to the ena driver. A minimal implementation may not be a massive amount of work
  4. Switch to an instance type with an Intel NIC. See https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/enhanced-networking.html
  5. Switch to Azure or GCP, both of which also offer NICs with better drivers

@rhughes-xilinx
Copy link

OK, perhaps I ought to stop talking, since every time I (figuratively) open my mouth I give you bad advice, but I'm going to try again:

The current ena driver (in Linus's tree) does have in-driver support for XDP, so it should work with Onload. The specific error logging you gave (eth0 does not support XDP) can only be coming out if you have an ena driver without support, so I suggest you check that.

@hjastenger
Copy link
Author

hjastenger commented Jun 2, 2021

Hi Richard, first of all thanks for the detailed response! Really appreciate you reaching out on this GH issue! Lots of information already for me to try out. Bit of background info on the previously mentioned points:

For the ENA driver im currently running version 2.5.0g, which should have support for XDP. So I'm either doing something incorrect when compiling and configuring the newer ENA driver or something went south with that specific version. Ill recheck on the amazon driver repository for any directions.

One last thing I noticed is the following when starting up open onload and loading the drivers into the kernel:

unload.sh: /sbin/rmmod onload
unload.sh: /sbin/rmmod sfc_char
unload.sh: /sbin/rmmod sfc_resource
unload.sh: /sbin/rmmod sfc_driverlink
NET_OPT is
CHAR_OPT is
modprobe: FATAL: Module mtdchar not found in directory /lib/modules/4.18.0-305.el8.x86_64
ERROR: Did not find sfc_control in /proc/devices
sfc is a RELEASE driver
RESOURCE_OPT is
CHAR_OPT is
ONLOAD_OPT is

The process appears to be running fine, just not sure whether this FATAL & ERROR logs might be a potential culprit.

@hjastenger
Copy link
Author

hjastenger commented Jun 2, 2021

PS: from a different issue on the amazon driver repository (amzn/amzn-drivers#173) I read the following:

The driver doesn't support AF_XDP yet and the application falls back to SKB mode (kernel's generic implementation for XDP processing) which implies copying.

So even though XDP is supported by the ENA driver, it doesn't support AF_XDP yet? Might be the reason why its currently failing.

@eugeneia
Copy link

eugeneia commented Jun 2, 2021

@hjastenger Fwiw (Hi from amzn/amzn-drivers#173), the XDP related APIs should work on EC2 assuming you have recent kernel and ENA versions (I have tested a 5.11 kernel recently with ENA 2.5.0). Its just that its not particular fast at this time due to the issue you quoted.

@hjastenger
Copy link
Author

hjastenger commented Jun 2, 2021

Hi @eugeneia, thanks for the info. I'm currently using RHEL 4.18, latest kernel from RH. I've tried upgrading the kernel using something like Elrepo to 5.12.9 but this results in not being able to build onload properly. Any idea?

@rhughes-xilinx
Copy link

not sure whether this FATAL & ERROR logs might be a potential culprit

We really ought to find some way of hiding those messages. They confuse everybody and they're absolutely harmless.

The eth0 does not support XDP comes out specifically iff dev->netdev_ops->ndo_bpf == NULL. You can see at https://github.com/amzn/amzn-drivers/blob/1c30884cea0704df0850019b15045961eca975a6/kernel/linux/ena/ena_netdev.c#L4076 that this isn't the case, so my first guess would be that ENA_XDP_SUPPORT has somehow ended up off. Perhaps try sticking a #error at that line to be sure it's getting compiled-in.

@ol-alexandra
Copy link
Contributor

I've tried upgrading the kernel using something like Elrepo to 5.12.9 but this results in not being able to build onload properly.

Onload build and works with linux-5.12 if CONFIG_VDPA is not set. I've filed an internal bug for CONFIG_VDPA issue. It usually takes a few weeks to fix such an issue. In the best case we'll get a fix next Tuesday,

@hjastenger
Copy link
Author

hjastenger commented Jun 3, 2021

Thanks for the replies all! Super helpful. Reverted my kernel upgrade back to 4.18 and swapped out the ENA for Intel (Intel(R) 10 Gigabit Virtual Function Network Driver, ixgbevf). Adding my eth0 interface doesn't yield me the eth0 does not support XDP anymore. It currently tells me:

[  450.666568] [sfc efrm] efrm_nondl_register_device: register eth0
[  450.673665] [sfc efrm] Using VI range 0+(0-1)<<0
[  450.673667] [sfc efrm] eth0 type=4:
[  450.678906] [sfc efrm] efrm_driverlink_resume:
[  450.691265] [sfc efrm] ? hardware init failed (-22, attempt 1 of 1)
[  450.691266] [sfc efrm] ?: ERROR: hardware init failed rc=-22
[  450.697517] [sfc efrm] eth0 index=0 ifindex=2
[  450.702383] [onload] oo_nic_add: ifindex=2 oo_index=0

I hope hardware init failed due to the fact that im currently not using one of the Solarflare NIC's? Running something like onload ping 8.8.8.8 still doesn't display the 'success' banner for me. Its still printing, which makes sense if this is happening due to not using a Solarflare NIC

oo:ping[6050]: netif_tcp_helper_alloc_u: ENODEV.
This error can occur if no Solarflare network interfaces
are active/UP, or they are running packed stream
firmware, are disabled or lack Onload activation keys.
Please check your configuration. To obtain activation
keys, please contact your sales representative.
oo:ping[6050]: __citp_netif_alloc: failed to construct netif (19)
oo:ping[6050]: citp_netif_alloc_and_init: failed to create netif (19)
oo:ping[6050]: citp_udp_socket: failed (errno:19) - PASSING TO OS

@ol-alexandra
Copy link
Contributor

Have you typed echo enp1s0f0 |sudo tee /sys/module/sfc_resource/afxdp/register for your NIC?

@hjastenger
Copy link
Author

You are referring to the step 'adding your interface' with echo ens2f0 > /sys/module/sfc_resource/afxdp/register right? Which I did for my eth0 interface, found the entry in dmesg [ 450.666568] [sfc efrm] efrm_nondl_register_device: register eth0

@ol-alexandra
Copy link
Contributor

Do you see Accelerating ens2f0: RX 1 TX 1 message in syslog from onload_cp_server? Have you brought the interface up?

@hjastenger
Copy link
Author

So I guess you only add ens2f0 if you have that interface, so thats why im adding eth0 instead of the placeholder from the docs. syslog does contain that entry

xxx onload_cp_server[2870]: Accelerating eth0: RX 1 TX 1

@ol-alexandra
Copy link
Contributor

Cool. Is the interface up?

@hjastenger
Copy link
Author

Yes the interface is up

[xx]# cat /sys/class/net/eth0/operstate
up

@hjastenger
Copy link
Author

Im still seeing a lot of context switches with perf stat -e 'sched:sched_switch' -a -A --timeout 10000 when running something with onload, so I have the impression onload isn't functioning properly?

@rhughes-xilinx
Copy link

The canonical way to determine whether Onload is functioning properly is with the onload_stackdump tool. onload_stackdump dump while the Onloaded app is still running should print out some info for each socket that's accelerated, each under a heading vaguely akin to TCP 0:2036 lcl=192.168.138.3:48087 rmt=192.168.138.2:12345 ESTABLISHED.

Note that ping's sockets will never be accelerated by Onload (we have yet to find a use-case where anybody needed ICMP to be very efficient). For minimal testing I tend to use netcat; for performance testing iperf or its ilk are fine.

@hjastenger
Copy link
Author

hjastenger commented Jun 3, 2021

Note that ping's sockets will never be accelerated by Onload (we have yet to find a use-case where anybody needed ICMP to be very efficient).

Was only running it with ping to verify onload was running by checking the banner message. Read somewhere on the onload forum that you could easily check onload was functioning by checking for the sucess info banner.

The canonical way to determine whether Onload is functioning properly is with the onload_stackdump tool. onload_stackdump dump while the Onloaded app is still running should print out some info for each socket that's accelerated, each under a heading vaguely akin to TCP 0:2036 lcl=192.168.138.3:48087 rmt=192.168.138.2:12345 ESTABLISHED.

Have been trying out the code used by Cloudflare in one of their blog posts (https://blog.cloudflare.com/how-to-receive-a-million-packets), but ill try to validate this with the onload_stackdump you describe, thanks again for the effort all!

@ronenhamias
Copy link

this thread is really helpful so thanks!

just small question where i can find onload_stackdump

@hjastenger
Copy link
Author

hjastenger commented Jun 4, 2021

@ronenhamias I think (assumption) he might be referring to onload_tcpdump?

@rhughes-xilinx
Copy link

No, I meant onload_stackdump. It should be in build/gnu_x86_64/tools/ip

@ronenhamias
Copy link

thank you @rhughes-xilinx @hjastenger

@monergeim
Copy link

Good day, gentlemens!
Regarding onload_stackdump, I'm using awslinux 2 with kernel 5.4.117 and ena ver 2.5.0, but getting some strange error during onload install, which led me to nowhere:

make[2]: Entering directory `/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools'
make -C /usr/src/kernels/5.4.117-58.216.amzn2.x86_64 NDEBUG=1 GCOV= CC=cc  symverfile=/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools/Module.symvers KBUILD_EXTMOD=/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools _module_/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools
make[3]: Entering directory `/usr/src/kernels/5.4.117-58.216.amzn2.x86_64'
make[4]: *** No rule to make target `_module_/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools'.  Stop.
make[3]: *** [sub-make] Error 2
make[3]: Leaving directory `/usr/src/kernels/5.4.117-58.216.amzn2.x86_64'
make[2]: *** [all] Error 2
make[2]: Leaving directory `/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools'

May somebody know the reason why it could appear?

@ronenhamias
Copy link

Hello again

following this thread - when i am running onload_stackdump dump as suggested

i am getting:

ci onload_stackdump should not itself be run under onload acceleration.

am i missing anything?

@rhughes-xilinx
Copy link

ci onload_stackdump should not itself be run under onload acceleration.

Is the question "I definitely think I'm not running this command under Onload", "I don't understand what this is saying" or "how can I tell whether I am or not?"? My first guess would be that somewhere way up your terminal you ran onload bash and forgot.

@ronenhamias
Copy link

actually was trying to verify that the installation works.

but i made a mistake and was trying to install it from https://support-nic.xilinx.com/wp/onload?sd=SF-109585-LS-35&pe=SF-122921-DH-4
and this version probably is for solarflare NIC only. and this is why its not working.

but when trying to build it from source the build fails with this error:

./openonload/scripts/onload_install

make[2]: Entering directory `/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools'
make -C /usr/src/kernels/5.4.117-58.216.amzn2.x86_64 NDEBUG=1 GCOV= CC=cc  symverfile=/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools/Module.symvers KBUILD_EXTMOD=/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools _module_/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools
make[3]: Entering directory `/usr/src/kernels/5.4.117-58.216.amzn2.x86_64'
make[4]: *** No rule to make target `_module_/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools'.  Stop.
make[3]: *** [sub-make] Error 2
make[3]: Leaving directory `/usr/src/kernels/5.4.117-58.216.amzn2.x86_64'
make[2]: *** [all] Error 2
make[2]: Leaving directory `/home/ec2-user/openonload/build/x86_64_linux-5.4.117-58.216.amzn2.x86_64/lib/citools'

still trying to figure out why it fails to build - any ideas?

@ol-alexandra
Copy link
Contributor

Any ideas how I can download that Amazon's kernel 5.4.117-58.216.amzn2? Or at least provide full build log, please. Sometimes there are some hints before the fatal error.

NB Onload is tested with Ubuntu's 5.4.0-73-generic, so I'd expect that any linux-5.4 works. Or at least builds. Probably Amazon patches their kernel, and I have no idea how. Google refused to provide me any details or download link for that Amazon's kernel. May be my google-foo is not sufficient.

@ceeaspb
Copy link

ceeaspb commented Nov 13, 2021

Doesn't fully answer your question, as I am not sure 5.4 is available now, but you can download the kernel source from within a running instance:
yumdownloader --source kernel-5.10.75-79.358.amzn2
It includes the upstream and patches.
May be of use to test the Amazon 2 image as it rolls forward.

@aneagoe
Copy link

aneagoe commented Jan 19, 2022

@hjastenger have you had any success with this? Are you able to share an update of the current state of things and a rough guide on getting this up and running?

@aneagoe
Copy link

aneagoe commented Jan 19, 2022

Ok, so I managed to compile, install and enable the latest code (commit 7b76edae6b7f7b7276c1496b2a6ed1420fab5306) against kernel 4.18.0-348.7.1.el8_5.x86_64 on a centos8 ec2 instance. I had to lower the MTU and adjust the channels (ie ethtool -L eth0 4) after which I was able to successfully enable onload:

Jan 19 15:20:46 localhost kernel: Solarflare NET driver v5.3.12.1010
Jan 19 15:20:46 localhost kernel: Efx driverlink registering resource driver
Jan 19 15:20:46 localhost kernel: [onload] Onload 7b76edae 2022-01-19 master 
Jan 19 15:20:46 localhost kernel: [onload] Copyright 2019-2022 Xilinx, 2006-2019 Solarflare Communications, 2002-2005 Level 5 Networks 
Jan 19 15:20:46 localhost onload_cp_server[156356]: Spawned daemon process 156374
Jan 19 15:20:46 localhost onload_cp_server[156374]: Onload Control Plane server 7b76edae 2022-01-19 master  started: id 0, pid 156374 
Jan 19 15:20:51 localhost kernel: [sfc efrm] efrm_nondl_register_device: register eth0
Jan 19 15:20:51 localhost kernel: [sfc efrm] eth0 type=4:\x000 ifindex=2
Jan 19 15:20:51 localhost kernel: ena 0000:00:05.0 eth0: XDP program is set, changing the max_mtu from 9216 to 3498
Jan 19 15:20:51 localhost kernel: [sfc efrm] eth0 index=0 ifindex=2
Jan 19 15:20:51 localhost kernel: [onload] oo_nic_add: ifindex=2 oo_index=0 
Jan 19 15:20:51 localhost onload_cp_server[156374]: Accelerating eth0: RX 1 TX 1

Onload is then seemingly accelerating applications. However, when inspecting onload_stackdump lots, I noticed there are no u_polls and only k_polls is increasing. Any idea why?

@rhughes-xilinx @hjastenger @eugeneia

PS:
I've set EF_POLL_USEC=1000000 and nothing changed.

@ol-alexandra
Copy link
Contributor

Please share onload_stackdump lots to start with.

@aneagoe
Copy link

aneagoe commented Jan 19, 2022

@ol-alexandra please see below. There are no accelerated sockets... Comparatively, running the same application (just a udpsender) on a machine with X2522 card shows accelerated socket and u_polls is increasing as expected.

[root@localhost ~]# onload_stackdump lots 
============================================================
ci_netif_dump_to_logger: stack=0 name=
  cplane_pid=156374
  namespace=net:[4026531992]
  Onload 7b76edae 2022-01-19 master  uid=0 pid=174172 ns_flags=a0
  creation_time=2022-01-19 15:59:15 (delta=27secs)
  lock=0 UNLOCKED  nics=1 primed=1
  sock_bufs: max=8192 n_allocated=2048 free=2048
  aux_bufs: free=0
  aux_bufs[syn-recv state]: n=0 max=2048
  aux_bufs[syn-recv bucket]: n=0 max=8192
  aux_bufs[epoll3 state]: n=0 max=8192
  aux_bufs[unknown]: n=0 max=8192
  pkt_sets: pkt_size=2048 set_size=1024 max=32 alloc=32
  pkt_set[0]: free=493 current
  pkt_set[1]: free=1024
  pkt_set[2]: free=1024
  pkt_set[3]: free=1024
  pkt_set[4]: free=1024
  pkt_set[5]: free=1024
  pkt_set[6]: free=1024
  pkt_set[7]: free=1024
  pkt_set[8]: free=1024
  pkt_set[9]: free=1024
  pkt_set[10]: free=1024
  pkt_set[11]: free=1024
  pkt_set[12]: free=1024
  pkt_set[13]: free=1024
  pkt_set[14]: free=1024
  pkt_set[15]: free=1024
  pkt_set[16]: free=1024
  pkt_set[17]: free=1024
  pkt_set[18]: free=1024
  pkt_set[19]: free=1024
  pkt_set[20]: free=1024
  pkt_set[21]: free=1024
  pkt_set[22]: free=1024
  pkt_set[23]: free=1024
  pkt_set[24]: free=1024
  pkt_set[25]: free=1024
  pkt_set[26]: free=1024
  pkt_set[27]: free=1024
  pkt_set[28]: free=1024
  pkt_set[29]: free=1024
  pkt_set[30]: free=1024
  pkt_set[31]: free=1024
  pkt_bufs: max=32768 alloc=32768 free=32237 async=0
  pkt_bufs: rx=531 rx_ring=499 rx_queued=0 pressure_pool=32
  pkt_bufs: tx=0 tx_ring=0 tx_oflow=0
  pkt_bufs: in_loopback=0 in_sock=0
  pkt_bufs: rx_reserved=0
  time: netif=541ad3 poll=541ad3 now=541aea (diff=0.032sec)
  active cache: hit=0 avail=0 cache=EMPTY pending=EMPTY
  passive scalable cache: cache=EMPTY pending=EMPTY
ci_netif_dump_vi: stack=0 intf=0 dev=? hw=4
  vi=0 pd_owner=1 channel=0 tcpdump=off vi_flags=a00000 oo_vi_flags=0
  evq: cap=-1 current=0 is_32_evs=1 is_ev=0
  evq: sync_major=ffffffff sync_minor=0 sync_min=0
  evq: sync_synced=0 sync_flags=0
  rxq: cap=511 lim=511 spc=12 level=499 total_desc=45
  txq: cap=511 lim=511 spc=511 level=0 pkts=0 oflow_pkts=0
  txq: pio_buf_size=0 tot_pkts=0 bytes=0
  txq: ts_nsec=40000000
  clk: 
  last_rx_stamp: 0:0
  ctpio: max_frame_len=0 frame_len_check=0 ct_thresh=4294940120
ci_netif_dump_extra_to_logger: stack=0
  in_poll=0 post_poll_list_empty=1 poll_did_wake=0
  rx_defrag_head=-1 rx_defrag_tail=-1
  tx_may_alloc=1 can=1 nonb_pool=0 send_may_poll=0 is_spinner=0,0
  hwport_to_intf_i=0,-1,-1,-1,-1,-1,-1,-1 intf_i_to_hwport=0,0,0,0,0,0,0,0
  uk_intf_ver=0602ac26bf92fd37e3e56f1bf6ad2f6c
  deferred count 0/32
  numa nodes: creation=0 load=0
  numa node masks: packet alloc=1 sock alloc=1 interrupt=0
pids: 
--------------------- sockets ------------------------------
-------------------- ci_netif_stats: 0 ---------------------
k_polls: 378
u_polls: 0
ioctl_evq_polls: 0
rx_evs: 45
tx_evs: 0
periodic_polls: 336
periodic_evs: 0
periodic_lock_contends: 0
interrupts: 41
interrupt_polls: 41
interrupt_evs: 0
interrupt_wakes: 0
interrupt_primes: 41
interrupt_no_events: 0
interrupt_lock_contends: 0
interrupt_budget_limited: 0
deferred_polls: 0
timeout_interrupts: 0
timeout_interrupt_when_primed: 0
timeout_interrupt_polls: 0
timeout_interrupt_evs: 0
timeout_interrupt_wakes: 0
timeout_interrupt_no_events: 0
timeout_interrupt_lock_contends: 0
muxer_primes: 0
pkt_wait_primes: 0
sock_sleeps: 0
sock_sleep_primes: 0
sock_wakes_rx: 0
sock_wakes_tx: 0
sock_wakes_rx_os: 0
sock_wakes_tx_os: 0
sock_wakes_signal: 0
pkt_huge_pages: 0
pkt_nonb: 0
pkt_nonb_steal: 0
pkt_wakes: 0
pkt_scramble0: 0
pkt_scramble1: 0
pkt_scramble2: 0
pkt_wait_spin: 0
unlock_slow: 380
unlock_slow_pkt_waiter: 0
unlock_slow_socket_list: 0
unlock_slow_need_prime: 0
unlock_slow_prime_ul: 0
unlock_slow_wake: 0
unlock_slow_swf_update: 0
unlock_slow_close: 1
unlock_slow_xdp_change: 0
unlock_slow_syscall: 1
lock_wakes: 0
stack_lock_buzz: 0
deferred_work: 0
sock_lock_sleeps: 0
sock_lock_buzz: 0
tcp_send_nonb_pool_empty: 0
tcp_send_ni_lock_contends: 0
tcp_send_fail_noroute: 0
udp_send_ni_lock_contends: 0
getsockopt_ni_lock_contends: 0
setsockopt_ni_lock_contends: 0
udp_send_mcast_loop: 0
udp_send_mcast_loop_drop: 0
active_opens: 0
tcp_handover_socket: 0
tcp_handover_bind: 0
tcp_handover_listen: 0
tcp_accept_os: 0
tcp_handover_connect: 0
tcp_handover_setsockopt: 0
udp_handover_socket: 0
udp_handover_bind: 1
udp_handover_connect: 0
udp_handover_setsockopt: 0
udp_bind_no_filter: 1
udp_connect_no_filter: 0
timewait_reap: 0
timewait_reap_filter: 0
table_max_hops: 1
table_mean_hops: 1
table_n_entries: 0
table_n_slots: 0
tcp_rtos: 0
tail_drop_probe_retrans: 0
tail_drop_probe_sendq: 0
tail_drop_probe_unnecessary: 0
tail_drop_probe_success: 0
rst_recv_acceptq: 0
rst_recv_synrecv: 0
rst_recv_synrecv_paws_rejected: 0
rst_recv_after_fin: 0
rst_recv_has_recvq: 0
rst_recv_has_sendq: 0
rst_recv_has_unack: 0
rst_recv_unacceptable: 0
rst_sent_unacceptable_ack: 0
rst_sent_synrecv_bad_syn: 0
rst_sent_synrecv_bad_ack: 0
rst_sent_listen_got_ack: 0
rst_sent_bad_options: 0
rst_sent_bad_seq: 0
rst_sent_no_match: 0
no_match_pass_to_kernel_batches: 41
no_match_pass_to_kernel_tcp: 45
no_match_pass_to_kernel_udp: 0
no_match_pass_to_kernel_ip_other: 0
no_match_pass_to_kernel_non_ip: 0
no_match_bad_intf_i: 0
no_match_bad_netdev: 0
no_match_oom: 0
no_match_corrupted: 0
no_match_dropped: 0
no_match_in_active_wild: 0
unacceptable_acks: 0
tcp_cant_fin: 0
tcp_cant_fin_resolved: 0
tcp_cant_fin_dropped: 0
synrecv_retransmits: 0
synrecv_send_fails: 0
synrecv_timeouts: 0
synrecv_purge: 0
syn_drop_busy: 0
syn_drop_no_return_route: 0
listen2synrecv: 0
synrecv2established: 0
ul_accepts: 0
accept_eagain: 0
aux_alloc_fails: 0
aux_bucket_alloc_fails: 0
accepts_deferred: 0
acks_sent: 0
wnd_updates_sent: 0
rx_slow: 0
rx_out_of_order: 0
rx_rob_non_empty: 0
retransmits: 0
invalid_ack_limited: 0
challenge_ack_sent: 0
challenge_ack_out_of_pkts: 0
challenge_ack_limited: 0
tx_defer_pkt: 0
tx_defer_pkt_fast: 0
tx_defer_pkt_sent: 0
tx_defer_pkt_drop_failed: 0
tx_defer_pkt_drop_arp_failed: 0
tx_defer_pkt_drop_timeout: 0
tx_defer_pkt_drop_limited: 0
tx_error_events: 0
rx_discard_csum_bad: 0
rx_discard_inner_csum_bad: 0
rx_discard_mcast_mismatch: 0
rx_discard_crc_bad: 0
rx_discard_trunc: 0
rx_discard_rights: 0
rx_discard_ip_options_bad: 0
rx_discard_len_err: 0
rx_discard_other: 0
rx_refill_recv: 0
rx_future: 0
rx_future_rollback: 0
reap_rx_limited: 0
reap_buf_limited: 0
pkts_reaped: 0
refill_rx_limited: 0
refill_buf_limited: 0
defer_work_limited: 0
defer_work_contended_unsafe: 0
tx_dma_max: 0
tx_dma_doorbells: 0
bufset_alloc_fails: 0
bufset_alloc_nospace: 0
mss_limitations: 0
memory_pressure_enter: 0
memory_pressure_exit_poll: 0
memory_pressure_exit_recv: 0
memory_pressure_drops: 0
udp_rx_no_match_drops: 0
udp_free_with_tx_active: 0
sw_filter_insert_table_full: 0
pio_pkts: 0
no_pio_err: 0
ctpio_pkts: 0
ctpio_dma_fallbacks: 0
tcp_sendpages: 0
poll_no_pkt: 0
sockcache_cached: 0
sockcache_contention: 0
passive_sockcache_stacklim: 0
active_sockcache_stacklim: 0
active_sockcache_non_ip4: 0
active_attach_fd: 0
active_attach_fd_reuse: 0
active_attach_fd_fail: 0
sockcache_socklim: 0
sockcache_hit: 0
sockcache_hit_reap: 0
sockcache_miss_intmismatch: 0
activecache_cached: 0
activecache_stacklim: 0
activecache_hit: 0
activecache_hit_reap: 0
sock_attach_fd_detach: 0
sock_attach_fd_detach_fail_soft: 0
sock_attach_fd_detach_fail_hard: 0
sock_attach_fd_more: 0
sock_attach_fd_more_fail: 0
epoll_fd_uncache: 0
accept_attach_fd_retry: 0
tcp_rcvbuf_abused: 0
tcp_rcvbuf_abused_rob_guilty: 0
tcp_rcvbuf_abused_recv_coalesced: 0
tcp_rcvbuf_abused_recv_guilty: 0
tcp_rcvbuf_abused_rob_desperate: 0
tcp_rcvbuf_abused_badly: 0
tcp_listen_synack_retrans_no_buffer: 0
proactive_packet_allocation: 0
stack_locks_deferred: 41
tcp_shared_local_ports_used: 0
tcp_shared_local_ports_reused_tw: 0
tcp_shared_local_ports_grow: 0
tcp_shared_local_ports_grow_failed: 0
tcp_shared_local_ports_exhausted: 0
epoll_add_non_home: 0
epoll_sb_state_alloc_failed: 0
sock_attach_fd_alloc_fail: 0
mac_filter_shares: 0
tcp_shared_local_ports_skipped: 0
tcp_shared_local_ports_skipped_in_use: 0
tcp_syn_retrans_once: 0
tcp_syn_retrans_twice: 0
tcp_syn_retrans_thrice: 0
tcp_syn_retrans: 0
tcp_connect_timedout: 0
tcp_connect_etimedout: 0
tcp_connect_icmp: 0
tcp_connect_ehostunreach: 0
tcp_connect_enetunreach: 0
tcp_connect_enetdown: 0
tcp_connect_refused: 0
tcp_connect_econnrefused: 0
tcp_connect_eother: 0
tcp_connect_eother_val: 0
tcp_seq_table_hits: 0
tcp_seq_table_insertions: 0
tcp_seq_table_expiries: 0
tcp_seq_table_purgations: 0
tcp_seq_table_steps: 0
tcp_seq_table_short_expiry: 0
tcp_seq_table_avoided: 0
tcp_urgent_ignore_rx: 0
tcp_urgent_process_rx: 0
ip_options: 0
tcpdump_missed: 0
lowest_free_pkts: 32225
rx_xdp_pass: 0
rx_xdp_drop: 0
rx_xdp_tx: 0
rx_xdp_redirect: 0
rx_xdp_aborted: 0
rx_xdp_unknown: 0
rx_icmp: 0
rx_icmp_dropped: 0
rx_icmp_enomem: 0
rx_icmp_handler_batches: 0
rx_icmp_hooked: 0
-------------------- more_stats: 0 -------------------------
TCP_CLOSED: 0
TCP_LISTEN: 0
TCP_SYN_SENT: 0
TCP_ESTABLISHED: 0
TCP_CLOSE_WAIT: 0
TCP_LAST_ACK: 0
TCP_FIN_WAIT1: 0
TCP_FIN_WAIT2: 0
TCP_CLOSING: 0
TCP_TIME_WAIT: 0
TCP_STATE_FREE: 2048
TCP_STATE_UDP: 0
TCP_STATE_PIPE: 0
TCP_STATE_AUXBUF: 0
TCP_STATE_ACTIVE_WILD: 0
BAD_STATE: 0
sock_orphans: 0
sock_wake_needed_rx: 0
sock_wake_needed_tx: 0
tcp_has_recvq: 0
tcp_recvq_bytes: 0
tcp_recvq_pkts: 0
tcp_has_recv_reorder: 0
tcp_recv_reorder_pkts: 0
tcp_has_sendq: 0
tcp_sendq_bytes: 0
tcp_sendq_pkts: 0
tcp_has_inflight: 0
tcp_inflight_bytes: 0
tcp_inflight_pkts: 0
tcp_n_in_listenq: 0
tcp_n_in_acceptq: 0
udp_has_recvq: 0
udp_recvq_pkts: 0
udp_has_sendq: 0
udp_sendq_bytes: 0
udp_tot_recv_pkts_ul: 0
udp_tot_recv_drops_ul: 0
udp_tot_recv_pkts_os: 0
udp_tot_send_pkts_ul: 0
udp_tot_send_pkts_os: 0
ef_vi_rx_ev_lost: 0
ef_vi_rx_ev_bad_desc_i: 0
ef_vi_rx_ev_bad_q_label: 0
ef_vi_evq_gap: 0
--------------------- ci_ip_stats: 0 -----------------------
in_recvs: 45
in_hdr_errs: 0
in_discards: 0
in_delivers: 45
-------------------- ci_tcp_stats: 0 -----------------------
tcp_active_opens: 0
tcp_passive_opens: 0
tcp_estab_resets: 0
tcp_curr_estab: 0
tcp_in_segs: 45
tcp_out_segs: 0
tcp_retran_segs: 0
tcp_out_rsts: 0
-------------------- ci_tcp_ext_stats: 0 -------------------
syncookies_sent: 0
syncookies_recv: 0
syncookies_failed: 0
embrionic_rsts: 0
prune_called: 0
rcv_pruned: 0
ofo_pruned: 0
out_of_window_icmps: 0
lock_dropped_icmps: 0
arp_filter: 0
time_waited: 0
time_wait_recycled: 0
time_wait_killed: 0
paws_passive_rejected: 0
paws_active_rejected: 0
paws_estab_rejected: 0
tso_missing: 0
delayed_ack: 0
delayed_ack_locked: 0
delayed_ack_lost: 0
listen_overflows: 0
listen_no_pkts: 0
listen_drops: 0
tcp_prequeued: 0
tcp_direct_copy_from_backlog: 0
tcp_direct_copy_from_prequeue: 0
tcp_prequeue_dropped: 0
tcp_hp_hits: 0
tcp_hp_hits_to_user: 0
tcp_pure_acks: 0
tcp_hp_acks: 0
tcp_reno_recovery: 0
tcp_sack_recovery: 0
tcp_sack_reneging: 0
tcp_fack_reorder: 0
tcp_sack_reorder: 0
tcp_reno_reorder: 0
tcp_ts_reorder: 0
tcp_full_undo: 0
tcp_partial_undo: 0
tcp_loss_undo: 0
tcp_sack_undo: 0
tcp_loss: 0
tcp_lost_retransmit: 0
tcp_reno_failures: 0
tcp_sack_failures: 0
tcp_loss_failures: 0
tcp_timeouts: 0
tcp_reno_recovery_fail: 0
tcp_sack_recovery_fail: 0
tcp_fast_retrans: 0
tcp_forward_retrans: 0
tcp_slow_start_retrans: 0
tcp_scheduler_failures: 0
tcp_rcv_collapsed: 0
tcp_dsack_old_sent: 0
tcp_dsack_ofo_sent: 0
tcp_dsack_recv: 0
tcp_dsack_ofo_recv: 0
tcp_abort_on_syn: 0
tcp_abort_on_data: 0
tcp_abort_on_close: 0
tcp_abort_on_memory: 0
tcp_abort_on_timeout: 0
tcp_abort_on_linger: 0
tcp_abort_on_delegated_send: 0
tcp_abort_failed: 0
tcp_memory_pressures: 0
-------------------- ci_udp_stats: 0 -----------------------
udp_in_dgrams: 0
udp_no_ports: 0
udp_in_errs: 0
udp_out_dgrams: 0
--------------------- config opts --------------------------
                        NDEBUG: 1
                    EF_URG_RFC: 0
               EF_TCP_URG_MODE: 1
                    EF_TX_PUSH: 1
             EF_POLL_ON_DEMAND: 1
                 EF_INT_DRIVEN: 0 (default: 1)
             EF_POLL_IN_KERNEL: 0
                   EF_XDP_MODE: 0
                EF_INT_REPRIME: 0
                 EF_MCAST_RECV: 1
       EF_FORCE_SEND_MULTICAST: 1
         EF_MULTICAST_LOOP_OFF: 1
                 EF_MCAST_SEND: 0
         EF_MCAST_RECV_HW_LOOP: 1
        EF_TCP_LISTEN_HANDOVER: 0
       EF_TCP_CONNECT_HANDOVER: 0
       EF_UDP_CONNECT_HANDOVER: 1
          EF_FORCE_TCP_NODELAY: 0
          EF_UDP_SEND_UNLOCKED: 1
              EF_UNCONFINE_SYN: 1
      EF_BINDTODEVICE_HANDOVER: 0
    EF_MCAST_JOIN_BINDTODEVICE: 0
              EF_TCP_RX_CHECKS: 0
                  EF_USE_DSACK: 1
     EF_TIMESTAMPING_REPORTING: 0
            EF_RX_TIMESTAMPING: 0
   EF_RX_TIMESTAMPING_ORDERING: 0
            EF_TX_TIMESTAMPING: 0
             EF_TCP_TSOPT_MODE: 2
             EF_CLUSTER_IGNORE: 1 (default: 0)
               EF_VALIDATE_ENV: 1
            EF_TAIL_DROP_PROBE: 0 (default: 1)
       EF_TCP_RST_DELAYED_CONN: 0
            EF_TCP_SNDBUF_MODE: 1
     EF_TCP_COMBINE_SENDS_MODE: 0
   EF_TCP_SOCKBUF_MAX_FRACTION: 1
             EF_TCP_SYNCOOKIES: 0
EF_TCP_SEND_NONBLOCK_NO_PACKETS_MODE: 0
          EF_TCP_RCVBUF_STRICT: 0
EF_UDP_SEND_NONBLOCK_NO_PACKETS_MODE: 0
            EF_TCP_RCVBUF_MODE: 0
       EF_HIGH_THROUGHPUT_MODE: 0
EF_TCP_TIME_WAIT_ASSASSINATION: 1
        EF_MCAST_JOIN_HANDOVER: 0
         EF_PACKET_BUFFER_MODE: 0
        EF_TCP_SERVER_LOOPBACK: 0
        EF_TCP_CLIENT_LOOPBACK: 0
             EF_USE_HUGE_PAGES: 1
        EF_COMPOUND_PAGES_MODE: 0
                        EF_PIO: 1
                      EF_CTPIO: 1
                 EF_CTPIO_MODE: 1
               EF_TCP_SYN_OPTS: 7
      EF_TCP_ADV_WIN_SCALE_MAX: 14
             EF_TCP_TCONST_MSL: 25
            EF_TCP_FIN_TIMEOUT: 5 (default: 60)
           EF_TCP_RX_LOG_FLAGS: 0
       EF_RETRANSMIT_THRESHOLD: 9 (default: 15)
EF_RETRANSMIT_THRESHOLD_ORPHAN: 8
   EF_RETRANSMIT_THRESHOLD_SYN: 6 (default: 4)
EF_RETRANSMIT_THRESHOLD_SYNACK: 5
                 EF_SHARE_WITH: 0
                   EF_RXQ_SIZE: 512
                   EF_TXQ_SIZE: 512
           EF_SEND_POLL_THRESH: 64
          EF_SEND_POLL_MAX_EVS: 96
     EF_UDP_SEND_UNLOCK_THRESH: 1500
      EF_UDP_PORT_HANDOVER_MIN: 2
      EF_UDP_PORT_HANDOVER_MAX: 1
     EF_UDP_PORT_HANDOVER2_MIN: 2
     EF_UDP_PORT_HANDOVER2_MAX: 1
     EF_UDP_PORT_HANDOVER3_MIN: 2
     EF_UDP_PORT_HANDOVER3_MAX: 1
              EF_DELACK_THRESH: 1
         EF_DYNAMIC_ACK_THRESH: 16
        EF_CHALLENGE_ACK_LIMIT: 1000
      EF_INVALID_ACK_RATELIMIT: 500
           EF_SOCKET_CACHE_MAX: 0
       EF_PER_SOCKET_CACHE_MAX: 0 (default: -1)
        EF_ACCEPTQ_MIN_BACKLOG: 1
       EF_NONAGLE_INFLIGHT_MAX: 50
           EF_DEFER_WORK_LIMIT: 32
                   EF_IRQ_CORE: -1
                EF_IRQ_CHANNEL: -1
                  EF_RXQ_LIMIT: 511 (default: 65535)
               EF_EVS_PER_POLL: 64
                  EF_SPIN_USEC: 100001 (default: 0)
                  EF_BUZZ_USEC: 100 (default: 0)
                EF_HELPER_USEC: 500
          EF_HELPER_PRIME_USEC: 250
                EF_MAX_PACKETS: 32768
             EF_MAX_RX_PACKETS: 24576
             EF_MAX_TX_PACKETS: 24576
                    EF_RXQ_MIN: 256
           EF_MIN_FREE_PACKETS: 100
           EF_PREFAULT_PACKETS: 1
           EF_PREALLOC_PACKETS: 0
              EF_MAX_ENDPOINTS: 8192
    EF_ENDPOINT_PACKET_RESERVE: 0
              EF_DEFER_ARP_MAX: 128
          EF_DEFER_ARP_TIMEOUT: 60
EF_TCP_SNDBUF_ESTABLISHED_DEFAULT: 131072
EF_TCP_RCVBUF_ESTABLISHED_DEFAULT: 131072
                 EF_TCP_SNDBUF: 0
                 EF_TCP_RCVBUF: 0
                 EF_UDP_SNDBUF: 0
                 EF_UDP_RCVBUF: 0
            EF_TCP_BACKLOG_MAX: 512 (default: 256)
            EF_TCP_SYNRECV_MAX: 2048 (default: 1024)
           EF_TCP_INITIAL_CWND: 0
          EF_TCP_LOSS_MIN_CWND: 0
               EF_TCP_MIN_CWND: 0
         EF_TCP_FASTSTART_INIT: 65536
         EF_TCP_FASTSTART_IDLE: 65536
         EF_TCP_FASTSTART_LOSS: 65536
       EF_TCP_EARLY_RETRANSMIT: 1
            EF_RFC_RTO_INITIAL: 1000
                EF_RFC_RTO_MIN: 200
                EF_RFC_RTO_MAX: 120000
             EF_KEEPALIVE_TIME: 237000 (default: 7200000)
            EF_KEEPALIVE_INTVL: 13000 (default: 75000)
           EF_KEEPALIVE_PROBES: 7 (default: 9)
        EF_BURST_CONTROL_LIMIT: 0
      EF_CONG_AVOID_SCALE_BACK: 0
 EF_FREE_PACKETS_LOW_WATERMARK: 256 (default: 0)
              EF_PIO_THRESHOLD: 1514
        EF_CTPIO_MAX_FRAME_LEN: 500 (default: 0)
            EF_CTPIO_CT_THRESH: 64
        EF_CTPIO_SWITCH_BYPASS: 0
          EF_TX_PUSH_THRESHOLD: 100
                        EF_LOG: 27
     EF_TCP_SHARED_LOCAL_PORTS: 0
EF_TCP_SHARED_LOCAL_PORTS_REUSE_FAST: 0
 EF_TCP_SHARED_LOCAL_PORTS_MAX: 100
EF_TCP_SHARED_LOCAL_PORTS_NO_FALLBACK: 0
EF_TCP_SHARED_LOCAL_PORTS_PER_IP: 0
EF_TCP_SHARED_LOCAL_PORTS_PER_IP_MAX: 0
EF_TCP_SHARED_LOCAL_PORTS_STEP: 1
           EF_SCALABLE_FILTERS: ""
EF_SCALABLE_FILTERS_IFINDEX_ACTIVE: 0
EF_SCALABLE_FILTERS_IFINDEX_PASSIVE: 0
      EF_SCALABLE_FILTERS_MODE: 0 (default: -1)
         EF_PERIODIC_TIMER_CPU: -1
    EF_SCALABLE_FILTERS_ENABLE: 0
       EF_SCALABLE_LISTEN_MODE: 0
EF_SCALABLE_ACTIVE_WILDS_NEED_FILTER: 1 (default: 0)
        EF_INTERFACE_WHITELIST: ""
        EF_INTERFACE_BLACKLIST: ""
  EF_KERNEL_PACKETS_BATCH_SIZE: 1
  EF_KERNEL_PACKETS_TIMER_USEC: 500
               EF_TCP_ISN_MODE: 1
    EF_TCP_ISN_INCLUDE_PASSIVE: 0
               EF_TCP_ISN_2MSL: 240
             EF_TCP_ISN_OFFSET: 65537
         EF_TCP_ISN_CACHE_SIZE: 0
            EF_AF_XDP_ZEROCOPY: 0
                  EF_ICMP_PKTS: 64
--------------------- stack time ---------------------------
          sched_ticks: 541ad3
ci_ip_time_real_ticks: 541ad3
                  frc: 1506b4ce92de
  ci_ip_time_frc2tick: 22
    ci_ip_time_frc2us: 12
   ci_ip_time_frc2isn: 10
   ci_ip_time_tick2ms: 1.398118
   ci_ip_time_ms2tick: 0.715247 (b71a7400)
                  khz: 2999965
time constants for this CPU
  rto_initial: 715ticks (1000ms)
  rto_min: 144ticks (200ms)
  rto_max: 85829ticks (120000ms)
  delack: 35ticks (50ms)
  idle: 35ticks (50ms)
  keepalive_time: 169513ticks (237000ms)
  keepalive_intvl: 9298ticks (13000ms)
  keepalive_probes: 7
  zwin_max: 85829ticks (120000ms)
  paws_idle: 1483136993ticks (2073600000ms)
  PMTU slow discover: 407691ticks (570000ms)
  PMTU fast discover: 85829ticks (120000ms)
  PMTU recover: 21457ticks (30000ms)
  RFC 5961 challenge ack limit: 2 per tick, 1000 per sec

  Time between ACKs sent as a response to invalid incoming TCP packets: 358ticks (500ms)

  Intrumentation: 1ticks (0ms)
--------------------- process env --------------------------
--------------------------------------------

@aneagoe
Copy link

aneagoe commented Jan 19, 2022

@ol-alexandra ok, I managed to find something more. When I run the application with onload, I can see the following on /var/log/messages:

Jan 19 16:11:11 localhost kernel: [onload] [1]: WARNING: huge pages are incompatible with AF_XDP. Disabling hugepage support.
Jan 19 16:11:11 localhost kernel: ena 0000:00:05.0 eth0: Command parameter 46 is not supported
Jan 19 16:11:11 localhost kernel: [sfc efhw] rxclass_get_dev_info: rxclass: Cannot get RX class rule count 
Jan 19 16:11:11 localhost kernel: [onload] __oof_socket_add_wild: 1:2047 ERROR: FILTER UDP 172.29.28.32:65500 0.0.0.0:0 failed (-95)
Jan 19 16:11:14 localhost kernel: [sfc efhw] af_xdp_flush_rx_dma_channel: FIXME AF_XDP
Jan 19 16:11:14 localhost kernel: [sfc efhw] af_xdp_flush_tx_dma_channel: FIXME AF_XDP

@ol-alexandra
Copy link
Contributor

ethtool -i eth0 please

@aneagoe
Copy link

aneagoe commented Jan 19, 2022

@ol-alexandra

[root@localhost ~]# ethtool -i eth0
driver: ena
version: 4.18.0-348.7.1.el8_5.x86_64
firmware-version:
expansion-rom-version:
bus-info: 0000:00:05.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

@ol-alexandra
Copy link
Contributor

ena does not support n-tuple filtering, which is required for Onload-over-AF_XDP.

@abower-xilinx @maciejj-xilinx I do not see n-tuple filtering being mentioned in https://github.com/Xilinx-CNS/onload/blob/master/README.md

@aneagoe
Copy link

aneagoe commented Jan 20, 2022

@ol-alexandra thanks a lot for clarifying. Indeed, the driver doesn't seem to support n-tuple filtering, as per https://lore.kernel.org/lkml/1486646499-13682-2-git-send-email-netanel@annapurnalabs.com/

@maciejj-xilinx
Copy link
Contributor

In principle, n-tuple supports would not be required if the interface was configured with single channel.

  • {{ethtool -l }} output would show how many RX channels there is and {{ethtool -L}} could be used to set it.

Currently, Onload strictly requires n-tuple support. However, it would be pretty simple change to allow it to run without filters when single channel is set on the NIC. Would this be your usecase?

@aneagoe
Copy link

aneagoe commented Jan 20, 2022

@maciejj-xilinx it's definitely worth a shot. Our goal is to optimize the ec2 network stack on top of ENA adapters by using kernel bypass. It's very difficult to asses how big of an improvement we'd be looking at without testing first. If this is an easy fix, I'm happy to test and report back.
I've also opened a feature request with the team responsible for the ena driver (amzn/amzn-drivers#206)

@aneagoe
Copy link

aneagoe commented Feb 24, 2022

@maciejj-xilinx is there any chance to drop n-tuple hard requirement in case interface is configured with single channel?

@maciejj-xilinx
Copy link
Contributor

Hi aneagoe,

Thanks for getting in touch.

I have submitted a change that allows disabling use of ntuple filters.

Disabling ntuple filter can be achieved with
echo 0 > /sys/module/sfc_resource/parameters/enable_af_xdp_flow_filters

We have not tested the change specifically with ENA, but we are hopeful that this should get you further.

Hope that helps and let us know whether this works for you.

Maciej

@aneagoe
Copy link

aneagoe commented Feb 28, 2022

@maciejj-xilinx thanks a lot for this, I'll test with it shortly.

@aneagoe
Copy link

aneagoe commented Mar 2, 2022

That definitely did the trick and now I see u_polls being ~10x k_polls. So acceleration definitely works. However, it seems to be slower that without onload by ~1us (just observational testing with udpserver/udpclient and isolcpus/pinning). In both cases (with and without onload) I've used the same channel configuration (1).

@maciejj-xilinx
Copy link
Contributor

We have not characterized Onload on ENA specifically, however there are some general tuning guidelines to optimize Onload for latency.

On onload side the most important latency-wise tunables are:

EF_POLL_USEC=100000
EF_UL_EPOLL=3

From our observations with Onload over AF_XDP latency was better than with kernel stack. Though this would be largely depending on internals of particular network driver and its setting such as interrupt moderation.

In general we saw really good results with applications such as Redis or Nginx proxy With these the focus is throughput, or latency under load.

Can I ask what sort of application you are working with?

@aneagoe
Copy link

aneagoe commented Mar 8, 2022

The only testing that I've done (where I observed the difference in performance) was using udpclient/udpserver from here: https://github.com/majek/dump/tree/master/how-to-receive-a-packet. I haven't tested with EF_UL_EPOLL=3 but since onload is using the latency profile by default EF_POLL_USEC=100000 was set (and confirmed via onload_stackdump lots).
In contrast, testing the same but on two back to back servers with X2522 adapters showed an improvement of about ~3.5us when running with onload:

[root@aloha ~]# taskset -c 5 udpclient 10.0.3.3:10000 --polling --timestamp
[*] Sending to 10.0.3.3:10000, polling=1, src_port=65500
pps= 47340 avg= 20.310us dev=  1.583us min= 0.000us  (packet=1.184us/0.333)  
pps= 47674 avg= 20.147us dev=  1.419us min=12.651us  (packet=1.185us/0.358)  
pps= 47691 avg= 20.139us dev=  1.456us min=12.631us  (packet=1.180us/0.330)  
pps= 47550 avg= 20.195us dev=  1.345us min=12.645us  (packet=1.198us/0.289)  
pps= 49298 avg= 19.442us dev=  1.519us min=12.234us  (packet=1.182us/0.334)  
pps= 48211 avg= 19.912us dev=  1.408us min=12.673us  (packet=1.182us/0.340)   
^C
[root@aloha ~]# taskset -c 5 onload udpclient 10.0.3.3:10000 --polling --timestamp
[*] Sending to 10.0.3.3:10000, polling=1, src_port=65500
oo:udpclient[1565109]: Using OpenOnload 7.1.1.75 [6]
oo:udpclient[1565109]: Copyright 2019-2021 Xilinx, 2006-2019 Solarflare Communications, 2002-2005 Level 5 Networks
pps= 65540 avg= 14.303us dev=  0.530us min= 0.000us  (packet=0.205us/0.115)  
pps= 64646 avg= 14.702us dev=  1.230us min= 9.197us  (packet=0.280us/0.099)  
pps= 64635 avg= 14.717us dev=  1.046us min= 9.121us  (packet=0.307us/0.250)  
pps= 66688 avg= 14.243us dev=  0.521us min= 8.918us  (packet=0.292us/0.166)  
pps= 65226 avg= 14.575us dev=  1.109us min= 9.080us  (packet=0.296us/0.076)  
pps= 64746 avg= 14.674us dev=  0.964us min= 9.156us  (packet=0.299us/0.096)  

Both on AWS and on the physical servers I've used isolcpus and kernel parameter optimizations.
I can imagine that running in single channel mode has some detrimental effects but even when not using onload the adapter was still set in single channel mode, so the comparison is quite fair.

@shirshen12
Copy link

shirshen12 commented Mar 23, 2022

@aneagoe I have been able to run Onload-on-AF_XDP on Mellanox NiCs (Connect X 4 Lx and beyond with mlx5_core and AF_XDP ZC support) and Intel NiCs (ixgbe/i40e) in Openstack and bare-metals (Azure in progress) and the numbers are amazing for Redis and Memcached.

Redis: 2X vis-a-vis kernel
Memcached: 35% better vis-a-vis kernel

One major issue is Onload-on-AF_XDP by default does not use all hidden queues which are mapped to Onload, like how it does for Onload-on-ef_vi in Solarflare NiCs.

You can check that by:

ethtool -S <ifname> | grep xdp_redirect

and you will see just one queue with all packets being bounced into Onload.

Auto sensing of multi-threaded apps is a problem in AF_XDP by design.

Someone from Onload team, please correct me if I am wrong.

@mikhail-lapshin
Copy link

Hi maciejj-xilinx

Trying the latest Onload with ENA, and this seems not working for me.

Disabling ntuple filter can be achieved with
echo 0 > /sys/module/sfc_resource/parameters/enable_af_xdp_flow_filters

$ ethtool -S ens5 | grep xdp
     queue_0_rx_xdp_aborted: 3281
     queue_0_rx_xdp_drop: 0
     queue_0_rx_xdp_pass: 388380
     queue_0_rx_xdp_tx: 0
     queue_0_rx_xdp_invalid: 0
     queue_0_rx_xdp_redirect: 495933

queue_0_rx_xdp_redirect is growing when I run some traffic. queue_0_rx_xdp_pass growing as well. Does this imply that it works?

dmesg still have FIXME AF_XDP

[ 5902.191771] [onload] Onload  
[ 5902.194635] [onload] Copyright 2019-2022 Xilinx, 2006-2019 Solarflare Communications, 2002-2005 Level 5 Networks
[ 5902.304378] onload_cp_server[36188]: Spawned daemon process 36208
[ 5907.916669] [sfc efrm] efrm_nondl_register_device: register ens5
[ 5907.920690] [sfc efrm] ens5 type=4:\x000 ifindex=2
[ 5907.922772] ena 0000:00:05.0 ens5: XDP program is set, changing the max_mtu from 9216 to 3498
[ 5907.922786] [sfc efrm] ens5 index=0 ifindex=2
[ 5907.922798] [onload] oo_nic_add: ifindex=2 oo_index=0
[ 5984.419463] [onload] [0]: WARNING: huge pages are incompatible with AF_XDP. Disabling hugepage support.
[ 5994.476638] [sfc efhw] af_xdp_flush_rx_dma_channel: FIXME AF_XDP
[ 5994.480669] [sfc efhw] af_xdp_flush_tx_dma_channel: FIXME AF_XDP

My measurements with and without onload are the same, so I strongly believe onload is not working in my case.

How to trouble shoot this?

@shirshen12
Copy link

shirshen12 commented May 18, 2022

Ignore those FIXME_AF_XDP. You need to see if XDP_REDIRECT clause is implemented in the ENA driver or not. This is the same problem I see in Azure.

Onload works on all drivers where the XDP_REDIRECT clause is implemented like Intel (ixgbe/i40e), Mellanox(mlx5_core).

You can try in AWS with EC2 instances which give PCI passthrough to Intel 82599 NiCs

@mikhail-lapshin
Copy link

EC2 with Intel NIC is not working either.
It fails on registration
echo ens3 | sudo tee /sys/module/sfc_resource/afxdp/register

[ 1839.705182] [onload] Onload  
[ 1839.707833] [onload] Copyright 2019-2022 Xilinx, 2006-2019 Solarflare Communications, 2002-2005 Level 5 Networks
[ 1839.819177] onload_cp_server[35494]: Spawned daemon process 35514
[ 1862.899986] [sfc efrm] nondl_register_store: cannot detect number of channels for device ens3 assuming 1
[ 1862.899991] [sfc efrm] efrm_nondl_register_device: register ens3
[ 1862.906456] [sfc efrm] ens3 type=4:\x000 ifindex=2
[ 1862.906732] [sfc efrm] ?: ERROR: hardware init failed rc=-22
[ 1862.912687] [sfc efrm] ens3 index=0 ifindex=2
[ 1862.912742] [onload] oo_nic_add: ifindex=2 oo_index=0

And it's not just a log line to ignore, onload reports the same error.

$ sudo onload  iperf -c 172.31.36.232 -M 64 -t 10 -i 1
oo:iperf[35753]: netif_tcp_helper_alloc_u: ENODEV.
This error can occur if:
 - no Solarflare network interfaces are active/UP, or they are running
   packed stream firmware or are disabled, and
 - there are no AF_XDP interfaces registered with sfc_resource
Please check your configuration.
WARNING: attempt to set TCP maxmimum segment size to 64 failed.
Setting the MSS may not be implemented on this OS.

Still would prefer to sort the issue with ENA. These Intel NIC instances don't look good for our needs.

@shirshen12
Copy link

shirshen12 commented May 19, 2022

Please use this setup in AWS with Ubuntu 21.04 and report. Maybe I can help from there.

NiC Intel 82599 - 10GBe
Driver: ixgbe
OS: Ubuntu 21.04 LTS

### upgrade to latest OS kernel
apt update -y
apt upgrade -y
apt full-upgrade -y

### reboot into new kernel
reboot

apt install build-essential net-tools unzip libcap-dev linux-tools-common linux-tools-generic netperf libevent-dev libnl-route-3-dev tk bison tcl libnl-3-dev flex libnl-route-3-200 dracut python2 libpcap-dev -y
apt install initramfs-tools -y

### build the intel driver, ixgbe
wget https://downloadmirror.intel.com/682680/ixgbe-5.13.4.tar.gz
tar zxf ixgbe-5.13.4.tar.gz
cd ixgbe-5.13.4/src/
make install

### The binary will be installed as:
/lib/modules/<KERNEL VER>/updates/drivers/net/ethernet/intel/ixgbe/ixgbe.ko

### Load the ixgbe module using the modprobe command.
rmmod ixgbe; modprobe ixgbe

### update the initrd/initramfs file to prevent the OS loading old versions of the ixgbe driver.
update-initramfs -u

### reboot again, just for safety
reboot

### Install Onload:
git clone https://github.com/Xilinx-CNS/onload.git
cd onload
scripts/onload_mkdist --release
cd onload-<version>/scripts/
./onload_install
./onload_tool reload

### register the NiC with AF_XDP driver interface
echo enp1s0 > /sys/module/sfc_resource/afxdp/register

### turn on Intel Flow Director
ethtool --features enp1s0 ntuple on

@Ventus5566
Copy link

@shirshen12

Hi
which AWS ec2 type you used?
on AWS ec2, I use Ubuntu 20.04 LTS, but network adapter is ixgbevf, not ixgbe

there is three command has error message

  1. ./onload_tool reload
    error messge like this
    "modprobe: ERROR: could not insert 'sfc': Unknown symbol in module, or unknown parameter (see dmesg)
    modprobe: ERROR: ../libkmod/libkmod-module.c:990 command_do() Error running install command '/sbin/modprobe --ignore-install sfc && if [ -e /lib/modules/$(uname -r)/extra/sfc_aoe.ko ]; then /sbin/modprobe sfc_aoe; fi' for module sfc: retcode 1
    modprobe: ERROR: could not insert 'sfc': Invalid argument
    onload_tool: FAILED: /sbin/modprobe sfc

  2. echo enp1s0 > /sys/module/sfc_resource/afxdp/register
    after I do this command , register file is still empty
    also I do "ulimit -l unlimited" , but look like not helpful

  3. ethtool --features ens4 ntuple on
    error message shows
    "Cannot change ntuple-filters"
    "Could not change any device features"
    and "ethtool -k ens4" shows "ntuple-filters: off [fixed]"

@shirshen12
Copy link

@Ventus5566 yes its ixgbevf and not ixgbe direct! You will need PCI passthrough access to Intel NiCs at minimum, which means ixgbe is required. I am not sure of ixgbevf. @rhughes-xilinx once said that ixgbevf needs patches.

I have tested on:

  1. Bare-metal Intel 82599 and XL710 (ixgbe and i40e)
  2. Bare-metal Mellanox Connect X 4Lx, 5X and 6X (mlx5_core 5.4.1 onwards)
  3. VMs with PCI passthrough/SR-IOV access to Intel 82599 and XL710 (ixgbe and i40e and Mellanox Connect X 4Lx, 5X and 6X (mlx5_core 5.4.1 onwards)

@lparkersc
Copy link

Hi all. This post is a bit stale. I've run into a few issues with setting up onload with ena on aws. I've created a new issue here. Any help would be much appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests