Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to test XDP_TX performance using Linux traffic gen? #639

Closed
williamtu opened this issue Oct 18, 2021 · 21 comments
Closed

How to test XDP_TX performance using Linux traffic gen? #639

williamtu opened this issue Oct 18, 2021 · 21 comments
Assignees
Labels
documentation Improvements or additions to documentation triaged Discussed in a triage meeting
Milestone

Comments

@williamtu
Copy link

williamtu commented Oct 18, 2021

Hi,

I have a sender Linux machine and receiver Windows machine, followed the xdp_test.exe guide
https://github.com/microsoft/ebpf-for-windows/blob/master/docs/GettingStarted.md#xdp_testsexe
to load the xdp program using netsh

netsh ebpf>add program reflect_packet.o xdp
netsh ebpf>show programs

    ID  Pins  Links  Mode       Type           Name
======  ====  =====  =========  =============  ====================
 65539     1      1  JIT        xdp            reflect_packet

And I have another Linux machine running DPDK test-me to generate traffic

root@t3600-2:~/dpdk/build/app# ./dpdk-testpmd -a 0000:02:00.0 -- -i --port-topology=chained --forward-mode=txonly --eth-peer=0,18:66:da:a2:62:6c --tx-ip=10.20.114.118,10.20.114.115
Waiting for lcores to finish...

  ---------------------- Forward statistics for port 0  ----------------------
  RX-packets: 0              RX-dropped: 0             RX-total: 0
  TX-packets: 512830304      TX-dropped: 748000        TX-total: 513578304
  ----------------------------------------------------------------------------

At Windows, I saw RX packets, but no TX packets.
Screen Shot 2021-10-18 at 6 37 58 AM

Question:
How do I know which Windows interface the XDP program binds to?
Is there a tool / command to know the XDP_TX packet rate?
(Or any pointer to the source code for me to read)

Thank you
William

@shankarseal
Copy link
Collaborator

The reflect_packet program only reflects UDP datagrams destined to REFLECTION_TEST_PORT
#define REFLECTION_TEST_PORT 8989

#define REFLECTION_TEST_PORT 8989

I will update the MD with the above information.

Currently the XDP eBPF program binds to all interfaces. Since the packets will be intercepted and reflected back at low layer (just above the NDIS miniport driver), I doubt if taskmgr will be able to show it though.

@shankarseal shankarseal self-assigned this Oct 18, 2021
@shankarseal shankarseal added this to the 2110 milestone Oct 18, 2021
@dthaler dthaler added triaged Discussed in a triage meeting documentation Improvements or additions to documentation labels Oct 18, 2021
@williamtu
Copy link
Author

thank you!
I change my traffic gen to use the UDP port 8989, and unfortunately hit blue screen
"DRIVER IRQL NOT LESS THAN OR EQUAL"
Screen Shot 2021-10-18 at 9 22 05 AM

How do I share the error info or debug this issue?
Thanks
William

@Alan-Jowett
Copy link
Member

To make this easier to debug, can you run this in interpret mode with driver verifier enabled? Due to limitations of the Windows kernel, the debugger can't unwind through generated code in the kernel.

To enable driver verifier run:
verifier /standard /bootmode persistent /driver ebpfcore.sys netebpfext.sys sample_ebpf_ext.sys

Once you have it running in interpret mode, attach a kernel-mode debugger and capture a stack backtrace to better understand where it's crashing.

@williamtu
Copy link
Author

williamtu commented Oct 18, 2021

Thanks!
I enabled interpret mode, reboot, test it again and it still blue screen.
I'm new to the kernel-mode debugger setup.
Is there an easier way to do it? I was trying to setup kernel-mode debugger using Visual Studio, but didn't success. It will take some time to get back to you.

PS C:\> verifier /standard /bootmode persistent /driver ebpfcore.sys netebpfext.sys s
ample_ebpf_ext.sys

Verifier Flags: 0x001209bb

  Standard Flags:

    [X] 0x00000001 Special pool.
    [X] 0x00000002 Force IRQL checking.
    [X] 0x00000008 Pool tracking.
    [X] 0x00000010 I/O verification.
    [X] 0x00000020 Deadlock detection.
    [X] 0x00000080 DMA checking.
    [X] 0x00000100 Security checks.
    [X] 0x00000800 Miscellaneous checks.
    [X] 0x00020000 DDI compliance checking.

  Additional Flags:

    [ ] 0x00000004 Randomized low resources simulation.
    [ ] 0x00000200 Force pending I/O requests.
    [ ] 0x00000400 IRP logging.
    [ ] 0x00002000 Invariant MDL checking for stack.
    [ ] 0x00004000 Invariant MDL checking for driver.
    [ ] 0x00008000 Power framework delay fuzzing.
    [ ] 0x00010000 Port/miniport interface checking.
    [ ] 0x00040000 Systematic low resources simulation.
    [ ] 0x00080000 DDI compliance checking (additional).
    [ ] 0x00200000 NDIS/WIFI verification.
    [ ] 0x00800000 Kernel synchronization delay fuzzing.
    [ ] 0x01000000 VM switch verification.
    [ ] 0x02000000 Code integrity checks.

  Internal Flags:

    [X] 0x00100000 Extended Verifier flags (internal).

    [X] Indicates flag is enabled.

  Boot Mode:

    Persistent

  Rules:

    All rules are using default settings

  Extensions:

    wdm: rules.default

  Verified Drivers:

    ebpfcore.sys
    netebpfext.sys
    sample_ebpf_ext.sys
The system reboot is required for the changes to take effect.

@Alan-Jowett
Copy link
Member

Alternatively, you can grab the memory.dmp from %windir%\memory.dmp + driver files (ebpfcore.sys + netebpfext.sys) + matching pdb files. That should be enough to debug it.

@williamtu
Copy link
Author

Hi Alan,
Thanks, it looks like an mlx5 kernel driver issue.
However, I don't have the mlx5 driver symbol to decode its full stack trace.

Microsoft (R) Windows Debugger Version 10.0.19041.685 AMD64
Copyright (c) Microsoft Corporation. All rights reserved.


Loading Dump File [C:\Windows\MEMORY.DMP]
Kernel Bitmap Dump File: Kernel address space is available, User address space may not be available.

Symbol search path is: srv*
Executable search path is: 
Page 200002d2d too large to be in the dump file.
Page 2000040d7 too large to be in the dump file.
Page 2000040d7 too large to be in the dump file.
Windows 10 Kernel Version 17763 MP (12 procs) Free x64
Product: Server, suite: TerminalServer DataCenter SingleUserTS
Built by: 17763.1.amd64fre.rs5_release.180914-1434
Machine Name:
Kernel base = 0xfffff805`3801d000 PsLoadedModuleList = 0xfffff805`3843c9b0
Debug session time: Mon Oct 18 18:11:37.935 2021 (UTC - 7:00)
System Uptime: 0 days 0:04:55.868
Page 200002d2d too large to be in the dump file
...
...Page 20084f138 too large to be in the dump file.
.
Loading User Symbols

Loading unloaded module list
.......
For analysis of this file, run !analyze -v
4: kd> !analyze -v
*******************************************************************************
*                                                                             *
*                        Bugcheck Analysis                                    *
*                                                                             *
*******************************************************************************

DRIVER_IRQL_NOT_LESS_OR_EQUAL (d1)

An attempt was made to access a pageable (or completely invalid) address at an
interrupt request level (IRQL) that is too high.  This is usually
caused by drivers using improper addresses.
If kernel debugger is available get stack backtrace.
Arguments:
Arg1: 0000000000000020, memory referenced
Arg2: 0000000000000002, IRQL
Arg3: 0000000000000008, value 0 = read operation, 1 = write operation
Arg4: 0000000000000020, address which referenced memory

Debugging Details:
------------------

Page 20010df25 too large to be in the dump file.

KEY_VALUES_STRING: 1

    Key  : Analysis.CPU.Sec
    Value: 2

    Key  : Analysis.DebugAnalysisProvider.CPP
    Value: Create: 8007007e on WIN-JLNCA019LNG

    Key  : Analysis.DebugData
    Value: CreateObject

    Key  : Analysis.DebugModel
    Value: CreateObject

    Key  : Analysis.Elapsed.Sec
    Value: 10

    Key  : Analysis.Memory.CommitPeak.Mb
    Value: 71

    Key  : Analysis.System
    Value: CreateObject


BUGCHECK_CODE:  d1

BUGCHECK_P1: 20

BUGCHECK_P2: 2

BUGCHECK_P3: 8

BUGCHECK_P4: 20

READ_ADDRESS:  0000000000000020 

PROCESS_NAME:  System

BLACKBOXBSD: 1 (!blackboxbsd)


TRAP_FRAME:  ffffa2013f252f50 -- (.trap 0xffffa2013f252f50)
NOTE: The trap frame does not contain all registers.
Some register values may be zeroed or incorrect.
rax=0000000000000020 rbx=0000000000000000 rcx=fffff801acde2ed0
rdx=ffffa8011ebe4690 rsi=0000000000000000 rdi=0000000000000000
rip=0000000000000020 rsp=ffffa2013f2530e8 rbp=ffffa2013f253159
 r8=0000000000000001  r9=0000000000000001 r10=ffffa8011ebe4690
r11=ffffa8011a8451a0 r12=0000000000000000 r13=0000000000000000
r14=0000000000000000 r15=0000000000000000
iopl=0         nv up ei pl zr na po nc
00000000`00000020 ??              ???
Resetting default scope

FAILED_INSTRUCTION_ADDRESS: 
+0
00000000`00000020 ??              ???

STACK_TEXT:  
ffffa201`3f252e08 fffff805`381df869 : 00000000`0000000a 00000000`00000020 00000000`00000002 00000000`00000008 : nt!KeBugCheckEx
ffffa201`3f252e10 fffff805`381dbc8e : ffffa201`3f2530c8 fffff801`ac629cfd 00000000`00000000 ffffa201`3f253019 : nt!KiBugCheckDispatch+0x69
ffffa201`3f252f50 00000000`00000020 : fffff801`ab27194d ffffa801`1b2cb010 ffffa801`1ebe4690 ffffa201`3f253280 : nt!KiPageFault+0x44e
ffffa201`3f2530e8 fffff801`ab27194d : ffffa801`1b2cb010 ffffa801`1ebe4690 ffffa201`3f253280 00000000`00000001 : 0x20
ffffa201`3f2530f0 fffff801`ab2766a3 : ffffa801`1a8451a0 00000000`00000000 ffffa801`00000001 00000000`00000000 : NDIS!ndisMSendCompleteNetBufferListsInternal+0x22d
ffffa201`3f2531c0 fffff801`ab271f91 : 00000000`00000000 ffffa801`1bce2e50 ffffa801`1a3ea040 00000000`00000000 : NDIS!ndisCallSendCompleteHandler+0x33
ffffa201`3f253200 fffff801`ace67fc7 : ffffa801`1a8451a0 ffffa801`1ebe4690 ffffa801`1a8917e0 fffff805`388c8100 : NDIS!NdisMSendNetBufferListsComplete+0x301
ffffa201`3f253310 fffff801`ace6603e : 00000000`00000002 ffffa801`1b632870 fffff801`ab2f93d0 fffff801`ac773f10 : mlx5+0x87fc7
ffffa201`3f253340 fffff801`ace5fb2c : ffffa801`1b65b001 00000000`00000000 00000000`00000000 fffff805`000007ff : mlx5+0x8603e
ffffa201`3f253450 fffff801`ace5febd : 00000003`905ebc57 00000000`00000002 ffffa201`3f2536a8 00000000`00000000 : mlx5+0x7fb2c
ffffa201`3f253620 fffff801`acecbcdb : ffffa801`1b621058 ffffa801`1a902000 00000000`00000000 00000000`00000000 : mlx5+0x7febd
ffffa201`3f253680 fffff801`acec5a1a : 00000000`00000000 ffffa801`1adacd70 ffffa801`1adacd70 ffffa201`3f25393c : mlx5+0xebcdb
ffffa201`3f2536f0 fffff801`acec6b26 : ffffa801`00000000 ffffa801`1adacd70 ffffa801`00000080 ffffa801`1a902000 : mlx5+0xe5a1a
ffffa201`3f253780 fffff801`acec2c5c : ffffa801`1adacd70 ffffa201`3f2538a9 ffffa801`1a8451a0 ffffa801`1a902000 : mlx5+0xe6b26
ffffa201`3f2537b0 fffff801`ab276838 : ffffa801`1aebd000 000000cd`a5cbf013 00000000`00000000 ffffa801`1a506010 : mlx5+0xe2c5c
ffffa201`3f2537e0 fffff805`3807fda7 : 00000000`00000000 ffffa201`3f25393c 00000000`1f300000 ffffbc00`4e4cb180 : NDIS!ndisInterruptDpc+0x188
ffffa201`3f253910 fffff805`3807f3ee : 00000000`0000001a 00000000`00989680 ffffbc00`4e4db300 00000000`00000019 : nt!KiExecuteAllDpcs+0x2e7
ffffa201`3f253a50 fffff805`381d1a5a : ffffffff`00000000 ffffbc00`4e4cb180 00000000`00000000 ffffbc00`4e4db300 : nt!KiRetireDpcList+0x1ae
ffffa201`3f253c60 00000000`00000000 : ffffa201`3f254000 ffffa201`3f24e000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x5a


SYMBOL_NAME:  mlx5+87fc7

MODULE_NAME: mlx5

IMAGE_NAME:  mlx5.sys

STACK_COMMAND:  .thread ; .cxr ; kb

BUCKET_ID_FUNC_OFFSET:  87fc7

FAILURE_BUCKET_ID:  AV_VRF_CODE_AV_NULL_IP_mlx5!unknown_function

OS_VERSION:  10.0.17763.1

BUILDLAB_STR:  rs5_release

OSPLATFORM_TYPE:  x64

OSNAME:  Windows 10

FAILURE_ID_HASH:  {7a60976f-829a-ec80-6ee3-725a87250f98}

Followup:     MachineOwner
---------

@Alan-Jowett
Copy link
Member

Interesting. @shankarseal can you take a look? I think something might be corrupted in the NBL as it's not finding the right send completion handler for the packet.

@shankarseal
Copy link
Collaborator

shankarseal commented Oct 18, 2021

@williamtu can you may be upload the kernel crash dump?
Also can you give repro steps of traffic generation (preferably) from a windows machine so that I can repro this locally?

@williamtu
Copy link
Author

Hi @shankarseal,

Thanks! What traffic generation software should I use on Windows?

On my side, the traffic generator Linux machine and windows machine are connected back-to-back. It's hard to reimagine the traffic gen machine to run Windows. I will try but it might take some time.
I attached the MEMORY.DMP 350MB file:
https://drive.google.com/file/d/1ooRvUXtZHBJknzWHIRJOrUmdZxIuhXy3/view?usp=sharing

@shankarseal
Copy link
Collaborator

Thanks for sharing the crash dump. Can you please also upload c:\ebpf-for-windows\x64\Debug\NetEbpfExt.pdb ?

@williamtu
Copy link
Author

Please see netebpfext.pdb below, thanks
NetEbpfExt.pdb.zip

@shankarseal
Copy link
Collaborator

shankarseal commented Oct 19, 2021

Thanks for the symbols. I could get some diagnostic information from the crash dump. But I could not root cause it. I will try to get a repro myself. I am assuming you just blasted the target machine with a lot of UDP datagrams with dst port == 8989? The packet that caused the crash is this;

    00000000 1c 34 da 64 3b b4 1c 34 da 64 3b a8 08 00 45 00  ·4·d;··4·d;···E·
    00000010 00 32 00 00 00 00 40 11 32 29 c4 fe 5f 4b c4 fe  ·2····@·2)··_K··
    00000020 5f 4a 23 1d 23 1d 00 1e 00 00 00 00 00 00 00 00  _J#·#···········
    00000030 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ················1

30 bytes of 0s . Is that right? I will try to do the same.

@williamtu
Copy link
Author

Thanks a lot!
Yes, the packet content you posted is correct, it's from the DPDK test-pmd app.

@shankarseal
Copy link
Collaborator

shankarseal commented Oct 19, 2021

update - I tried ntttcp and flooded target VM running reflect_packet with 20K datagrams - no crash. Are you using reflect_packet or encap_reflect_packet? I will try testpmd as well - but I wonder how much difference the traffic gen tool will make?

@williamtu
Copy link
Author

I'm using reflect_packet
netsh ebpf>add program reflect_packet.o xdp
Does the packet rate matter? I'm sending 30Mpps. Should I send slower?

@shankarseal
Copy link
Collaborator

I have a repro - nvm.
I was earlier running on a different windows version. I have a repro on Server 2019 release. I can easily investigate now that I have an on demand repro.

@williamtu
Copy link
Author

that's great! thank you.

@dthaler
Copy link
Collaborator

dthaler commented Oct 21, 2021

Should be fixed by #641. Please reopen if not.

@dthaler dthaler closed this as completed Oct 21, 2021
@shankarseal
Copy link
Collaborator

shankarseal commented Oct 21, 2021

@williamtu - The bug has been fixed. Please pull the latest changes and give it a try.

Note that this implementation of XDP is more a prototype than finished product. We just implemented the basic functionality and helpers rather than focus on perf.

With the current implementation, using NTTTCP tool, I observed that there were up to 100K packets sent/received per second on the VM running the XDP eBPF program with 1 core being busy. I tried but could not spread the traffic on other cores of the VM - I will keep trying to do that.

You can measure the packets per second by running perfmon.exe and adding a counter under Network Interface -> Packets Sent/Packets Received.

@williamtu
Copy link
Author

@shankarseal
Thank you! I'm rerunning my test and will get back to you.

@williamtu
Copy link
Author

@shankarseal, thanks, I pull the latest code and it works OK!
I used perfmon.exe and got around XDP_TX 104kpps using single code.

Screen Shot 2021-10-21 at 7 26 36 PM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation triaged Discussed in a triage meeting
Projects
None yet
Development

No branches or pull requests

4 participants