padding? lose packets when offload is enabled on mellanox #993

AnatoliChe · 2023-01-21T16:01:57Z

Hi! I believe it's problem with kernel/driver/firmware, but maybe you can help.
When hw offload is enabled all packets shorter then 12 bytes are loosing.

It's problem is appearing with MT28841 (Mellanox/Nvidia ConnectX-6 Dx EN adapter card, 25GbE, Dual-port SFP28, PCIe 4.0 x8, Crypto and Secure Boot MCX621102AC-ADAT)
kernel 5.15.89
driver: mlx5_core
firmware-version: 22.32.1010
With Intel cards no problems.
and enabled hardware offload
I can ping do
ping 192.168.44.1 with 0% packet loss
can do
ping 192.168.44.1 -s 4
12 bytes from 192.168.4.1: icmp_seq=1 ttl=64
with 0% packet loss

but
ping 192.168.44.101 -s 3
it's 11 bytes 0%
I can see ougoing packet with length 44
and no incoming ESP at the other side.

As only length of ESP 48 bytes it's works.

With disables offload everything works.
Looks like problems with padding?

Could you help please?

letoams · 2023-01-24T02:14:21Z

I Forwarded your message on what looks like a kernel/hardware bug in IPsec to our ipsec-devel list where the kernel and hardware people are. They might respond here or else I will forward any responses I will get in this issue.

AnatoliChe · 2023-01-24T12:14:38Z

Oh! Many thanks!!!

I tried upgrade fw to last one.
now
firmware-version: 22.35.2302 (MT_0000000430)
but it didn't help.
I'm going to try new drivers not from kernel 5.15.89 maybe...

letoams · 2023-01-26T08:33:38Z

On Mon, Jan 23, 2023 at 09:14:14PM -0500, Paul Wouters via Devel wrote: Forwarding a message on what looks like a kernel/hardware bug in IPsec See #993

We tried to reproduce it on kernel v6.2-rc5 with latest released FW and everything worked as expected without any packet drops. Thanks

paulwouters · 2023-01-26T14:24:15Z

We got a response from nvidia/mellanox:

We tried to reproduce it on kernel v6.2-rc5 with latest released FW and
everything worked as expected without any packet drops.

letoams · 2023-01-26T14:32:44Z

On Thu, 26 Jan 2023, Leon Romanovsky wrote: On Mon, Jan 23, 2023 at 09:14:14PM -0500, Paul Wouters via Devel wrote: > > Forwarding a message on what looks like a kernel/hardware bug in IPsec > > See #993 We tried to reproduce it on kernel v6.2-rc5 with latest released FW and everything worked as expected without any packet drops.

Thanks for the quick response. We have relayed this reply. Probably they will wait for the 6.2 release kernel for testing, but if/once I hear something, I will let you know. Paul

AnatoliChe · 2023-01-27T22:18:14Z

Thanks!
I have this error with 6.2.0-rc5 and last FW too so it mean problem with my kernel's config.
It means I enabled some options with I have this weird problem.
(Some option that Intel can work good and brings troubles to mlx).
So it's only my problem and I should find solution myself.
Thank you again for help,

letoams · 2023-01-28T01:49:54Z

Thanks! I have this error with 6.2.0-rc5 and last FW too so it mean problem with my kernel's config. It means I enabled some options with I have this weird problem. (Some option that Intel can work good and brings troubles to mlx).

Odd. You did not set compress=yes right ? That might cause a difference in processing for smaller and bigger packets. Can you show /proc/net/xfrm_stat after you cause some dropped packets ? All values should be 0, if not that might indicate where to look. Paul

AnatoliChe · 2023-01-28T06:49:48Z

compress=no
I had problems with compression so last 5 years I'm switching it off at once.

Yes there are errors in /proc/net/xfrm_stat

XfrmInNoStates 1
XfrmOutNoStates 2
XfrmAcquireError 19

letoams · 2023-01-28T13:32:16Z

On Jan 28, 2023, at 01:50, AnatoliChe ***@***.***> wrote: ----==_mimepart_63d4c597beb89_6a59c5bc14320c8 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit compress=no I had problems with compression so last 5 years I'm switching it off at once.

Good.

Yes there are errors in /proc/net/xfrm_stat XfrmInNoStates 1 XfrmOutNoStates 2 XfrmAcquireError 19

Interesting. Can you redo this, show the nonzero values and then “ip xfrm policy” and “ip xfrm state” and “ipsec status” ? There is verbose command to (ip -v of the above commands ?) that might provide insights. (Not at a computer now)

AnatoliChe · 2023-01-29T14:08:22Z

sure
ip xfrm policy
src 192.168.44.102/32 dst 192.168.44.101/32
dir out priority 1753281
tmpl src 0.0.0.0 dst 0.0.0.0
proto esp reqid 16389 mode transport
src 192.168.44.101/32 dst 192.168.44.102/32
dir in priority 1753281
tmpl src 0.0.0.0 dst 0.0.0.0
proto esp reqid 16389 mode transport

ip xfrm state
src 192.168.44.101 dst 192.168.44.102
proto esp spi 0xcf79b7a1 reqid 16389 mode transport
replay-window 0
aead rfc4106(gcm(aes)) 0x58523991707387b604d14b9589aa51ed0dfed611 128
lastused 2023-01-29 15:44:54
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
crypto offload parameters: dev eth7 dir in
sel src 192.168.44.101/32 dst 192.168.44.102/32
src 192.168.44.102 dst 192.168.44.101
proto esp spi 0xa197e8e4 reqid 16389 mode transport
replay-window 0
aead rfc4106(gcm(aes)) 0x02508fede9b7982e3c8a0856f7fce8ef865660d9 128
lastused 2023-01-29 15:44:54
anti-replay context: seq 0x0, oseq 0xa9e, bitmap 0x00000000
crypto offload parameters: dev eth7 dir out
sel src 192.168.44.102/32 dst 192.168.44.101/32

from other side:
src 192.168.44.102 dst 192.168.44.101
proto esp spi 0x8762a7cb reqid 16445 mode transport
replay-window 0
aead rfc4106(gcm(aes)) 0x8e0ac059958c3fcb2afb12bf2c2f910db541f16b 128
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
crypto offload parameters: dev eth2 dir in
sel src 192.168.44.102/32 dst 192.168.44.101/32
src 192.168.44.101 dst 192.168.44.102
proto esp spi 0xecd56543 reqid 16445 mode transport
replay-window 0
aead rfc4106(gcm(aes)) 0x88e99cdf7bb4c3135a13818c2891860a79cdfe6e 128
anti-replay context: seq 0x0, oseq 0x8a, bitmap 0x00000000
crypto offload parameters: dev eth2 dir out
sel src 192.168.44.101/32 dst 192.168.44.102/32

src 192.168.44.101/32 dst 192.168.44.102/32
dir out priority 1753281
tmpl src 0.0.0.0 dst 0.0.0.0
proto esp reqid 16445 mode transport
src 192.168.44.102/32 dst 192.168.44.101/32
dir in priority 1753281
tmpl src 0.0.0.0 dst 0.0.0.0
proto esp reqid 16445 mode transport

ethtool -i eth7
driver: mlx5_core
version: 6.2.0-rc5-ipsec-core2-smp
firmware-version: 22.35.2302 (MT_0000000430)
expansion-rom-version:
bus-info: 0000:03:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

ethtool -k eth7 | grep esp
tx-esp-segmentation: on
esp-hw-offload: on [fixed]
esp-tx-csum-hw-offload: on [fixed]

ethtool --show-offload eth7
Features for eth7:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: off
tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-tunnel-remcsum-segmentation: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: on
tx-udp-segmentation: on
tx-gso-list: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: on
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: on [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: on
esp-hw-offload: on [fixed]
esp-tx-csum-hw-offload: on [fixed]
rx-udp_tunnel-port-offload: on
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]
rx-gro-list: off
macsec-hw-offload: off [fixed]
rx-udp-gro-forwarding: off
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]

ipsec status
000 using kernel interface: xfrm
000
000 interface lo UDP [::1]:4500
000 interface lo UDP [::1]:500
000 interface lo UDP 127.0.0.1:4500
000 interface lo UDP 127.0.0.1:500
000 interface eth7 UDP 192.168.44.102:500
000
000 fips mode=disabled;
000 SElinux=disabled
000 seccomp=unsupported
000
000 config setup options:
000
000 configdir=/etc, configfile=/etc/ipsec.conf, secrets=/etc/ipsec.secrets, ipsecdir=/etc/ipsec.d
000 nssdir=/var/lib/ipsec/nss, dumpdir=/dev/shm/, statsbin=unset
000 dnssec-rootkey-file=/usr/share/dns/root.key, dnssec-trusted=
000 sbindir=/usr/local/sbin, libexecdir=/usr/local/libexec/ipsec
000 pluto_version=5.0, pluto_vendorid=OE-Libreswan-5.0, audit-log=yes
000 nhelpers=-1, uniqueids=yes, dnssec-enable=yes, logappend=yes, logip=yes, shuntlifetime=900s, xfrmlifetime=30s
000 ddos-cookies-threshold=25000, ddos-max-halfopen=50000, ddos-mode=auto, ikev1-policy=accept
000 ikebuf=0, msg_errqueue=yes, crl-strict=no, crlcheckinterval=0, listen=, nflog-all=0
000 ocsp-enable=no, ocsp-strict=no, ocsp-timeout=2, ocsp-uri=
000 ocsp-trust-name=
000 ocsp-cache-size=1000, ocsp-cache-min-age=3600, ocsp-cache-max-age=86400, ocsp-method=get
000 global-redirect=no, global-redirect-to=
000 secctx-attr-type=
000 debug:
000
000 nat-traversal=yes, keep-alive=20, nat-ikeport=4500
000 virtual-private (%priv):
000
000 Kernel algorithms supported:
000
000 algorithm ESP encrypt: name=3DES_CBC, keysizemin=192, keysizemax=192
000 algorithm ESP encrypt: name=AES_CBC, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_CCM_12, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_CCM_16, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_CCM_8, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_CTR, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_GCM_12, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_GCM_16, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_GCM_8, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=CAMELLIA_CBC, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=CHACHA20_POLY1305, keysizemin=256, keysizemax=256
000 algorithm ESP encrypt: name=NULL, keysizemin=0, keysizemax=0
000 algorithm ESP encrypt: name=NULL_AUTH_AES_GMAC, keysizemin=128, keysizemax=256
000 algorithm AH/ESP auth: name=AES_CMAC_96, key-length=128
000 algorithm AH/ESP auth: name=AES_XCBC_96, key-length=128
000 algorithm AH/ESP auth: name=HMAC_MD5_96, key-length=128
000 algorithm AH/ESP auth: name=HMAC_SHA1_96, key-length=160
000 algorithm AH/ESP auth: name=HMAC_SHA2_256_128, key-length=256
000 algorithm AH/ESP auth: name=HMAC_SHA2_256_TRUNCBUG, key-length=256
000 algorithm AH/ESP auth: name=HMAC_SHA2_384_192, key-length=384
000 algorithm AH/ESP auth: name=HMAC_SHA2_512_256, key-length=512
000 algorithm AH/ESP auth: name=NONE, key-length=0
000
000 IKE algorithms supported:
000
000 algorithm IKE encrypt: v1id=5, v1name=OAKLEY_3DES_CBC, v2id=3, v2name=3DES, blocksize=8, keydeflen=192
000 algorithm IKE encrypt: v1id=8, v1name=OAKLEY_CAMELLIA_CBC, v2id=23, v2name=CAMELLIA_CBC, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=-1, v1name=n/a, v2id=20, v2name=AES_GCM_C, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=-1, v1name=n/a, v2id=19, v2name=AES_GCM_B, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=-1, v1name=n/a, v2id=18, v2name=AES_GCM_A, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=13, v1name=OAKLEY_AES_CTR, v2id=13, v2name=AES_CTR, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=7, v1name=OAKLEY_AES_CBC, v2id=12, v2name=AES_CBC, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=-1, v1name=n/a, v2id=28, v2name=CHACHA20_POLY1305, blocksize=16, keydeflen=256
000 algorithm IKE PRF: name=HMAC_MD5, hashlen=16
000 algorithm IKE PRF: name=HMAC_SHA1, hashlen=20
000 algorithm IKE PRF: name=HMAC_SHA2_256, hashlen=32
000 algorithm IKE PRF: name=HMAC_SHA2_384, hashlen=48
000 algorithm IKE PRF: name=HMAC_SHA2_512, hashlen=64
000 algorithm IKE PRF: name=AES_XCBC, hashlen=16
000 algorithm IKE DH Key Exchange: name=MODP1536, bits=1536
000 algorithm IKE DH Key Exchange: name=MODP2048, bits=2048
000 algorithm IKE DH Key Exchange: name=MODP3072, bits=3072
000 algorithm IKE DH Key Exchange: name=MODP4096, bits=4096
000 algorithm IKE DH Key Exchange: name=MODP6144, bits=6144
000 algorithm IKE DH Key Exchange: name=MODP8192, bits=8192
000 algorithm IKE DH Key Exchange: name=DH19, bits=512
000 algorithm IKE DH Key Exchange: name=DH20, bits=768
000 algorithm IKE DH Key Exchange: name=DH21, bits=1056
000 algorithm IKE DH Key Exchange: name=DH31, bits=256
000
000 stats db_ops: {curr_cnt, total_cnt, maxsz} :context={0,0,0} trans={0,0,0} attrs={0,0,0}
000
000 Connection list:
000
000 "peer1-peer210g1": 192.168.44.102[@peer110gf1.ipsec.tkb]...192.168.44.101[@peer210g1.ipsec.tkb]; erouted; eroute owner: #3
000 "peer1-peer210g1": oriented; my_ip=unset; their_ip=unset; my_updown=ipsec _updown;
000 "peer1-peer210g1": xauth us:none, xauth them:none, my_username=[any]; their_username=[any]
000 "peer1-peer210g1": our auth:rsasig(RSASIG+RSASIG_v1_5), their auth:RSASIG+RSASIG_v1_5, our autheap:none, their autheap:none;
000 "peer1-peer210g1": modecfg info: us:none, them:none, modecfg policy:push, dns:unset, domains:unset, cat:unset;
000 "peer1-peer210g1": sec_label:unset;
000 "peer1-peer210g1": ike_life: 28800s; ipsec_life: 1200s; ipsec_max_bytes: 2^63B; ipsec_max_packets: 2^63; replay_window: 0; rekey_margin: 300s; rekey_fuzz: 50%; keyingtries: 0;
000 "peer1-peer210g1": retransmit-interval: 500ms; retransmit-timeout: 60s; iketcp:no; iketcp-port:4500;
000 "peer1-peer210g1": initial-contact:no; cisco-unity:no; fake-strongswan:no; send-vendorid:no; send-no-esp-tfc:no;
000 "peer1-peer210g1": policy: IKEv2+RSASIG+RSASIG_v1_5+ENCRYPT+PFS+UP+IKE_FRAG_ALLOW+ESN_NO;
000 "peer1-peer210g1": v2-auth-hash-policy: SHA2_256+SHA2_384+SHA2_512;
000 "peer1-peer210g1": conn_prio: 32,32; interface: eth7; metric: 0; mtu: unset; sa_prio:auto; sa_tfc:none;
000 "peer1-peer210g1": nflog-group: unset; mark: unset; vti-iface:unset; vti-routing:no; vti-shared:no; nic-offload:yes;
000 "peer1-peer210g1": our idtype: ID_FQDN; our id=@peer110gf1.ipsec.tkb; their idtype: ID_FQDN; their id=@peer210g1.ipsec.tkb
000 "peer1-peer210g1": liveness: active; dpdaction:restart; dpddelay:30s; retransmit-timeout:60s
000 "peer1-peer210g1": nat-traversal: encaps:auto; keepalive:20s
000 "peer1-peer210g1": newest IKE SA: #1; newest IPsec SA: #3; conn serial: $1;
000 "peer1-peer210g1": IKE algorithms: AES_GCM_16_128-HMAC_SHA2_512+HMAC_SHA2_256-MODP2048
000 "peer1-peer210g1": IKEv2 algorithm newest: AES_GCM_16_128-HMAC_SHA2_512-MODP2048
000 "peer1-peer210g1": ESP algorithms: AES_GCM_16_128-NONE-MODP2048
000 "peer1-peer210g1": ESP algorithm newest: AES_GCM_16_128-NONE; pfsgroup=
000
000 Total IPsec connections: loaded 1, active 1
000
000 State Information: DDoS cookies not required, Accepting new IKE connections
000 IKE SAs: total(1), half-open(0), open(0), authenticated(1), anonymous(0)
000 IPsec SAs: total(1), authenticated(1), anonymous(0)
000
000 #1: "peer1-peer210g1":500 STATE_V2_ESTABLISHED_IKE_SA (established IKE SA); REKEY in 27270s; REPLACE in 27711s; newest; idle;
000 #3: "peer1-peer210g1":500 STATE_V2_ESTABLISHED_CHILD_SA (established Child SA); LIVENESS in 0s; REKEY in 426s; REPLACE in 870s; newest; eroute owner; IKE SA #1; idle;
000 #3: "peer1-peer210g1" esp.8762a7cb@192.168.44.101 esp.ecd56543@192.168.44.102 Traffic: ESPin=5KB ESPout=7KB ESPmax=2^63B
000
000 Bare Shunt list:
000

cat /proc/net/xfrm_stat
XfrmInNoStates 1
XfrmAcquireError 45

but I guess it canbe related with ipsec restart.

AnatoliChe · 2023-03-20T07:07:17Z

it's not related with padding, it's DX6's problem.

paulwouters · 2024-02-21T19:33:32Z

I again confirmed, I cannot reproduce this issue:
root@tundra:~# ipsec traffic
#2: "test", type=ESP(nic-offload=packet), add_time=1708543865, inBytes=696, outBytes=1170, maxBytes=2^63B, id='10.0.1.1'

root@tundra:~# ping 10.0.1.1 -s 3
PING 10.0.1.1 (10.0.1.1) 3(31) bytes of data.
11 bytes from 10.0.1.1: icmp_seq=1 ttl=64
11 bytes from 10.0.1.1: icmp_seq=2 ttl=64
11 bytes from 10.0.1.1: icmp_seq=3 ttl=64
11 bytes from 10.0.1.1: icmp_seq=4 ttl=64
11 bytes from 10.0.1.1: icmp_seq=5 ttl=64
11 bytes from 10.0.1.1: icmp_seq=6 ttl=64
^C
--- 10.0.1.1 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5045ms

root@tundra:~# ping 10.0.1.1 -s 4
PING 10.0.1.1 (10.0.1.1) 4(32) bytes of data.
12 bytes from 10.0.1.1: icmp_seq=1 ttl=64
12 bytes from 10.0.1.1: icmp_seq=2 ttl=64
12 bytes from 10.0.1.1: icmp_seq=3 ttl=64
12 bytes from 10.0.1.1: icmp_seq=4 ttl=64
12 bytes from 10.0.1.1: icmp_seq=5 ttl=64
^C
--- 10.0.1.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4044ms

sorry, nothing we can do.

cagney added the offload label Oct 26, 2023

paulwouters closed this as completed Feb 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

padding? lose packets when offload is enabled on mellanox #993

padding? lose packets when offload is enabled on mellanox #993

AnatoliChe commented Jan 21, 2023 •

edited

Loading

letoams commented Jan 24, 2023 via email •

edited by paulwouters

Loading

AnatoliChe commented Jan 24, 2023

letoams commented Jan 26, 2023 via email

paulwouters commented Jan 26, 2023

letoams commented Jan 26, 2023 via email

AnatoliChe commented Jan 27, 2023

letoams commented Jan 28, 2023 via email

AnatoliChe commented Jan 28, 2023

letoams commented Jan 28, 2023 via email

AnatoliChe commented Jan 29, 2023

AnatoliChe commented Mar 20, 2023

paulwouters commented Feb 21, 2024

padding? lose packets when offload is enabled on mellanox #993

padding? lose packets when offload is enabled on mellanox #993

Comments

AnatoliChe commented Jan 21, 2023 • edited Loading

letoams commented Jan 24, 2023 via email • edited by paulwouters Loading

AnatoliChe commented Jan 24, 2023

letoams commented Jan 26, 2023 via email

paulwouters commented Jan 26, 2023

letoams commented Jan 26, 2023 via email

AnatoliChe commented Jan 27, 2023

letoams commented Jan 28, 2023 via email

AnatoliChe commented Jan 28, 2023

letoams commented Jan 28, 2023 via email

AnatoliChe commented Jan 29, 2023

AnatoliChe commented Mar 20, 2023

paulwouters commented Feb 21, 2024

AnatoliChe commented Jan 21, 2023 •

edited

Loading

letoams commented Jan 24, 2023 via email •

edited by paulwouters

Loading