Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

padding? lose packets when offload is enabled on mellanox #993

Closed
AnatoliChe opened this issue Jan 21, 2023 · 12 comments
Closed

padding? lose packets when offload is enabled on mellanox #993

AnatoliChe opened this issue Jan 21, 2023 · 12 comments
Labels

Comments

@AnatoliChe
Copy link

AnatoliChe commented Jan 21, 2023

Hi! I believe it's problem with kernel/driver/firmware, but maybe you can help.
When hw offload is enabled all packets shorter then 12 bytes are loosing.

It's problem is appearing with MT28841 (Mellanox/Nvidia ConnectX-6 Dx EN adapter card, 25GbE, Dual-port SFP28, PCIe 4.0 x8, Crypto and Secure Boot MCX621102AC-ADAT)
kernel 5.15.89
driver: mlx5_core
firmware-version: 22.32.1010
With Intel cards no problems.
and enabled hardware offload
I can ping do
ping 192.168.44.1 with 0% packet loss
can do
ping 192.168.44.1 -s 4
12 bytes from 192.168.4.1: icmp_seq=1 ttl=64
with 0% packet loss

but
ping 192.168.44.101 -s 3
it's 11 bytes 0%
I can see ougoing packet with length 44
and no incoming ESP at the other side.

As only length of ESP 48 bytes it's works.

With disables offload everything works.
Looks like problems with padding?

Could you help please?

@letoams
Copy link
Member

letoams commented Jan 24, 2023 via email

@AnatoliChe
Copy link
Author

Oh! Many thanks!!!

I tried upgrade fw to last one.
now
firmware-version: 22.35.2302 (MT_0000000430)
but it didn't help.
I'm going to try new drivers not from kernel 5.15.89 maybe...

@letoams
Copy link
Member

letoams commented Jan 26, 2023 via email

@paulwouters
Copy link
Member

We got a response from nvidia/mellanox:

We tried to reproduce it on kernel v6.2-rc5 with latest released FW and
everything worked as expected without any packet drops.

@letoams
Copy link
Member

letoams commented Jan 26, 2023 via email

@AnatoliChe
Copy link
Author

Thanks!
I have this error with 6.2.0-rc5 and last FW too so it mean problem with my kernel's config.
It means I enabled some options with I have this weird problem.
(Some option that Intel can work good and brings troubles to mlx).
So it's only my problem and I should find solution myself.
Thank you again for help,

@letoams
Copy link
Member

letoams commented Jan 28, 2023 via email

@AnatoliChe
Copy link
Author

compress=no
I had problems with compression so last 5 years I'm switching it off at once.

Yes there are errors in /proc/net/xfrm_stat

XfrmInNoStates 1
XfrmOutNoStates 2
XfrmAcquireError 19

@letoams
Copy link
Member

letoams commented Jan 28, 2023 via email

@AnatoliChe
Copy link
Author

sure
ip xfrm policy
src 192.168.44.102/32 dst 192.168.44.101/32
dir out priority 1753281
tmpl src 0.0.0.0 dst 0.0.0.0
proto esp reqid 16389 mode transport
src 192.168.44.101/32 dst 192.168.44.102/32
dir in priority 1753281
tmpl src 0.0.0.0 dst 0.0.0.0
proto esp reqid 16389 mode transport

ip xfrm state
src 192.168.44.101 dst 192.168.44.102
proto esp spi 0xcf79b7a1 reqid 16389 mode transport
replay-window 0
aead rfc4106(gcm(aes)) 0x58523991707387b604d14b9589aa51ed0dfed611 128
lastused 2023-01-29 15:44:54
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
crypto offload parameters: dev eth7 dir in
sel src 192.168.44.101/32 dst 192.168.44.102/32
src 192.168.44.102 dst 192.168.44.101
proto esp spi 0xa197e8e4 reqid 16389 mode transport
replay-window 0
aead rfc4106(gcm(aes)) 0x02508fede9b7982e3c8a0856f7fce8ef865660d9 128
lastused 2023-01-29 15:44:54
anti-replay context: seq 0x0, oseq 0xa9e, bitmap 0x00000000
crypto offload parameters: dev eth7 dir out
sel src 192.168.44.102/32 dst 192.168.44.101/32

from other side:
src 192.168.44.102 dst 192.168.44.101
proto esp spi 0x8762a7cb reqid 16445 mode transport
replay-window 0
aead rfc4106(gcm(aes)) 0x8e0ac059958c3fcb2afb12bf2c2f910db541f16b 128
anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000
crypto offload parameters: dev eth2 dir in
sel src 192.168.44.102/32 dst 192.168.44.101/32
src 192.168.44.101 dst 192.168.44.102
proto esp spi 0xecd56543 reqid 16445 mode transport
replay-window 0
aead rfc4106(gcm(aes)) 0x88e99cdf7bb4c3135a13818c2891860a79cdfe6e 128
anti-replay context: seq 0x0, oseq 0x8a, bitmap 0x00000000
crypto offload parameters: dev eth2 dir out
sel src 192.168.44.101/32 dst 192.168.44.102/32

src 192.168.44.101/32 dst 192.168.44.102/32
dir out priority 1753281
tmpl src 0.0.0.0 dst 0.0.0.0
proto esp reqid 16445 mode transport
src 192.168.44.102/32 dst 192.168.44.101/32
dir in priority 1753281
tmpl src 0.0.0.0 dst 0.0.0.0
proto esp reqid 16445 mode transport

ethtool -i eth7
driver: mlx5_core
version: 6.2.0-rc5-ipsec-core2-smp
firmware-version: 22.35.2302 (MT_0000000430)
expansion-rom-version:
bus-info: 0000:03:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: yes

ethtool -k eth7 | grep esp
tx-esp-segmentation: on
esp-hw-offload: on [fixed]
esp-tx-csum-hw-offload: on [fixed]

ethtool --show-offload eth7
Features for eth7:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: off
tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-tunnel-remcsum-segmentation: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: on
tx-udp-segmentation: on
tx-gso-list: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off
rx-all: on
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: on [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: on
esp-hw-offload: on [fixed]
esp-tx-csum-hw-offload: on [fixed]
rx-udp_tunnel-port-offload: on
tls-hw-tx-offload: off [fixed]
tls-hw-rx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]
rx-gro-list: off
macsec-hw-offload: off [fixed]
rx-udp-gro-forwarding: off
hsr-tag-ins-offload: off [fixed]
hsr-tag-rm-offload: off [fixed]
hsr-fwd-offload: off [fixed]
hsr-dup-offload: off [fixed]

ipsec status
000 using kernel interface: xfrm
000
000 interface lo UDP [::1]:4500
000 interface lo UDP [::1]:500
000 interface lo UDP 127.0.0.1:4500
000 interface lo UDP 127.0.0.1:500
000 interface eth7 UDP 192.168.44.102:500
000
000 fips mode=disabled;
000 SElinux=disabled
000 seccomp=unsupported
000
000 config setup options:
000
000 configdir=/etc, configfile=/etc/ipsec.conf, secrets=/etc/ipsec.secrets, ipsecdir=/etc/ipsec.d
000 nssdir=/var/lib/ipsec/nss, dumpdir=/dev/shm/, statsbin=unset
000 dnssec-rootkey-file=/usr/share/dns/root.key, dnssec-trusted=
000 sbindir=/usr/local/sbin, libexecdir=/usr/local/libexec/ipsec
000 pluto_version=5.0, pluto_vendorid=OE-Libreswan-5.0, audit-log=yes
000 nhelpers=-1, uniqueids=yes, dnssec-enable=yes, logappend=yes, logip=yes, shuntlifetime=900s, xfrmlifetime=30s
000 ddos-cookies-threshold=25000, ddos-max-halfopen=50000, ddos-mode=auto, ikev1-policy=accept
000 ikebuf=0, msg_errqueue=yes, crl-strict=no, crlcheckinterval=0, listen=, nflog-all=0
000 ocsp-enable=no, ocsp-strict=no, ocsp-timeout=2, ocsp-uri=
000 ocsp-trust-name=
000 ocsp-cache-size=1000, ocsp-cache-min-age=3600, ocsp-cache-max-age=86400, ocsp-method=get
000 global-redirect=no, global-redirect-to=
000 secctx-attr-type=
000 debug:
000
000 nat-traversal=yes, keep-alive=20, nat-ikeport=4500
000 virtual-private (%priv):
000
000 Kernel algorithms supported:
000
000 algorithm ESP encrypt: name=3DES_CBC, keysizemin=192, keysizemax=192
000 algorithm ESP encrypt: name=AES_CBC, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_CCM_12, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_CCM_16, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_CCM_8, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_CTR, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_GCM_12, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_GCM_16, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=AES_GCM_8, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=CAMELLIA_CBC, keysizemin=128, keysizemax=256
000 algorithm ESP encrypt: name=CHACHA20_POLY1305, keysizemin=256, keysizemax=256
000 algorithm ESP encrypt: name=NULL, keysizemin=0, keysizemax=0
000 algorithm ESP encrypt: name=NULL_AUTH_AES_GMAC, keysizemin=128, keysizemax=256
000 algorithm AH/ESP auth: name=AES_CMAC_96, key-length=128
000 algorithm AH/ESP auth: name=AES_XCBC_96, key-length=128
000 algorithm AH/ESP auth: name=HMAC_MD5_96, key-length=128
000 algorithm AH/ESP auth: name=HMAC_SHA1_96, key-length=160
000 algorithm AH/ESP auth: name=HMAC_SHA2_256_128, key-length=256
000 algorithm AH/ESP auth: name=HMAC_SHA2_256_TRUNCBUG, key-length=256
000 algorithm AH/ESP auth: name=HMAC_SHA2_384_192, key-length=384
000 algorithm AH/ESP auth: name=HMAC_SHA2_512_256, key-length=512
000 algorithm AH/ESP auth: name=NONE, key-length=0
000
000 IKE algorithms supported:
000
000 algorithm IKE encrypt: v1id=5, v1name=OAKLEY_3DES_CBC, v2id=3, v2name=3DES, blocksize=8, keydeflen=192
000 algorithm IKE encrypt: v1id=8, v1name=OAKLEY_CAMELLIA_CBC, v2id=23, v2name=CAMELLIA_CBC, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=-1, v1name=n/a, v2id=20, v2name=AES_GCM_C, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=-1, v1name=n/a, v2id=19, v2name=AES_GCM_B, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=-1, v1name=n/a, v2id=18, v2name=AES_GCM_A, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=13, v1name=OAKLEY_AES_CTR, v2id=13, v2name=AES_CTR, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=7, v1name=OAKLEY_AES_CBC, v2id=12, v2name=AES_CBC, blocksize=16, keydeflen=128
000 algorithm IKE encrypt: v1id=-1, v1name=n/a, v2id=28, v2name=CHACHA20_POLY1305, blocksize=16, keydeflen=256
000 algorithm IKE PRF: name=HMAC_MD5, hashlen=16
000 algorithm IKE PRF: name=HMAC_SHA1, hashlen=20
000 algorithm IKE PRF: name=HMAC_SHA2_256, hashlen=32
000 algorithm IKE PRF: name=HMAC_SHA2_384, hashlen=48
000 algorithm IKE PRF: name=HMAC_SHA2_512, hashlen=64
000 algorithm IKE PRF: name=AES_XCBC, hashlen=16
000 algorithm IKE DH Key Exchange: name=MODP1536, bits=1536
000 algorithm IKE DH Key Exchange: name=MODP2048, bits=2048
000 algorithm IKE DH Key Exchange: name=MODP3072, bits=3072
000 algorithm IKE DH Key Exchange: name=MODP4096, bits=4096
000 algorithm IKE DH Key Exchange: name=MODP6144, bits=6144
000 algorithm IKE DH Key Exchange: name=MODP8192, bits=8192
000 algorithm IKE DH Key Exchange: name=DH19, bits=512
000 algorithm IKE DH Key Exchange: name=DH20, bits=768
000 algorithm IKE DH Key Exchange: name=DH21, bits=1056
000 algorithm IKE DH Key Exchange: name=DH31, bits=256
000
000 stats db_ops: {curr_cnt, total_cnt, maxsz} :context={0,0,0} trans={0,0,0} attrs={0,0,0}
000
000 Connection list:
000
000 "peer1-peer210g1": 192.168.44.102[@peer110gf1.ipsec.tkb]...192.168.44.101[@peer210g1.ipsec.tkb]; erouted; eroute owner: #3
000 "peer1-peer210g1": oriented; my_ip=unset; their_ip=unset; my_updown=ipsec _updown;
000 "peer1-peer210g1": xauth us:none, xauth them:none, my_username=[any]; their_username=[any]
000 "peer1-peer210g1": our auth:rsasig(RSASIG+RSASIG_v1_5), their auth:RSASIG+RSASIG_v1_5, our autheap:none, their autheap:none;
000 "peer1-peer210g1": modecfg info: us:none, them:none, modecfg policy:push, dns:unset, domains:unset, cat:unset;
000 "peer1-peer210g1": sec_label:unset;
000 "peer1-peer210g1": ike_life: 28800s; ipsec_life: 1200s; ipsec_max_bytes: 2^63B; ipsec_max_packets: 2^63; replay_window: 0; rekey_margin: 300s; rekey_fuzz: 50%; keyingtries: 0;
000 "peer1-peer210g1": retransmit-interval: 500ms; retransmit-timeout: 60s; iketcp:no; iketcp-port:4500;
000 "peer1-peer210g1": initial-contact:no; cisco-unity:no; fake-strongswan:no; send-vendorid:no; send-no-esp-tfc:no;
000 "peer1-peer210g1": policy: IKEv2+RSASIG+RSASIG_v1_5+ENCRYPT+PFS+UP+IKE_FRAG_ALLOW+ESN_NO;
000 "peer1-peer210g1": v2-auth-hash-policy: SHA2_256+SHA2_384+SHA2_512;
000 "peer1-peer210g1": conn_prio: 32,32; interface: eth7; metric: 0; mtu: unset; sa_prio:auto; sa_tfc:none;
000 "peer1-peer210g1": nflog-group: unset; mark: unset; vti-iface:unset; vti-routing:no; vti-shared:no; nic-offload:yes;
000 "peer1-peer210g1": our idtype: ID_FQDN; our id=@peer110gf1.ipsec.tkb; their idtype: ID_FQDN; their id=@peer210g1.ipsec.tkb
000 "peer1-peer210g1": liveness: active; dpdaction:restart; dpddelay:30s; retransmit-timeout:60s
000 "peer1-peer210g1": nat-traversal: encaps:auto; keepalive:20s
000 "peer1-peer210g1": newest IKE SA: #1; newest IPsec SA: #3; conn serial: $1;
000 "peer1-peer210g1": IKE algorithms: AES_GCM_16_128-HMAC_SHA2_512+HMAC_SHA2_256-MODP2048
000 "peer1-peer210g1": IKEv2 algorithm newest: AES_GCM_16_128-HMAC_SHA2_512-MODP2048
000 "peer1-peer210g1": ESP algorithms: AES_GCM_16_128-NONE-MODP2048
000 "peer1-peer210g1": ESP algorithm newest: AES_GCM_16_128-NONE; pfsgroup=
000
000 Total IPsec connections: loaded 1, active 1
000
000 State Information: DDoS cookies not required, Accepting new IKE connections
000 IKE SAs: total(1), half-open(0), open(0), authenticated(1), anonymous(0)
000 IPsec SAs: total(1), authenticated(1), anonymous(0)
000
000 #1: "peer1-peer210g1":500 STATE_V2_ESTABLISHED_IKE_SA (established IKE SA); REKEY in 27270s; REPLACE in 27711s; newest; idle;
000 #3: "peer1-peer210g1":500 STATE_V2_ESTABLISHED_CHILD_SA (established Child SA); LIVENESS in 0s; REKEY in 426s; REPLACE in 870s; newest; eroute owner; IKE SA #1; idle;
000 #3: "peer1-peer210g1" esp.8762a7cb@192.168.44.101 esp.ecd56543@192.168.44.102 Traffic: ESPin=5KB ESPout=7KB ESPmax=2^63B
000
000 Bare Shunt list:
000

cat /proc/net/xfrm_stat
XfrmInNoStates 1
XfrmAcquireError 45

but I guess it canbe related with ipsec restart.

@AnatoliChe
Copy link
Author

it's not related with padding, it's DX6's problem.

@cagney cagney added the offload label Oct 26, 2023
@paulwouters
Copy link
Member

I again confirmed, I cannot reproduce this issue:
root@tundra:~# ipsec traffic
#2: "test", type=ESP(nic-offload=packet), add_time=1708543865, inBytes=696, outBytes=1170, maxBytes=2^63B, id='10.0.1.1'

root@tundra:~# ping 10.0.1.1 -s 3
PING 10.0.1.1 (10.0.1.1) 3(31) bytes of data.
11 bytes from 10.0.1.1: icmp_seq=1 ttl=64
11 bytes from 10.0.1.1: icmp_seq=2 ttl=64
11 bytes from 10.0.1.1: icmp_seq=3 ttl=64
11 bytes from 10.0.1.1: icmp_seq=4 ttl=64
11 bytes from 10.0.1.1: icmp_seq=5 ttl=64
11 bytes from 10.0.1.1: icmp_seq=6 ttl=64
^C
--- 10.0.1.1 ping statistics ---
6 packets transmitted, 6 received, 0% packet loss, time 5045ms

root@tundra:~# ping 10.0.1.1 -s 4
PING 10.0.1.1 (10.0.1.1) 4(32) bytes of data.
12 bytes from 10.0.1.1: icmp_seq=1 ttl=64
12 bytes from 10.0.1.1: icmp_seq=2 ttl=64
12 bytes from 10.0.1.1: icmp_seq=3 ttl=64
12 bytes from 10.0.1.1: icmp_seq=4 ttl=64
12 bytes from 10.0.1.1: icmp_seq=5 ttl=64
^C
--- 10.0.1.1 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4044ms

sorry, nothing we can do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants