Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Windows] Sequence number of a packet after DNAT+SNAT is incorrect #229

Closed
lzhecheng opened this issue Oct 12, 2021 · 5 comments
Closed

Comments

@lzhecheng
Copy link

lzhecheng commented Oct 12, 2021

Hello, I met the following problem with Windows OVS.

In our project, a HTTP packet goes through OVS pipeline with DNAT+SNAT and output at a tunnel port (encaped by Geneve). For the reply packet, TCP sequence number is changed after it is decaped. As a result, the packet didn't reach the target at once.

The wireshark results are shown below. We can see that sequence number is changed for packet with same ip id.
Uplink (decapsulation):
uplink
:

Output port:
gw0

@ua1422
Copy link

ua1422 commented Oct 12, 2021

Can you please provide a minimal setup that will reproduce your issue?

By sequence number, do you mean sequence in GRE header?

@wenyingd
Copy link

For the setup: the client and the HTTP server are located on two different hosts, and HTTP traffic between the client and the server is on a Geneve tunnel. We have installed OVS on both two hosts, and create tunnel port on OVS. We also create OVS ports to connect the client and server. For the overlay traffic, we have performed both DNAT and SNAT on the HTTP packets. We find the client fail to access the server from our tests.

After capturing the packets, we see the TCP connection is setup correctly, and the HTTP reply packet is received correctly from the host uplink interface. And we also find that the HTTP reply packet is actually changed on the TCP sequence number when OVS forwards the packet to the client port.

@aserdean
Copy link
Member

Can you please provide the output of:
ovs-vsctl show
ovs-ofctl dump-flows <bridge> (on all the bridges)
systeminfo

On both hosts?

ovsrobot pushed a commit to ovsrobot/ovs that referenced this issue Oct 13, 2021
Currently the layers info propogated to ProcessDeferredActions may be
incorrect. Because of this, any subsequent usage of layers might result
in undesired behavior. Accordingly in this patch it will add the related
 layers in the deferred action to make sure the layers consistent with
the related NBL.

In the specified case 229, we have encountered one issue when doing
the decap Geneve Packet and doing the twice NAT(via two flow tables)
and found the HTTP packet will be changed the TCP sequence.

After debugging, we found the issue is caused by the not-updated
layers value isTcp and isUdp for Geneve decapping case.

The related function call chains are listed below,

OvsExecuteDpIoctl—>OvsActionsExecute—>OvsDoExecuteActions->OvsTunnelPortRx
——>OvsDoExecuteActions——〉nat ct action and recircle action
->OvsActionsExecute->defered_actions processing for nat and recircle action

For the Geneve packet decaping, it will firstly set the layers for Udp packet.
Then it will go on doing OVS flow extract to get the inner packet layers and
Processing the first nat action and first recircle action. After that datapath
Will do defered_actions processing on OvsActionsExecute. And it does inherit
The incorrect geneve packet layers value( isTCP 0 and isUdp 1).So in the second
Nat action processing it will get the wrong TCP Headers in OvsUpdateAddressAndPort
And it will update  related TCP check field value but in this case it will change
The packet Tcp seq value.

Reported-at:openvswitch/ovs-issues#229
Signed-off-by: Wilson Peng <pweisong@vmware.com>
Signed-off-by: 0-day Robot <robot@bytheb.org>
@twofish197
Copy link

@lzhecheng
Copy link
Author

ovs-vsctl show

PS C:\cygwin\home\Administrator\antrea> ovs-vsctl show
aad49e0a-0637-42b3-91d2-90df545f0b42
    Bridge br-int
        datapath_type: system
        Port br-int
            Interface br-int
                type: internal
        Port antrea-gw0
            Interface antrea-gw0
                type: internal
        Port antrea-tun0
            Interface antrea-tun0
                type: geneve
                options: {csum="true", key=flow, local_ip="10.176.26.112", remote_ip=flow}
        Port "Ethernet0 2"
            Interface "Ethernet0 2"
    ovs_version: "2.15.2"

Here are those NAT related flows:

# DNAT
 cookie=0x7040000000000, duration=2824.609s, table=42, n_packets=0, n_bytes=0, idle_age=2824, priority=200,tcp,reg3=0xc0a8f903,reg4=0x20035/0x7ffff actions=ct(commit,table=45,zone=65520,nat(dst=192.168.249.3:53),exec(load:0x
21->NXM_NX_CT_MARK[]))
# SNAT
 cookie=0x7000000000000, duration=2800.589s, table=106, n_packets=6, n_bytes=396, priority=210,ct_state=+new+trk,ip,reg1=0x2 actions=ct(commit,table=108,zone=65521,nat(src=169.254.169.253))

systeminfo

PS C:\cygwin\home\Administrator\antrea> systeminfo

Host Name:                 A-MS-2006-WIN-1
OS Name:                   Microsoft Windows Server 2019 Standard
OS Version:                10.0.17763 N/A Build 17763
OS Manufacturer:           Microsoft Corporation
OS Configuration:          Standalone Server
OS Build Type:             Multiprocessor Free
Registered Owner:          Windows User
Registered Organization:
Original Install Date:     1/2/2020, 4:04:05 AM
System Boot Time:          10/11/2021, 8:25:50 PM
System Manufacturer:       VMware, Inc.
System Model:              VMware7,1
System Type:               x64-based PC
Processor(s):              8 Processor(s) Installed.
                           [01]: Intel64 Family 6 Model 85 Stepping 4 GenuineIntel ~2195 Mhz
                           [02]: Intel64 Family 6 Model 85 Stepping 4 GenuineIntel ~2195 Mhz
                           [03]: Intel64 Family 6 Model 85 Stepping 4 GenuineIntel ~2195 Mhz
                           [04]: Intel64 Family 6 Model 85 Stepping 4 GenuineIntel ~2195 Mhz
                           [05]: Intel64 Family 6 Model 85 Stepping 4 GenuineIntel ~2195 Mhz
                           [06]: Intel64 Family 6 Model 85 Stepping 4 GenuineIntel ~2195 Mhz
                           [07]: Intel64 Family 6 Model 85 Stepping 4 GenuineIntel ~2195 Mhz
                           [08]: Intel64 Family 6 Model 85 Stepping 4 GenuineIntel ~2195 Mhz
Windows Directory:         C:\Windows
System Directory:          C:\Windows\system32
Boot Device:               \Device\HarddiskVolume2
System Locale:             en-us;English (United States)
Time Zone:                 (UTC-08:00) Pacific Time (US & Canada)
Total Physical Memory:     16,383 MB
Available Physical Memory: 11,648 MB
Virtual Memory: Max Size:  23,986 MB
Virtual Memory: Available: 19,538 MB
Page File Location(s):     C:\pagefile.sys
Logon Server:              \\A-MS-2006-WIN-1
Hotfix(s):                 6 Hotfix(s) Installed.
                           [01]: KB4576949
                           [02]: KB4512577
                           [04]: KB4561600
                           [05]: KB4570332
                           [06]: KB4570333
Network Card(s):           5 NIC(s) Installed.
                           [01]: vmxnet3 Ethernet Adapter
                                 Connection Name: Ethernet0 2
                                 DHCP Enabled:    Yes
                                 DHCP Server:     N/A
                                 IP address(es)
                           [02]: Hyper-V Virtual Ethernet Adapter
                                 Connection Name: vEthernet (8f70e431c182fa8)
                                 DHCP Enabled:    No
                                 IP address(es)
                                 [01]: 172.30.240.1
                                 [02]: fe80::bd25:d3af:fa51:61b3
                           [03]: Hyper-V Virtual Ethernet Adapter
                                 Connection Name: vEthernet (HNS Internal NIC)
                                 DHCP Enabled:    No
                                 IP address(es)
                                 [01]: 10.110.200.41
                           [04]: Hyper-V Virtual Ethernet Adapter
                                 Connection Name: br-int
                                 DHCP Server:     10.172.40.5
                                 IP address(es)
                                 [01]: 10.176.26.112
                                 [02]: fe80::f0de:4edf:4a45:de4c
                                 [03]: 2620:124:6020:1006:f0de:4edf:4a45:de4c
                           [05]: Hyper-V Virtual Ethernet Adapter
                                 Connection Name: antrea-gw0
                                 DHCP Enabled:    No
                                 [01]: 192.168.250.1
                                 [02]: fe80::6401:fe23:fa6f:95bb
Hyper-V Requirements:      A hypervisor has been detected. Features required for Hyper-V will not be displayed.

aserdean pushed a commit to openvswitch/ovs that referenced this issue Oct 19, 2021
Currently the layers info propogated to ProcessDeferredActions may be
incorrect. Because of this, any subsequent usage of layers might result
in undesired behavior. Accordingly in this patch it will add the related
 layers in the deferred action to make sure the layers consistent with
the related NBL.

In the specified case 229, we have encountered one issue when doing
the decap Geneve Packet and doing the twice NAT(via two flow tables)
and found the HTTP packet will be changed the TCP sequence.

After debugging, we found the issue is caused by the not-updated
layers value isTcp and isUdp for Geneve decapping case.

The related function call chains are listed below,

OvsExecuteDpIoctl—>OvsActionsExecute—>OvsDoExecuteActions->OvsTunnelPortRx
——>OvsDoExecuteActions——〉nat ct action and recircle action
->OvsActionsExecute->defered_actions processing for nat and recircle action

For the Geneve packet decaping, it will firstly set the layers for Udp packet.
Then it will go on doing OVS flow extract to get the inner packet layers and
Processing the first nat action and first recircle action. After that datapath
Will do defered_actions processing on OvsActionsExecute. And it does inherit
The incorrect geneve packet layers value( isTCP 0 and isUdp 1).So in the second
Nat action processing it will get the wrong TCP Headers in OvsUpdateAddressAndPort
And it will update  related TCP check field value but in this case it will change
The packet Tcp seq value.

Reported-at:openvswitch/ovs-issues#229
Signed-off-by: Wilson Peng <pweisong@vmware.com>
Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org>
aserdean pushed a commit to openvswitch/ovs that referenced this issue Oct 19, 2021
Currently the layers info propogated to ProcessDeferredActions may be
incorrect. Because of this, any subsequent usage of layers might result
in undesired behavior. Accordingly in this patch it will add the related
 layers in the deferred action to make sure the layers consistent with
the related NBL.

In the specified case 229, we have encountered one issue when doing
the decap Geneve Packet and doing the twice NAT(via two flow tables)
and found the HTTP packet will be changed the TCP sequence.

After debugging, we found the issue is caused by the not-updated
layers value isTcp and isUdp for Geneve decapping case.

The related function call chains are listed below,

OvsExecuteDpIoctl—>OvsActionsExecute—>OvsDoExecuteActions->OvsTunnelPortRx
——>OvsDoExecuteActions——〉nat ct action and recircle action
->OvsActionsExecute->defered_actions processing for nat and recircle action

For the Geneve packet decaping, it will firstly set the layers for Udp packet.
Then it will go on doing OVS flow extract to get the inner packet layers and
Processing the first nat action and first recircle action. After that datapath
Will do defered_actions processing on OvsActionsExecute. And it does inherit
The incorrect geneve packet layers value( isTCP 0 and isUdp 1).So in the second
Nat action processing it will get the wrong TCP Headers in OvsUpdateAddressAndPort
And it will update  related TCP check field value but in this case it will change
The packet Tcp seq value.

Reported-at:openvswitch/ovs-issues#229
Signed-off-by: Wilson Peng <pweisong@vmware.com>
Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org>
aserdean pushed a commit to openvswitch/ovs that referenced this issue Oct 19, 2021
Currently the layers info propogated to ProcessDeferredActions may be
incorrect. Because of this, any subsequent usage of layers might result
in undesired behavior. Accordingly in this patch it will add the related
 layers in the deferred action to make sure the layers consistent with
the related NBL.

In the specified case 229, we have encountered one issue when doing
the decap Geneve Packet and doing the twice NAT(via two flow tables)
and found the HTTP packet will be changed the TCP sequence.

After debugging, we found the issue is caused by the not-updated
layers value isTcp and isUdp for Geneve decapping case.

The related function call chains are listed below,

OvsExecuteDpIoctl—>OvsActionsExecute—>OvsDoExecuteActions->OvsTunnelPortRx
——>OvsDoExecuteActions——〉nat ct action and recircle action
->OvsActionsExecute->defered_actions processing for nat and recircle action

For the Geneve packet decaping, it will firstly set the layers for Udp packet.
Then it will go on doing OVS flow extract to get the inner packet layers and
Processing the first nat action and first recircle action. After that datapath
Will do defered_actions processing on OvsActionsExecute. And it does inherit
The incorrect geneve packet layers value( isTCP 0 and isUdp 1).So in the second
Nat action processing it will get the wrong TCP Headers in OvsUpdateAddressAndPort
And it will update  related TCP check field value but in this case it will change
The packet Tcp seq value.

Reported-at:openvswitch/ovs-issues#229
Signed-off-by: Wilson Peng <pweisong@vmware.com>
Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org>
aserdean pushed a commit to openvswitch/ovs that referenced this issue Oct 19, 2021
Currently the layers info propogated to ProcessDeferredActions may be
incorrect. Because of this, any subsequent usage of layers might result
in undesired behavior. Accordingly in this patch it will add the related
 layers in the deferred action to make sure the layers consistent with
the related NBL.

In the specified case 229, we have encountered one issue when doing
the decap Geneve Packet and doing the twice NAT(via two flow tables)
and found the HTTP packet will be changed the TCP sequence.

After debugging, we found the issue is caused by the not-updated
layers value isTcp and isUdp for Geneve decapping case.

The related function call chains are listed below,

OvsExecuteDpIoctl—>OvsActionsExecute—>OvsDoExecuteActions->OvsTunnelPortRx
——>OvsDoExecuteActions——〉nat ct action and recircle action
->OvsActionsExecute->defered_actions processing for nat and recircle action

For the Geneve packet decaping, it will firstly set the layers for Udp packet.
Then it will go on doing OVS flow extract to get the inner packet layers and
Processing the first nat action and first recircle action. After that datapath
Will do defered_actions processing on OvsActionsExecute. And it does inherit
The incorrect geneve packet layers value( isTCP 0 and isUdp 1).So in the second
Nat action processing it will get the wrong TCP Headers in OvsUpdateAddressAndPort
And it will update  related TCP check field value but in this case it will change
The packet Tcp seq value.

Reported-at:openvswitch/ovs-issues#229
Signed-off-by: Wilson Peng <pweisong@vmware.com>
Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org>
aserdean pushed a commit to openvswitch/ovs that referenced this issue Oct 19, 2021
Currently the layers info propogated to ProcessDeferredActions may be
incorrect. Because of this, any subsequent usage of layers might result
in undesired behavior. Accordingly in this patch it will add the related
 layers in the deferred action to make sure the layers consistent with
the related NBL.

In the specified case 229, we have encountered one issue when doing
the decap Geneve Packet and doing the twice NAT(via two flow tables)
and found the HTTP packet will be changed the TCP sequence.

After debugging, we found the issue is caused by the not-updated
layers value isTcp and isUdp for Geneve decapping case.

The related function call chains are listed below,

OvsExecuteDpIoctl—>OvsActionsExecute—>OvsDoExecuteActions->OvsTunnelPortRx
——>OvsDoExecuteActions——〉nat ct action and recircle action
->OvsActionsExecute->defered_actions processing for nat and recircle action

For the Geneve packet decaping, it will firstly set the layers for Udp packet.
Then it will go on doing OVS flow extract to get the inner packet layers and
Processing the first nat action and first recircle action. After that datapath
Will do defered_actions processing on OvsActionsExecute. And it does inherit
The incorrect geneve packet layers value( isTCP 0 and isUdp 1).So in the second
Nat action processing it will get the wrong TCP Headers in OvsUpdateAddressAndPort
And it will update  related TCP check field value but in this case it will change
The packet Tcp seq value.

Reported-at:openvswitch/ovs-issues#229
Signed-off-by: Wilson Peng <pweisong@vmware.com>
Signed-off-by: Alin-Gabriel Serdean <aserdean@ovn.org>
lzhecheng added a commit to lzhecheng/antrea that referenced this issue Nov 8, 2021
It includes AntreaProxy related fixes.
openvswitch/ovs-issues#229
openvswitch/ovs-issues#231

Signed-off-by: Zhecheng Li <lzhecheng@vmware.com>
tnqn pushed a commit to antrea-io/antrea that referenced this issue Nov 9, 2021
It includes AntreaProxy related fixes.
openvswitch/ovs-issues#229
openvswitch/ovs-issues#231

Signed-off-by: Zhecheng Li <lzhecheng@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants