Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sNAT via conntrack ACLs #4104

Closed
courtland opened this issue Jun 14, 2022 · 5 comments
Closed

sNAT via conntrack ACLs #4104

courtland opened this issue Jun 14, 2022 · 5 comments

Comments

@courtland
Copy link

Hello,

I am hoping to get some some feedback with using Faucet + openvswitch as a basic IPv4 NAT router on an Ubuntu 22.04 system with untagged VLANs. After looking through some of your integration tests and the conntrack/ovs docs, I managed to put the following together. It seems to work, albeit the throughput performance is not great. It's not the most powerful hardware, but without conntrack in play I get line rate of about 1Gb/s routing between VLANs using Faucet. With conntrack sNAT it drops to 200Gb/s. I'm wondering if I am doing something wrong or if you have any ACL/flow optimization suggestions. Is there anything inherently wrong with NATing to a Faucet VIP?

vlans:
  wan:
    vid: 4095
    description: "wan native"
    faucet_mac: "0e:00:00:00:10:02"
    routes:
      - route:
          ip_dst: "0.0.0.0/0"
          ip_gw: "10.8.8.1"
    faucet_vips:
      - 10.8.8.254/24
  lan:
    vid: 1
    description: "lan native"
    faucet_mac: "0e:00:00:00:10:01"
    faucet_vips:
      - 10.0.1.254/24
routers:
  vlan-router:
    vlans: [lan, wan]
acls:
  nat_out:
    - rule:
        eth_type: 0x0800 # ipv4
        ct_state: 0/0x20 # untracked
        actions:
          ct:
            flags: 1
            table: 1
            zone: 1 # unique snat zone
            nat:
              flags: 1
              range_ipv4_min: 10.8.8.254
              range_ipv4_max: 10.8.8.254
    - rule:
        actions:
          allow: 1
  nat_in:
    - rule:
        eth_type: 0x0800 # ipv4
        ct_state: 0/0x20 # untracked
        actions:
          ct:
            flags: 1
            table: 1
            zone: 1 # unique snat zone
            nat:
              flags: 1
    - rule:
        actions:
          allow: 1
dps:
  dp1:
    dp_id: 0x1
    hardware: "Open vSwitch"
    interfaces:
      2:
        name: "eno2"
        description: "eno2"
        native_vlan: wan
        acl_in: nat_in
      3:
        name: "eno3"
        description: "eno3"
        native_vlan: lan
        acl_in: nat_out
 ovs-vsctl show
c543bc62-82f3-416d-9e8e-3af4642499a4
    Bridge dp1
        Controller "tcp:127.0.0.1:6653"
            is_connected: true
        fail_mode: secure
        Port eno2
            Interface eno2
        Port eno3
            Interface eno3
        Port dp1
            Interface dp1
                type: internal
    Bridge br-int
        fail_mode: secure
        datapath_type: system
        Port br-int
            Interface br-int
                type: internal
    ovs_version: "2.17.0"
ovs-ofctl dump-flows dp1 -O OpenFlow13 | grep =ct
 cookie=0x5adc15c0, duration=2053.907s, table=0, n_packets=246678, n_bytes=66885166, priority=20480,ct_state=-trk,ip,in_port=2 actions=ct(commit,table=1,zone=1,nat(src))
 cookie=0x5adc15c0, duration=1171.222s, table=0, n_packets=219635, n_bytes=856634967, priority=20480,ct_state=-trk,ip,in_port=3 actions=ct(commit,table=1,zone=1,nat(src=10.8.8.254))

Thanks!

@gizmoguy
Copy link
Member

So in conntrack mode faucet shouldn't touch your packets at all, everything should be processed inside the Open vSwitch dataplane.

I haven't benchmarked OVS NATing myself so not sure what performance you can expect, though stateful connection tracking is inherently more work than regular routing so if you have a slow CPU this could be expected.

You can rule out faucet as the culprit of the slow performance by setting up NAT outside of faucet using this OVS tutorial: https://docs.openvswitch.org/en/latest/tutorials/ovs-conntrack/

Also if you post your hardware specs and OVS version, we might be able to help you tune things to get better performance.

@courtland
Copy link
Author

Thanks for the quick response.

It turns out the performance problem was actually a regression in OVS code in the 5.15 LTS kernel that currently ships with Ubuntu 22.04 LTS (5.15.0-39-generic). It was generating a kernel WARN stack trace for every packet passed through conntrack. The warning did not break functionality but really hampered performance. Upgrading the kernel to latest mainline fixed the problem. Throughput is line rate 1Gb/s again for a basic NAT flow.

Here's the relevant kernel issue and commit discussion for anyone that's curious:

WARNING: CPU: 4 PID: 0 at include/net/netfilter/nf_conntrack.h:175 __ovs_ct_lookup+0x36c/0x3e0 [openvswitch]

https://lore.kernel.org/netdev/YmaJlOS1YtPaq0EC@kroah.com/T/

I would still be interested in any general performance tuning suggestions if you have any advice or relevant links. I'm on OVS 2.17.0 with plans to run on a variety of hardware in campus environments, but the following is the lab system in question.

Machine:
  Type: Kvm System: Supermicro product: SYS-E300-8D
  Mobo: Supermicro model: X10SDV-TP8F
Memory:
  RAM: total: 15.53 GiB used: 1.22 GiB (7.8%)
Network:
  Device-1: Intel Ethernet X552 10 GbE SFP+ driver: ixgbe
  Device-2: Intel Ethernet X552 10 GbE SFP+ driver: ixgbe
  Device-3: Intel I210 Gigabit Network driver: igb
  Device-4: Intel I210 Gigabit Network driver: igb
  Device-5: Intel I350 Gigabit Network driver: igb
  Device-6: Intel I350 Gigabit Network driver: igb
  Device-7: Intel I350 Gigabit Network driver: igb
  Device-8: Intel I350 Gigabit Network driver: igb

I'd also like to say that Faucet is a really great project. I've worked in the "SDN" space for a long time and Faucet so far seems to have a great balance between flexibility and approachability compared to the other popular "open" SDN controllers out there. I hope the community can keep the momentum going.

Are you open to doc PRs? I would like to add the NAT example to your "Recipe Book", but I noticed so far you seem to have only full-blown tutorials.

@gizmoguy
Copy link
Member

Great detective work, I'm not running any Ubuntu 22.04 machines yet so hadn't come across this issue.

We of course are happy to take doc only PRs, especially in the area of conntrack support for faucet since it's such a new feature. I'm happy to work with you to get this landed.

I'm working on writing some stateful firewall rulesets for faucet, but I haven't got them to the stage where I can publish them yet, but happy to share my working ruleset if you're interested in doing more than NAT.

@courtland
Copy link
Author

I am definitely interested in doing more than NAT, so any stateful rules you're willing to share would be very helpful! That was actually going to be my next on my list of things to try...

I will try and put together a PR with some conntrack docs. Hopefully sometime next week. Since you're working on stateful rules, do you think it makes more sense to try and put together a new Tutorial about conntrack, or just begin to populate the Recipe Book?

@courtland
Copy link
Author

It turns out the performance problem was actually a regression in OVS code in the 5.15 LTS kernel that currently ships with Ubuntu 22.04 LTS (5.15.0-39-generic). It was generating a kernel WARN stack trace for every packet passed through conntrack. The warning did not break functionality but really hampered performance. Upgrading the kernel to latest mainline fixed the problem.

FWIW here's a link to the Ubuntu bug for the kernel issue along with a link to the fix.

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1978865

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants