-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
security groups: Do not use conntrack when it is not required #10594
base: 4.20
Are you sure you want to change the base?
Conversation
The conntrack is disabled if the security group allows all traffic. Also, refactored the code a little.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## 4.20 #10594 +/- ##
============================================
- Coverage 16.00% 16.00% -0.01%
- Complexity 13104 13105 +1
============================================
Files 5651 5651
Lines 495870 495870
Branches 60049 60049
============================================
- Hits 79370 79361 -9
- Misses 407638 407652 +14
+ Partials 8862 8857 -5
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is really great! I don't have a test env now, but can you confirm that you have tested this in your environment and verified it works?
Well, I tested it somewhat extensively: single ipv4, single ipv4+ipv6, ipv4+ipv6+additional ipv4 and ipv6 ips, multiple NICs, different ipv4+ipv6 rule combinations.. Looks good to me. However, I wouldn't trust me on this entirely, it is quite hard to not make a mistake with such a complicated script. |
@@ -607,12 +622,16 @@ def default_network_rules(vm_name, vm_id, vm_ip, vm_ip6, vm_mac, vif, brname, se | |||
if vm_ip: | |||
execute("iptables -A " + vmchain_default + " -m physdev --physdev-is-bridged --physdev-in " + vif + " -m set ! --match-set " + vmipsetName + " src -j DROP") | |||
execute("iptables -A " + vmchain_default + " -m physdev --physdev-is-bridged --physdev-out " + vif + " -m set ! --match-set " + vmipsetName + " dst -j DROP") | |||
execute("iptables -A " + vmchain_default + " -m physdev --physdev-is-bridged --physdev-in " + vif + " -m set --match-set " + vmipsetName + " src -p udp --dport 53 -j RETURN ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phsm
I think we should not change it
RETURN means rules in other chains will be checked. But they will not be checked if this is changed to ACCEPT
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw: I did not look into the changes. Correct me if I am wrong.
This script is very important for public cloud providers, we have to be very careful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for scrutinizing my PR, I understand how critical that file is. So the more eyes look at it (and test), the better.
Let me show the examples of the current and proposed security group rules so it becomes more clear.
First, lets have a look how the current rules look like for a single virtual machine with the current security group implementation:
:FORWARD ACCEPT [0:0]
# a set of rules like that for every shared network bridge (VLAN)
# in/out traffic forwarded via this bridge always passes into an individual "per bridge" chain
-A FORWARD -o brbond0-304 -m physdev --physdev-is-bridged -j BF-brbond0-304
-A FORWARD -i brbond0-304 -m physdev --physdev-is-bridged -j BF-brbond0-304
# Everything that was not explicitly accepted or dropped in the "per bridge" chain gets returned here
# and dropped
-A FORWARD -o brbond0-304 -j DROP
-A FORWARD -i brbond0-304 -j DROP
:BF-brbond0-304 - [0:0]
# The early accept of packets belongign to established and related connections.
# The goal of my changes are to eliminate conntrack as much as possible, therefore my changes do not have this rule
-A BF-brbond0-304 -m state --state RELATED,ESTABLISHED -j ACCEPT
-A BF-brbond0-304 -m physdev --physdev-is-in --physdev-is-bridged -j BF-brbond0-304-IN
-A BF-brbond0-304 -m physdev --physdev-is-out --physdev-is-bridged -j BF-brbond0-304-OUT
-A BF-brbond0-304 -m physdev --physdev-out bond0.304 --physdev-is-bridged -j ACCEPT
:BF-brbond0-304-IN - [0:0]
# rules for other VMs are omitted
-A BF-brbond0-304-IN -m physdev --physdev-in vnet152 --physdev-is-bridged -j i-2-104-def
:BF-brbond0-304-OUT - [0:0]
# rules for other VMs are omitted
-A BF-brbond0-304-OUT -m physdev --physdev-out vnet152 --physdev-is-bridged -j i-2-104-def
# At this point we can safely claim that the traffic concerning a single VM always reaches the VM '-def' chain.
# If anything is returned from the '-def' chain, it traverses back to the previous chains until it gets dropped in the FORWARD chain
# Except the outgoing traffic to the Internet due to an unexpected rule '-A BF-brbond0-304 -m physdev --physdev-out bond0.304 --physdev-is-bridged -j ACCEPT'
:i-2-104-def - [0:0]
-A i-2-104-def -m state --state RELATED,ESTABLISHED -j ACCEPT <- unnecesarry, the same rule exists earlier
-A i-2-104-def -p udp -m physdev --physdev-in vnet152 --physdev-is-bridged -m udp --sport 68 --dport 67 -j ACCEPT
-A i-2-104-def -p udp -m physdev --physdev-out vnet152 --physdev-is-bridged -m udp --sport 67 --dport 68 -j ACCEPT
-A i-2-104-def -p udp -m physdev --physdev-in vnet152 --physdev-is-bridged -m udp --sport 67 -j DROP
-A i-2-104-def -m physdev --physdev-in vnet152 --physdev-is-bridged -m set ! --match-set i-2-104-VM src -j DROP
-A i-2-104-def -m physdev --physdev-out vnet152 --physdev-is-bridged -m set ! --match-set i-2-104-VM dst -j DROP
# This traffic in the following 2 rules gets returned into the previous chains, which I believe triggers the rule
# '-A BF-brbond0-304 -m physdev --physdev-out bond0.304 --physdev-is-bridged -j ACCEPT'
# effectively always allowing the DNS, regardless if the security group denies all the traffic or not.
-A i-2-104-def -p udp -m physdev --physdev-in vnet152 --physdev-is-bridged -m set --match-set i-2-104-VM src -m udp --dport 53 -j RETURN
-A i-2-104-def -p tcp -m physdev --physdev-in vnet152 --physdev-is-bridged -m set --match-set i-2-104-VM src -m tcp --dport 53 -j RETURN
-A i-2-104-def -m physdev --physdev-in vnet152 --physdev-is-bridged -m set --match-set i-2-104-VM src -j i-2-104-VM-eg
-A i-2-104-def -m physdev --physdev-out vnet152 --physdev-is-bridged -j i-2-104-VM
:i-2-104-VM - [0:0]
# the incoming traffic to the VM gets into this chain
# The test VM has only one rule there: accept all protocols from 0.0.0.0/0
-A i-2-104-VM -j ACCEPT
-A i-2-104-VM -j RETURN
:i-2-104-VM-eg - [0:0]
# the outgoing traffic from the VM gets into this chain
# the test VM has no rules there:
-A i-2-104-VM-eg -j ACCEPT
Now, lets look at the rules that my changes produce:
*raw
:PREROUTING ACCEPT [0:0]
# Everything in the cs_notrack ipset gets excluded from conntrack
-A PREROUTING -m set --match-set cs_notrack dst -j NOTRACK
-A PREROUTING -m set --match-set cs_notrack src -j NOTRACK
:FORWARD ACCEPT [0:0]
-A FORWARD -o brbond0-302 -m physdev --physdev-is-bridged -j BF-brbond0-302
-A FORWARD -i brbond0-302 -m physdev --physdev-is-bridged -j BF-brbond0-302
-A FORWARD -o brbond0-302 -j DROP
-A FORWARD -i brbond0-302 -j DROP
:BF-brbond0-302 - [0:0]
-A BF-brbond0-302 -m physdev --physdev-is-in --physdev-is-bridged -j BF-brbond0-302-IN
-A BF-brbond0-302 -m physdev --physdev-is-out --physdev-is-bridged -j BF-brbond0-302-OUT
-A BF-brbond0-302 -m physdev --physdev-out bond0.302 --physdev-is-bridged -j ACCEPT
:BF-brbond0-302-IN - [0:0]
-A BF-brbond0-302-IN -m physdev --physdev-in vnet4 --physdev-is-bridged -j i-2-340-def
:BF-brbond0-302-OUT - [0:0]
-A BF-brbond0-302-OUT -m physdev --physdev-out vnet4 --physdev-is-bridged -j i-2-340-def
:i-2-340-def - [0:0]
-A i-2-340-def -p udp -m physdev --physdev-in vnet4 --physdev-is-bridged -m udp --sport 68 --dport 67 -j ACCEPT
-A i-2-340-def -p udp -m physdev --physdev-out vnet4 --physdev-is-bridged -m udp --sport 67 --dport 68 -j ACCEPT
-A i-2-340-def -p udp -m physdev --physdev-in vnet4 --physdev-is-bridged -m udp --sport 67 -j DROP
-A i-2-340-def -m physdev --physdev-in vnet4 --physdev-is-bridged -m set ! --match-set i-2-340-VM src -j DROP
-A i-2-340-def -m physdev --physdev-out vnet4 --physdev-is-bridged -m set ! --match-set i-2-340-VM dst -j DROP
# In the following 2 rules: instead of passing the traffic all the way back to the BF-brbond0-302 chain to accept it, I accept it here.
-A i-2-340-def -p udp -m physdev --physdev-in vnet4 --physdev-is-bridged -m set --match-set i-2-340-VM src -m udp --dport 53 -j ACCEPT
-A i-2-340-def -p tcp -m physdev --physdev-in vnet4 --physdev-is-bridged -m set --match-set i-2-340-VM src -m tcp --dport 53 -j ACCEPT
-A i-2-340-def -m physdev --physdev-in vnet4 --physdev-is-bridged -m set --match-set i-2-340-VM src -j i-2-340-VM-eg
-A i-2-340-def -m physdev --physdev-out vnet4 --physdev-is-bridged -j i-2-340-VM
# The following 2 rules: the last resort rule before dropping the packet: check the conntrack
-A i-2-340-def -m physdev --physdev-in vnet4 --physdev-is-bridged -m state --state RELATED,ESTABLISHED -j ACCEPT
-A i-2-340-def -m physdev --physdev-out vnet4 --physdev-is-bridged -m state --state RELATED,ESTABLISHED -j ACCEPT
# The following 2 rules: drop any other traffic concerning this vNIC
# this is done to support multi-NIC, as the same set rules will be added below for the other NIC if the VM has it
-A i-2-340-def -m physdev --physdev-in vnet4 --physdev-is-bridged -j DROP
-A i-2-340-def -m physdev --physdev-out vnet4 --physdev-is-bridged -j DROP
My intention to have the logic reworked was to reduce implicit chain returns when you do not know what is going to happen to a packet when it is returned.
So I made the traffic to flow into the VM '-def' chain and never return from it anymore, so it is clear what is happening to a packet by just looking at the '-def' chain.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@phsm
thanks for the reply.
as far as I remember, both ingress rules and egress rules should be checked. this is why I questioned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update regarding the missing '-j DROP' rules in the chain FORWARD chain:
I double checked, those drop rules are present in the chain. Seems I just wrongly grep-ed while preparing the listings for the previous post.
Edited the rules listing in the previous to contain these rules.
cc @loth @kriegsmanj |
Description
This PR changes the behavior of Security Groups to disable connection tracking when it is not needed.
The idea is that the VM that have "allow all" rule can have as many connections as they want without straining the host system. This change may be benefitial for VPS hosters, where the VM behavior is not under control of the servers administrator.
The list of changes:
Introduced two new ipsets:
cs_notrack
for IPv4 andcs_notrack6
for IPv6 that contain the VM IP addresses that do not need to be tracked.When a security group contains a rule allowing all protocols from 0.0.0.0/0 (IPv4) or ::/0 (IPv6), then all the IPv4 and/or IPv6 addresses of the VM are added to these ipsets.
The following rules are added into iptables table
raw
chainPREROUTING
:The iptables matchers
-m state --state NEW
are removed as they are not needed for several reasons:The
-m state --state ESTABLISHED,RELATED
rules are only placed at the end of the VM-def
chain, as the last resort rule before the final decision to drop the packet. The goal is to try explicit matchers as much as possible.The behavior of the
-VM
chain that contains user-defined rules was modified:-def
chain, where it is checked to belong to an existing connection, otherwise dropped.-m state NEW
are removed.Since the VM
-def
chain is populated with rules for each NIC, and there is no place in inject the final unconditional-j DROP
in the code, I had to resort to blocking traffic matching each VM network interface in the end of each set of interface-specific rulesA minor refactoring is done:
split_ips_by_family()
now takes one or more arguments that can be either a;
-separated string or any other type that can be parsed by Pythonipaddress.ip_address()
method.The function splits
;
-separated strings when it encounters them, removes the empty elements and '0' literals (they indicate an empty IP address list for some reason).As the result, it returns a tuple containing a list of IPv4 addresses, and IPv6 addresses. Therefore, the function is backwards compatible to the previous behavior.
split_ips_by_family()
, are removed.add_to_ipset()
uses-!
flag that silently ignores addition of a new element if it already exists in the ipset, or its removal if it doesn't exist in the ipset. It will still crash if the requested ipset does not exist. This change makesipset add
calls indempotent.Types of changes
Feature/Enhancement Scale or Bug Severity
Feature/Enhancement Scale
Bug Severity
Screenshots (if appropriate):
How Has This Been Tested?
Tested:
How did you try to break this feature and the system with this change?