-
Notifications
You must be signed in to change notification settings - Fork 333
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes network policy support #48
Conversation
This is great. I am hoping to give a feedback soon. But, if you have a newer version, please feel free to update the pull request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So much code! Will continue to look at it.
ovn_k8s/modes/overlay.py
Outdated
match_items.add(item_match_str) | ||
ports_match = "\ ||\ ".join(match_items) | ||
if ports_match and src_match: | ||
policy_match = "%s\ &&\ %s" % (ports_match, src_match) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
'man ovn-sb' says that when we use "||" and "&&" in the same match, there has to be parenthesis. I have not tested this, but it looks like parenthesis is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking back at the yml policy files I used to test this code path went untested. You are therefore right I think
rule_address_set = self._build_rule_address_set_name( | ||
policy_data['metadata']['name'], | ||
namespace, | ||
rule) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a check here to see whether address set actually exists? Same in below function on removing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I added this check, but as I refactored this code many time I might have removed it.
I went through the commits. I am OK with the architecture. Since this is still a beta feature in K8s, I did not concentrate too much on micro-details. We can work on fixing bugs and TODOs as we go forward. |
Thanks Guru! I am a bit busy at the moment with other projects, but I've spotted some things I want to improve. I should be able to provide an improved patchset (with all unit tests) early next week |
Add functions to watch and retrieve kubernetes namespaces and network policies. The patch also includes unit tests to cover the newly added functions. Signed-off-by: Salvatore Orlando <salv.orlando@gmail.com>
d844bd7
to
db62339
Compare
I gave this patch series a decent test and fixed quite a number of errors I found in the process. |
db62339
to
cd52a9d
Compare
response = requests.get(url, headers=headers, params=query_params) | ||
if not response: | ||
if response.status_code == 404: | ||
raise exceptions.NotFound(resource_type='namespace', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little unrelated, but I think the exceptions class does not work correctly. You likely need a call in init of OvnK8sException
super(Exception, self).init(self.message)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point. I'll check and push a separate patch in the morning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am still reviewing one commit at a time. A few trivial comments.
def remove_pseudo_acls(self, policy_id): | ||
self.pseudo_acls.pop(policy_id, None) | ||
|
||
def _find_policies_for_pod(self, pod_data, policies): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment of the form 'This function returns all the 'NetworkPolicy' objects that apply to the pod in question' as a summary will likely make it easy to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
pod_policies.append(policy) | ||
return pod_policies | ||
|
||
def apply_pod_policy_acls(self, pod_data, policies): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A summary comment for the function of the form "Get all the K8s NetworkPolicy objects that effect this pod. For each of the NetworkPolicy objects, create pseudo-acls and apply it to the pod as acls." will be useful.
I think similar summary comments for all the functions this commit adds will be useful. This can come in another commit. But I think it helps in browsing code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will add proper docstrings for all the functions which are supposed to be part of the public interface, and simple comments for the others
rule_address_set, 'addresses', pod_ip) | ||
|
||
def build_pseudo_acls(self, policy_data): | ||
# Build pseudo ACL list for policy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will help if we have a comment that describes what a pseudo acl is or an example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
ovn_k8s/modes/overlay.py
Outdated
match = self._create_acl_match(lport_name, pseudo_acl) | ||
vlog.dbg("Pod: %s (Namespace:%s): ACL match: %s" % ( | ||
pod_name, pod_ns, match)) | ||
ovn_acl_data = (pseudo_acl[0], match, pseudo_acl[3]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is a little hard to look back to see what pseudo_acl[0] and pseudo_acl[3] are. A comment will help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been using the tuple out of laziness. I could just use a named tuple indeed.
policy_pseudo_acls = self.pseudo_acls.get( | ||
policy['metadata']['uid']) | ||
if not policy_pseudo_acls: | ||
policy_pseudo_acls = self.build_pseudo_acls(policy) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am looking at the code one commit at a time. As of commit 2, I don't see address_set being created when a pod gets created. The flow looks like create a pod -> create logical port -> get all network policies that apply to that logical port -> for each network policy, get the network policy match and convert it into pseudo acls. For each pseudo acl, convert it into OVN ACL and apply.
It probably gets fixed in a future commit though or may be gets created when a network policy is created?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The address set is actually created for every rule in the policy.
There is a bug actually, and that bug is that the address set is created also when it's not needed. I will double check your comment however as I would not be surprised if you found another bug.
Otherwise I will add inline comments to clarify how address sets are created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a good explanation for why you did not see the address set being created - that is done in the processor which is completed after commit number 2 -> 0cf76dd#diff-0530b779b91cf681368939077d832059R122
ovn_k8s/modes/overlay.py
Outdated
@@ -217,6 +237,16 @@ def delete_logical_port(self, event): | |||
|
|||
vlog.info("deleted logical port %s" % logical_port) | |||
|
|||
if delete_network_policies: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it have been easier to delete the acls first and then delete the logical port? Probably does not make a difference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does not indeed since the pod at this stage is already deleted. I'll see if removing ACLs first simplifies the logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did the change. The logic looks pretty much the same, but perhaps the code now it's a bit easier to debug.
rule_hash = hasher.hexdigest() | ||
return "%s_%s_%s" % (policy_name, namespace, rule_hash) | ||
|
||
def create_address_set(self, namespace, policy_data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment here could help on what 'policy_data' is. Is it a NetworkPolicy object?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, I will add a comment
# It could be useful for debugging | ||
self._create_acl(pod_id, lswitch_name, lport_name, *ovn_acl_data) | ||
|
||
def _build_rule_address_set_name(self, policy_name, namespace, rule_data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will be useful to know in a comment what 'rule_data' is. I guess, because of multi-layering, it is easy to get lost in a maze. So a comment here lets you look at the function without having to go back and see what 'rule_data' is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense
ovn_k8s/modes/overlay.py
Outdated
namespace = data['metadata']['namespace'] | ||
pod_name = data['metadata']['name'] | ||
|
||
ip_address_mask = "%s/%s" % (ip_address, mask) | ||
if create_network_policies: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think the following comment here makes sense (if it does not, would you mind writing the correct one):
When a pod gets created, we need to do two things.
-
Look at all network policies and see whether that network policy's pod selector matches this pod. If it does, add ingress control ACLs based on the rules in that policy object.
-
Look at all the network policies and see whether this pod matches any network policy's 'from' clause. If it does match, then we need to add this pod IP to that address set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct, that is exactly what we do. I will add the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few more trivial comments. It has started getting more obvious as how all the processors and watcher come together, but I think any new person reading this will start feeling like they are in a maze. So probably more comments will help.
I am writing notes as I go along. Maybe I will write a architecture document which explains the entire workflow to make a beginner (or my own self a couple of months down the line) understand this better.
ovn_k8s/processor/__init__.py
Outdated
@@ -33,15 +33,27 @@ def __init__(self, event_type, source, metadata): | |||
self.metadata = metadata | |||
|
|||
|
|||
class NSEvent(Event): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment above the new classes that expands the class names of NS and NP will likely help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
log = vlog.Vlog('policy_processor') | ||
pw = ovn_k8s.watcher.policy_watcher | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The commit message explains the policy processor. But I think it will be more useful to have a description about what the policy processor is, what its responsibilities are and who calls it in the file here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely. I will also add this information to the markdown doc I added in the last commit
ovn_k8s/watcher/namespace_watcher.py
Outdated
def _send_event(self, ns_name, event_type): | ||
event = ovn_k8s.processor.NSEvent(event_type, | ||
source=ns_name, | ||
metadata=self.ns_cache[ns_name]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment here that we are adding this to a policy processor's event queue will make it very explicit that namespace events are only processed by policy processor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure
|
||
def process_events(self, events): | ||
log.dbg("Processing %d events from queue" % len(events)) | ||
for event in events[:]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we add a comment here about where these different events are coming from, it makes it easy. For e.g namespace events come from namespace watcher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am done!
if event.event_type == 'ADDED': | ||
log.dbg("Namespace %s added - spawning policy watcher" % namespace) | ||
watcher_thread = pw.create_namespace_watcher( | ||
namespace, self._pool) | ||
self._np_watcher_threads[namespace] = watcher_thread | ||
# Upon restart this event will be receive for existing pods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/receive/received
# Pods are affected only if the namespace is isolated | ||
if not kubernetes.is_namespace_isolated( | ||
variables.K8S_API_SERVER, pod_data['metadata']['namespace']): | ||
log.dbg("Namespace %s for pod %s is not isolated, no further " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The debug message here states that since the pod's namespace is not isolated, you don't need any processing. But, we don't return and instead do some processing. Is this wrong?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nope it's not wrong. I wrote the dbg stmt before I realized we needed some processing....
I will fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is actually a chance the processing done after the log stmt is wasteful.
I will double check and if necessary push and additional PR to amend - adding todo comments for now.
policies = ns_policy_map.get( | ||
pod_ns, kubernetes.get_network_policies( | ||
variables.K8S_API_SERVER, | ||
pod_data['metadata']['namespace'])) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pod_data['metadata']['namespace'] can be replaces with pod_ns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be done.
@@ -28,34 +30,192 @@ def __init__(self, pool): | |||
self._pool = pool | |||
self._np_watcher_threads = {} | |||
|
|||
def _process_ns_event(self, event): | |||
def _process_ns_event(self, event, affected_pods, pod_events): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment which gives a summary of what this function does will be useful. It looks like the goal is to find all the pods effected by this namespace event and put it in affected_pods dict
"no processing necessary" % | ||
(pod_name. namespace, isolated_final)) | ||
return | ||
self.mode.remove_acls(pod_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any single namespace event, and we remove all the ACLs for the pod above. Is that not bad? If not, can we add a comment on why it is not a big deal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you're right that is pretty bad. I think I can easily make that better leveraging priorities.
There is plenty of space for improvement! (that a nice way for saying I did care mostly about functionality and very little about performance a scale in this first patch series)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
adding a TODO comment for now, will push improvements like this in different PRs.
for event in pod_events[pod_id]]))) | ||
self._apply_pod_acls(pod_data, pod_events[pod_id], policies) | ||
|
||
log.info("Event processing terminated.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't the above message keep getting printed a lot of times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh yes, and it's not the only one.
I will remove this as a starter, but I was thinking on a different PR to remove log clutter.
# in the namespace are affected | ||
scan_pods() | ||
|
||
def _process_pod_event(self, event, affected_pods, pod_events): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand right, this function is only called when a pod's label changes. A comment of that form will be useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be done
self.mode.whitelist_pod_traffic(pod_id, namespace, pod_name) | ||
|
||
def _apply_pod_acls(self, pod_data, pod_events, policies): | ||
# Do shallow copy of pod events as the list will be modified |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a quick glance, I could not find where the list is being modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might easily be correct since I changed this routine several times there might be no need anymore to do the copy
self._process_ns_event(event, affected_pods, pod_events) | ||
if isinstance(event, processor.NPEvent): | ||
# policy add -> create ACLs for affected pods | ||
# policy delete -> remove ACLs for affected pods |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The function _process_np_event does not actually create and destroy ACLs for the pods right? All it does is marks all the affected pods. And then removes/creates address sets and removes/creates pseudo ACLs
Thanks for your thorough review @shettyg !!! |
Add methods to ovn_k8s.modes.overlay.OvnNB for implementing kubernetes network policies as ovn ACLs. These methods include: - whitelisting pod traffic for non-isolated namespaces - creating ACL match rules from policies' ingress spec - applying ACL rules for polices to a given pod in isolated namespace (including a default drop-all rule) - removing ACL rules for a given pod - managing OVN address sets for K8S policies ingress rules - adding/removing pods' IP addresses from address sets Unit test coverage is provided for methods added as a part of this patch. Signed-off-by: Salvatore Orlando <salv.orlando@gmail.com>
In order to handle kubernetes network policies, both namespaces and network policies must be monitored. As policies are always namespaces, a distinct policy watcher thread is spawn for each namespace. This is achieved using a "policy processor". The namespace event is sent to the policy processor, which starts the policy watcher thread. The same policy processor will also handle (not in this patch) translation of k8s network policies into OVN ACLs. Also, the pod watcher now checks for changes in pod labels: updates, additions, and deletions. When a change is detected, generate a policy event so that: 1 - network policies' from clauses can be recalculated 2 - the appropriate network policies can be applied to the pod according to pod selectors Signed-off-by: Salvatore Orlando <salv.orlando@gmail.com>
In this way event processors will not have to look into event metatada to gain more insigths into what originated the events, but can just look at the event class. This enables watchers to set whatever makes sense into event metadata without having to worry about assumptions about what data event processors will find in metadata. Signed-off-by: Salvatore Orlando <salv.orlando@gmail.com>
Add logic for processing network policy objects. This includes: - ensuring traffic is whitelisted for pods in non-isolated namespaces - ensuring a default 'drop' ACL is added for each pod in an isolated namespace - translating kubernetes network policies into a pseudo-acls representation of OVN match rules - Determining which policies apply to a pod and translating pseud-acls into actual ACLs upon pod creation - Removing all ACLs for a pod upon pod deletion - Creating and maintaining OVN address sets for the IP addresses of pods that match the from clause of network policies rules - monitoring transitions in the namespace isolation property and reacting accordingly Also, the pod watcher will keep track of pod-IP mappings. As the pod IP address is removed before the pod DELETED events, when this event occurs neither pod data nor their cached version will contain a pod IP. The pod IP is required to update the address set for the network policy upon pod deletion. As a part of this patch, the signature for the create_logical_port and delete_logical_port methods in the ovn_k8s.modes.overlay.OvnNB class has been changed to accept pod data (and pod ip for the delete method) rather than an event. Signed-off-by: Salvatore Orlando <salv.orlando@gmail.com>
Add a markdown document discussing the kubernetes network policy integration with OVN ACLs which also includes some implementation details, and some example policies which can be used to test the feature. Signed-off-by: Salvatore Orlando <salv.orlando@gmail.com>
cd52a9d
to
e85a11f
Compare
Do let me know once you want me to take a second look. force push don't produce notifications. |
Yes, I will, but do not review yet this new patchset as I introduce a bug in the creation of the match rule |
There is actually a weird issue with the watcher as I'm not receiving anymore DELETED events for pods. I am not sure it has something to do with the code in this PR, since once I deleted pod the following GET returns no results:
whereas a watch API query will return an ADDED event for the pod I just deleted.
There is a chance this could be an upstream bug - I need to verify this. |
That is weird. I haven't seen issues like this as I have been using k8s 1.3.x. |
@salv-orlando @shettyg Is any plan of moving this to golang instead? I'd prefer using golang since #110 has been in. |
Yes. Thats the goal. I am looking at porting this to golang. |
policy_data['metadata']['name'], | ||
policy_data['metadata']['namespace'], | ||
rule) | ||
src_pod_ips = 'address_set(%s)' % rule_address_set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will work. According to openvswitch/ovs@ea38256 this line should be written down as:
src_pod_ips = '$%s' % rule_address_set
We are comfortable with the golang watcher now. So it is time to retire this pull request. Thanks salv for all the work here! |
OVN: bump to ovs-2.12
NOTE: Testing is not yet complete!!!
Well, I'm pretty sure at this stage you will not put this code into any production environment... anyway, if you try out this code it's likely something will not work as expected, or not work at all. In that case, I would be extremely grateful if you'd let me know what went wrong, and I'll fix it.
Besides not being tested, this work is not yet complete as it lacks:
I am opening this PR to get early code reviews, so that I can start addressing feedback while the above steps (and testing of course) are completed.