New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Openshift Multi-tenant Sandbox Hooks #600
Conversation
|
Changes Unknown when pulling 30dae9d on shawn-hurley:network-iso-impl into ** on openshift:master**. |
8e2b60a
to
d8a6f0b
Compare
|
Changes Unknown when pulling 8e2b60a on shawn-hurley:network-iso-impl into ** on openshift:master**. |
|
Changes Unknown when pulling d8a6f0b on shawn-hurley:network-iso-impl into ** on openshift:master**. |
d8a6f0b
to
b3df86b
Compare
|
Changes Unknown when pulling b3df86b on shawn-hurley:network-iso-impl into ** on openshift:master**. |
|
This PR is related to openshift/openshift-ansible#6536. They do not block each other, but they are needed to work together. |
b3df86b
to
63b6aac
Compare
|
Changes Unknown when pulling 63b6aac on shawn-hurley:network-iso-impl into ** on openshift:master**. |
pkg/clients/openshift.go
Outdated
| } | ||
|
|
||
| // JoinNamespacesNetworks - Will take the net namespace to be added to a network, | ||
| // and the namespace ID of that netwok. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this more clear that the namespace id is the network that we are joining to.
pkg/runtime/openshift.go
Outdated
| pluginName, err := ocli.GetClusterNetworkPlugin() | ||
| log.Debugf("plugin for the network - %v", pluginName) | ||
| if err != nil { | ||
| // Plugins could not be defined oc cluster up or other use cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this statement more clear
pkg/runtime/runtime.go
Outdated
| @@ -172,6 +186,16 @@ func (p provider) CreateSandbox(podName string, | |||
| } | |||
|
|
|||
| log.Info("Successfully created apb sandbox: [ %s ], with %s permissions in namespace %s", podName, apbRole, namespace) | |||
| log.Info("Running post create sandbox fuctions if defined.") | |||
| for i, f := range p.postSandboxCreate { | |||
| log.Debug("Running function: %v", i) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Debug -> Debugf
Also, make this more useful.
pkg/runtime/sandbox_hooks.go
Outdated
| // return error. | ||
| type PostSandboxCreate func(string, string, []string, string) error | ||
|
|
||
| // AddPostCreateSandbox - Adds a post create sandbox to the runtime. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adds a post create sandbox to the runtime -> Adds a post create sandbox function to the runtime.
pkg/runtime/sandbox_hooks.go
Outdated
| } | ||
|
|
||
| // PostSandboxDestroy - The post sand box destroy function will be called | ||
| // after the sandbox is destory. This could mean the namespace is kept around |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
destroy -> destroyed
pkg/runtime/sandbox_hooks.go
Outdated
|
|
||
| // PostSandboxDestroy - The post sand box destroy function will be called | ||
| // after the sandbox is destory. This could mean the namespace is kept around | ||
| // if the error and configuration conditions are met. This function should not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
error/APB errors ....
63b6aac
to
ca3ff56
Compare
|
Changes Unknown when pulling ca3ff56 on shawn-hurley:network-iso-impl into ** on openshift:master**. |
|
@smarterclayton had some concerns with this PR, let's hold off on merging until we learn more about the concerns and can address. |
|
Adding some additional background to help others understand the context of this PR. Background of the issue: A typical workflow is:
In a multi-tenant setup, the namespace networks are isolated. This means the APB logic is unable to talk to the deployed services. The goal of the PR referenced is to allow the transient namespace to communicate to the target namespace. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nits
pkg/clients/openshift.go
Outdated
| } | ||
|
|
||
| // JoinNamespacesNetworks - Will take the net namespace to be added to a network, | ||
| // and the namespace ID of the netwok that is being added to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
netwok -> network
pkg/clients/openshift.go
Outdated
| } | ||
|
|
||
| // IsolateNamespacesNetworks - Will take the net namespace to be added to a network, | ||
| // and the namespace ID of the netwok that is being added to.. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
netwok -> network
|
Summing up the bluejeans conversation:
|
|
@danwinship 1 clarification
we could potentially keep the transient namespace alive so that a cluster admin could diagnose issues if an APB fails. In that case, we should re-isolate the network correct? |
| const ( | ||
| // ChangePodNetworkAnnotation - Annotation used for changing a pod | ||
| // network, used to join networks together. | ||
| ChangePodNetworkAnnotation = "pod.network.openshift.io/multitenant.change-network" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, right... I was going to say "that constant should be available in networkoapi", but it's not, because we had decided that this is "internal API", not public API. So maybe that's an argument that you should use oc adm pod-network instead? (But the API is still guaranteed to keep existing, since we have to preserve compatibility between old/new clients/servers, so it's safe for you to depend on it if you'd rather keep the code this way.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we are attempting to remove shelling out in our code, therefore I think on a whole we would prefer to keep using the API endpoints.
@jmrodri @eriknelson can you guys weigh in as far as broker code base is concerned? It sounds like the question is if we have to duplicate some constants and some logic around checking if an annotation has been updated vs shelling out to the oc command.
@danwinship Is that a fair characterization of the concern here? Do you see more code duplication then the above mentioned?
pkg/clients/openshift.go
Outdated
| // GetClusterNetworkPlugin - Get cluster network | ||
| func (o OpenshiftClient) GetClusterNetworkPlugin() (string, error) { | ||
| net := &networkoapi.ClusterNetwork{} | ||
| err := o.networkClient.Get().Resource("clusternetworks").Name("default").Do().Into(net) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Name(networkoapi.ClusterNetworkDefault)
| } | ||
| return wait.ExponentialBackoff(backoff, func() (bool, error) { | ||
| return didAnnotationUpdate("join", netns.NetName) | ||
| }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned above, this could fail if the ClusterNetwork object is stale and mistakenly claims that they're using multitenant when really they're using a third-party plugin now
| // Case insensitive check here because want to prepare if things change. | ||
| if strings.ToLower(pluginName) == "redhat/openshift-ovs-multitenant" { | ||
| log.Debugf("stating that the pluginname is multitenant - %v", pluginName) | ||
| return true, addPodNetworks, isolatePodNetworks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as mentioned above, there's really no need to re-isolate the transient network on cleanup
You wouldn't want to keep things in the same state as when the APB ran? I guess there are arguments both ways. Hm... I was about to say "leaving the networks joined won't hurt anything", but I just realized that's not actually the case: joining networks breaks multicast and EgressNetworkPolicy. For multicast you can make things work by having the setup process set the "multicast-enabled" plugin on the transient NetNamespace if it's set on the target NetNamespace (https://docs.openshift.org/latest/admin_guide/managing_networking.html#admin-guide-networking-multicast). (Ideally you should not just set it unconditionally; the nodes will log warnings if you join two namespaces and one of them has the multicast annotation but the other doesn't.) But for EgressNetworkPolicy, there's no way to make things work; if the target namespace has any EgressNetworkPolicies defined, then joining another namespace to it will break all cluster-external traffic from that namespace. EgressNetworkPolicy isn't that widely used though, and we're thinking about deprecating it, so maybe you can just punt on that and say that APBs can't be deployed into a namespace that has any EgressNetworkPolicies defined in it, if the cluster is using the multitenant plugin. |
We are currently removing the role bindings and such even if we are keeping the namespace. I figured that we would want to do the same and if the cluster admin would want to re-add the networks together then they could. @jwmatthews Do you have any preference here. I don't think it hurts to leave them attached, just thought the right thing was to remove our failed run from the cluster as much as possible?
Sorry, I have not read about this policy yet, do you think that a large documentation warning on this would be sufficient? Should the broker be able to look up some value and error if a setting/condition is met? |
852f565
to
91cd0ef
Compare
|
Changes Unknown when pulling 91cd0ef on shawn-hurley:network-iso-impl into ** on openshift:master**. |
* Post destroy sandbox hook to isolate the networks * Post create sandbox hook to join the networks. * Adding implementations for these hooks. * Adding the new permissions to the "production" deployment temp. * Addressing wording and spelling issues. * Adding network policy object creation and adressing comments. * Adding ability to delete network policy on sandbox deletion.
91cd0ef
to
746ff2f
Compare
|
@enj @danwinship @knobunc Can you guys double check this PR. I believe that I have updated it to match the comments and process that was agreed upon. Thanks! |
|
Changes Unknown when pulling 746ff2f on shawn-hurley:network-iso-impl into ** on openshift:master**. |
|
We are logging the error but the way in which the post hook works is it will not cause the rest of the APB execution to quit. I think that this one we do not need to change and can add that documentation regarding the message.
It is a UUID so it does have some uniqueness guarantee, as far as I understand? |
|
Changes Unknown when pulling ec2971c on shawn-hurley:network-iso-impl into ** on openshift:master**. |
| @@ -171,7 +186,44 @@ func (p provider) CreateSandbox(podName string, | |||
| } | |||
| } | |||
|
|
|||
| // Must create a Network policy to allow for comunication from the APB pod to the target namespace. | |||
| networkPolicy := &networkingv1.NetworkPolicy{ | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a "deny all ingress" network policy created in the apb pod namespace that will prevent the user from initiating communication in the other direction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I am wrong, but if the network policy does not exist to allow the traffic the base state of the plugin ovs-networkpolicy is to disallow traffic.
If you are using ovs-multitenant plugin I don't think that you can.
@danwinship are these statements correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The base state of the networkpolicy plugin is to allow all traffic. But you said before there's nothing listening to any ports in the temp namespace, so this shouldn't matter?
But if you want to be paranoid, you could add a policy:
kind: NetworkPolicy
apiVersion: extensions/v1beta1
metadata:
name: deny-all
spec:
podSelector:
ingress: []
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that if we ever open a port for the APB to listen on then we would want to control the network with NetworkPolicy as @danwinship said. I don't think it is a concern as there is nothing exposed to the network in the apb pod namespace.
@liggitt and I think @enj you had the same concern, does the above sound acceptable to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
OK
Yeah, that's good. Not predictable, and an attacker can't trick you into using a value of their choice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some additional nits
pkg/runtime/openshift.go
Outdated
| log.Debugf("adding pod networks together namespace: %v, target namespaces: %v", ns, targetNS) | ||
| // Check to make sure that we have a target namespace. | ||
| if len(targetNS) < 1 { | ||
| return fmt.Errorf("Can not find target namespace to add to its networ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its networ -> its network
pkg/runtime/openshift.go
Outdated
| // Check to make sure that we have a target namespace. | ||
| if len(targetNS) < 1 { | ||
| return fmt.Errorf("Can not find target namespace to add to its networ") | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its networ->its network
|
Changes Unknown when pulling 2e4121e on shawn-hurley:network-iso-impl into ** on openshift:master**. |
|
I tested this (albeit before changes to support the tech preview network-policy plugin) and I was able to update an APB in a multi-tenant environment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK. When I tested I was able to update an APB that required talking to the deployed pod in a multi-tenant environment.
| if err != nil { | ||
| return nil, err | ||
| } | ||
| return result, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why return result if it's not used by the caller?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no good reason. eventually, you may need it and you can ignore it when you call the method.
do you have strong feelings on this one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't matter to me if you think it will be used eventually I'm fine with it.
I would double check though that result isn't going to be the same as thenetns param. I would have to do some testing to verify, but I don't think NetName and NetId will change.
| if err != nil { | ||
| return nil, err | ||
| } | ||
| return result, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above.
| @@ -3,3 +3,7 @@ package runtime | |||
| func (k kubernetes) getRuntime() string { | |||
| return "kubernetes" | |||
| } | |||
|
|
|||
| func (k kubernetes) shouldJoinNetworks() (bool, PostSandboxCreate, PostSandboxDestroy) { | |||
| return false, nil, nil | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirming for my own edification -- this is strictly an Openshift feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The openshift plugin for the multitenant environment is only for openshift.
The network policy stuff is for both k8s and openshift so that is apart of the actual create sandbox and destroy sandbox methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VISIACK, don't see anything that sticks out to me, just a minor spelling comment.
| } | ||
|
|
||
| // PostSandboxDestroy - The post sand box destroy function will be called | ||
| // after the sandbox is destoryed. This could mean the namespace is kept around |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
destoryed -> destroyed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK, I like the PostHooks
* adding openshift-api to vendor * Adding ability to add sandbox hooks to join pod networks. * Post destroy sandbox hook to isolate the networks * Post create sandbox hook to join the networks. * Adding implementations for these hooks. * Adding the new permissions to the "production" deployment temp. * Addressing wording and spelling issues. * Adding network policy object creation and addressing comments. * Adding ability to delete network policy on sandbox deletion. * re-adding isolation of net namespaces.
Describe what this PR does and why we need it:
Implementation of #572
Changes proposed in this pull request
shouldJoinNetworksmethod to retrieve the hooks to be used.