Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Openshift Multi-tenant Sandbox Hooks #600

Merged
merged 4 commits into from Jan 11, 2018

Conversation

shawn-hurley
Copy link
Contributor

Describe what this PR does and why we need it:
Implementation of #572
Changes proposed in this pull request

@openshift-ci-robot openshift-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Dec 19, 2017
@shawn-hurley
Copy link
Contributor Author

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 30dae9d on shawn-hurley:network-iso-impl into ** on openshift:master**.

@shawn-hurley shawn-hurley force-pushed the network-iso-impl branch 2 times, most recently from 8e2b60a to d8a6f0b Compare December 19, 2017 21:28
@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 8e2b60a on shawn-hurley:network-iso-impl into ** on openshift:master**.

@coveralls
Copy link

coveralls commented Dec 19, 2017

Coverage Status

Changes Unknown when pulling d8a6f0b on shawn-hurley:network-iso-impl into ** on openshift:master**.

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling b3df86b on shawn-hurley:network-iso-impl into ** on openshift:master**.

@shawn-hurley
Copy link
Contributor Author

This PR is related to openshift/openshift-ansible#6536.

They do not block each other, but they are needed to work together.

@coveralls
Copy link

coveralls commented Dec 20, 2017

Coverage Status

Changes Unknown when pulling 63b6aac on shawn-hurley:network-iso-impl into ** on openshift:master**.

}

// JoinNamespacesNetworks - Will take the net namespace to be added to a network,
// and the namespace ID of that netwok.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this more clear that the namespace id is the network that we are joining to.

pluginName, err := ocli.GetClusterNetworkPlugin()
log.Debugf("plugin for the network - %v", pluginName)
if err != nil {
// Plugins could not be defined oc cluster up or other use cases.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make this statement more clear

@@ -172,6 +186,16 @@ func (p provider) CreateSandbox(podName string,
}

log.Info("Successfully created apb sandbox: [ %s ], with %s permissions in namespace %s", podName, apbRole, namespace)
log.Info("Running post create sandbox fuctions if defined.")
for i, f := range p.postSandboxCreate {
log.Debug("Running function: %v", i)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Debug -> Debugf

Also, make this more useful.

// return error.
type PostSandboxCreate func(string, string, []string, string) error

// AddPostCreateSandbox - Adds a post create sandbox to the runtime.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adds a post create sandbox to the runtime -> Adds a post create sandbox function to the runtime.

}

// PostSandboxDestroy - The post sand box destroy function will be called
// after the sandbox is destory. This could mean the namespace is kept around
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destroy -> destroyed


// PostSandboxDestroy - The post sand box destroy function will be called
// after the sandbox is destory. This could mean the namespace is kept around
// if the error and configuration conditions are met. This function should not
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

error/APB errors ....

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling ca3ff56 on shawn-hurley:network-iso-impl into ** on openshift:master**.

@jwmatthews jwmatthews added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 4, 2018
@jwmatthews
Copy link
Member

@smarterclayton had some concerns with this PR, let's hold off on merging until we learn more about the concerns and can address.

@jwmatthews
Copy link
Member

Adding some additional background to help others understand the context of this PR.

Background of the issue:
With APBs there will be 2 namespaces at play.
"target" namespace, which is the namespace the deployed services will run in
"transient" namespace, the temporary namespace the broker creates for the APB to run
only the APB code runs in here, then the namespace is destroyed.

A typical workflow is:

  1. 'target' namespace exists
  2. User runs an action that triggers an APB action
  3. Broker creates a new namespace, the 'transient' namespace
  4. Broker runs the APB in the 'transient' namespace
    a.. The actual ansible logic for the APB runs in the 'transient' namespace
    OCP Resources are manipulated in the 'target' namespace
  5. Broker destroys the 'transient' namespace

In a multi-tenant setup, the namespace networks are isolated. This means the APB logic is unable to talk to the deployed services.

The goal of the PR referenced is to allow the transient namespace to communicate to the target namespace.

Copy link

@knobunc knobunc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nits

}

// JoinNamespacesNetworks - Will take the net namespace to be added to a network,
// and the namespace ID of the netwok that is being added to.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

netwok -> network

}

// IsolateNamespacesNetworks - Will take the net namespace to be added to a network,
// and the namespace ID of the netwok that is being added to..
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

netwok -> network

@danwinship
Copy link

Summing up the bluejeans conversation:

  • The only obvious alternative approach would be to have the APB deploy a helper pod in the target namespace and then use "oc exec" to run commands through that. But that would probably require lots of additional complexity in every APB that needed these features, whereas the proposal here puts all of the extra work in the infrastructure and makes things easier for the individual APBs.
  • If the cluster is using the multitenant plugin, then joining the networks together is the right answer. I would have suggested just using oc adm pod-network ... but using the API directly works as well. Also, there's no need to re-isolate them afterward since "isolating" just means giving the transient NetNamespace a unique NetID again, but there's no point in doing that since you're about to delete it anyway.
  • If the cluster is using the redhat/openshift-ovs-networkpolicy plugin, or basically any third-party plugin, then you can't "join networks", and you'll instead have to create a NetworkPolicy in the target namespace, allowing access from the transient namespace. (Which will require putting some sort of label on the transient Namespace object first so that you can match it with a LabelSelector from the NetworkPolicy.) (If the cluster is using the multitenant plugin, you can still do this anyway; it will just be a no-op.)
  • Looking at the "default" ClusterNetwork record is the best currently-available way to figure out what network plugin is in use, but it's only guaranteed to be right if one of the openshift plugins is being used; otherwise, it's possible that it will contain stale data. (Our "how to change network plugins" documentation tells people to delete the ClusterNetwork object when switching to a third-party plugin, but they might not.). So the safe/right way to set up cross-namespace network access is:
    • always create a NetworkPolicy
    • if the default ClusterNetwork claims that multitenant is in use, then also do the join-projects hack, but it's possible that the ClusterNetwork record is wrong, so don't actually error out if the master ignores the request to join the networks. (But maybe warn about it?)

@shawn-hurley
Copy link
Contributor Author

@danwinship 1 clarification

Also, there's no need to re-isolate them afterward since "isolating" just means giving the transient NetNamespace a unique NetID again, but there's no point in doing that since you're about to delete it anyway.

we could potentially keep the transient namespace alive so that a cluster admin could diagnose issues if an APB fails. In that case, we should re-isolate the network correct?

const (
// ChangePodNetworkAnnotation - Annotation used for changing a pod
// network, used to join networks together.
ChangePodNetworkAnnotation = "pod.network.openshift.io/multitenant.change-network"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, right... I was going to say "that constant should be available in networkoapi", but it's not, because we had decided that this is "internal API", not public API. So maybe that's an argument that you should use oc adm pod-network instead? (But the API is still guaranteed to keep existing, since we have to preserve compatibility between old/new clients/servers, so it's safe for you to depend on it if you'd rather keep the code this way.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we are attempting to remove shelling out in our code, therefore I think on a whole we would prefer to keep using the API endpoints.

@jmrodri @eriknelson can you guys weigh in as far as broker code base is concerned? It sounds like the question is if we have to duplicate some constants and some logic around checking if an annotation has been updated vs shelling out to the oc command.

@danwinship Is that a fair characterization of the concern here? Do you see more code duplication then the above mentioned?

// GetClusterNetworkPlugin - Get cluster network
func (o OpenshiftClient) GetClusterNetworkPlugin() (string, error) {
net := &networkoapi.ClusterNetwork{}
err := o.networkClient.Get().Resource("clusternetworks").Name("default").Do().Into(net)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name(networkoapi.ClusterNetworkDefault)

}
return wait.ExponentialBackoff(backoff, func() (bool, error) {
return didAnnotationUpdate("join", netns.NetName)
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned above, this could fail if the ClusterNetwork object is stale and mistakenly claims that they're using multitenant when really they're using a third-party plugin now

// Case insensitive check here because want to prepare if things change.
if strings.ToLower(pluginName) == "redhat/openshift-ovs-multitenant" {
log.Debugf("stating that the pluginname is multitenant - %v", pluginName)
return true, addPodNetworks, isolatePodNetworks

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as mentioned above, there's really no need to re-isolate the transient network on cleanup

@danwinship
Copy link

we could potentially keep the transient namespace alive so that a cluster admin could diagnose issues if an APB fails. In that case, we should re-isolate the network correct?

You wouldn't want to keep things in the same state as when the APB ran? I guess there are arguments both ways.

Hm... I was about to say "leaving the networks joined won't hurt anything", but I just realized that's not actually the case: joining networks breaks multicast and EgressNetworkPolicy.

For multicast you can make things work by having the setup process set the "multicast-enabled" plugin on the transient NetNamespace if it's set on the target NetNamespace (https://docs.openshift.org/latest/admin_guide/managing_networking.html#admin-guide-networking-multicast). (Ideally you should not just set it unconditionally; the nodes will log warnings if you join two namespaces and one of them has the multicast annotation but the other doesn't.)

But for EgressNetworkPolicy, there's no way to make things work; if the target namespace has any EgressNetworkPolicies defined, then joining another namespace to it will break all cluster-external traffic from that namespace.

EgressNetworkPolicy isn't that widely used though, and we're thinking about deprecating it, so maybe you can just punt on that and say that APBs can't be deployed into a namespace that has any EgressNetworkPolicies defined in it, if the cluster is using the multitenant plugin.

@shawn-hurley
Copy link
Contributor Author

You wouldn't want to keep things in the same state as when the APB ran? I guess there are arguments both ways.

We are currently removing the role bindings and such even if we are keeping the namespace. I figured that we would want to do the same and if the cluster admin would want to re-add the networks together then they could. @jwmatthews Do you have any preference here. I don't think it hurts to leave them attached, just thought the right thing was to remove our failed run from the cluster as much as possible?

EgressNetworkPolicy isn't that widely used though, and we're thinking about deprecating it, so maybe you can just punt on that and say that APBs can't be deployed into a namespace that has any EgressNetworkPolicies defined in it, if the cluster is using the multitenant plugin.

Sorry, I have not read about this policy yet, do you think that a large documentation warning on this would be sufficient? Should the broker be able to look up some value and error if a setting/condition is met?

@shawn-hurley shawn-hurley removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 5, 2018
@coveralls
Copy link

coveralls commented Jan 5, 2018

Coverage Status

Changes Unknown when pulling 91cd0ef on shawn-hurley:network-iso-impl into ** on openshift:master**.

* Post destroy sandbox hook to isolate the networks
* Post create sandbox hook to join the networks.
* Adding implementations for these hooks.
* Adding the new permissions to the "production" deployment temp.
* Addressing wording and spelling issues.
* Adding network policy object creation and adressing comments.
* Adding ability to delete network policy on sandbox deletion.
@shawn-hurley
Copy link
Contributor Author

@enj @danwinship @knobunc Can you guys double check this PR. I believe that I have updated it to match the comments and process that was agreed upon. Thanks!

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 746ff2f on shawn-hurley:network-iso-impl into ** on openshift:master**.

@shawn-hurley shawn-hurley removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 8, 2018
@danwinship
Copy link

  • You removed the "re-isolate" part at the end, but I think you should put it back. In particular, joining the temp namespace to the target namespace could cause disruption of multicast and EgressNetworkPolicy, so you want that to be undisrupted when you finish installing.
  • addPodNetworks() still always returns an error if the join fails, but there's the problem that the user might be using a third-party plugin and ClusterNetwork has stale info, in which case joining would time out (because the controller that watches for the "join" annotation wouldn't be running). Maybe the right fix here is just to document it though: "if you're using a third-party network plugin, and get such-and-such error message, then the fix is to run oc delete clusternetwork default to delete the stale config object that claims you're running the multitenant plugin".
  • The new NetworkPolicy-related code looks correct. (I assume that that value you're sticking in apb-pod-name is unique-ish?)

@shawn-hurley
Copy link
Contributor Author

addPodNetworks() still always returns an error if the join fails, but there's the problem that the user might be using a third-party plugin and ClusterNetwork has stale info, in which case joining would time out (because the controller that watches for the "join" annotation wouldn't be running). Maybe the right fix here is just to document it though: "if you're using a third-party network plugin, and get such-and-such error message, then the fix is to run oc delete clusternetwork default to delete the stale config object that claims you're running the multitenant plugin".

We are logging the error but the way in which the post hook works is it will not cause the rest of the APB execution to quit. I think that this one we do not need to change and can add that documentation regarding the message.

The new NetworkPolicy-related code looks correct. (I assume that that value you're sticking in apb-pod-name is unique-ish?)

It is a UUID so it does have some uniqueness guarantee, as far as I understand?

@coveralls
Copy link

coveralls commented Jan 8, 2018

Coverage Status

Changes Unknown when pulling ec2971c on shawn-hurley:network-iso-impl into ** on openshift:master**.

@rthallisey rthallisey added the 3.10 | release-1.2 Kubernetes 1.10 | Openshift 3.10 | Broker release-1.2 label Jan 9, 2018
@shawn-hurley shawn-hurley added 3.9 | release-1.1 Kubernetes 1.9 | Openshift 3.9 | Broker release-1.1 and removed 3.10 | release-1.2 Kubernetes 1.10 | Openshift 3.10 | Broker release-1.2 labels Jan 9, 2018
@@ -171,7 +186,44 @@ func (p provider) CreateSandbox(podName string,
}
}

// Must create a Network policy to allow for comunication from the APB pod to the target namespace.
networkPolicy := &networkingv1.NetworkPolicy{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a "deny all ingress" network policy created in the apb pod namespace that will prevent the user from initiating communication in the other direction?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I am wrong, but if the network policy does not exist to allow the traffic the base state of the plugin ovs-networkpolicy is to disallow traffic.

If you are using ovs-multitenant plugin I don't think that you can.

@danwinship are these statements correct?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The base state of the networkpolicy plugin is to allow all traffic. But you said before there's nothing listening to any ports in the temp namespace, so this shouldn't matter?

But if you want to be paranoid, you could add a policy:

kind: NetworkPolicy
apiVersion: extensions/v1beta1
metadata:
  name: deny-all
spec:
  podSelector:
ingress: []

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that if we ever open a port for the APB to listen on then we would want to control the network with NetworkPolicy as @danwinship said. I don't think it is a concern as there is nothing exposed to the network in the apb pod namespace.

@liggitt and I think @enj you had the same concern, does the above sound acceptable to you?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

@danwinship
Copy link

We are logging the error but the way in which the post hook works is it will not cause the rest of the APB execution to quit.

OK

It is a UUID so it does have some uniqueness guarantee, as far as I understand?

Yeah, that's good. Not predictable, and an attacker can't trick you into using a value of their choice.

Copy link
Contributor

@rthallisey rthallisey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some additional nits

log.Debugf("adding pod networks together namespace: %v, target namespaces: %v", ns, targetNS)
// Check to make sure that we have a target namespace.
if len(targetNS) < 1 {
return fmt.Errorf("Can not find target namespace to add to its networ")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its networ -> its network

// Check to make sure that we have a target namespace.
if len(targetNS) < 1 {
return fmt.Errorf("Can not find target namespace to add to its networ")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its networ->its network

@coveralls
Copy link

Coverage Status

Changes Unknown when pulling 2e4121e on shawn-hurley:network-iso-impl into ** on openshift:master**.

@jmontleon
Copy link
Contributor

I tested this (albeit before changes to support the tech preview network-policy plugin) and I was able to update an APB in a multi-tenant environment.

Copy link
Contributor

@jmontleon jmontleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK. When I tested I was able to update an APB that required talking to the deployed pod in a multi-tenant environment.

if err != nil {
return nil, err
}
return result, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why return result if it's not used by the caller?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no good reason. eventually, you may need it and you can ignore it when you call the method.

do you have strong feelings on this one?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter to me if you think it will be used eventually I'm fine with it.

I would double check though that result isn't going to be the same as thenetns param. I would have to do some testing to verify, but I don't think NetName and NetId will change.

if err != nil {
return nil, err
}
return result, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as above.

@@ -3,3 +3,7 @@ package runtime
func (k kubernetes) getRuntime() string {
return "kubernetes"
}

func (k kubernetes) shouldJoinNetworks() (bool, PostSandboxCreate, PostSandboxDestroy) {
return false, nil, nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirming for my own edification -- this is strictly an Openshift feature?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The openshift plugin for the multitenant environment is only for openshift.

The network policy stuff is for both k8s and openshift so that is apart of the actual create sandbox and destroy sandbox methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

Copy link
Contributor

@eriknelson eriknelson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VISIACK, don't see anything that sticks out to me, just a minor spelling comment.

}

// PostSandboxDestroy - The post sand box destroy function will be called
// after the sandbox is destoryed. This could mean the namespace is kept around
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

destoryed -> destroyed

Copy link
Contributor

@jmrodri jmrodri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK, I like the PostHooks

@jmrodri jmrodri merged commit fba586e into openshift:master Jan 11, 2018
jianzhangbjz pushed a commit to jianzhangbjz/ansible-service-broker that referenced this pull request May 17, 2018
* adding openshift-api to vendor
* Adding ability to add sandbox hooks to join pod networks.
* Post destroy sandbox hook to isolate the networks
* Post create sandbox hook to join the networks.
* Adding implementations for these hooks.
* Adding the new permissions to the "production" deployment temp.
* Addressing wording and spelling issues.
* Adding network policy object creation and addressing comments.
* Adding ability to delete network policy on sandbox deletion.
* re-adding isolation of net namespaces.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.9 | release-1.1 Kubernetes 1.9 | Openshift 3.9 | Broker release-1.1 needs-review size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet