Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add romana to built-in CNI options #3290

Merged
merged 1 commit into from
Sep 15, 2017

Conversation

cgilmour
Copy link
Contributor

This PR adds romana as a networking option for kops.

It installs the latest "preview" release of Romana v2.0, which provides the expected features in terms of IP allocations and route configuration. Network policy features are being ported to 2.0 and will be in the final release. (We intend to submit a followup PR for kops as part of that rolling out that release.)

Note: in this setup, we're using the etcd cluster that kops deploys for k8s. This isn't ideal, but some possibilities (eg: StatefulSets) aren't practical for the CNI itself, and creating a parallel etcd cluster via manifests seemed to be a more-intrusive approach than using the existing one.
If this is a concern or problem, then I'm very open to discussing and implementing it based on your suggestions.

Also, some functionality is exclusive to AWS environments. Other cloud platforms are on Romana's roadmap but not developed yet. Let me know that restriction needs to be enforced in code or directly documented.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 28, 2017
@k8s-ci-robot
Copy link
Contributor

Hi @cgilmour. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 28, 2017
@chrislovecnm
Copy link
Contributor

/ok-to-test

You mind squashing your commits?

@k8s-ci-robot k8s-ci-robot removed the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Aug 28, 2017
Copy link
Contributor

@chrislovecnm chrislovecnm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Some questions and changes.


// Romana declares that we want Romana networking
type RomanaNetworkingSpec struct {
DaemonServiceIP string `json:"daemonServiceIP,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we get a description on the sctruct members?


#### Installing Romana on a new Cluster

The following command sets up a cluster with Kube-router as the CNI, service proxy and networking policy provider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document that etcd has to be open to the nodes?

effect: NoSchedule
containers:
- name: romana-agent
image: quay.io/romana/agent:v2.0-preview.2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this released in GA? If not may want to add a note to the docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a note into the Romana section of networking doc.

args:
- --service-cluster-ip-range={{ .ServiceClusterIPRange }}
securityContext:
privileged: true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it need full privileged?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one does, yes.

tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I add CPU and mem limits please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPU and memory limits added for all containers in the spec.

imagePullPolicy: Always
args:
- --etcd_use_v2
- --etcd_addr={{ .Networking.Romana.EtcdServiceIP }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is calico setting this? Can we do hostname(s) instead of IP address?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This component, romana-vpcrouter is involved in creating route entries so that pods can be reached, and runs on the host network instead of as a plain pod.
We'd originally configured it as a regular pod and using a DNS address for etcd, but it didn't work. It provides the routing to pods, such as kube-dns. When the routes don't exist yet, it can't find etcd to get the information to add those routes.

For those reasons, it uses hostNetworking and a predeclared service ip for the etcd address.

@@ -468,6 +468,27 @@ func (b *BootstrapChannelBuilder) buildManifest() (*channelsapi.Addons, map[stri
}
}

if b.cluster.Spec.Networking.Romana != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we test that k8s version is 1.6 or greater?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine adding this. Is there an example to base it on?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to add it to cluster validation and a similar example (where a networking mode is not supported on particular k8s versions) is here https://github.com/kubernetes/kops/blob/master/pkg/apis/kops/validation/legacy.go#L400

(And I'm sure you could make romana work with k8s 1.5, but I agree that it probably isn't worth it :-) )

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cgilmour did you make this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, here

tolerations:
- key: node-role.kubernetes.io/master
effect: NoSchedule
containers:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPU and mem limits please

@cgilmour
Copy link
Contributor Author

Commits have been squashed, and updates made for the requested changes.

Copy link
Member

@justinsb justinsb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally LGTM. I'm happy to merge and iterate on these!

if b.Cluster.Spec.Networking.Romana != nil {
// Romana needs to access etcd
glog.Warningf("Opening etcd port on masters for access from the nodes, for romana. This is unsafe in untrusted environments.")
tcpPorts = append(tcpPorts, 4001)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you open etcd on 12379, not 4001.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Kind of. The service port is 12379, but that gets translated by kube-proxy's iptables rules to a specific IP and port 4001. So the security group sees connections to port 4001.

@@ -468,6 +468,27 @@ func (b *BootstrapChannelBuilder) buildManifest() (*channelsapi.Addons, map[stri
}
}

if b.cluster.Spec.Networking.Romana != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to add it to cluster validation and a similar example (where a networking mode is not supported on particular k8s versions) is here https://github.com/kubernetes/kops/blob/master/pkg/apis/kops/validation/legacy.go#L400

(And I'm sure you could make romana work with k8s 1.5, but I agree that it probably isn't worth it :-) )

@@ -69,5 +69,18 @@ func (b *NetworkingOptionsBuilder) BuildOptions(o interface{}) error {
}
}

if networking.Romana != nil {
daemonIP, err := WellKnownServiceIP(clusterSpec, 99)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may want to set these only if they aren't already set, to allow the user to override them if they want to. That said I suspect we don't want to allow the user to override them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Our" needs for the address is mainly that it's static and reserved, because in this "layer" we can't depend on kube-dns pods being reachable.
I don't think it'd be an issue if users changed them, because they'd get substituted in all the necessary places, but it's not really a valuable thing to do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@justinsb thoughts?

name: romana
namespace: kube-system
spec:
clusterIP: {{ .Networking.Romana.DaemonServiceIP }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this actually changeable? I note that the clients aren't templated with this value, so I'm wondering how they reach the service.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the containers take the address from the environment variables that get added to the pod automatically for services. Those ones don't need to have the value passed in explicitly via flags.

if b.Cluster.Spec.Networking.Romana != nil {
// Romana needs to access etcd
glog.Warningf("Opening etcd port on masters for access from the nodes, for romana. This is unsafe in untrusted environments.")
tcpRanges = []portRange{{From: 1, To: 4001}, {From: 4003, To: 65535}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with the above opened ports (e.g. 9600 is missing)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was based on what another CNI does. Port 9600 is covered by the second range (4003-65535).
I'm OK with making this more specific in both places.

@@ -393,6 +393,12 @@ func ValidateCluster(c *kops.Cluster, strict bool) *field.Error {
}
}

if kubernetesRelease.LT(semver.MustParse("1.6.0")) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This adds the version check to require v1.6.0 or higher for romana networking.

@cgilmour
Copy link
Contributor Author

Are there any other changes that you'd like in this PR, or other questions that need an answer?

@k8s-github-robot k8s-github-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 13, 2017
@k8s-github-robot
Copy link

@cgilmour PR needs rebase

@k8s-github-robot k8s-github-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 13, 2017
@chrislovecnm
Copy link
Contributor

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 15, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: chrislovecnm

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 15, 2017
@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit 5cb443d into kubernetes:master Sep 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants