-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cni ipvlan vpc k8s support #4762
Cni ipvlan vpc k8s support #4762
Conversation
pkg/apis/kops/networking.go
Outdated
Kuberouter *KuberouterNetworkingSpec `json:"kuberouter,omitempty"` | ||
Romana *RomanaNetworkingSpec `json:"romana,omitempty"` | ||
AmazonVPC *AmazonVPCNetworkingSpec `json:"amazonvpc,omitempty"` | ||
AmazonVPCIPVlan *AmazonVPCIPVlanNetworkingSpec `json:"amazonvpcipvlan,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name cni-ipvlan-vpc-k8s doesn't really roll off the tongue so AmazonVPCIPVlan was the best I could come up with
nodeup/pkg/model/network.go
Outdated
@@ -61,6 +60,17 @@ func (b *NetworkBuilder) Build(c *fi.ModelBuilderContext) error { | |||
} | |||
} | |||
|
|||
if networking.AmazonVPCIPVlan != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hooked downloading of the binaries into this ModelBuilder. It seems a little awkward and I'm wondering if treating the download as another asset the same as the other cni binaries is the right way to go. I had to modify the Archive task to support gzip archives which seems like is already handled in the asset store.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that other CNI providers install binaries by mounting the hostPath and copying them in, possibly from the image. It's a more generic solution, in that it works on any cluster. I'd probably be more in favor of that approach, but we can also try this (designating it as alpha in the docs) and see how it goes. But also I don't know if they copy binaries or just config files.
I do agree that an asset might probably be better than introducing another code path.
That said, I think that getting a manifest would likely be best of all - going to comment on that at top level though
upup/pkg/fi/cloudup/tagbuilder.go
Outdated
@@ -127,6 +127,10 @@ func buildNodeupTags(role api.InstanceGroupRole, cluster *api.Cluster, clusterTa | |||
tags.Insert("_networking_cni") | |||
} | |||
|
|||
if networking.AmazonVPCIPVlan != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hooked the cni conflist template into the Loader with this tag. I'm not sure if that is the right place for it but it seemed to follow established patterns in the code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: We're trying to get rid of tags in general, because a function tends to be clearer. I don't think this actually does anything (it looks like you're just using it on a resource, where I don't think tags act as a filter, but I could very well be wrong)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The treewalker ignores tag directories that are unknown
I spent a fair bit of time looking at how templates get handled in the code and this seemed to be the way it's done in nodeup, though there do appear to be other ways in other places. Is there another example of doing templates in nodeup without the tree walker / tags that I've overlooked and could emulate?
upup/pkg/fi/nodeup/command.go
Outdated
@@ -241,6 +246,14 @@ func (c *NodeUpCommand) Run(out io.Writer) error { | |||
loader.Builders = append(loader.Builders, &model.CalicoBuilder{NodeupModelContext: modelContext}) | |||
} | |||
|
|||
loader.TemplateFunctions["MapJson"] = func(m map[string]string) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my text/template foo is not very strong and this is what I came up with to get my template to render correctly. I'm sure there's a better approach. This function seems out of place here
dba2164
to
d934201
Compare
Hi, I saw this PR and I am also looking into deploying lyft cni with kops. |
/ok-to-test |
I considered it and I definitely see an argument for it. To me though, the do something once semantics didn't feel like a good fit for daemonset. Having the containers that had done the downloading just hanging around felt not right and somewhat counter to one of the designed benefits of the plugin. But the thing that really tipped the scales for me personally was, having the thing configured before the kubelet even starts means there's no waiting around for cni containers to be up and running. |
yeah, I am thinking about adding prometheus+jaeger agent to the deamon set so we get more visibility into it in production + it would kind of simplify this work you are doing
|
a658dbb
to
75c6027
Compare
cmd/kops/create_cluster.go
Outdated
@@ -872,6 +872,8 @@ func RunCreateCluster(f *util.Factory, out io.Writer, c *CreateClusterOptions) e | |||
cluster.Spec.Networking.AmazonVPC = &api.AmazonVPCNetworkingSpec{} | |||
case "cilium": | |||
cluster.Spec.Networking.Cilium = &api.CiliumNetworkingSpec{} | |||
case "amazonvpcipvlan": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are going to need to figure out names that don't cause people confusion :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lyft should be in the name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how bout LyftVPC ? it's short and sweet
nodeup/pkg/model/network.go
Outdated
@@ -61,6 +60,17 @@ func (b *NetworkBuilder) Build(c *fi.ModelBuilderContext) error { | |||
} | |||
} | |||
|
|||
if networking.AmazonVPCIPVlan != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that other CNI providers install binaries by mounting the hostPath and copying them in, possibly from the image. It's a more generic solution, in that it works on any cluster. I'd probably be more in favor of that approach, but we can also try this (designating it as alpha in the docs) and see how it goes. But also I don't know if they copy binaries or just config files.
I do agree that an asset might probably be better than introducing another code path.
That said, I think that getting a manifest would likely be best of all - going to comment on that at top level though
pkg/apis/kops/networking.go
Outdated
Romana *RomanaNetworkingSpec `json:"romana,omitempty"` | ||
AmazonVPC *AmazonVPCNetworkingSpec `json:"amazonvpc,omitempty"` | ||
Cilium *CiliumNetworkingSpec `json:"cilium,omitempty"` | ||
AmazonVPCIPVlan *AmazonVPCIPVlanNetworkingSpec `json:"amazonvpcipvlan,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the JSON here should probably be amazonVpcIpvlan
, but on the other hand none of the others have done that (I'm thinking kuberouter should have been kubeRouter and amazonvpc should have been amazonVpc)
upup/pkg/fi/cloudup/tagbuilder.go
Outdated
@@ -127,6 +127,10 @@ func buildNodeupTags(role api.InstanceGroupRole, cluster *api.Cluster, clusterTa | |||
tags.Insert("_networking_cni") | |||
} | |||
|
|||
if networking.AmazonVPCIPVlan != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: We're trying to get rid of tags in general, because a function tends to be clearer. I don't think this actually does anything (it looks like you're just using it on a resource, where I don't think tags act as a filter, but I could very well be wrong)
var extractArgs = "xf" | ||
|
||
if e.Gzip { | ||
extractArgs = "xzf" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine. Some alternatives you might want to consider: What we did in assetstore.go is we look at the extension, and try to guess whether we should expand it: if strings.HasSuffix(file, ".tar.gz") || strings.HasSuffix(file, ".tgz")
. I also think tar
can usually figure it out, but we might not want to rely on that. Also because we are in nodeup and so we have the file at this point, we could detect the file type (e.g. run file
).
But this is good enough for now, unless you're incline to change it!
pkg/apis/kops/networking.go
Outdated
// LyftIpVlanNetworkingSpec declares that we want to use the cni-ipvlan-vpc-k8s CNI networking | ||
type AmazonVPCIPVlanNetworkingSpec struct { | ||
// Tags that subnets must have to be used for pod IPs | ||
SubnetTags map[string]string `json:"subnetTags,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So do we need all of these in order to start trying it out? I'm not sure how far along you are, but the challenge is that once we put them in the API, they're part of our contract. Can we default these or live without them, at least for now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SecurityGroupIds is set to the sgs on the primary ENI if nothing overrides it so we could just move that into a template function and not allow overrides to start
InterfaceIndex can just be 1. It's the suggested way of doing it
RouteToVPCPeers could just be true. I imagine there are situations where people want to not have this turned on
The SubnetTags could just do Type = pod. Then in the future maybe if kops creates subnets for the pod network it could add this tag to them.
BinariesDownloadURL will get removed altogether since it's handled as an asset now.
This is nice work - I had a few comments. In general I'd like to get to a model where you're able to specify a manifest for the CNI provider which nodeup/protokube then just runs I'm a bit worried about establishing an API that we can't continue to offer in future, so any fields we can default (at least to start) would be great. A few arguments for a manifest:
Some arguments against:
|
I see we are downloading stuff. Does lyft not have it in a daemonset? |
75c6027
to
a5ebe6a
Compare
@chrislovecnm Lyft doesn't have a daemonset for this. It's kind of one of the touted features that you don't have to run one.
|
9964329
to
b2faa28
Compare
@polarbizzle thanx for the update. Just looked into the commit - shouldn't cniVersion in 10-cni-ipvlan-vpc-k8s.conflist.template also be set to "0.4.2"? |
@merlineus 0.3.1 is actually the version of the cni spec |
@polarbizzle oh, I see, sorry for that :) |
254c19a
to
9781eaf
Compare
9781eaf
to
2d66ae8
Compare
2d66ae8
to
db0ffaa
Compare
https://github.com/lyft/cni-ipvlan-vpc-k8s This cni solution is slightly different in that it doesn't require running a daemonset It requires: * a config file in /etc/cni/net.d * the binaries in /opt/cni/bin * adding the --node-ip param to the kubelet This code is modeled after the AmazonVPC cni bits. I've left the setup of the required subnets as an exercise to the reader.
edcc309
to
1ceaa45
Compare
@justinsb , @chrislovecnm could you have another look and perhaps we could land this one? |
@@ -964,7 +966,7 @@ func RunCreateCluster(f *util.Factory, out io.Writer, c *CreateClusterOptions) e | |||
|
|||
case api.TopologyPrivate: | |||
if !supportsPrivateTopology(cluster.Spec.Networking) { | |||
return fmt.Errorf("Invalid networking option %s. Currently only '--networking kopeio-vxlan (or kopeio)', '--networking weave', '--networking flannel', '--networking calico', '--networking canal', '--networking kube-router', '--networking romana', '--networking amazon-vpc-routed-eni' are supported for private topologies", c.Networking) | |||
return fmt.Errorf("Invalid networking option %s. Currently only '--networking kopeio-vxlan (or kopeio)', '--networking weave', '--networking flannel', '--networking calico', '--networking canal', '--networking kube-router', '--networking romana', '--networking amazon-vpc-routed-eni' '--networking lyftvpc' are supported for private topologies", c.Networking) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really need to generate this list :-)
networking := cluster.Spec.Networking | ||
|
||
if networking == nil { | ||
return nil, fmt.Errorf("Networking is not set, and should not be nil here") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure you needed to do this - it probably would have sufficed to just do
if cluster.Spec.Networking != nil && cluster.Spec.Networking.LyftVPC != nil {
} | ||
|
||
if networking.LyftVPC != nil { | ||
tags.Insert("_lyft_vpc_cni") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You probably aren't using this tag - we're trying to get rid of tags altogether... but it's harmless
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This tag is how the cni config file template gets rendered
/upup/models/nodeup/resources/_lyft_vpc_cni/files/etc/cni/net.d/10-cni-ipvlan-vpc-k8s.conflist.template
if you can point me to a better place to handle rendering that template I could refactor accordingly
return nil, fmt.Errorf("error querying ec2 metadata service (for az/region): %v", err) | ||
} | ||
|
||
sgNames, err := metadata.GetMetadata("security-groups") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm missing something, but we can compute the security groups for any instance, because we set them up. I'll try to do a PoC so we can discuss, unless you know that I'm missing something :-)
Sorry for the delay @polarbizzle ... it's additive, and we've delayed on this for a long time, so I'm going to approve it! Hopefully should make it easier for others to make progress on it also (e.g. I want to try to compute the security groups)! /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: justinsb, polarbizzle The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @polarbizzle, thanks for your work! |
We want to try out the cni plugin Lyft has put out https://github.com/lyft/cni-ipvlan-vpc-k8s
It has some desirable characteristics like no extra pods to run, less overhead by using ipvlan etc.
I've been on a journey that started with just baking the extra config and binaries on top of the kops ami which was less than savory since I had to do acrobatics to try and get --node-ip into the kubelet args and get the templating of the cni conflist done at just the right time during instance startup. It worked but was quite unsatisfying.
I considered porting that work over to using the additional user-data / hooks capabilities but that also left me feeling unsatisfied with basically doing the same acrobatics in a harder to work with format and prone to being broken by upstream changes in the future
I decided then to undertake adding support for the plugin in kops using the AmazonVPC work as a starting point.
I've arrived at this and it works great for our clusters.
This does break slightly with the other CNI plugins in that it's not really an addon. Which I think is a positive thing overall.
There are some things that I've done in the code that might not be in quite the right place. I'll call them out inline
I wanted to get this work out there to solicit feedback and hopefully get this to a state that can be adopted so we don't have to maintain this in a fork forever. I did some googling and don't see any mention of it anywhere but welcome pointers to any previous discussion of the topic.