-
Notifications
You must be signed in to change notification settings - Fork 4.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calculate IPv6 subnet CIDR based on cluster CIDR #11523
Conversation
/hold for more feedback on approach |
|
True. Still, exposing this as a function would be something totally new. |
One other idea coming from @rifelpet would be to use a struct field to describe this: type IPv6CIDRSubnet struct {
NewBits int
NetNum int
} |
Could we use a syntax like I figure if we have use cases for more precise allocation in future we can add those. I did a prototype #11579 and the experience feels intuitive. (My prototype uses "::/64" so it's a valid CIDR, but that's not a great experience because We'd have to figure out how we assign to terraform consistently. We could do something like
I think it's reasonable to require this specification for terraform. It would be nice if |
Another possible syntax for absolute positioning: (I don't know if |
I like the |
I don't see any particular reason not to support auto-assignment for existing clusters. Why would would we require admins of existing clusters to manually do this themselves? What's the problem with Terraform? Can't we query the existing state of the subnets in order to specify that state in the configuration when it exists? |
The reason I see not to support auto-assignment after cluster creation in the complications in logic for something that is not needed to make IPv6 work. It is nice to have, but would gladly live without it for now. |
/hold cancel |
Final implementations uses the |
upup/pkg/fi/utils/cidr.go
Outdated
|
||
// ParseCidrSubnet parses a string in the format "/<newSize>#<netNum>" | ||
// and returns the values of <newSize> and <netNum>. | ||
func ParseCidrSubnet(subnet string) (int, int64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While cidrsubnet()
is the Terraform name of a related function, I think a name more like ParseGeneratedCIDR()
would be more appropriate. The Terraform function takes as input a CIDR and subnets it, whereas this thing takes the implicit generated block and carves out a subnet of it. The term "subnet" is problematic because it could be confused with the Subnet
object that this is being used to fill the IPv6CIDR
field in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, "generated" is not ideal either. It is just a notation.
When using networking as the context, "subnet" has a broader meaning.
I would see it more as a ParseSubnetNotation()
name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's not necessarily generated. It's cut out from the VPC's singular CIDR.
(except VPCs, especially shared ones, can technically have multiple CIDRs.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does ParseCIDRNotation()
sound better?
@@ -98,6 +99,14 @@ func (e *Subnet) Find(c *fi.Context) (*Subnet, error) { | |||
} | |||
|
|||
actual.IPv6CIDR = association.Ipv6CidrBlock | |||
if e.IPv6CIDR != nil && strings.HasPrefix(aws.StringValue(e.IPv6CIDR), "/") { | |||
subnetIPv6CIDR, err := calculateSubnetCIDR(e.VPC.IPv6CIDR, e.IPv6CIDR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there's a corner case that isn't handled: on the first run after a kOps upgrade, when the VPC and Subnets already exist, but there is not yet an associated Amazon-provided IPv6 association.
I don't see VPCAmazonIPv6CIDRBlock.RenderAWS()
writing to its e.VPC.IPv6CIDR
and I don't see any dependency of the Subnet
task on the VPCAmazonIPv6CIDRBlock
task.
(I do find it surprising that Find()
methods write to e
. And if they depend on values set by the Task.Run()
methods of their dependencies, how is that going to work with TargetDryRun
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VPCAmazonIPv6CIDRBlock.RenderAWS()
cannot write the CIDR, because the request doesn't actually return which block it associates. Because of that, there is no dependency and Subnet.RenderAWS()
waits for the association to be done, using fi.NewTryAgainLaterError()
.
I agree that Find()
writing to e
is awkward, but it was that or writing the e.IPv6CIDR
notation to a.IPv6CIDR
.
Also makes it easier later in FindDeletions()
. I could change that to set a.IPv6CIDR = e.IPv6CIDR
, if it makes things less awkward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think Find()
writing to e
is okay, I'm just having to figure out what the rules and model are. It seems Find()
can refine the expected based on existing state, including state produced by tasks listed as dependencies. But it has to deal with those dependencies not running due to TargetDryRun
.
There's an opportunity here to improve the developer documentation.
This could be used in the future to support a /64
notation where the particular sub-block doesn't need to be specified.
But there is a dependency on VPCAmazonIPv6CIDRBlock
as the association won't happen unless that task is run. And there shouldn't be two blocks of code when one would do, especially not if the two blocks of code implement different semantics.
@@ -196,6 +205,22 @@ func (_ *Subnet) RenderAWS(t *awsup.AWSAPITarget, a, e, changes *Subnet) error { | |||
} | |||
} | |||
|
|||
if e.IPv6CIDR != nil && strings.HasPrefix(aws.StringValue(e.IPv6CIDR), "/") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment would be helpful to indicate this handles the case where the VPC's IPv6CIDR was not known at the time e.Find()
is run.
And actually, e.Find()
should NewTryAgainLaterError
until it can get the association. Otherwise the code path is timing-dependent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This correct here, as it handles the case of TargetDryRun
. We only care about the exact CIDR at render time, when we know that the VPC will eventually get and IPv6 CIDR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TargetDryRun
will never call RenderAWS()
, so that case will never happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Find()
has nothing to do with VPC.IPv6CIDR
. It just needs to find the current state of the subnet.
If by the time it gets to RenderAWS()
is still missing the VPC.IPv6CIDR
, NewTryAgainLaterError
waits for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Find()
is using VPC.IPv6CIDR
to get the block to make allocations from.
Find()
and RenderAWS()
are using different rules for making the allocation and which one gets run depends on timing.
@@ -196,6 +205,22 @@ func (_ *Subnet) RenderAWS(t *awsup.AWSAPITarget, a, e, changes *Subnet) error { | |||
} | |||
} | |||
|
|||
if e.IPv6CIDR != nil && strings.HasPrefix(aws.StringValue(e.IPv6CIDR), "/") { | |||
vpcIPv6CIDR, err := findVPCIPv6CIDR(t.Cloud, e.VPC.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why limit this to Amazon-provided IPv6 associations? Why not just use e.VPC.IPv6CIDR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot test with customer owned CIDRs. The AWS docs regarding IPv6 are not exactly clear either.
So, for now, this is meant to only support AWS owned CIDRs, where I know that can be exactly 1 per subnet with a /64 size and from exactly 1 bloc of /56 size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's a customer-specified VPC.IPv6CIDR
, the intended behavior is obvious: take the slice off of the specified VPC IPv6CIDR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, it's obvious that it means extra work, that is unrelated. For now, this is experimental, with limitations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could test using a shared VPC with a site-local IPv6 CIDR association.
It make sense to support only AWS-provided CIDRs in managed VPCs, especially for egress. But we shouldn't design out the shared VPC case. There's already code that pulls from VPC.IPv6CIDR
; we should be consistent about using that when it's available.
@@ -196,6 +205,22 @@ func (_ *Subnet) RenderAWS(t *awsup.AWSAPITarget, a, e, changes *Subnet) error { | |||
} | |||
} | |||
|
|||
if e.IPv6CIDR != nil && strings.HasPrefix(aws.StringValue(e.IPv6CIDR), "/") { | |||
vpcIPv6CIDR, err := findVPCIPv6CIDR(t.Cloud, e.VPC.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no dependency on the VPCAmazonIPv6CIDRBlock
task, so RunTasks
might schedule the VPCAmazonIPv6CIDRBlock
task in a later slot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no value in the dependency, as VPCAmazonIPv6CIDRBlock
will not return the associated block.
This is the reason for using NewTryAgainLaterError
in Subnet.RenderAWS
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But unless VPCAmazonIPv6CIDRBlock.Run()
is called, the IPv6 association won't ever arrive. If there's no dependency, RunTasks
might schedule VPCAmazonIPv6CIDRBlock
to a later tranche than Subnet
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The IPv6 association won't ever arriveThere is a dependency in VPC, so it will arrive.
There is a dependency in VPC, so it will arrive, sooner or later.
RunTasks might schedule VPCAmazonIPv6CIDRBlock to a later tranche than Subnet.
This is why NewTryAgainLaterError
waits for it to appear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It won't appear due to how RunTasks
schedules tasks. If tasks b
and c
depend on a
, d
depends on b
, and e
depends on c
, the scheduler effectively puts false dependencies of d
depending on c
and e
depending on b
.
We should support this syntax on the |
return err | ||
} | ||
if vpcIPv6CIDR == nil { | ||
return fi.NewTryAgainLaterError("waiting for the VPC IPv6 CIDR to be assigned") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's also possible it's a shared VPC with no IPv6 allocation forthcoming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, there will be an error from VPCAmazonIPv6CIDRBlock
that says: "IPv6 CIDR block provided by Amazon not found".
I think we might need to make significant changes to |
I believe there are two outstanding issues:
For issue (1):
It then runs all the tasks in a given tranche in parallel. It waits for all tasks in a tranche to complete (without retries) before running any tasks in the next tranche.
If there were a subsequent change that required There also might be a race. If I'll write up issue (2) later. |
@johngmyers I addressed (1). |
At the moment, task order due to dependencies is the following:
In the subnet
In the subnet
@johngmyers Do you see any possible race condition anymore? |
The initial look seems quite promising. It will likely be 24 hours before I can give it a proper review. |
Thanks, no rush. |
checkNoChanges(t, cloud, allTasks) | ||
} | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing a test for the case where we have to wait for VPCAmazonIPv6CIDRBlock
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johngmyers The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
Ref: #8432 (comment)
Inspired by #11163.
This PR makes it possible to set subnet CIDR based on VPC CIDR on cluster creation similarly to:
There is some feedback from @johngmyers in #11442 (comment), which suggests using a
cidrsubnet(8,1)
syntax instead of acidrsubnet-8-1
syntax for configuring a subnet.The reason I chose the
cidrsubnet-8-1
is to maybe allow using this as a flag on create cluster and that it seems similar to other places, like https://kops.sigs.k8s.io/cluster_spec/#egress./cc @justinsb @johngmyers @rifelpet