-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dualstack services #1679
dualstack services #1679
Conversation
Headless services will operate using the following conditions: | ||
1. api-server does not default `ipFamily` field. User may optionally to choose to set it. | ||
2. If `ipFamily` field is set by the user, then the value of `preferredIPFamily` will follow accordingly. Example: setting `ipFamily` to `IPv6` then `preferredIPFamily` will have `IPv6` as a value. | ||
3. If `ipFamily` field is not set, then the value of `preferredIPFamily` will carry the values set in `-- service-cluster-ip-range`, which can be a single value in even in a dual stack cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit!
3. If `ipFamily` field is not set, then the value of `preferredIPFamily` will carry the values set in `-- service-cluster-ip-range`, which can be a single value in even in a dual stack cluster. | |
3. If `ipFamily` field is not set, then the value of `preferredIPFamily` will carry the values set in `--service-cluster-ip-range`, which can be a single value in even in a dual-stack cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, you're allowed to have single-stack --service-cluster-ip-range
in a dual-stack cluster? I didn't realize that. (That ought to be called out explicitly if so since network plugins have to deal with that.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make sure it is clear in the KEP and our docs, do you recommend other places where we should make that call out?
do you mind sharing how does this impacts CNIs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well, like, I specifically added a unit test to ensure we reject this as an invalid configuration in ovn-kubernetes: https://github.com/ovn-org/ovn-kubernetes/pull/1189/files#diff-173cba3f6f9206af564ac9ed53992a91R982 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@smarterclayton was the one that clarified this kubernetes/kubernetes#86895 (comment) ,
is it possible to enable dual-stack without configuring two service CIDRs? I mean configuring only a service-cidr for IPv6 and not for IPv4?
The feature gate flag allows dual stack config. Enabling the flag doesn’t require you to use dual stack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The text says "--service-cluster-ip-range
... can be a single value even in a dual-stack cluster". It should not say "a dual-stack cluster" if it means "any cluster in which the IPv6DualStack
feature gate is active, regardless of configuration".
Saying "you can enable the IPv6DualStack
feature gate and not be obligated to configure dual-stack networking" is unnecessary, because that's a hard requirement of the feature process. (Otherwise we'd never be able to get rid of the feature gate and declare the feature GA.)
So if the text means "--service-cluster-ip-range
can be a single value even when IPv6DualStack
is enabled" then that doesn't need to be said. But if it means "--service-cluster-ip-range
can be single-stack even when the pod network is dual-stack", then that's surprising.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the clarification, I was always assuming
--service-cluster-ip-range can be a single value even when IPv6DualStack is enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aojea Your assumption is correct.
Headless services will operate using the following conditions: | ||
1. api-server does not default `ipFamily` field. User may optionally to choose to set it. | ||
2. If `ipFamily` field is set by the user, then the value of `preferredIPFamily` will follow accordingly. Example: setting `ipFamily` to `IPv6` then `preferredIPFamily` will have `IPv6` as a value. | ||
3. If `ipFamily` field is not set, then the value of `preferredIPFamily` will carry the values set in `-- service-cluster-ip-range`, which can be a single value in even in a dual stack cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, you're allowed to have single-stack --service-cluster-ip-range
in a dual-stack cluster? I didn't realize that. (That ought to be called out explicitly if so since network plugins have to deal with that.)
dea675d
to
61d97cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to talk somewhere about the requirement for constancy of the primary family?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After we get these changes in, I'd like to maybe do a pass over the doc and make it clearer which phases things are in. Unless you read the whole thing at once, it can be confusing.
61d97cd
to
29dc6ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we demanding that --cluster-cidr
and --services-cluster-ip-range
have the same ordering? How about node.spec.podCIDR
?
29dc6ba
to
6d26e3c
Compare
6d26e3c
to
ffd12fc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A topic that we've gone back and forth on, and is not covered here: ipFamilies controller. If ipFamilies is not stored (is live-derived from apiserver flags), then clients can get different answers depending on which apiserver they ask (while a single->dual upgrade is in progress). That's bad.
Consider these 2 cases:
Case 1: upgrading to dual-stack at the same time as enabling the gate (likely when it goes beta):
t0: user created a service in single-stack v4 before dual-stack was available (so no ipFamily
)
t1: Kubernetes BETAs dual-stack support
t2: cluster upgrades version and configure dual-stack at the same time, begins
t3: reading the service populates ipFamilies
with either nil or "v4,v6" depending on which apiserver you hit
t4: controllers do not comprehend ipFamilies
anyway
t5: pods have not been restarted, so v6 traffic is a black-hole
t6: cluster upgrade completes
t7: reading the service populates ipFamilies
with "v4,v6"
t8: pods may or may not have not been restarted, so v6 traffic is still a partial black-hole
Case 2: upgrading to dual-stack when it goes GA:
t0: user created a service in single-stack v4 before dual-stack was available (so no ipFamily
)
t1: Kubernetes BETAs dual-stack support
t2: cluster upgrades version, does not configure dual-stack
t3: reading the service populates ipFamilies
with either nil or "v4" depending on which apiserver you hit
t4: controllers do not comprehend ipFamilies
anyway
t5: cluster upgrade completes
t6: reading the service populates ipFamilies
with "v4" only
t7: Kubernetes GAs dual-stack support
t8: cluster upgrade to dual-stack begins
t9: reading the service populates ipFamilies
with either "v4" or "v4,v6" depending on which apiserver you hit
t10: controllers thrash
t11: kube-proxies are inconsistent with each other
t12: pods have not been restarted, so v6 traffic is a black-hole anyway
t13: cluster upgrade to dual-stack completes
t13: reading the service populates ipFamilies
with "v4,v6"
t14: pods may or may not have not been restarted, so v6 traffic is still a partial black-hole
Both seem problematic. It seems better to instead have a controller which writes that value to the service status.
Let's look at the new timelines:
Case 1: upgrading to dual-stack at the same time as enabling the gate (likely when it goes beta):
t0: user created a service in single-stack v4 before dual-stack was available (so no ipFamily
)
t1: Kubernetes BETAs dual-stack support
t2: cluster upgrades version and configure dual-stack at the same time, begins
t3: reading the service populates ipFamilies
with nil
t4: controllers which comprehend ipFamilies
do whatever they did before
t5: cluster upgrade completes
t6: controller iterates all services and writes ipFamilies
as "v4"
Case 2: upgrading to dual-stack when it goes GA:
t0: user created a service in single-stack v4 before dual-stack was available (so no ipFamily
)
t1: Kubernetes BETAs dual-stack support
t2: cluster upgrades version, does not configure dual-stack
t3: reading the service populates ipFamilies
with nil
t4: controllers which comprehend ipFamilies
do whatever they did before
t5: cluster upgrade completes
t6: controller iterates all services and writes ipFamilies
as "v4"
t7: Kubernetes GAs dual-stack support
t8: cluster upgrade to dual-stack begins
t9: reading the service returns ipFamilies
as "v4"
t10: cluster upgrade to dual-stack completes
t11: reading the service returns ipFamilies
as "v4"
That raises 2 new issues:
-
How do you upgrade an existing service from "v4" to "v4,v6". You can only really do that after all your pods are restarted.
-
Is that really status any more? Or did I just convert it into
spec
? It's not really recoverable by observation...
What if (thinking out loud now):
- change
spec.ipFamily
tospec.ipFamilies
- drop
status.ipFamilies
- on CREATE:
- if user provided a value, use it
- else set it from cluster-cidrs
- on GET:
- if stored value, use it
- else default from cluster-cidrs[0]
Then the answer to question 1 (how to upgrade single-dual) is simply "add a value to ipFamiles
". Allow mutations which add a value, but disallow mutations that change [0].
Single-stack controllers (endpoints) look at ipFamilies[0]
. Dul-stack controllers (endpoint-slice) look at all entries.
No controller needed. The timelines become:
Case 1: upgrading to dual-stack at the same time as enabling the gate (likely when it goes beta):
t0: user created a service in single-stack v4 before dual-stack was available (so no ipFamily
)
t1: Kubernetes BETAs dual-stack support
t2: cluster upgrades version and configure dual-stack at the same time, begins
t3: reading the service populates ipFamilies
with either nil or "v4" depending on which apiserver you hit
t4: controllers which comprehend ipFamilies
either do whatever they did before or get "v4"
t5: cluster upgrade completes
t6: reading the service populates ipFamilies
with "v4"
t7: user upgrades service to dual-stack by writing ipFamilies = [v4,v6]
Case 2: upgrading to dual-stack when it goes GA:
t0: user created a service in single-stack v4 before dual-stack was available (so no ipFamily
)
t1: Kubernetes BETAs dual-stack support
t2: cluster upgrades version, does not configure dual-stack
t3: reading the service populates ipFamilies
with either nil or "v4" depending on which apiserver you hit
t4: controllers which comprehend ipFamilies
either do whatever they did before or get "v4"
t5: cluster upgrade completes
t6: reading the service populates ipFamilies
with "v4"
t7: Kubernetes GAs dual-stack support
t8: cluster upgrade to dual-stack begins
t9: reading the service returns ipFamilies
as "v4"
t10: cluster upgrade to dual-stack completes
t11: reading the service returns ipFamilies
as "v4"
t12: user upgrades service to dual-stack by writing ipFamilies = [v4,v6]
It really seems like the decision about whether a Service is "v4", "v6", or "v4,v6" should just be defined by the Endpoints, not the Service. If the Endpoints addresses are all IPv4, then the Service should be "v4" (for NodePort, etc, purposes) without you having to do anything else. Likewise with "v6" if the Endpoints are all IPv6. If there are both IPv4 and IPv6 Endpoints addresses (whether because the Service points to dual-stack pods or because it points to a mix of single-stack IPv4 and single-stack IPv6 pods) then the Service should be "v4,v6". That will fail in the context of a pod that is launched as dual-stack but actually only listens on one address family, but we had already talked about needing to indicate We'd still need |
There is no |
@thockin this scenario assumes that |
what is the whole discussion about "ipFamilies" about then? |
4041ed7
to
5bce077
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm close to LGTM on this delta, after which I want to do an overall edit pass to make sure it is still coherent at a macro level.
There are a few things that need addressing.
https://github.com/kubernetes/enhancements/pull/1679/files#r434536119
https://github.com/kubernetes/enhancements/pull/1679/files#r436273707 (the second proposal) or
https://github.com/kubernetes/enhancements/pull/1679/files#r434023994
https://github.com/kubernetes/enhancements/pull/1679/files#r435267286
2. dualstack: assigned from both the first and the second cidr entries, in this scenario Services will have a dual assigned `ClusterIPs` | ||
|
||
The above is achieved using the following changes to Service api. | ||
1. Add `spec.ipFamilyPolicy` field with possible values of `SingleStack`, `DualStack`, or `nil`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My inclination is to leave this as an enum, but it's a soft pref. It's unlikely we'll ever have another value here. But I have been wrong before.
Does the naming here present real confusion to people?
5bce077
to
97e3880
Compare
97e3880
to
583ab7c
Compare
583ab7c
to
6b4697f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This reads like I expected. @danwinship this adopts your proposed change for bool. It leaves the dependency on EPSlice and I am OK with that - it just prevents a path that is almost certainly not what the user wanted anyway.
What we should do now is boil it down to a smaller form (slides?) and get an outside API reviewer to read it. I can sign up to condense it if you want, Kal
|
||
#### Type Headless services | ||
|
||
\<TBD\> | ||
1. For Headless with selector, the value of `spec.preferDualStack` and `spec.ipFamilies` will drive the endpoint selection. Endpoint slice controller will select endpoints with ip families that match the value of these two flags. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, only ipFamilies
matters to controllers. Nobody except the service allocator should have to look at the prefer
flag, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we should do now is boil it down to a smaller form (slides?) and get an outside API reviewer to read it. I can sign up to condense it if you want, Kal
@thockin yes please condense. I will attend the api review as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, only
ipFamilies
matters to controllers. Nobody except the service allocator should have to look at theprefer
flag, right?
Yes. Because there will be cases where user wants dualstack preferDualStack: true
but won't get it, and IPFamilies
is the old field that reflect that fact (and ClusterIPs
but that will require testing ips for families wherever controller is concerned)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But the only reason a service would preferDualStack
but not get it is because the cluster isn't dual-stack. Or at least "doesn't have dual-stack service CIDRs". I suppose in theory we could support "I want a cluster with dual-stack pods and dual-stack headless services but only single-stack ClusterIP services" but that's weird and I think it's better to say that nobody except the service allocator looks at preferDualStack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a relatively minor point but ipFamiles[]
has all the information it needs. Eg.
time +0: user creates a headless service with prefer=true
, but cluster is single-stack. They get ipFamiles = [ v4 ]
.
time +1: endpoint controllers see ipFamiles = [ v4 ]
and only consider v4 addresses
time +2: admin upgrades cluster to dual-stack
time +3: endpoint controllers see ipFamiles = [ v4 ]
and only consider v4 addresses
The point being that the controllers should NOT change at T+3, because the service pre-existed dual-stack enablement.
5. The first cidr entry of `--service-cluster-ip-range=<first cidr entry>, <secondary cidr entry>` is considered cluster default ip address family. | ||
6. Attempting to allocate `ClusterIP` or `ClusterIPs` from an ip address family that is not configured on the cluster will result in an error. | ||
|
||
> The below discussion assumes that `Endpoint Slice and Endpoint Slice Controller` will allow dual stack endpoints, the existing endpoint controller will remain single stack. That means enabling dual stack services by having more than one entry in `--service-cluster-ip-range` without enabling `Endpoint Slice` feature gate will result in validation failures during startup. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but even when using EndpointSliceController, we still also run the EndpointController and generate Endpoints for compatibility with old components, correct? So we still need to specify how EndpointController behaves in a dual-stack cluster... (Actually, we need to specify how EndpointSliceController behaves in a dual-stack cluster too... you didn't update that part of the KEP...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robscott Was weighing options, but I think we have a canonical answer unless we choose to change it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it sounds like this is suggesting Endpoints resources would be single stack only. As far as the Endpoints controller, it would be relatively straightforward to drop any IPs not matching the family of the Endpoints resource.
If we want to say that the Endpoints resource is guaranteed to be single stack, that may be more difficult. Of course the Endpoints controller is not the only thing that can produce/manage Endpoints. There is a chance that someone is already writing dual-stack Endpoints in a cluster somewhere. Would we need to add some kind of field and/or validation to enforce this single stack behavior? Maybe it's fine to only update the behavior of the controller here and leave the API alone?
For the EndpointSlice controller I think the answer is somewhat simpler. Assuming we maintain the IPv4 and IPv6 address types, the EndpointSlice controller could just be modified to detect the address type and add IP addresses to EndpointSlices with the appropriate address type. So if all Pods backing a Service have both v4 and v6 IPs, we'd end up with 2x the number of EndpointSlices for a dual stack Service, half for the v4 addresses and the other half for the v6 addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@robscott yeah, let's be clear - when an Endpoints
is managed by the Enpoints controller, it will be single-stack, corresponding to ipFamilies[0]
. Endpoints
managed by anything else is out of scope.
For EPSlice - I'm still looking to you for thoughts on whether you think merging back to a single type (IP) or keeping distinct IPv4, IPv6 makes sense. Or both, which gives creators freedom, but forces consumers to have to handle both. Once you GA, you have to keep whatever you choose.
|
||
User who want to create a new single stack Service can create it using one of the following methods: | ||
1. Set `spec.preferDualStack: nil` and `spec.ipFamilies: nil` fields. api-server will default `spec.preferDualStack: false` and `spec.ipFamilies: ['default ip family of the cluster']`. | ||
2. Set `spec.preferDualStack: false` api-server will default `spec.ipFamilies: ['cluster default ip address family']`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need to distinguish nil
from false
for preferDualStack
?
In particular, the distinction is not needed for the mutating admission hook case like it was with ipFamilyPolicy
, because if you want to say "I can't be mutated to dual-stack" you do that by setting ipFamilies
, not by setting preferDualStack
(because explicitly setting preferDualStack: false
but not setting ipFamilies
doesn't make sense)
If we do need both nil
and false
then the two examples should be consistent. (specifically "default ip family of the cluster" vs "cluster default ip address family")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need to handle nil. We can safely use defaulting to convert nil to false (or make it a non-pointer bool)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only at ipalloc, not in the defaulting logic. To allow web hooks to change single stack by default
to dual stack by default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure exactly which part of which comment you're referring to, but what I was saying is that we do not need to have a nil
vs false
distinction, at any point, including to handle the webhook case.
preferDualStack: false
means what ipFamilyPolicy: nil
used to, which means that a webhook can change a preferDualStack: false
Service to preferDualStack: true
as long as ipFamilies
or clusterIPs
isn't set, regardless of whether it's preferDualStack: false
because the user specified that explicitly or because the value was filled in by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The basic idea is to allow webhooks to identify
preferDualStack:nil && (nil families || single family entry) => no preference. can do both. i don't have code that depend on single/multiple IPs
preferDualStack:false && (nil families || single family entry) => I can not work with dual stack, i will fail if it is dual stack
preferDualStack:true && (nil families || single family entry) => i can work with dual stack if available
preferDualStack:true && (two ip families) => i have to have dual stack
a web hook can not and must not change false
=> true
for
- This is not caller intention.
- It can not tell if the caller can actually deal with dual stack or not.
- same applies on
true + two families
because the intention is to have dual stack. either api-server can full fill it or fail it, but must not respond with single family service.
forcing
false
=>true
might be applicable to some use cases, but it won't be the default use case.
the ip-alloc comment above is wrt where the defaulting will have to take effect, and it will be in ip allocation phase, not at defaulting phase because defaulting logic does not know the flags set on the cluster (at least not yet).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
preferDualStack:false && (nil families || single family entry) => I can not work with dual stack, i will fail if it is dual stack
No. This was the point of getting rid of ipFamilyPolicy: SingleStack
and switching to a two-valued field (false
/true
) rather than a three-valued field (nil
/SingleStack
/DualStack
). There is no "I can't do dual stack" any more. You have to say either "I can only do single-stack IPv4" (preferDualStack: false, ipFamilies: ["IPv4"]
) or "I can only do single-stack IPv6" (preferDualStack: false, ipFamilies: ["IPv6"]
). It doesn't make sense to say "I can't do dual-stack but I have no preference about IP families". Thus, there's no reason to treat preferDualStack: nil, ipFamilies: nil
and preferDualStack: false, ipFamilies: false
as meaning different things, and thus there is no reason to even have preferDualStack: nil
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or whatever. We're still alpha. We can still fix this in 1.20 if it merges one way now and we agree later it was wrong. We kind of need to get this merged soon though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewing now, but the problem is defaulting.
We want to be able to assert that consumers only need to look at ipFamilies
and that they can rely on ipFamilies
to not be empty.
That means that an old, stored service MUST be defaulted properly on read.
For a headful service, I can rely on clusterIP
, but for a headless Service I can't.
So defaulting code observes { clusterIP: None, ipFamilies: nil }
it has to convert that to { clusterIP: None, ipFamiles: [ $default-family] }
. Defaulting can't tell the difference between reading an old saved instance or a new create op. Admission control runs AFTER defaulting, so this hypothetical admission controller can not know whether the user meant to say "I require single-stack v4" or "I don't care, but the defaults have been set".
We either need a tri-state (blech) or we need to distinguish read from create when applying defaults.
Kal was looking at this last option last I spoke to him.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No description provided.
ipFamilies: # defaulted | ||
- IPv4 | ||
clusterIP: 1.2.3.4 # allocated | ||
clusterIPs: # allocated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be clearer if you explicitly showed what the "input" yaml was
2. set `spec.ClusterIPs: true` and set `spec.ClusterIPs` to any two dual stack ips in any order. | ||
3. set `spec.ipFamilies` to any two dual stack ip families in any order. | ||
4. Set `spec.CluterIPs` to any two dual stack ip families in any order. | ||
- If the user prefer dual stack if available (service creation will not fail if the cluster is not configured for dual stack) then they can do one of the following |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is weird that this is under "Creating a New Dual Stack Service", since it may not result in a dual-stack service...
I think you're torn between half wanting to describe how the code behaves and half wanting to provide end-user use cases, but it gets muddled.
I'm not sure we really need to fix this now, but maybe a better organization would be to first give the encyclopedic description of how the apiserver responds to each set of fields, and then after give examples of things like "what happens with a legacy service that doesn't specify any of the new fields", "how to create a service that will get allocated a single IP in a single-stack cluster and dual IPs in a dual-stack cluster", "how to create an IPv4-only service", "how to create an IPv6-primary dual-stack service even in an IPv4-primary dual-stack cluster", "how to manually specify dual cluster IPs", etc. And then you can show both the input yaml and the result after defaulting/allocation.
But that doesn't have to block this merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to do a full edit pass just because it has been patched enough times now :)
6b4697f
to
edc2868
Compare
edc2868
to
1a36bcb
Compare
💭 if you wanted to find EndpointSlices for a particular address family, one way to achieve that (if it's defined before GA) is to use a selector. To make that feasible, have control plane become responsible for the getting the labels right (maybe an admission controller?) |
@sftim Is there a requirement to be able to watch only slices of a certain family (rather than filtering the WATCH client-side) ? The requirement I know is that any given service's Endpoints might be single-stack, but different service might choose different families. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm LGTM on this. There are some small points to be cleaned up in followup, and one notable point to debate - pointerness of the bool. I'll open a distinct comment on that.
|
||
User who want to create a new single stack Service can create it using one of the following methods: | ||
1. Set `spec.preferDualStack: nil` and `spec.ipFamilies: nil` fields. api-server will default `spec.preferDualStack: false` and `spec.ipFamilies: ['default ip family of the cluster']`. | ||
2. Set `spec.preferDualStack: false` api-server will default `spec.ipFamilies: ['cluster default ip address family']`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No description provided.
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: khenidak, thockin The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
OK, let's talk about what I think is the only major sticking point left. I mentioned it in my last review, but let's just take it here. The previous proposal had It seems there's a not-quite-aligned interpretation of this field, though. Kal and I debated a bit and there's arguments in both directions. Summarized as "does the bool need to be a tristate or not". (See footnote 1 for a remark on conventions for optional) My argument (and @danwinship I think) is that plain bi-state is suficient:
Kal argues that Dan's point in proposing the bool was that "I require single-stack, but I don't care which family" is a nonsense statement. There's literally no way to code that in an app - either you handle both families or you handle one SPECIFIC family. The correct expression of "I require single-stack" must define WHICH family. Kal's counter which I find some merit in is that having 2 service IPs represents "exposed surface" that some users will actively want to prevent, e.g. for simpler auditing purposes. I find it hard to refute "this is doesn't make sense, but I want to do it anyway". So this is what I want to discuss. I'll characterize these as 3 models:
In all three models, a user who sets In model 1 that means changing In model 2 that means changing In model 3 that means changing My personal feeling after writing this up is that model 3 is sort of the worst choice. It uses a bool (which can be very limiting) and forces API consumers to interpret it as a tri-state. It is isomorphic to model 1. I would argue we either go with model 2 (and tell users who care about the extra surface area to NOT run the optional admission controller) or revert the design to model 1 (which is clearest expression of intention, I think). Footnote 1: |
Yeah, this is why I said "let's just commit it like this and we can argue more later". I'm fine if we get locked into having a |
The issue at hand is not really whether it's spelled `bool` or `*bool` -
it's about whether `nil` is a value that an end-user can ever see on a
read. Model 3 says yes, and is (IMO) a poorer form of model 1. Model 2
says no.
…On Fri, Jun 19, 2020 at 1:45 PM Dan Winship ***@***.***> wrote:
We may still make it a pointer because that is the convention for optional
fields, but the defaulting can simply turn nil into false.
Yeah, this is why I said "let's just commit it like this and we can argue
more later". I'm fine if we get locked into having a *bool in the API
even if we eventually decide we didn't need it.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub
<#1679 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABKWAVHZIMMPVGIQLPTRWELRXPE6DANCNFSM4MFDLBFQ>
.
|
I mean, if you're going to make claims like that then my brain is automatically going to go into "devil's advocate" mode and realize that a pod could learn its pod IP via the downward API and then explicitly bind to just that one address. 🤦♂️ In that case, it would work in either single-stack IPv4 or single-stack IPv6 clusters, but would not be capable of doing dual-stack. (This wouldn't make a lot of sense for golang-based servers where binding on either
Sure, but the |
If you want to write a helm chart or just generic YAML that can run in either cluster, and you only want half the surface area, this doesn't help. That said, "don't run the optional admission controller" seems like a practical answer. In that case (false, nil, nil) does mean "I want single stack and I don't care which" |
Yeah, "I want to force Services to become dual-stack but also want Services to not be dual-stack for security reasons" isn't a very compelling user story. 🙂 |
updates dualstack kep as per comments found here kubernetes/kubernetes#86895 in prep for dualstack next phase.
@thockin + @aojea + @lachie83 - I think need to revise the
CoreDNS Operation
section. Theresolution
is a function of how the client asks for address. DNS Servers typically respond with all records they have about name irrespective of protocol version used to make the query.