-
Notifications
You must be signed in to change notification settings - Fork 39.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP Refactor and Remove KubeletConfiguration from componentconfig #44252
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: mtaufen
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure what to do with this PR. What is a .gogo file ? :)
ManifestURLHeader string `json:"manifestURLHeader"` | ||
|
||
// maxOpenFiles is Number of files that can be opened by Kubelet process. | ||
MaxOpenFiles int64 `json:"maxOpenFiles"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're sure this shouldn't be a cluster-wide param?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As it's just for the Kubelet process, I'd configure it in node-level config.
// mounts,etc). | ||
RootDirectory string `json:"rootDirectory"` | ||
|
||
// containerRuntime is the container runtime to use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why wouldn't these be cluster-wide?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ContainerRuntime is probably the same cluster-wide. This is also really a CRI detail and I should probably move this to the section of stuff that is subsumed by CRI.
The RemoteRuntime/RemoteImage endpoints are paths to socket files on the node, for communicating with CRI. Probably the same cluster-wide if all of your nodes have the same filesystem layout, but it's a node-level config parameter and should be treated as such.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the true CRI world, this field, ContainerRuntime
will not exist. Since it's a legacy field, let's not include this in the kubelet configuration (and kubelet can't transition from using docker to rkt properly anyway).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the runtime/image endpoints, I think it's perfectly fine to have a hybrid cluster composed of docker and rkt nodes if you'd like.
It's also risky to give the impression that you can simply change the endpoint in your cluster-wide KubeletConfiguration
to use another CRI implementation without any consequence, so I'd not want to encourage people to use this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree that this should not be cluster-wide.
type FeatureGates struct { | ||
// featureGates is a string of comma-separated key=value pairs that describe feature | ||
// gates for alpha/experimental features. | ||
FeatureGates string `json:"featureGates,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this something that really wants to be cluster-config? E.g. master and node components need this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it does. We ought to put this somewhere central, because so many components use it. Where that central place should be, I've yet to really think about.
@thockin this isn't intended to be merged. .gogo is a made-up extension so my tools don't try to parse that file. This PR is just so that we can talk about the categories in a file that only contains the stuff from KubeletConfiguration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At a glance, the categories in the file look too fine-grained to me.
@thockin this isn't intended to be merged. .gogo is a made-up extension so my tools don't try to parse that file. This PR is just so that we can talk about the categories in a file that only contains the stuff from KubeletConfiguration.
A markdown file may be better -- we'd get the colored text for the code :-)
type KubeletDebugConfig struct { | ||
// enableDebuggingHandlers enables server endpoints for log collection | ||
// and local running of containers and commands | ||
EnableDebuggingHandlers *bool `json:"enableDebuggingHandlers"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caveat: EnableDebuggingHandlers
is enabled by default because it's required for exec, attach, logs, portforward, etc to work. It's probably a misnomer, and I am not sure it should really be grouped into the KubeletDebugConfig
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Um, does that mean that debug-only endpoints like configz are also served in production clusters? We really shouldn't do that...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would also be a good opportunity for us to change the name (and split debug from exec, attach, etc.), since this config API is still alpha.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Further, since this is required for kubectl to interact directly with containers, what are the specific cases where you'd want to turn it off?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the master nodes.
KubeAPIBurst int32 `json:"kubeAPIBurst"` | ||
// syncFrequency is the max period between synchronizing running | ||
// containers and config | ||
SyncFrequency metav1.Duration `json:"syncFrequency"` // TODO(kc-refactor): This feels right here but take another look later. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SyncFrequency
is only related to kubelet's internal implementation and is not directly related to how kubelet interacts with the apiserver (if this is what KubeAPIConfig
is about).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok then KubeAPIConfig would not be the appropriate place for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm going to stick it in PodLifecycleConfig. That said, it does still have some indirect effect on the frequency of API calls, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very indirectly. If you sync the pods and the statuses remain unchanged, no updates will be sent.
|
||
} | ||
|
||
type NodeStatusConfig struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we really need separate structs for NodeRegistrationConfig
and NodeStatusConfig
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i guess i am wondering if we would ever have more than one field in this struct?
// dockerEndpoint is the path to the docker endpoint to communicate with. | ||
DockerEndpoint string `json:"dockerEndpoint"` | ||
// enable gathering custom metrics. | ||
EnableCustomMetrics bool `json:"enableCustomMetrics"` // for Docker |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think EnableCustomMetrics
belong in the DockerOptions
. We support this through cadvisor, and may eventually move this out of kubelet.
CNIBinDir string `json:"cniBinDir"` | ||
// podInfraContainerImage is the image whose network/ipc namespaces | ||
// containers in each pod will use. | ||
PodInfraContainerImage string `json:"podInfraContainerImage"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a docker-specific field.
/* BEGIN RUNTIME REMOVAL */ | ||
|
||
// I believe the plan is to remove ALL of these from the Kubelet/Node configuration | ||
// e.g. they become flags-only for now and are immediately marked deprecated, for later removal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can deprecate these flags anytime soon because there are no alternatives to specify these settings. We can only deprecate them when/if we deprecate the docker integration and/or enable running dockershim as a separate process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why can't we deprecate the integration now and tell people to move to CRI as the replacement?
// mounts,etc). | ||
RootDirectory string `json:"rootDirectory"` | ||
|
||
// containerRuntime is the container runtime to use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the true CRI world, this field, ContainerRuntime
will not exist. Since it's a legacy field, let's not include this in the kubelet configuration (and kubelet can't transition from using docker to rkt properly anyway).
// mounts,etc). | ||
RootDirectory string `json:"rootDirectory"` | ||
|
||
// containerRuntime is the container runtime to use. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for the runtime/image endpoints, I think it's perfectly fine to have a hybrid cluster composed of docker and rkt nodes if you'd like.
It's also risky to give the impression that you can simply change the endpoint in your cluster-wide KubeletConfiguration
to use another CRI implementation without any consequence, so I'd not want to encourage people to use this.
|
||
type CgroupsConfig struct { | ||
// kubeletCgroups is the absolute name of cgroups to isolate the kubelet in. | ||
KubeletCgroups string `json:"kubeletCgroups"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice if this could be made optional.
// for no container. Rolling back the flag requires a reboot. | ||
SystemCgroups string `json:"systemCgroups"` | ||
// runtimeCgroups are cgroups that container runtime is expected to be isolated in. | ||
RuntimeCgroups string `json:"runtimeCgroups"` // TODO(kc-refactor): Does this become part of CRI now? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this would be deprecated, the kubelet should cease to re-parent runtimes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, the more we can yank the better
RuntimeCgroups string `json:"runtimeCgroups"` // TODO(kc-refactor): Does this become part of CRI now? | ||
// cgroupRoot is the root cgroup to use for pods. This is handled by the | ||
// container runtime on a best effort basis. | ||
CgroupRoot string `json:"cgroupRoot"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i thought we updated this text. with cgroups per qos enabled, this is the cgroup that will be the parent of kubepods cgroup for all end-user pods.
KubeReserved map[string]string `json:"kubeReserved"` | ||
// This flag helps kubelet identify absolute name of top level cgroup used to enforce `SystemReserved` compute resource reservation for OS system daemons. | ||
// Refer to [Node Allocatable](https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md) doc for more information. | ||
SystemReservedCgroup string `json:"systemReservedCgroup,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is where the separate struct for CgroupsConfig feels awkward. can we combine them into a common NodeAllocatableCnofig struct?
I'll post a comment when I push that update. |
I restructured into these categories:
Additional:
|
Also wrt FeatureGates, that mechanism is unversioned -- it should be made versioned and not coupled to other components' versioned APIs. We shouldn't have to rev the API version of every component every time feature gates are added or removed, components from a new K8s version should be able to convert an old FeatureGates object to what they expect. |
I made some progress on this today but it won't build yet as some test code relies on bits of the alpha dynamic config API that this PR removes. I'll probably continue this work in the 1.8 timeframe, as I really need to focus on the alpha implementation of the new dynamic config mechanism for 1.7, which matters more than the API right now. Updates:
Commit text from the latest squash:
|
This moves KubeletConfiguration from componentconfig to pkg/kubelet/apis/nodeconfig The types themselves still need a lot of refactoring work. For example, a serious audit of what can SAFELY be dynamically updated should be done. Part of this commit moves a lot of parameters out of the KubeletConfiguration type and into KubeletFlags, because they need more thought before we allow them to be dynamic (like changing cgroups on the fly). Unfortunately this somewhat reduces the utility of dynamic config for testing, and presently the pods_container_manager_test in e2e_node is a blocker because it relied on dynamic config to change cgroups parameters. I probably shouldn't be surprised at how many dependencies on dynamic config have sprung up in the last ~year, but it is kind of incredible. After all, the current implementation is barely alpha, and the API is still subject to violent changes like this commit.
@mtaufen: The following test(s) failed:
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
@mtaufen PR needs rebase |
) | ||
|
||
// GroupName is the group name use in this package | ||
const GroupName = "nodeconfig" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this called nodeconfig and not kubeletconfig. Should it include configuration for other node agents like node problem detector or kube-proxy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The KubeletConfiguration (and decomposition) describe how you want your node to behave, but the Kubelet is really just an implementation detail of how these node-level things are configured, so it doesn't make sense to marry the name to the Kubelet (even though the Kubelet tree is, at present, the best location for the types - because the Kubelet is the only thing that implements them).
Those node-agents should expose their configuration types from their own trees. Interesting idea about putting them in the same group though. Can you get name collisions between components' API groups during registration (like if kube-proxy also called its API group nodeconfig
and had some dependency on the Kubelet)? Come to think of it this seems possible... maybe it would be better to call the group kubeletconfig
just to be safe... This might also be less confusing for people.
My bias is toward trying to design higher-level APIs, and "configure general node behavior" is a higher level concept than "configure the Kubelet". I think it also gets people to think in terms of what's really necessary for achieving the node behavior they want, rather than putting all kinds of unnecessary stuff in the configuration API.
WDYT?
@mtaufen this wasn't /lgtm'd prior to codefreeze and seems like WIP. Removing the PR from the milestone |
Yup that's the correct motion. This work should be targeted for 1.8.
…On Tue, Jun 6, 2017 at 11:37 AM, grodrigues3 ***@***.***> wrote:
@mtaufen <https://github.com/mtaufen> this wasn't /lgtm'd prior to
codefreeze and seems like WIP. Removing the PR from the milestone
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#44252 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AA3JwXqxeukyKMq9Hid3WGur4z_5v1l6ks5sBZx4gaJpZM4M4CTt>
.
--
Michael Taufen
MTV SWE
|
This PR hasn't been active in 62 days. It will be closed in 27 days (Sep 4, 2017). cc @bgrant0607 @dchen1107 @derekwaynecarr @justinsb @mtaufen @smarterclayton @vishh @yujuhong You can add 'keep-open' label to prevent this from happening, or add a comment to keep it open another 90 days |
Automatic merge from submit-queue (batch tested with PRs 50198, 49051, 48432) move KubeletConfiguration out of componentconfig API group I'm splitting #44252 into more manageable steps. This step moves the types and updates references. To reviewers: the most important changes are the removals from pkg/apis/componentconfig and additions to pkg/kubelet/apis/kubeletconfig. Almost everything else is an import or name update. I have one unanswered question: Should I create a whole new api scheme for Kubelet APIs rather than register e.g. a kubeletconfig group with the default runtime.Scheme instance? This feels like the right thing, as the Kubelet should be exposing its own API, but there's a big fat warning not to do this in `pkg/api/register.go`. Can anyone answer this?
I'm going to do something similar across a few PRs. |
I've finished my initial pass at grouping the KubeletConfiguration types into substructures and I'm looking for feedback.
The proposed groupings are in a temporary file for now (
pkg/apis/componentconfig/v1alpha1/temp_v1a1_refactor.gogo
), to make it easy to see the original type side-by-side (pkg/apis/componentconfig/v1alpha1/types.go
) and to make it easier to grep just the stuff fromKubeletConfiguration
. The.gogo
extension on the temp file is just so tools don't get confused by a malformed go file.There are a few
TODO(kc-refactor)
comments noting questions I have or things I still have to think about.@yujuhong Please let me know if I effectively covered the set of flags-only-eventually-remove parameters related to container runtimes, or if I went too far.
@timstclair Please take a look at the
PodSecurityConfig
struct and let me know if you like grouping things this way.@bgrant0607 @thockin @dchen1107 @vishh @dashpole @lavalamp @Random-Liu @derekwaynecarr @justinsb @smarterclayton @ncdc @mikedanese