Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-proxy: We should be able to override bind addresses for healthz and metrics #108737

Open
TeddyAndrieux opened this issue Mar 16, 2022 · 32 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@TeddyAndrieux
Copy link

What happened?

When we use config for kube-proxy we cannot override the bind addresses for healthz and metrics.

What did you expect to happen?

I expect the kube-proxy CLI argument to override the value from the configuration file.

How can we reproduce it (as minimally and precisely as possible)?

A typical use case is, I want to expose the kube-proxy metrics only on the host IP, then I want to use the following pod template:

      containers:
      - command:
        - /usr/local/bin/kube-proxy
        - --config=/var/lib/kube-proxy/config.conf
        - --hostname-override=$(NODE_NAME)
        - --healthz-bind-address=$(HOST_IP):10256
        - --metrics-bind-address=$(HOST_IP):10249
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: HOST_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.hostIP

But today it does not work kube-proxy will just ignore the healthz-bind-address and metrics-bind-address arguments

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-02-16T12:38:05Z", GoVersion:"go1.17.7", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-02-16T12:32:02Z", GoVersion:"go1.17.7", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

@TeddyAndrieux TeddyAndrieux added the kind/bug Categorizes issue or PR as related to a bug. label Mar 16, 2022
@k8s-ci-robot k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 16, 2022
@TeddyAndrieux
Copy link
Author

/sig network

@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 16, 2022
@aojea
Copy link
Member

aojea commented Mar 16, 2022

this has been discussed so many times that I honestly lost track current status, there was one time all the flags were going to be deprecated 😄 IIRC

kubernetes/enhancements#2158 (comment)

current situation is kube-proxy config take precedence over flags

@TeddyAndrieux
Copy link
Author

TeddyAndrieux commented Mar 16, 2022

Ok but here it's not only "precedence" since even if the bind addresses are not defined in the config, the flags are still ignored.
And only using config we cannot easily bind on the node IP since the config map is shared for all the clusters.
To me, we need at least a special case for those bind addresses even if it's not the case for other flags

@aojea
Copy link
Member

aojea commented Mar 16, 2022

Ok but here it's not only "precedence" since even if the bind addresses are not defined in the config, the flags are still ignored.

if the addresses are not defined, they are defaulted ... so they are always "defined" if not by the user, they are by the kube-proxy itself

if len(obj.BindAddress) == 0 {
obj.BindAddress = "0.0.0.0"
}
if obj.HealthzBindAddress == "" {
obj.HealthzBindAddress = fmt.Sprintf("0.0.0.0:%v", ports.ProxyHealthzPort)
} else if !strings.Contains(obj.HealthzBindAddress, ":") {
obj.HealthzBindAddress += fmt.Sprintf(":%v", ports.ProxyHealthzPort)
}
if obj.MetricsBindAddress == "" {
obj.MetricsBindAddress = fmt.Sprintf("127.0.0.1:%v", ports.ProxyStatusPort)
} else if !strings.Contains(obj.MetricsBindAddress, ":") {
obj.MetricsBindAddress += fmt.Sprintf(":%v", ports.ProxyStatusPort)

And only using config we cannot easily bind on the node IP since the config map is shared for all the cluster

yeah, I remember someone reporting this problem too but I think he found a way to solve it, let me try to find the issue ... I agree that this flag/config situation is problematic, but if you read the discussion I linked there are trade offs in all solutions

@TeddyAndrieux
Copy link
Author

if the addresses are not defined, they are defaulted ... so they are always "defined" if not by the user, they are by the kube-proxy itself

Yes sure but to me, those default should come after flags.

yeah, I remember someone reporting this problem too but I think he found a way to solve it, let me try to find the issue ... I agree that this flag/config situation is problematic, but if you read the discussion I linked there are trade offs in all solutions

A way to solve it, to me something is doable by using an init container that will basically write the kube proxy config but it's sad to need that kind of approach just to set the bind addresses.

I agree that it does not necessarily make sense to provide flags for all config args, but it makes sense to provide some flags for instance specific stuff, like bind addresses.
It's like the hostnameOverride this one could be set in the config but we need to have it as a flag as well, and it's what we have today and I think we will keep it

@aojea
Copy link
Member

aojea commented Mar 17, 2022

A way to solve it, to me something is doable by using an init container that will basically write the kube proxy config but it's sad to need that kind of approach just to set the bind addresses

Or settings the bind address to 0.0.0.0 in the configuration

s, but it makes sense to provide some flags for instance specific stuff, like bind addresses.
It's like the hostnameOverride this one could be set in the config but we need to have it as a flag as well, and it's what we have today and I think we will keep it

The problem I see here is that flags allow to use dynamic values automatically, and configuration does not, that leaves us with only one choice as you are saying.

@neolit123 is there a simple way to have configuration customized by node using the downward API?

@TeddyAndrieux
Copy link
Author

Or settings the bind address to 0.0.0.0 in the configuration

It's not a solution to me, I do not want to expose metrics on all IPs.

TBH I really don't get why we would keep hostnameOverride as a flag and not the bind addresses

@aojea
Copy link
Member

aojea commented Mar 17, 2022

TBH I really don't get why we would keep hostnameOverride as a flag and not the bind addresses

yeah, I was trying to provide a workaround meanwhile we sort out the other problem :)
I agree that seems an incongruence ... but there seems a lot of history on the kube-proxy config/flag I don't have context, I can find a lot of issues and discussions about it but no clear reference to this

@neolit123
Copy link
Member

neolit123 commented Mar 17, 2022

is there a simple way to have configuration customized by node using the downward API?

IIUC this is a problem only if kube proxy is run as a Daemonset. If you are managing it in other ways e.g. systemd you could pass a unique config to separete instances.

The OP already has a good attempt at using the downward api. You could also try creating and mounting N config maps that have node specific suffixes in their names. But this sort of defeats the DS purpose.

Apparently both kube proxy and kube scheduler made the decision to have config override the flags. This is especially bad given k8s components do not have a clear instance vs global config story. Flags overriding config has been the interim solution for kubelet. But if kube proxy is DS managed the instance specific "fix" using flags is a no-go.

@TeddyAndrieux
Copy link
Author

Ok fine but then how do we manage instance-specific args ? And why are we still able to use the flag for hostnameOverride ?

@neolit123
Copy link
Member

neolit123 commented Mar 17, 2022

IIRC some kube proxy flags were overriding config and some didn't, but i don't know the details.
If a component config renders some key flags to be ignored this means the only way to apply these instance options is to pass separate config.

@TeddyAndrieux
Copy link
Author

It's really sad 😞

@aojea
Copy link
Member

aojea commented Mar 17, 2022

will bring this topic for today's sig-network meeting https://docs.google.com/document/d/1_w77-zG_Xj0zYvEMfQZTQ-wPP4kXkpGD8smVtW_qqWM/edit , let's see what is the general opinion

@TeddyAndrieux
Copy link
Author

Cool 👍 thank you @aojea

@dcbw
Copy link
Member

dcbw commented Mar 17, 2022

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 17, 2022
@dcbw
Copy link
Member

dcbw commented Mar 17, 2022

/assign dcbw
/assign aojea

@thockin
Copy link
Member

thockin commented Mar 18, 2022

The short story here is "nobody got around to doing something reasonable here". The component-standardization process fizzled out and nobody stepped up to help drive it.

Ithink the choices are, in preference order:

  1. flags should override config. If we can find a way to do that without breaking anyone, I'd be happy.

  2. Alternatively, we could define the existing config/flag as being (string-encoded) either:
    a) an IP (with optional port)
    b) a CIDR (with optional port)
    c) a list of (a) or (b)

When we choose to bind to a port we get all the local interface addresses and intersect that with the set of addresses in this flag, and listen on all of those. We would need a "multi-listen" abstraction and good tests. But then you could say --healthz-bind-address=10.240.0.0/16:10256 or even --healthz-bind-address=10.240.0.0/24:10256,10.240.0.6/24:10256,192.168.9.0/26:10256.

  1. Alternatively, we could add a new flag that is specifically an override.

@aojea
Copy link
Member

aojea commented Mar 18, 2022

@thockin what about moving those fields from string to *string on the config type?

@thockin
Copy link
Member

thockin commented Mar 19, 2022

would that allow flags to override?

@aojea
Copy link
Member

aojea commented Mar 21, 2022

would that allow flags to override?

it will allow to differentiate when users want the field to be defaulted, see #104624 for reference, so they can "force" the configuration option to be ""

However, looking at the code, this seems to solve THIS problem but doesn't seem to solve the general problem and we can end in a more inconsistent situation that we have now if we keep adding exceptions

// Complete completes all the required options.
func (o *Options) Complete() error {
if len(o.ConfigFile) == 0 && len(o.WriteConfigTo) == 0 {
klog.InfoS("Warning, all flags other than --config, --write-config-to, and --cleanup are deprecated, please begin using a config file ASAP")
o.config.HealthzBindAddress = addressFromDeprecatedFlags(o.config.HealthzBindAddress, o.healthzPort)
o.config.MetricsBindAddress = addressFromDeprecatedFlags(o.config.MetricsBindAddress, o.metricsPort)
}
// Load the config file here in Complete, so that Validate validates the fully-resolved config.
if len(o.ConfigFile) > 0 {
c, err := o.loadConfigFromFile(o.ConfigFile)

Is there any history about referencing Downward API from configuration files?, that will solve the issue without having to use flags

        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: HOST_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.hostIP

@thockin
Copy link
Member

thockin commented Mar 21, 2022

downward API may require the thing to be up and running before they can be resolved, so that seems like a bad idea to me.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 20, 2022
@TeddyAndrieux
Copy link
Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 20, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 18, 2022
@gdemonet
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 23, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 22, 2022
@TeddyAndrieux
Copy link
Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 16, 2023
@VigneshSP94
Copy link

@aojea is there any direction on this issue? If so, I'd like to contribute if possible.

@aojea
Copy link
Member

aojea commented Aug 22, 2023

kube-proxy config vs flags is an existing problem #98302 , that we actually hit recently #119864

We can not change behaviors, so some configurations that were working until now can not stop behaving differently or we'll create a lot of confusion and break the compatibility of these configs.

The main problem with the configuration is that is defaulted, I think that override is out scope at this point, but I think that someone proposed some time ago, but I can not find it, that we merge the flags and the passed config options, with the configuration taking precedence to keeps backwards compatibility, and then default after that ... @thockin @danwinship do you remember who proposed that? maybe it was Tim? I think that can be a valid solution

@thockin
Copy link
Member

thockin commented Aug 22, 2023

I think I have paged out all the context on this - it needs a fresh look.

Personally, I feel like flags should probably win, but I don't know what other components do. Better would be errors in case of conflicting values, but defaults do make it weird.

@aojea
Copy link
Member

aojea commented Aug 23, 2023

@aojea is there any direction on this issue? If so, I'd like to contribute if possible.

@VigneshSP94 the direction is to do some archaeology to understand all the historical context, I threw some links there, summarize it and make a proposal as KEP on how to implement the kube-proxy config and flags and its behaviors in a backwards compatible way ... it is a considerable amount of work, my advice, before the KEP you may create a google doc to socialize your idea and receive early feedback, you can share it in the sig-network mailing list

@danwinship
Copy link
Contributor

We can not change behaviors

The config file is v1alpha1... If we created a v1alpha2 config format, we could also have cmd/kube-proxy implement different rules for CLI-vs-config with the new format.

The new format could either be identical to v1alpha except for the CLI-overriding behaviors, or we could try to improve it at the same time... (#117909 has some ideas on that front).

(I don't think we're ready to jump to v1beta1 either with or without major changes...)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. sig/network Categorizes an issue or PR as relevant to SIG Network. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

11 participants