kube-proxy: We should be able to override bind addresses for healthz and metrics #108737

TeddyAndrieux · 2022-03-16T15:17:50Z

What happened?

When we use config for kube-proxy we cannot override the bind addresses for healthz and metrics.

What did you expect to happen?

I expect the kube-proxy CLI argument to override the value from the configuration file.

How can we reproduce it (as minimally and precisely as possible)?

A typical use case is, I want to expose the kube-proxy metrics only on the host IP, then I want to use the following pod template:

      containers:
      - command:
        - /usr/local/bin/kube-proxy
        - --config=/var/lib/kube-proxy/config.conf
        - --hostname-override=$(NODE_NAME)
        - --healthz-bind-address=$(HOST_IP):10256
        - --metrics-bind-address=$(HOST_IP):10249
        env:
        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: HOST_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.hostIP

But today it does not work kube-proxy will just ignore the healthz-bind-address and metrics-bind-address arguments

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
Client Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-02-16T12:38:05Z", GoVersion:"go1.17.7", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.4", GitCommit:"e6c093d87ea4cbb530a7b2ae91e54c0842d8308a", GitTreeState:"clean", BuildDate:"2022-02-16T12:32:02Z", GoVersion:"go1.17.7", Compiler:"gc", Platform:"linux/amd64"}

Cloud provider

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

The text was updated successfully, but these errors were encountered:

TeddyAndrieux · 2022-03-16T15:39:45Z

/sig network

aojea · 2022-03-16T17:17:07Z

this has been discussed so many times that I honestly lost track current status, there was one time all the flags were going to be deprecated 😄 IIRC

kubernetes/enhancements#2158 (comment)

current situation is kube-proxy config take precedence over flags

TeddyAndrieux · 2022-03-16T20:48:37Z

Ok but here it's not only "precedence" since even if the bind addresses are not defined in the config, the flags are still ignored.
And only using config we cannot easily bind on the node IP since the config map is shared for all the clusters.
To me, we need at least a special case for those bind addresses even if it's not the case for other flags

aojea · 2022-03-16T20:57:46Z

Ok but here it's not only "precedence" since even if the bind addresses are not defined in the config, the flags are still ignored.

if the addresses are not defined, they are defaulted ... so they are always "defined" if not by the user, they are by the kube-proxy itself

kubernetes/pkg/proxy/apis/proxyconfig/v1alpha1/defaults.go

Lines 35 to 46 in 5c8c1f4

    
           if len(obj.BindAddress) == 0 { 
        
           	obj.BindAddress = "0.0.0.0" 
        
           } 
        
           if obj.HealthzBindAddress == "" { 
        
           	obj.HealthzBindAddress = fmt.Sprintf("0.0.0.0:%v", ports.ProxyHealthzPort) 
        
           } else if !strings.Contains(obj.HealthzBindAddress, ":") { 
        
           	obj.HealthzBindAddress += fmt.Sprintf(":%v", ports.ProxyHealthzPort) 
        
           } 
        
           if obj.MetricsBindAddress == "" { 
        
           	obj.MetricsBindAddress = fmt.Sprintf("127.0.0.1:%v", ports.ProxyStatusPort) 
        
           } else if !strings.Contains(obj.MetricsBindAddress, ":") { 
        
           	obj.MetricsBindAddress += fmt.Sprintf(":%v", ports.ProxyStatusPort)

And only using config we cannot easily bind on the node IP since the config map is shared for all the cluster

yeah, I remember someone reporting this problem too but I think he found a way to solve it, let me try to find the issue ... I agree that this flag/config situation is problematic, but if you read the discussion I linked there are trade offs in all solutions

TeddyAndrieux · 2022-03-17T06:45:27Z

if the addresses are not defined, they are defaulted ... so they are always "defined" if not by the user, they are by the kube-proxy itself

Yes sure but to me, those default should come after flags.

yeah, I remember someone reporting this problem too but I think he found a way to solve it, let me try to find the issue ... I agree that this flag/config situation is problematic, but if you read the discussion I linked there are trade offs in all solutions

A way to solve it, to me something is doable by using an init container that will basically write the kube proxy config but it's sad to need that kind of approach just to set the bind addresses.

I agree that it does not necessarily make sense to provide flags for all config args, but it makes sense to provide some flags for instance specific stuff, like bind addresses.
It's like the hostnameOverride this one could be set in the config but we need to have it as a flag as well, and it's what we have today and I think we will keep it

aojea · 2022-03-17T07:01:25Z

A way to solve it, to me something is doable by using an init container that will basically write the kube proxy config but it's sad to need that kind of approach just to set the bind addresses

Or settings the bind address to 0.0.0.0 in the configuration

s, but it makes sense to provide some flags for instance specific stuff, like bind addresses.
It's like the hostnameOverride this one could be set in the config but we need to have it as a flag as well, and it's what we have today and I think we will keep it

The problem I see here is that flags allow to use dynamic values automatically, and configuration does not, that leaves us with only one choice as you are saying.

@neolit123 is there a simple way to have configuration customized by node using the downward API?

TeddyAndrieux · 2022-03-17T07:24:16Z

Or settings the bind address to 0.0.0.0 in the configuration

It's not a solution to me, I do not want to expose metrics on all IPs.

TBH I really don't get why we would keep hostnameOverride as a flag and not the bind addresses

aojea · 2022-03-17T09:27:38Z

TBH I really don't get why we would keep hostnameOverride as a flag and not the bind addresses

yeah, I was trying to provide a workaround meanwhile we sort out the other problem :)
I agree that seems an incongruence ... but there seems a lot of history on the kube-proxy config/flag I don't have context, I can find a lot of issues and discussions about it but no clear reference to this

neolit123 · 2022-03-17T11:44:48Z

is there a simple way to have configuration customized by node using the downward API?

IIUC this is a problem only if kube proxy is run as a Daemonset. If you are managing it in other ways e.g. systemd you could pass a unique config to separete instances.

The OP already has a good attempt at using the downward api. You could also try creating and mounting N config maps that have node specific suffixes in their names. But this sort of defeats the DS purpose.

Apparently both kube proxy and kube scheduler made the decision to have config override the flags. This is especially bad given k8s components do not have a clear instance vs global config story. Flags overriding config has been the interim solution for kubelet. But if kube proxy is DS managed the instance specific "fix" using flags is a no-go.

TeddyAndrieux · 2022-03-17T11:57:16Z

Ok fine but then how do we manage instance-specific args ? And why are we still able to use the flag for hostnameOverride ?

neolit123 · 2022-03-17T13:03:19Z

IIRC some kube proxy flags were overriding config and some didn't, but i don't know the details.
If a component config renders some key flags to be ignored this means the only way to apply these instance options is to pass separate config.

TeddyAndrieux · 2022-03-17T13:33:35Z

It's really sad 😞

aojea · 2022-03-17T17:18:15Z

will bring this topic for today's sig-network meeting https://docs.google.com/document/d/1_w77-zG_Xj0zYvEMfQZTQ-wPP4kXkpGD8smVtW_qqWM/edit , let's see what is the general opinion

TeddyAndrieux · 2022-03-17T17:25:13Z

Cool 👍 thank you @aojea

dcbw · 2022-03-17T19:16:47Z

/triage accepted

dcbw · 2022-03-17T19:18:40Z

/assign dcbw
/assign aojea

thockin · 2022-03-18T16:26:08Z

The short story here is "nobody got around to doing something reasonable here". The component-standardization process fizzled out and nobody stepped up to help drive it.

Ithink the choices are, in preference order:

flags should override config. If we can find a way to do that without breaking anyone, I'd be happy.
Alternatively, we could define the existing config/flag as being (string-encoded) either:
a) an IP (with optional port)
b) a CIDR (with optional port)
c) a list of (a) or (b)

When we choose to bind to a port we get all the local interface addresses and intersect that with the set of addresses in this flag, and listen on all of those. We would need a "multi-listen" abstraction and good tests. But then you could say --healthz-bind-address=10.240.0.0/16:10256 or even --healthz-bind-address=10.240.0.0/24:10256,10.240.0.6/24:10256,192.168.9.0/26:10256.

Alternatively, we could add a new flag that is specifically an override.

aojea · 2022-03-18T17:51:56Z

@thockin what about moving those fields from string to *string on the config type?

thockin · 2022-03-19T05:15:02Z

would that allow flags to override?

aojea · 2022-03-21T10:56:37Z

would that allow flags to override?

it will allow to differentiate when users want the field to be defaulted, see #104624 for reference, so they can "force" the configuration option to be ""

However, looking at the code, this seems to solve THIS problem but doesn't seem to solve the general problem and we can end in a more inconsistent situation that we have now if we keep adding exceptions

kubernetes/cmd/kube-proxy/app/server.go

Lines 226 to 236 in 9d18c76

    
           // Complete completes all the required options. 
        
           func (o *Options) Complete() error { 
        
           	if len(o.ConfigFile) == 0 && len(o.WriteConfigTo) == 0 { 
        
           		klog.InfoS("Warning, all flags other than --config, --write-config-to, and --cleanup are deprecated, please begin using a config file ASAP") 
        
           		o.config.HealthzBindAddress = addressFromDeprecatedFlags(o.config.HealthzBindAddress, o.healthzPort) 
        
           		o.config.MetricsBindAddress = addressFromDeprecatedFlags(o.config.MetricsBindAddress, o.metricsPort) 
        
           	} 
        
           	// Load the config file here in Complete, so that Validate validates the fully-resolved config. 
        
           	if len(o.ConfigFile) > 0 { 
        
           		c, err := o.loadConfigFromFile(o.ConfigFile)

Is there any history about referencing Downward API from configuration files?, that will solve the issue without having to use flags

        - name: NODE_NAME
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: spec.nodeName
        - name: HOST_IP
          valueFrom:
            fieldRef:
              apiVersion: v1
              fieldPath: status.hostIP

thockin · 2022-03-21T19:03:44Z

downward API may require the thing to be up and running before they can be resolved, so that seems like a bad idea to me.

k8s-triage-robot · 2022-06-20T08:13:22Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

TeddyAndrieux · 2022-06-20T18:21:23Z

/remove-lifecycle stale

k8s-triage-robot · 2022-09-18T19:04:29Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

gdemonet · 2022-09-23T08:45:25Z

/remove-lifecycle stale

k8s-triage-robot · 2022-12-22T09:25:15Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue or PR as fresh with /remove-lifecycle stale
Mark this issue or PR as rotten with /lifecycle rotten
Close this issue or PR with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

TeddyAndrieux · 2023-01-16T13:38:12Z

/remove-lifecycle stale

VigneshSP94 · 2023-08-22T12:10:03Z

@aojea is there any direction on this issue? If so, I'd like to contribute if possible.

aojea · 2023-08-22T12:21:39Z

kube-proxy config vs flags is an existing problem #98302 , that we actually hit recently #119864

We can not change behaviors, so some configurations that were working until now can not stop behaving differently or we'll create a lot of confusion and break the compatibility of these configs.

The main problem with the configuration is that is defaulted, I think that override is out scope at this point, but I think that someone proposed some time ago, but I can not find it, that we merge the flags and the passed config options, with the configuration taking precedence to keeps backwards compatibility, and then default after that ... @thockin @danwinship do you remember who proposed that? maybe it was Tim? I think that can be a valid solution

thockin · 2023-08-22T16:08:33Z

I think I have paged out all the context on this - it needs a fresh look.

Personally, I feel like flags should probably win, but I don't know what other components do. Better would be errors in case of conflicting values, but defaults do make it weird.

aojea · 2023-08-23T08:55:09Z

@aojea is there any direction on this issue? If so, I'd like to contribute if possible.

@VigneshSP94 the direction is to do some archaeology to understand all the historical context, I threw some links there, summarize it and make a proposal as KEP on how to implement the kube-proxy config and flags and its behaviors in a backwards compatible way ... it is a considerable amount of work, my advice, before the KEP you may create a google doc to socialize your idea and receive early feedback, you can share it in the sig-network mailing list

danwinship · 2023-09-04T14:25:20Z

We can not change behaviors

The config file is v1alpha1... If we created a v1alpha2 config format, we could also have cmd/kube-proxy implement different rules for CLI-vs-config with the new format.

The new format could either be identical to v1alpha except for the CLI-overriding behaviors, or we could try to improve it at the same time... (#117909 has some ideas on that front).

(I don't think we're ready to jump to v1beta1 either with or without major changes...)

TeddyAndrieux added the kind/bug Categorizes issue or PR as related to a bug. label Mar 16, 2022

TeddyAndrieux mentioned this issue Mar 16, 2022

node-exporter listens on all addresses scality/metalk8s#2166

Closed

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 16, 2022

k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 16, 2022

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 17, 2022

k8s-ci-robot assigned aojea and dcbw Mar 17, 2022

aojea mentioned this issue Mar 22, 2022

Can we use config file to pass so many args to components, instead of bare cmd flags? #108821

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 20, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 20, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 18, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 23, 2022

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 22, 2022

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 16, 2023

aojea mentioned this issue Sep 3, 2023

Add support for nf_conntrack_tcp_be_liberal sysctl to kube-proxy #120354

Merged

jayunit100 mentioned this issue Sep 12, 2023

kube-proxy discards command line args #120603

Closed

danwinship mentioned this issue Oct 2, 2023

WIP KubeProxyConfiguration v1alpha2 kubernetes/enhancements#4263

Closed

sanmai-NL mentioned this issue May 22, 2024

kube-proxy's --healthz-bind-address should support IPv4 and IPv6 simultaneously (dual stack) #125055

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kube-proxy: We should be able to override bind addresses for healthz and metrics #108737

kube-proxy: We should be able to override bind addresses for healthz and metrics #108737

TeddyAndrieux commented Mar 16, 2022

TeddyAndrieux commented Mar 16, 2022

aojea commented Mar 16, 2022

TeddyAndrieux commented Mar 16, 2022 •

edited

Loading

aojea commented Mar 16, 2022

TeddyAndrieux commented Mar 17, 2022

aojea commented Mar 17, 2022

TeddyAndrieux commented Mar 17, 2022

aojea commented Mar 17, 2022

neolit123 commented Mar 17, 2022 •

edited

Loading

TeddyAndrieux commented Mar 17, 2022

neolit123 commented Mar 17, 2022 •

edited

Loading

TeddyAndrieux commented Mar 17, 2022

aojea commented Mar 17, 2022

TeddyAndrieux commented Mar 17, 2022

dcbw commented Mar 17, 2022

dcbw commented Mar 17, 2022

thockin commented Mar 18, 2022

aojea commented Mar 18, 2022

thockin commented Mar 19, 2022

aojea commented Mar 21, 2022

thockin commented Mar 21, 2022

k8s-triage-robot commented Jun 20, 2022

TeddyAndrieux commented Jun 20, 2022

k8s-triage-robot commented Sep 18, 2022

gdemonet commented Sep 23, 2022

k8s-triage-robot commented Dec 22, 2022

TeddyAndrieux commented Jan 16, 2023

VigneshSP94 commented Aug 22, 2023

aojea commented Aug 22, 2023

thockin commented Aug 22, 2023

aojea commented Aug 23, 2023

danwinship commented Sep 4, 2023

kube-proxy: We should be able to override bind addresses for healthz and metrics #108737

kube-proxy: We should be able to override bind addresses for healthz and metrics #108737

Comments

TeddyAndrieux commented Mar 16, 2022

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

TeddyAndrieux commented Mar 16, 2022

aojea commented Mar 16, 2022

TeddyAndrieux commented Mar 16, 2022 • edited Loading

aojea commented Mar 16, 2022

TeddyAndrieux commented Mar 17, 2022

aojea commented Mar 17, 2022

TeddyAndrieux commented Mar 17, 2022

aojea commented Mar 17, 2022

neolit123 commented Mar 17, 2022 • edited Loading

TeddyAndrieux commented Mar 17, 2022

neolit123 commented Mar 17, 2022 • edited Loading

TeddyAndrieux commented Mar 17, 2022

aojea commented Mar 17, 2022

TeddyAndrieux commented Mar 17, 2022

dcbw commented Mar 17, 2022

dcbw commented Mar 17, 2022

thockin commented Mar 18, 2022

aojea commented Mar 18, 2022

thockin commented Mar 19, 2022

aojea commented Mar 21, 2022

thockin commented Mar 21, 2022

k8s-triage-robot commented Jun 20, 2022

TeddyAndrieux commented Jun 20, 2022

k8s-triage-robot commented Sep 18, 2022

gdemonet commented Sep 23, 2022

k8s-triage-robot commented Dec 22, 2022

TeddyAndrieux commented Jan 16, 2023

VigneshSP94 commented Aug 22, 2023

aojea commented Aug 22, 2023

thockin commented Aug 22, 2023

aojea commented Aug 23, 2023

danwinship commented Sep 4, 2023

TeddyAndrieux commented Mar 16, 2022 •

edited

Loading

neolit123 commented Mar 17, 2022 •

edited

Loading

neolit123 commented Mar 17, 2022 •

edited

Loading