Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kube-proxy needs to be configured to override hostname in some environments #857

Closed
detiber opened this issue May 24, 2018 · 19 comments · Fixed by kubernetes/kubernetes#69340 or kubernetes/kubernetes#71283
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/network Categorizes an issue or PR as relevant to SIG Network.
Milestone

Comments

@detiber
Copy link
Member

detiber commented May 24, 2018

Currently in environments where a user must configure --hostname-override for the kubelet (such as AWS), kube-proxy is currently being deployed in a degraded state. Specifically, Services of type NodePort and LoadBalancer where externalTrafficPolicy: local.

Since we are deploying kube-proxy as a daemonset that means that the only options available are to override the command arguments using the downward api, or to use an init container to mutate the config. This is further complicated because the kube-proxy command line options are marked as deprecated in favor of the component config.

@detiber
Copy link
Member Author

detiber commented May 24, 2018

related kube-proxy issue: kubernetes/kubernetes#57518

@timothysc timothysc added kind/bug Categorizes issue or PR as related to a bug. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels May 25, 2018
@timothysc timothysc added this to the v1.11 milestone May 25, 2018
@timothysc
Copy link
Member

@luxas This is a nasty config issue, we need to chat with the @kubernetes/sig-network-bugs folks on this b/c the UX and work - arounds are really ugly.

@k8s-ci-robot k8s-ci-robot added the sig/network Categorizes an issue or PR as relevant to SIG Network. label May 25, 2018
@timothysc timothysc modified the milestones: v1.11, v1.12 Jun 12, 2018
@timothysc
Copy link
Member

dlipovetsky pushed a commit to platform9/nodeadm that referenced this issue Jul 27, 2018
@dlipovetsky
Copy link

I tried to apply the workaround, but ran into two issues.

First, the patch appears to have a typo. After patching, I found that the NODE_NAME variable was not expanded in the command line:

# pgrep -laf kube-proxy
8463 /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=${NODE_NAME}

I fixed this by changing the curly braces ${NODE_NAME} to parentheses $(NODE_NAME), as per the docs:

Note: The environment variable appears in parentheses, "$(VAR)". This is required for the variable to be expanded in the command or args field.

-- https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#use-environment-variables-to-define-arguments

Second, kube-proxy appears to not use the value of the --hostname-override flag when it gets its Node from the API server, instead using the hostname reported by the OS. I edited the kube-proxy DaemonSet to increase the log verbosity, and found that kube-proxy is reading the --hostname-override flag:

I0728 02:09:00.385519       1 flags.go:27] FLAG: --alsologtostderr="false"
I0728 02:09:00.385634       1 flags.go:27] FLAG: --bind-address="0.0.0.0"
I0728 02:09:00.385644       1 flags.go:27] FLAG: --cleanup="false"
I0728 02:09:00.385658       1 flags.go:27] FLAG: --cleanup-iptables="false"
I0728 02:09:00.385665       1 flags.go:27] FLAG: --cleanup-ipvs="true"
I0728 02:09:00.385670       1 flags.go:27] FLAG: --cluster-cidr=""
I0728 02:09:00.385678       1 flags.go:27] FLAG: --config="/var/lib/kube-proxy/config.conf"
I0728 02:09:00.385685       1 flags.go:27] FLAG: --config-sync-period="15m0s"
I0728 02:09:00.385698       1 flags.go:27] FLAG: --conntrack-max="0"
I0728 02:09:00.385705       1 flags.go:27] FLAG: --conntrack-max-per-core="32768"
I0728 02:09:00.385712       1 flags.go:27] FLAG: --conntrack-min="131072"
I0728 02:09:00.385718       1 flags.go:27] FLAG: --conntrack-tcp-timeout-close-wait="1h0m0s"
I0728 02:09:00.385724       1 flags.go:27] FLAG: --conntrack-tcp-timeout-established="24h0m0s"
I0728 02:09:00.385734       1 flags.go:27] FLAG: --feature-gates=""
I0728 02:09:00.385744       1 flags.go:27] FLAG: --healthz-bind-address="0.0.0.0:10256"
I0728 02:09:00.385750       1 flags.go:27] FLAG: --healthz-port="10256"
I0728 02:09:00.385756       1 flags.go:27] FLAG: --help="false"
I0728 02:09:00.385762       1 flags.go:27] FLAG: --hostname-override="192.0.2.24"
I0728 02:09:00.385779       1 flags.go:27] FLAG: --iptables-masquerade-bit="14"
I0728 02:09:00.385787       1 flags.go:27] FLAG: --iptables-min-sync-period="0s"
I0728 02:09:00.385794       1 flags.go:27] FLAG: --iptables-sync-period="30s"
I0728 02:09:00.385802       1 flags.go:27] FLAG: --ipvs-exclude-cidrs="[]"
I0728 02:09:00.385821       1 flags.go:27] FLAG: --ipvs-min-sync-period="0s"
I0728 02:09:00.385832       1 flags.go:27] FLAG: --ipvs-scheduler=""
I0728 02:09:00.385838       1 flags.go:27] FLAG: --ipvs-sync-period="30s"
I0728 02:09:00.385844       1 flags.go:27] FLAG: --kube-api-burst="10"
I0728 02:09:00.385850       1 flags.go:27] FLAG: --kube-api-content-type="application/vnd.kubernetes.protobuf"
I0728 02:09:00.385857       1 flags.go:27] FLAG: --kube-api-qps="5"
I0728 02:09:00.385870       1 flags.go:27] FLAG: --kubeconfig=""
I0728 02:09:00.385876       1 flags.go:27] FLAG: --log-backtrace-at=":0"
I0728 02:09:00.385885       1 flags.go:27] FLAG: --log-dir=""
I0728 02:09:00.385891       1 flags.go:27] FLAG: --log-flush-frequency="5s"
I0728 02:09:00.385897       1 flags.go:27] FLAG: --logtostderr="true"
I0728 02:09:00.385907       1 flags.go:27] FLAG: --masquerade-all="false"
I0728 02:09:00.385913       1 flags.go:27] FLAG: --master=""
I0728 02:09:00.385918       1 flags.go:27] FLAG: --metrics-bind-address="127.0.0.1:10249"
I0728 02:09:00.385924       1 flags.go:27] FLAG: --nodeport-addresses="[]"
I0728 02:09:00.385931       1 flags.go:27] FLAG: --oom-score-adj="-999"
I0728 02:09:00.385941       1 flags.go:27] FLAG: --profiling="false"
I0728 02:09:00.385947       1 flags.go:27] FLAG: --proxy-mode=""
I0728 02:09:00.385955       1 flags.go:27] FLAG: --proxy-port-range=""
I0728 02:09:00.385962       1 flags.go:27] FLAG: --resource-container="/kube-proxy"
I0728 02:09:00.385968       1 flags.go:27] FLAG: --stderrthreshold="2"
I0728 02:09:00.385978       1 flags.go:27] FLAG: --udp-timeout="250ms"
I0728 02:09:00.385984       1 flags.go:27] FLAG: --v="4"
I0728 02:09:00.385990       1 flags.go:27] FLAG: --version="false"
I0728 02:09:00.385998       1 flags.go:27] FLAG: --vmodule=""
I0728 02:09:00.386005       1 flags.go:27] FLAG: --write-config-to=""
I0728 02:09:00.389945       1 feature_gate.go:230] feature gates: &{map[]}
I0728 02:09:00.412872       1 iptables.go:603] couldn't get iptables-restore version; assuming it doesn't support --wait
I0728 02:09:00.413128       1 iptables.go:200] Could not connect to D-Bus system bus: dial unix /var/run/dbus/system_bus_socket: connect: no such file or directory
W0728 02:09:00.477508       1 server_others.go:287] Flag proxy-mode="" unknown, assuming iptables proxy
I0728 02:09:00.479201       1 server_others.go:140] Using iptables Proxier.
W0728 02:09:00.496071       1 server.go:605] Failed to retrieve node info: nodes "daniel-ubuntu16" not found
W0728 02:09:00.496570       1 proxier.go:306] invalid nodeIP, initializing kube-proxy with 127.0.0.1 as nodeIP

Note that kube-proxy tries to get the node with the name daniel-ubuntu16, which happens to be the hostname of the node where the kube-proxy Pod is scheduled, but the node is registered as 192.0.2.24. The string daniel-ubuntu16 does not appear in the config file (/var/lib/kube-proxy/config.conf), so kube-proxy must getting this value from the OS.

I'll file a PR for the first issue, but I'm still tracking down the root cause for the second.

@dlipovetsky
Copy link

dlipovetsky commented Jul 28, 2018

I was able to work around the second by changing the hostnameOverride field in the kube-proxy ConfigMap to 192.0.2.24, but of course that's not a correct configuration because it's consumed by kube-proxy Pods scheduled on all nodes.

I believe the the root of the second issue is that the kube-proxy config file hostnameOverride value overrides the flag value. I edited the kube-proxy DaemonSet and removed the flag --config="/var/lib/kube-proxy/config.conf. After this change, the kube-proxy respected the --hostname-override flag.

The config file defines hostnameOverride: "". However, removing the field from the config file is not a workaround. The field is a string, so its zero value is the empty string, and this causes kube-proxy to ask the OS for the hostname.

Looking at the kube-proxy code, I find that the flags update a config struct, but later the configuration file is unmarshalled into a new config struct, and the pointer to the original config struct is overwritten, effectively discarding all the flag values. (Maybe this is by design, since the flags are deprecated.)

@dlipovetsky
Copy link

dlipovetsky commented Jul 28, 2018

I was able to set hostnameOverride by modifying the config in an init container. I updated the patch in the docs. Please see the above PR.

dlipovetsky pushed a commit to platform9/nodeadm that referenced this issue Jul 28, 2018
dlipovetsky pushed a commit to platform9/nodeadm that referenced this issue Jul 29, 2018
dlipovetsky pushed a commit to platform9/nodeadm that referenced this issue Jul 31, 2018
dlipovetsky pushed a commit to platform9/nodeadm that referenced this issue Aug 2, 2018
@timothysc timothysc assigned liztio and unassigned detiber Aug 21, 2018
@timothysc timothysc modified the milestones: v1.12, v1.13 Aug 21, 2018
@timothysc timothysc added lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. and removed priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. labels Aug 25, 2018
@timothysc timothysc modified the milestones: v1.12, v1.13 Sep 18, 2018
@timothysc
Copy link
Member

Sadly we are way to late in the cycle to make this change.

@fabriziopandini
Copy link
Member

@timothysc
how this new flag can set via kubeadm?

@timothysc timothysc reopened this Oct 25, 2018
@timothysc
Copy link
Member

I'm re-opening b/c we will need to update docs and other details on our side.

@Klaven
Copy link

Klaven commented Oct 31, 2018

I will pick this up.

@neolit123
Copy link
Member

@Klaven thanks
/kind documentation
^ adding this kind too.

@k8s-ci-robot k8s-ci-robot added the kind/documentation Categorizes issue or PR as related to documentation. label Oct 31, 2018
@anguslees
Copy link
Member

anguslees commented Nov 5, 2018

I think kubernetes/kubernetes#69340 makes the --hostname-override flag "undeprecated" again. This makes this bug super-easy to fix from kubeadm's pov:

we should always set kube-proxy's --hostname-override to spec.nodeName (via downward API)

ie: we should drop "in some environments" from this bug description. On older versions (before 69340 gets rolled out), the override flag is ignored in the presence of --config -- but whenever it is not ignored, this is always the correct value to set.

@seh
Copy link

seh commented Nov 11, 2018

It would also be nice if it were possible to set the node name during kubeadm join with less hassle. There's the --node-name flag, but kubeadm join ignores it silently if also supplied with a --config flag. (@SataQiu further documented this behavior in the code.)

At present, in order to set the node name, I have patch the configuration file supplied to kubeadm join on each host, which is fragile with YAML. Fortunately, kubeadm init and kubeadm join will happily read JSON as well, so patching a configuration file in JSON with jq is a little bit easier than the sed script I'm using now against the YAML. Still, given that we have the node-name flag in place, I was wondering if this patching is the desired user experience, or whether ignoring the flag value is an accident. Ignoring that flag silently is cruel.

Writing today, I noticed the following comment per @luxas's hand, so there is hope:

Nb. --config overrides command line flags, TODO: fix this

Is there an open issue for that problem?

@neolit123
Copy link
Member

@seh that's a valid observation for a UX problem.

we should have node-name overriding the config, but i don't think we can do that for 1.13.
there is an issue in the k/kubeadm repo (don't remember the name) about the "meta-problem" of flag overrides, but we don't have good universal solutions yet.

@seh
Copy link

seh commented Nov 19, 2018

Understood. Thank you for acknowledging the pain. Dealing with the current situation is possible with shell script facilities, but it would have been much easier to resolve had it been more obvious how the flag and the configuration file's presence interact. Documenting the current behavior—even if undesirable in the long term—would save a lot of confusion and frustration.

@timothysc timothysc removed their assignment Nov 20, 2018
@timothysc timothysc added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Nov 20, 2018
@bart0sh
Copy link

bart0sh commented Nov 20, 2018

we should have node-name overriding the config

@seh, @neolit123, @timothysc here is a possible fix for this: kubernetes/kubernetes#71270

Please, review.

@seh
Copy link

seh commented Nov 21, 2018

It looks like kubeadm init still ignores the --node-name flag when using a configuration file. So close...

@rnsv
Copy link

rnsv commented Oct 24, 2019

I was able to workaround this issue as follows

- name: kube-proxy
   command:
     - sh
     - -c 
     - |
       hostname $HOSTNAME
       /usr/local/bin/kube-proxy --config=/etc/kubernetes/kube-proxy.conf
   env:
     - name: HOSTNAME
       valueFrom:
         fieldRef:
           apiVersion: v1
           fieldPath: spec.nodeName

@thibault-ketterer
Copy link

to me it took this to work
somewhat it was ignoring the $HOSTNAME if not forced to use it

      - command:                                                                                                                                
        - /bin/sh                                                                                                                               
        - -c                                                                                                                                    
        - /usr/local/bin/kube-proxy --config=/var/lib/kube-proxy/config.conf --hostname-override=$HOSTNAME                                      
        env:                                                                                                                                    
        - name: HOSTNAME                                                                                                                        
          valueFrom:                                                                                                                            
            fieldRef:                                                                                                                           
              apiVersion: v1                                                                                                                    
              fieldPath: spec.nodeName

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. lifecycle/active Indicates that an issue or PR is actively being worked on by a contributor. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. sig/network Categorizes an issue or PR as relevant to SIG Network.
Projects
None yet