New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nodeup error downloading assets from storage.googleapis.com when using official Ubuntu 20.04 AMI #10206
Comments
@CheyiLin just out of curiosity, what AWS region are you using? |
@hakman I'm using And I found PR #9136 hard coded the HTTP timeout: // Create a client with a shorter timeout
httpClient := http.Client{
Timeout: 2 * time.Minute
} So nodes will never get ready if |
Hmm this is extremely slow. Will ask around if k8s has other faster mirrors. 100MB in 2 minutes is very reasonable in this century. It is quite puzzling that one download finished in 4 min and the other in 4 sec. |
@hakman Thank you. I just realized that why switching back to Debian Stretch AMI is working as expected. The nodeup in Debian Stretch AMI downloads only few assets like |
UPDATE The Debian Stretch I used So it is weird that even the Debian Stretch AMI still need to download [1] kubernetes-sigs/image-builder@54c1026
|
Can't say I can reproduce this even on a t3.small instance. Any special config on your side?
|
I am facing the exact same issue and I am afraid it is not related to kops. We are experiencing slow download speed in a private subnet when connecting to an Internet gateway. And the slowness is only observed when downloading from Google hosted servers:
This can be reproduced in our environment (subnet) on a standalone EC2 instance not managed by KOPS ASG. It is not happening 100% of time, but probably 90%. All existing, old EC2 instances in the same subnet still have fast download speed and this only affect newly created instances. All instances are using pegged Ubuntu 20.04 AMI and new worker nodes were always able to download kubelet fast until this week. Captured some pcaps and we found that on those new, slow instances, looks like they are using too big of a TCP MSS for some reason (8961 bytes), thus causing fragmentation/reassembly. On existing old, fast instances, the TCP MSS is 1430. On the slow instances, if we manually adjust MSS (for instance, Also, we haven't seen this issue when using debian or ubuntu 18.04, which all come with older kernels. |
Interesting @yuha0. Any idea if there in which Ubuntu AMI this was changed?
|
Last I heard was that this was potentially introduced somewhere after kernel versions While this behaviour still exists, and is at some random frequency (i.e., a lot of focal images worked okay, then after certain point all new instances using focal decided to exhibit this behaviour..), we don't have any conclusive evidence of what the exact cause is yet. Even later builds of the bionic image exhibit this same intermittent behaviour. But the one that has been going okay so far is:
|
Hey, This behavior has been introduced in kernel 4.19.86 because of https://lore.kernel.org/patchwork/patch/1157936/. Workarounds: sysctl -a | grep -i net.ipv4.tcp_rmemnet.ipv4.tcp_rmem = 4096 131072 6291456 sysctl -w net.ipv4.tcp_rmem="4096 87380 6291456"net.ipv4.tcp_rmem = 4096 87380 6291456 sysctl -a | grep -i net.ipv4.tcp_rmemnet.ipv4.tcp_rmem = 4096 87380 6291456 Technical explanation: -Before https://lore.kernel.org/patchwork/patch/1157936/ got merged we were actually relying on tcp_fixup_rcvbuf function to increase the receive buffer while the connection is going.In this case receiver buffer auto-tuning starts after receiving one advertised window amount of data however After the initial receiver buffer was raised by patch a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB"), the reciver buffer may take too long to start raising. -After patch a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB") got merged we no longer have the tcp_fixup_rcvbuf which is raising the receiver buffer by multiplying the rcvmem*4 if we have the sysctl_tcp_moderate_rcvbuf enabled this indeed give more freedom to the client to send data which is reflecting in the download speed since the connection start. -Regarding why decreasing MTU and receive buffer enhancing the performance, going through the code it looks like the dynamic receive window is now relying on the socket receive space which the minimum between tp->rcv_wnd(controlled by the tcp.rmem parameter) and TCP_INIT_CWND * tp->advmss which is 10 * MMS(controlled by MTU), as we can see decreasing one of these parameters(MTU or tcp_rmem) will decrease the initial rcvq_space.space hence the dynamic scaling will kickoff early because we are scaling the window whenever we found that the rcv_space won’t be enough to accommodate the received data.This patch https://www.spinics.net/lists/netdev/msg526390.html was actually trying to solve this issue but it’s relying on low advertised MSS which we can’t have with Jumbo frames configured as we have in this case. http://lkml.iu.edu/hypermail/linux/kernel/1912.2/01978.html https://www.spinics.net/lists/netdev/msg526390.html Kernel 4.19.86
Kenrel 4.19.85 https://elixir.bootlin.com/linux/v4.19.85/source/net/ipv4/tcp_input.c#L451
7-After I disable net.ipv4.tcp_moderate_rcvbuf on 4.14 kernel, rcv_space is clearly capped because tcp_fixup_rcvbuf won’t increase the rcvmem and the socket rcvbuf https://elixir.bootlin.com/linux/v4.19.85/source/net/ipv4/tcp_input.c#L441 static void tcp_fixup_rcvbuf(struct sock *sk)
} |
For awareness this is a kernel Bug, I have submitted patch [1] upstream to get this fixed and [2] should be in the stable queue at the moment. You can also apply the below workaround until we have the fix merged: Links: [1] https://lore.kernel.org/netdev/20201204180622.14285-1-abuehaze@amazon.com/ |
Very nice. Thanks a lot for the fix and for the explanations @abuehaze. Probably this should be picker up by Ubuntu pretty soon. |
By default kops overrides a set of syscalls from the OS defaults: kops/nodeup/pkg/model/sysctls.go Lines 67 to 82 in d5d08a4
(And there are more in that file!) In particular: Do I understand correctly that we should reduce the size of the default buffer, because it's stopping TCP autotuning? Also, if any of the TCP experts on this thread have any input on those values as defaults, that would be greatly appreciated - they haven't been touched since about 2017 that I can see ... and 4 years is a long time in kubernetes! The values were originally based on nginx performance recommendations (provided here ) |
Although of course I've realized that these TCP settings are applied by nodeup, so don't apply to the download of nodeup itself. I don't think we have a dependency configured so that we download only after applying the sysctls, so I would guess it won't apply to the download of kubelet etc. A potential workaround could therefore be to explicitly set Would still appreciate any guidance on whether the values we're setting make sense, but this feels like a fairly easy and safe fix! |
This ensures that we're using our settings for downloading nodeup itself and any assets that nodeup downloads. This is a workaround for reported problems with the initial download on some kernels otherwise. Issue kubernetes#10206
As a workaround, I set sysctl parameters by using additionalUserData:
- name: custom-sysctl.sh
type: text/x-shellscript
content: |
#!/bin/bash
set -xe
cat <<EOF > /etc/sysctl.d/999-k8s-custom.conf
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 87380 16777216
EOF
sysctl -p /etc/sysctl.d/999-k8s-custom.conf |
This ensures that we're using our settings for downloading nodeup itself and any assets that nodeup downloads. This is a workaround for reported problems with the initial download on some kernels otherwise. Issue kubernetes#10206
I'm still having this issue intermittently even with the newest Ubuntu AMI and kops 1.19 that includes this fix #10654. |
This ensures that we're using our settings for downloading nodeup itself and any assets that nodeup downloads. This is a workaround for reported problems with the initial download on some kernels otherwise. Issue kubernetes#10206 Signed-off-by: Dmytro Oboznyi <dmytro.oboznyi@syncier.com>
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
We detected a similar error when launching two nodes at the same time. The other one logs:
Is this error related? |
@fvasco yes, those are the same symptoms :( |
/remove-lifecycle stale |
Believe this one to be solved now. /close |
@olemarkus: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
1. What
kops
version are you running?Version 1.18.2 (git-84495481e4)
2. What Kubernetes version are you running?
3. What cloud provider are you using?
AWS
4. What commands did you run? What is the simplest way to reproduce this issue?
kops rolling-update cluster --name my-cluster-name --yes
5. What happened after the commands executed?
The new node hasn't joined cluster over 20 minutes.
kops logs
kops-configuration logs
6. What did you expect to happen?
The new node will join the cluster in few minutes.
7. Please provide your cluster manifest
kops cluster & ig manifests
8. Please run the commands with most verbose logging by adding the
-v 10
flag.Paste the logs into this report, or in a gist and provide the gist link here.
I thinks it might be a network performance issue in Ubuntu 20.04 on AWS, or a
nodeup
issue, notkops
command line tool related.9. Anything else do we need to know?
Not have this issue when using official Debian Stretch AMI
kope.io/k8s-1.17-debian-stretch-amd64-hvm-ebs-2020-07-20
.UPDATE
Just ssh to that node and wget the file that nodeup trying to download, and it spends almost 5 minutes to download.
Other nodes (not rolling updated yet, using Debian Stretch AMI) within the same VPC don't have this issue:
The text was updated successfully, but these errors were encountered: