Skip to content
This repository was archived by the owner on Oct 16, 2020. It is now read-only.
This repository was archived by the owner on Oct 16, 2020. It is now read-only.

Paced TCP downloads in 1745.6.0 break down #2457

@fwiesel

Description

@fwiesel

Issue Report

It looks like any application on a coreos 1745.6.0 node not processing the data as fast as the source can provide it, will suffer a break down in transfer speed larger than the actual processing speed.

Docker image downloads where affected first, but it can be easily reproduced with curl.

Reverting to a prior version 1745.5.0 on the same host does not exhibit the behaviour.
But going back again to 1745.6.0 will.
It is also happening across machines (of the same type)

Bug

Container Linux Version

$ cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1745.6.0
VERSION_ID=1745.6.0
BUILD_ID=2018-06-08-0926
PRETTY_NAME="Container Linux by CoreOS 1745.6.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Environment

Baremetal
Intel E7- 4870
2x Cisco VIC ENIC in a bond (active-backup) mtu 9000, 10000baseT/Full

Expected Behavior

As in the prior version (1745.5.0), same machine.

curl --limit-rate 1M -o /dev/null http://<high-speed-low-latency-source>
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0  956M    0 8439k    0     0  1023k      0  0:15:57  0:00:08  0:15:49 1020k

The downloads proceeds with (more or less) the limited speed.
This is not limited to curl, but also the docker daemon downloads, and presumably others.
If the client does not process the data as fast as the network delivers it, the speed breaks down.

Actual Behavior

curl --limit-rate 1M -o /dev/null http://<high-speed-low-latency-source>
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                Dload  Upload   Total   Spent    Left  Speed
 0  956M    0 6482k    0     0  41779      0  6:40:17  0:02:38  6:37:39   623

The traffic drops fastly under the speed limit.
This is also happening, with kubernetes stopped and iptables cleared.

Reproduction Steps

  1. Run curl with a speed limit lower than the source can deliver. The faster the source the better
  2. Wait for the traffic to drop vastly under the limit, when some buffer is presumably full (in our case ~6MiB)
  3. Restart same host with prior version (1745.5.0)
  4. Run same command and see expected behaviour.

Other Information

A tcpdump seems to indicate, that the client scales the window size to 384 bytes at a roughly 10 packets per seconds.

Prior version:

cat /etc/os-release
NAME="Container Linux by CoreOS"
ID=coreos
VERSION=1745.5.0
VERSION_ID=1745.5.0
BUILD_ID=2018-05-31-0701
PRETTY_NAME="Container Linux by CoreOS 1745.5.0 (Rhyolite)"
ANSI_COLOR="38;5;75"
HOME_URL="https://coreos.com/"
BUG_REPORT_URL="https://issues.coreos.com"
COREOS_BOARD="amd64-usr"

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions