Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1.5: No TCP connectivity between docker container and qemu/dynamips #444

Closed
ghost opened this issue Feb 24, 2016 · 29 comments
Closed

1.5: No TCP connectivity between docker container and qemu/dynamips #444

ghost opened this issue Feb 24, 2016 · 29 comments
Assignees
Labels
Milestone

Comments

@ghost
Copy link

ghost commented Feb 24, 2016

GNS3 version latest 1.5.0dev1 on Linux (64-bit) with Python 3.4.2 Qt 5.3.2.

I've got a very strange issue and currently I'm completely puzzled.

I want to exchange data between docker container and non-docker VMs, e.g. dynamips or qemu.

I set up the following project:
docker_project

alpine-socat's Dockerfile:

FROM alpine:3.3
RUN apk add --update socat

ubuntu-socat's Dockerfile:

FROM ubuntu:14.04
RUN apt-get update && \
    apt-get install -y socat && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

Extreme simple, isn't it? I start the docker container with the cmd "sh".

MicroCore-1 is a qemu VM, R1 a 3725 dynamips router.

After configuring the interfaces with their IP address, I have full connectivity with ping. Every device can ping every other.

But I can't setup a TCP connection between the docker VMs and the other network elements. I'm starting a socat TCP-LISTEN:1234 - on the docker container and then try to connect from the other devices via telnet <destination IP> 1234, but it fails (timeout). Also doing telnet from the docker containers to dynamips/qemu VM fails. Between the docker containers I have no problems, between dynamips and qemu also no problem.

I made a wireshark trace on R1:
wireshark
You see that R1 sends out a SYN, which is answered by a SYN-ACK. But R1 doesn't seem to process the SYN-ACK, it re-sends the SYN. The same happens on the qemu-VM.

Any ideas ?

@ghost
Copy link
Author

ghost commented Feb 24, 2016

I don't have the firewall active, so I didn't check that. But it seems, that docker starts one:

behlers@dell:~$ sudo /sbin/iptables -L -v
Chain INPUT (policy ACCEPT 37062 packets, 6752K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain FORWARD (policy ACCEPT 0 packets, 0 bytes)
 pkts bytes target     prot opt in     out     source               destination         
44573   67M DOCKER-ISOLATION  all  --  any    any     anywhere             anywhere            
25262   66M DOCKER     all  --  any    docker0  anywhere             anywhere            
25262   66M ACCEPT     all  --  any    docker0  anywhere             anywhere             ctstate RELATED,ESTABLISHED
19311 1039K ACCEPT     all  --  docker0 !docker0  anywhere             anywhere            
    0     0 ACCEPT     all  --  docker0 docker0  anywhere             anywhere            

Chain OUTPUT (policy ACCEPT 36939 packets, 2797K bytes)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER (1 references)
 pkts bytes target     prot opt in     out     source               destination         

Chain DOCKER-ISOLATION (1 references)
 pkts bytes target     prot opt in     out     source               destination         
44573   67M RETURN     all  --  any    any     anywhere             anywhere            
behlers@dell:~$ 

But after flushing the table with sudo iptables -F the problem is still there.

@ghost
Copy link
Author

ghost commented Feb 24, 2016

I think I found the cause!!! The docker interfaces have TCP checksum offloading enabled, but no-one actually calculates the checksum !!! After activated the wireshark option, that verifies the TCP checksum, it shows it clearly:

The first TCP session is dynamips to qemu with correct TCP checksums, the second session is dynamips to docker with invalid checksums from docker.

wireshark_tcp_checksum

Now the question is how to fix it?

  • @noplay Do you know an option in the veth networking to disable offloading ? If yes, can you try that.
  • Otherwise we have to try ethtool (must be installed) in the container to disable the offloading

@julien-duponchelle julien-duponchelle added this to the 1.5 milestone Feb 24, 2016
@julien-duponchelle
Copy link
Contributor

I think we need to fix that here:
https://github.com/GNS3/ubridge/blob/master/hypervisor_docker.c#L115

@julien-duponchelle
Copy link
Contributor

Talk about this in docker issue tracker:
moby/moby#18776

And the kernel issue:
http://thread.gmane.org/gmane.linux.kernel/2111961

@julien-duponchelle
Copy link
Contributor

Do you confirm @Ehlers that is the same issue?

@AJNOURI
Copy link
Contributor

AJNOURI commented Feb 24, 2016

@Ehlers, @noplay
Experienced same issue, container-to-container tcp communications works, but not between container and VMs(Vbox/VMware) nor Cisco devices.
selection_930

According to Wireshark, the issue is in the TCP handshaking:

  • the container as a client doesn't respond with ACK to complete the 3-way handshake
  • the container as a sevrer doesn't respond with SYN/ACK

selection_931

selection_932

@ghost
Copy link
Author

ghost commented Feb 24, 2016

I think, it's not exactly my problem. The bug reports talks about bad packets from real hardware, that is routed to veth interfaces. These interfaces don't check the checksum and deliver them, even if the checksum is bad.

My issue is a docker container creating TCP packets. As TCP offloading seems to be active no-one creates the checksum. Container-to-container communication seems to work, because the receiving container doesn't check the checksum, so it doesn't matter.

The problem arises only, when the other end checks the checksum, as it's done in qemu and dynamips.

@julien-duponchelle
Copy link
Contributor

Yeah after reading all the doc I think it's a different issue

@julien-duponchelle
Copy link
Contributor

With eth tool the fix seem to be:

ethtool -K eth0 rx off

@julien-duponchelle
Copy link
Contributor

This could explain why running IOU in a container doesn't work.

@ghost
Copy link
Author

ghost commented Feb 24, 2016

ethtool -K eth0 gso off seems to have some influence as well, see http://www.linuxquestions.org/questions/linux-networking-3/help-needed-disabling-tcp-udp-checksum-offloading-in-debian-880233/

Quite confusing... Can make some tests with ethtool in the container tomorrow. But a solution outside the containers would be better, otherwise we would have to inject ethtool binary into every container.

@julien-duponchelle julien-duponchelle self-assigned this Feb 24, 2016
@julien-duponchelle
Copy link
Contributor

Yeah I think we should be able to found a solution outside

@julien-duponchelle
Copy link
Contributor

A trick for running ehttool from the outside. But require to be root.

Start the gns3server with --debug

Locate a line like this:
returned result ['gns3-veth2int moved to namespace 12247']

This is the process pid of the container. You can also found it with ps

vagrant@gns3$~$ pid=12247
vagrant@gns3$ sudo ln -s /proc/$pid/ns/net /var/run/netns/$pid
vagrant@gns3$ sudo ip netns exec 12247 ethtool

@ghost
Copy link
Author

ghost commented Feb 24, 2016

Just tested ethtool in the docker container, ethtool -K eth0 tx off works, rx off and gso off doesn't help.

Before:

# ethtool -k eth0
Features for eth0:
rx-checksumming: on
tx-checksumming: on
    tx-checksum-ipv4: off [fixed]
    tx-checksum-ip-generic: on
    tx-checksum-ipv6: off [fixed]
    tx-checksum-fcoe-crc: off [fixed]
    tx-checksum-sctp: off [fixed]
scatter-gather: on
    tx-scatter-gather: on
    tx-scatter-gather-fraglist: on
tcp-segmentation-offload: on
    tx-tcp-segmentation: on
    tx-tcp-ecn-segmentation: on
    tx-tcp6-segmentation: on
udp-fragmentation-offload: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: on
rx-vlan-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: on [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: on
tx-ipip-segmentation: on
tx-sit-segmentation: on
tx-udp_tnl-segmentation: on
tx-mpls-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: on
rx-vlan-stag-hw-parse: on
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
busy-poll: off [fixed]

And now the change:

# ethtool -K eth0 tx off
Actual changes:
tx-checksumming: off
    tx-checksum-ip-generic: off
tcp-segmentation-offload: off
    tx-tcp-segmentation: off [requested on]
    tx-tcp-ecn-segmentation: off [requested on]
    tx-tcp6-segmentation: off [requested on]
udp-fragmentation-offload: off [requested on]

@ghost
Copy link
Author

ghost commented Feb 25, 2016

Just tried it from the outside with julien's trick ip netns exec $pid ethtool -K eth0 tx off and it resolves the issue also.

And here a minimal project to reproduce the issue:
docker_project

Add an IP to both the dynamips router and the docker VM, verify it by pinging the remote device.
Now from the docker VM issue a telnet <router IP> 1234. As the router runs no service on port 1234, the telnet should immediately return with "Connection refused". But because the TCP packet from the docker VM has a bad checksum, the router ignores the TCP SYN from the docker VM and the telnet command hangs. After using the ethtool command, it should work.

Using this test from the router (to docker) is always successful as the TCP-RST answer from the docker VM has always the correct checksum. So testing from router to docker is useless.

@ghost
Copy link
Author

ghost commented Feb 25, 2016

As a workaround here a fix-up script, only tested on debian jessie, has to be started with sudo.

#!/bin/sh
# disable TCP checksum offloading on docker interfaces

if [ "$USER" != "root" ]; then
    echo "Must be run as root, start with sudo" >&2
    exit 1
fi

daemon=`cat /var/run/docker.pid 2> /dev/null`
if [ -z "$daemon" ]; then
    echo "Docker daemon not running" >&2
    exit 1
fi

mkdir -p /var/run/netns

pgrep -P $daemon | while read pid; do
    echo "docker process $pid..."
    ln -sf /proc/$pid/ns/net /var/run/netns/$pid
    sed -n 's/^ *\(eth[0-9]*\):.*/\1/p' < /proc/$pid/net/dev | while read dev; do
        echo "Fixing $dev..."
        ip netns exec $pid ethtool -K $dev tx off
    done
    echo
done

@AJNOURI
Copy link
Contributor

AJNOURI commented Feb 25, 2016

I have different results depending on who is the client and who is the server:

- client (container) ==> http server (VM)
None of the checksum rx/tx combination works

- client (VM) ==> http server (container)
Works only when tx:off
rx:off / tx:off works
rx:on / tx:off works

@AJNOURI
Copy link
Contributor

AJNOURI commented Feb 25, 2016

Looks like similar issue has been reported to docker:
moby/moby#16841

@ghost
Copy link
Author

ghost commented Mar 3, 2016

For me the "ethtool tx off" solves my issues.

Here a tiny C program, that does this specific ioctl:
tx-chksum-off.c
You find a list of ethtool ioctl in /usr/include/linux/ethtool.h

@julien-duponchelle
Copy link
Contributor

Thanks!

On Thu, Mar 3, 2016 at 11:57 PM Bernhard Ehlers notifications@github.com
wrote:

For me the "ethtool tx off" solves my issues.

Here a tiny C program, that does this specific ioctl:
tx-chksum-off.c
https://github.com/GNS3/gns3-server/files/157825/tx-chksum-off.c.txt


Reply to this email directly or view it on GitHub
#444 (comment).

julien-duponchelle added a commit to GNS3/ubridge that referenced this issue Mar 4, 2016
@julien-duponchelle
Copy link
Contributor

Could you test with last git version of ubridge (via git or https://github.com/GNS3/ubridge/archive/master.zip)

@ghost
Copy link
Author

ghost commented Mar 4, 2016

I tested with the following topology:
docker-project
FTP from dynamips router R1 to docker ftpd-1 works, also TFTP (UDP) traffic.
Telnet from docker ftpd-1 to R1 also works.

All that didn't work previously. So dynamips to docker (and vice versa) works fine, perfect !!!

But I have have some comments to your changes in ubridge.

  • In my tiny program I've missed a close(sock), I've corrected that. In my program it doesn't matter, as it terminates right after that. But in ubridge it keeps one file descriptor open on every call of turn_off_cx. Slowly you would run out of file descriptors.
  • printing the error message with perror doesn't seem right in ubridge, errors are normally send to the client with hypervisor_send_reply. So I would remove the perror and add a hypervisor_send_reply.
if (turn_off_cx(if2)) {
        hypervisor_send_reply(conn, HSC_ERR_CREATE, 1, "could not turn off TX checksum offloading");
        // maybe the veth pair should be destroyed as well,
        // or the above error could be changed into a warning
        goto out;
    } 

@ghost
Copy link
Author

ghost commented Mar 4, 2016

Just tested between docker and VMware VM (tinycore), also successful, also in both directions.

@ghost
Copy link
Author

ghost commented Mar 4, 2016

Here my proposed changes to ubridge. I decided to give only a warning, because I don't want to rollback the veth pair.

diff --git a/hypervisor_docker.c b/hypervisor_docker.c
index 8b4bb64..316d4e3 100644
--- a/hypervisor_docker.c
+++ b/hypervisor_docker.c
@@ -125,10 +125,8 @@ static int turn_off_cx(char *ifname) {
     int rc;

     sock = socket(PF_INET, SOCK_DGRAM, IPPROTO_IP);
-    if (sock < 0) {
-        perror("socket");
+    if (sock < 0)
         return sock;
-    }

     strncpy(ifr.ifr_name, ifname, sizeof(ifr.ifr_name));
     ifr.ifr_data = (char *)&eval;
@@ -137,11 +135,10 @@ static int turn_off_cx(char *ifname) {
     eval.data = 0;

     rc = ioctl(sock, SIOCETHTOOL, &ifr);
-    if (rc < 0) {
-        perror("ioctl");
-        return rc;
-    }
-    return 0;
+
+    close(sock);
+
+    return rc;
 }


@@ -200,7 +197,7 @@ static int cmd_create_veth_pair(hypervisor_conn_t *conn, int argc, char *argv[])
     }

     if (turn_off_cx(if2)) {
-        goto out;
+        hypervisor_send_reply(conn, HSC_INFO_MSG, 0, "Warning: could not turn off checksum");
     }

     hypervisor_send_reply(conn, HSC_INFO_OK, 1, "veth pair created: %s and %s", if1, if2);

@julien-duponchelle
Copy link
Contributor

Patch merged!

On Fri, Mar 4, 2016 at 11:19 AM Bernhard Ehlers notifications@github.com
wrote:

Here my proposed changes to ubridge. I decided to give only a warning,
because I don't want to rollback the veth pair.

diff --git a/hypervisor_docker.c b/hypervisor_docker.c
index 8b4bb64..316d4e3 100644
--- a/hypervisor_docker.c
+++ b/hypervisor_docker.c
@@ -125,10 +125,8 @@ static int turn_off_cx(char *ifname) {
int rc;

 sock = socket(PF_INET, SOCK_DGRAM, IPPROTO_IP);
  • if (sock < 0) {
  •    perror("socket");
    
  • if (sock < 0)
    return sock;
  • }

strncpy(ifr.ifr_name, ifname, sizeof(ifr.ifr_name));
ifr.ifr_data = (char *)&eval;
@@ -137,11 +135,10 @@ static int turn_off_cx(char *ifname) {
eval.data = 0;

rc = ioctl(sock, SIOCETHTOOL, &ifr);

  • if (rc < 0) {
  •    perror("ioctl");
    
  •    return rc;
    
  • }
  • return 0;
  • close(sock);
  • return rc;
    }

@@ -200,7 +197,7 @@ static int cmd_create_veth_pair(hypervisor_conn_t *conn, int argc, char *argv[])
}

 if (turn_off_cx(if2)) {
  •    goto out;
    
  •    hypervisor_send_reply(conn, HSC_INFO_MSG, 0, "Warning: could not turn off checksum");
    

    }

    hypervisor_send_reply(conn, HSC_INFO_OK, 1, "veth pair created: %s and %s", if1, if2);


Reply to this email directly or view it on GitHub
#444 (comment).

@ghost
Copy link
Author

ghost commented Mar 4, 2016

Thanks for merging. For me the issue is resolved.

As this has already a long history, I suggest, that we close it. If we experience other problems with the connectivity, a new issue should be opened.

@julien-duponchelle
Copy link
Contributor

Thanks a lot for your contribution it's very precious

@hellt
Copy link

hellt commented Nov 3, 2020

since the OP has been deleted, may I ask @noplay if the original issue was with the linux networking stack which container used that caused packets coming out to dynamips being not checksummed? Which caused the TCP termination by dynamips?

@julien-duponchelle
Copy link
Contributor

Not really, I think it was an issue due to the fact the checksum was not matching the sending machine but I don't remember

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants