Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

curl/git don't work properly inside an lxd container on wsl/ubuntu instance due to connection reset by peer. #9427

Closed
1 of 2 tasks
phr34k opened this issue Jan 3, 2023 · 4 comments

Comments

@phr34k
Copy link

phr34k commented Jan 3, 2023

Version

Microsoft Windows [Version 10.0.19044.2364]

WSL Version

  • WSL 2
  • WSL 1

Kernel Version

5.15.79.1

Distro Version

Ubuntu 22.10

Other Software

LXD (5.9-76c110d)
curl (7.81.0)

Repro Steps

I'm using snapcraft in wsl for software development of snaps, internally snapcraft uses lxd (i.e. snapcraft --use-lxd) to run commands inside containers. When using a command that fetches a large network resource like a git clone or flutter precache, I pretty much always get peer disconnects. For brievity I've reproduced this behavior with a minimum example. For more details see: https://stackoverflow.com/questions/74989653/how-to-fix-unstable-network-for-lxc-lxd-container-that-causes-curl-git-commands

# enable the systemd daemon, because snapcraft requires it.
echo -e "[boot]\nsystemd=true" | sudo tee /etc/wsl.conf > /dev/null
sudo apt-get update && sudo apt-get upgrade
# shutdown and reboot the vm
wsl.exe --shutdown
# install and configure lxd with default configuration i.e. install network bridge lxdbr0 and run curl command.
sudo snap install lxd
sudo lxd init
sudo lxc launch images:ubuntu/focal wired-bluejay
sudo lxc exec wired-bluejay -- curl -o dart-sdk-linux-x64.zip https://storage.googleapis.com/flutter_infra_release/flutter/472e34cbbcd461c748973e7e735558ab200d4f5e/dart-sdk-linux-x64.zip

Expected Behavior

curl and git exit without problems e.g.

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  246M  100  246M    0     0  2896k      0  0:01:27  0:01:27 --:--:-- 2295k

Actual Behavior

The behavior is intermittant, but curl, git and others terminate the download(s) due to a peer disconnected error. The behavior seems affected by the internet connection i.e. I had one wireless networks the problem doesn't occur at all, and two others where the problem persistantly is occuring.

  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  8  246M    8 20.8M    0     0  2780k      0  0:01:30  0:00:07  0:01:23 2951k
curl: (56) OpenSSL SSL_read: Connection reset by peer, errno 104
LXD is required but not installed. Do you wish to install LXD and configure it with the defaults? [y/N]: y
lxd 5.9-76c110d from Canonical✓ installed
Launching instance... \ (436.5s)                                                                                                               Traceback (most recent call last):
  File "/snap/snapcraft/8638/lib/python3.8/site-packages/craft_providers/actions/snap_installer.py", line 320, in inject_from_host
    executor.execute_run(
  File "/snap/snapcraft/8638/lib/python3.8/site-packages/craft_providers/lxd/lxd_instance.py", line 289, in execute_run
    return self.lxc.exec(
  File "/snap/snapcraft/8638/lib/python3.8/site-packages/craft_providers/lxd/lxc.py", line 329, in exec
    return runner(final_cmd, **kwargs)  # pylint: disable=subprocess-run-check
  File "/snap/snapcraft/8638/usr/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['lxc', '--project', 'snapcraft', 'exec', 'local:snapcraft-liquid-pos-on-amd64-for-amd64-8725724278304174', '--', 'env', 'PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin', 'SNAPCRAFT_MANAGED_MODE=1', 'snap', 'install', '/tmp/snapcraft.snap', '--classic']' returned non-zero exit status 1.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/snap/snapcraft/8638/lib/python3.8/site-packages/craft_providers/bases/buildd.py", line 480, in _install_snaps
    snap_installer.inject_from_host(
  File "/snap/snapcraft/8638/lib/python3.8/site-packages/craft_providers/actions/snap_installer.py", line 328, in inject_from_host
    raise SnapInstallationError(
craft_providers.actions.snap_installer.SnapInstallationError: failed to install snap 'snapcraft'
* Command that failed: 'lxc --project snapcraft exec local:snapcraft-liquid-pos-on-amd64-for-amd64-8725724278304174 -- env PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/snap/bin SNAPCRAFT_MANAGED_MODE=1 snap install /tmp/snapcraft.snap --classic'
* Command exit code: 1
* Command output: b'2023-01-05T01:52:09Z INFO Waiting for automatic snapd restart...\n'
* Command standard error output: b'error: cannot perform the following tasks:\n- Download snap "core20" (1778) from channel "stable" (read tcp 10.137.228.108:54806->91.189.91.43:443: read: connection reset by peer)\n'

Diagnostic Logs

I've already tried:

  • verified the mtu sizes (1500) and tinkering them to different values w/o improvement
  • verified the problem isn't due to instable internet (i.e. reproduced curl download outside lxd container w/o problems, both ubuntu host and the windows host).
  • disabled all network adapaters except wsl and wireless internet
  • disabled all profiles in windows firewall
  • verified the problem is specific to wsl by installing virtual box with ubuntu 22.10 and replicating the minimum examlple.
@phr34k
Copy link
Author

phr34k commented Jan 4, 2023

I tried to capture tcpdumps in order to analyze what is happening. Here's a brief overview https://gist.github.com/phr34k/334c1c34fd519165cfede0eeaf382e7b with link to pcap files. Honestly, I think the traffic between windows and ubuntu seems quite normal: retransmission of packets between 2% and 3% - but between ubuntu and lxc something odd is happening the number of packets is doubling and retransmission spikes to about 23%

This seems consistent with my findings that:

  • curl download on windows works
  • curl download on ubuntu (wsl) works
  • curl download on lxc in ubuntu (wsl) does not work

Is there any thing on the wsl side e.g. the network stack or wsl specific network drivers that could explain the that I'm seeing?

@phr34k phr34k changed the title curl/git don't work properly inside an lxd container on wsl/ubuntu instance and disconnects frequently. curl/git don't work properly inside an lxd container on wsl/ubuntu instance due to connection reset by peer. Jan 5, 2023
@LongLiveCHIEF
Copy link

Related to #8358?

@phr34k
Copy link
Author

phr34k commented Mar 5, 2023

@LongLiveCHIEF not sure, from the description of that issue it looks like curl hangs directly on the wsl instance whereas mine distinctively fails only inside the lxd container. I suspect there might be something going on with the network stack that has maybe been modified for wsl.

Copy link
Contributor

This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants