Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vsock crash #50

Closed
tailhook opened this issue Aug 4, 2016 · 24 comments
Closed

vsock crash #50

tailhook opened this issue Aug 4, 2016 · 24 comments

Comments

@tailhook
Copy link

tailhook commented Aug 4, 2016

Hi,

We have an annoying crash of docker for mac which boils down to invalid ringbuf check here AFAIU.

Log:

Aug  4 19:28:49 estus Docker[42833] <Error>: Socket.TCPV4.read connected TCP: caught Uwt.Uwt_error(Uwt.ECONNRESET, "uwt_read", "") returning Eof
Aug  4 19:28:49 estus Docker[42833] <Error>: Socket.TCPV4.write connected TCP: caught Uwt.Uwt_error(Uwt.EPIPE, "write", "") returning Eof
Aug  4 19:28:49 estus Docker[42833] <Error>: Socket.Stream: caught Uwt.Uwt_error(Uwt.ENOTCONN, "shutdown", "")
Aug  4 19:28:49 estus Docker[42833] <Error>: Socket.TCPV4.write connected TCP: caught Uwt.Uwt_error(Uwt.EPIPE, "write", "") returning Eof
Aug  4 19:28:49 estus Docker[42833] <Error>: Socket.Stream: caught Uwt.Uwt_error(Uwt.ENOTCONN, "shutdown", "")
Aug  4 19:28:55 estus Docker[42833] <Error>: Socket.TCPV4.write connected TCP: caught Uwt.Uwt_error(Uwt.EPIPE, "write", "") returning Eof
Aug  4 19:28:55 estus Docker[42833] <Error>: Socket.Stream: caught Uwt.Uwt_error(Uwt.ENOTCONN, "shutdown", "")
Aug  4 19:28:55 estus Docker[42833] <Error>: Socket.TCPV4.write connected TCP: caught Uwt.Uwt_error(Uwt.EPIPE, "write", "") returning Eof
Aug  4 19:28:55 estus Docker[42833] <Error>: Socket.Stream: caught Uwt.Uwt_error(Uwt.ENOTCONN, "shutdown", "")
Aug  4 19:28:55 estus Docker[42832] <Error>: Fatal unexpected exception: Socket.Closed
Aug  4 19:28:55 estus Docker[42834] <Notice>: virtio-net-vpnkit: initialising, opts="uuid=01b794ee-68a9-4b92-9c79-b2b44e5f271e,path=/Users/free2use/Library/Containers/com.docker.docker/Data/s50,macfile=/Users/free2use/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/mac.0"
Aug  4 19:28:55 estus Docker[42834] <Notice>: Error reading vsyslog length: EOF
Aug  4 19:28:55 estus Docker[42834] <Notice>: Assertion failed: (sc->reply_cons != sc->reply_prod), function send_response_common, file src/pci_virtio_sock.c, line 879.

As far as I can see the error is here:

    if (sc->reply_prod == VTSOCK_REPLYRINGSZ)
        sc->reply_prod = 0;
    /* check for reply ring overflow */
    /* XXX correct check? */
    DPRINTF(("TX: QUEUING REPLY IN SLOT %x (prod %x, cons %x)\n",
         slot, sc->reply_prod, sc->reply_cons));
    assert(sc->reply_cons != sc->reply_prod);

I.e. someone should "XXX correct check" :)

Alternatively, if it's hard to fix could we make VTSOCK_REPLYRINGSZ bigger (if it will help to make error more rare)?

@ijc
Copy link
Collaborator

ijc commented Aug 8, 2016

Hi @tailhook & thanks for your report. Sorry for the delay replying, I've been AFK travelling.

The assertion is checking whether the reply ring is full, I'm reasonably confident the check is correct, although as the comment indicates a second pair of eyes wouldn't go amis ;-). However this is slightly moot in light of seeing the assertion actually fire, even if the check were wrong it would certainly only be an off-by-one, which would indicate that either the ring had but a single slot left or had overflowed by one slot (the latter is obviously worse, but as I'll explain still a bit moot).

The issue is that if the reply ring has filled up (+/-1 if the check is wrong, but one slot either way isn't going to save us or change what follows) then due to a protocol bug in the vsock protocol spec there is actually nothing we can do other than crash or deadlock. The assert turns the deadlock into a crash, which is obviously not ideal either but better than some other component silently locking up. There is a description of the deadlock in 1ea0301 which is where the reply ring was introduced (previously the deadlock was much more obvious, there are also some subsequent followup fixes but that is the main one). This all stems from the assertion in the design that vsock is reliable and therefore cannot drop anything, while also not including any back pressure mechanisms, while also including cross talk between the rx and tx rings (i.e. a tx request can result in a requirement to have a free rx slot).

I raised this deadlock upstream a while back and had a discussion with the guys doing the protocol specification (as well as the Linux frontend implementation) and they were going to look at remedying it in the next version. I see in my inbox this morning that I have a new set of frontend patches (RFC v6) and a new version of the specification proposal (v7, nb the spec proposal and the frontend implementation are independently numbered, but v6 of the patches corresponds to v7 of the spec) and a changelog which specifically mentions this issue, so hopefully something has changed (or been more tightly specified) which will resolve this issue at root.

That's going to take a little time to digest and to make the necessary updates on both ends. In the meantime increasing VTSOCK_REPLYRINGSZ is not a bad idea, as you can see in 1ea0301 I already doubled it once to 2x ring size, going to 4x ring size can't hurt.

If you could give details of your repro case (ideally in the form of a Dockerfile or docker-compose.yml and a simple docker build/run or docker-compose up that would be handy since your workload seems to be even better at provoking this behaviour than ones I've found previously, to be hitting the assert it must be provoking many outstanding vsock tx requests compared with the ring size.

@tailhook
Copy link
Author

tailhook commented Aug 8, 2016

If you could give details of your repro case

It's hard to do. We have a huge web project in python. We run nodemon (to start python proccess) and webpack --watch for monitoring filesystem events. There are literally tens of thousands of inotify watches (default 8k limit doesn't work for us). Both things run on the (osxfs) volume.

The python process is a web application with a pretty large HTML pages (~10Mb). The crash usually happens when we refresh a page in a webbrowser. In fact when we use minified version of page (which is about 200Kb, i.e. 50x smaller) the failure is more rare. So it probably depends on amount of traffic between host system and containers. Each web page visit also fetches data from a database that is outside of this physical box.

Also, each web page touches session file which is put on shared volume. And it may happen (don't know how to verify it) that some inotify watches may follow that file changes for whatever reason we misconfigured the watchers.

This is the story. I can't publish all of our sources. But if you'll give me some hints which exact operations influence the ringbuf usage, we can come up with some smaller script to reproduce the issue: do we need filesystem operations? are network connections or packets fill the ringbuf? what about inotify?

@gluxon
Copy link

gluxon commented Aug 8, 2016

This is the story. I can't publish all of our sources. But if you'll give me some hints which exact operations influence the ringbuf usage, we can come up with some smaller script to reproduce the issue: do we need filesystem operations? are network connections or packets fill the ringbuf? what about inotify?

tailhook

This is very well said. My project is in a similar situation. While I open source all my personal projects, the one this bug affects is at my job, so source code is difficult to share. I attempted to start an nginx image that loads an HTML page returning many large images and wasn't able to reproduce the bug. The workaround I used until I discovered I could increase the reply rings size was just disabling images. I agree that knowing exactly what causes reply rings to increment would definitely help me to upload a testcase.

@ijc25 Thank you for your detailed comment! Reading it was very interesting after looking at the source code myself. I appreciate a developer that explains how an issue is occurring so the rest of us could learn.

@dsheets
Copy link
Collaborator

dsheets commented Aug 19, 2016

@ijc any phrase like "fix [issue ref]" will close an issue even if preceded by "not". :-P

@dsheets dsheets reopened this Aug 19, 2016
@ijc
Copy link
Collaborator

ijc commented Aug 19, 2016

@dsheets: thanks! I was careful with my wording in #51 but apparently got sloppy when I pulled the changes over.

@kudos
Copy link

kudos commented Sep 10, 2016

I'm seeing a similar crash:

11/09/2016 00:45:09.939 Docker[39599]: Assertion failed: (!REPLY_RING_EMPTY(sc)), function send_response_common, file src/pci_virtio_sock.c, line 950.
11/09/2016 00:45:09.942 Docker[39599]: virtio-net-vpnkit: initialising, opts="uuid=38112b0d-c8d1-483d-8240-07cfe3f720af,path=/Users/jonathan/Library/Containers/com.docker.docker/Data/s50,macfile=/Users/jonathan/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/mac.0"
11/09/2016 00:45:09.943 Docker[39599]: Interface will have uuid 38112b0d-c8d1-483d-8240-07cfe3f720af
11/09/2016 00:45:09.943 Docker[39599]: Connection established with MAC=c0:ff:ee:c0:ff:ee and MTU 1500
11/09/2016 00:45:09.943 Docker[39599]: virtio-9p: initialising path=/Users/jonathan/Library/Containers/com.docker.docker/Data/s40,tag=db
11/09/2016 00:45:09.943 Docker[39599]: virtio-9p: initialising path=/Users/jonathan/Library/Containers/com.docker.docker/Data/s51,tag=port
11/09/2016 00:45:09.943 Docker[39599]: linkname /Users/jonathan/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty
11/09/2016 00:45:09.943 Docker[39599]: COM1 connected to /dev/ttys002
11/09/2016 00:45:09.944 Docker[39599]: COM1 linked to /Users/jonathan/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty
11/09/2016 00:45:09.947 Docker[39598]: PPP.listen: closing connection
11/09/2016 00:45:09.948 Docker[39597]: Fatal unexpected exception: Socket.Closed
11/09/2016 00:45:09.951 Docker[39598]: Socket.Stream: caught Uwt.Uwt_error(Uwt.ENOTCONN, "shutdown", "")
11/09/2016 00:45:09.953 Docker[21966]: Reap com.docker.osxfs (pid 39597): exit status 1
11/09/2016 00:45:10.011 Docker[21966]: Reap com.docker.driver.amd64-linux (pid 39599): exit status 0
11/09/2016 00:45:10.013 Docker[21976]: VM shutdown at 2016-09-11 00:45:09 +0100 IST

Unlike @tailhook, I was running a single container without any volumes mounted.

@telecoda
Copy link

Experiencing the same issue here. Using docker-compose with multiple containers. Only occurs with a large amount of i/o.

@yacn
Copy link

yacn commented Sep 12, 2016

I'm seeing the same crash as @kudos with the latest beta, running two containers (portal and a DB) with the code mounted into the container.

9/12/16 12:00:27.181 PM Docker[47588]: Assertion failed: (!REPLY_RING_EMPTY(sc)), function send_response_common, file src/pci_virtio_sock.c, line 950.
9/12/16 12:00:27.181 PM Docker[47588]: virtio-net-vpnkit: initialising, opts="uuid=32529f86-e3f9-40d5-a388-c47224df3252,path=/Users/iboehman/Library/Containers/com.docker.docker/Data/s50,macfile=/Users/iboehman/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/mac.0"
9/12/16 12:00:27.181 PM Docker[47588]: Interface will have uuid 32529f86-e3f9-40d5-a388-c47224df3252
9/12/16 12:00:27.181 PM Docker[47588]: Connection established with MAC=c0:ff:ee:c0:ff:ee and MTU 1500
9/12/16 12:00:27.181 PM Docker[47588]: virtio-9p: initialising path=/Users/iboehman/Library/Containers/com.docker.docker/Data/s40,tag=db
9/12/16 12:00:27.182 PM Docker[47588]: virtio-9p: initialising path=/Users/iboehman/Library/Containers/com.docker.docker/Data/s51,tag=port
9/12/16 12:00:27.182 PM Docker[47588]: linkname /Users/iboehman/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty
9/12/16 12:00:27.182 PM Docker[47588]: COM1 connected to /dev/ttys001
9/12/16 12:00:27.183 PM Docker[47588]: COM1 linked to /Users/iboehman/Library/Containers/com.docker.docker/Data/com.docker.driver.amd64-linux/tty
9/12/16 12:00:27.183 PM Docker[47588]: received sigusr1, pausing
9/12/16 12:00:27.183 PM Docker[47588]: received sigusr2, unpausing
9/12/16 12:00:27.183 PM Docker[47588]: received sigusr1, pausing
9/12/16 12:00:27.183 PM Docker[47588]: received sigusr2, unpausing
9/12/16 12:00:27.188 PM Docker[47587]: PPP.listen: closing connection
9/12/16 12:00:27.189 PM Docker[47586]: Fatal unexpected exception: Socket.Closed
9/12/16 12:00:27.201 PM Docker[37616]: Reap com.docker.osxfs (pid 47586): exit status 1
9/12/16 12:00:27.222 PM Docker[37623]: VM shutdown at 2016-09-12 12:00:27 -0400 EDT
9/12/16 12:00:27.229 PM Docker[37616]: Reap com.docker.driver.amd64-linux (pid 47588): exit status 0
9/12/16 12:00:27.244 PM Docker[47587]: Socket.Stream: caught Uwt.Uwt_error(Uwt.ENOTCONN, "shutdown", "")

ijc pushed a commit to ijc/hyperkit that referenced this issue Sep 13, 2016
… of rx

v7 of the virtio vsock spec proposal adds a requirement to continue processing
the TX ring so long as "additional resources" are available in order to avoid
deadlock. In this implementation those additional resources take to form of the
reply_ring.

In order to avoid deadlock we must ensure that we process the reply ring in
preference to actual data RX frequently enough that we avoid filling the reply
ring too. Since the reply ring is 2x the virtio ring size I _think_ it should
be sufficient to ensure that we check for new reply entries periodically
(something less than the virtio ring size) rather than every time.

This is discussed in several replies to the spec posting at
<1470324277-19300-1-git-send-email-stefanha@redhat.com>.

A more formal consideration of how deadlock is avoided using this scheme should
be forthcoming (see the above thread). Worst case is this change doesn't avoid
deadlock, it can't make things any worse.

This should hopefully address moby#50.

Signed-off-by: Ian Campbell <ian.campbell@docker.com>
@ijc
Copy link
Collaborator

ijc commented Sep 16, 2016

I've merged the changes in #59 and the new Linux frontend patches into Docker for Mac's master branch so they will be present in the next beta, which will be beta27 due (hopefully) next week (but possibly as long as the week after depending on other bugs etc).

I think these changes will fix this issue so I am going to close it now while it is on my mind.

Once beta27 is released (if you install beta26 now you should get it via autoupdate) please do retest and file a bug against https://github.com/docker/for-mac (via the "Diagnose & Feedback" entry in the whale menu) if the issue persists or recurs in a different form.

Thanks for you report(s) and your patience.

Ian.

@ijc ijc closed this as completed Sep 16, 2016
@telecoda
Copy link

@ijc25 - nice work. I have been testing today and I'm not experiencing the implosion of Docker I was getting before with heavy i/o 👍

@ijc
Copy link
Collaborator

ijc commented Sep 16, 2016

@telecoda the changes referred to here are not yet in any released version of docker for mac (they will be in beta27 when it is released). So I don't know what change has fixed things for you, but I'm glad everything is ok!

@telecoda
Copy link

@ijc25 sorry, I was a little premature with my congrats. That'll explain why I still have the same issue..

@bryanhelmig
Copy link

Now that beta27 landed - looks like I'm still getting some sort of error in this class, here are some syslog -k Sender Docker logs https://gist.github.com/bryanhelmig/dca8725d5fcd4649432fe4e5d068dd51.

@ijc
Copy link
Collaborator

ijc commented Sep 29, 2016

@bryanhelmig Yes, that does look rather similar :-(.

Please could you open a fresh ticket against https://github.com/docker/for-mac with the full output of diagnostics. Any sort of reproduction case would also be useful in tracking this down.

I understand that many of the repro cases are when building proprietary software but if someone was able to par their case down to just the basic build system skeleton required to trigger the issue (removing all the proprietary stuff) or find a subset of their code which they were willing/able to post then we'll stand a much better chance of nailing this once and for all.

@bryanhelmig
Copy link

I will try and set up a build that does the following things we're doing:

  1. a mounted volume backing both js and python
  2. a handful of very large javascript non-minified debug build files that are served through a django dev server
  3. a non trivial django view utilizing many imports

I will see if we can reproduce it. Would a github repo with a docker-compose.yml be sufficient?

@mpauly
Copy link

mpauly commented Oct 5, 2016

In a similar configuration (also django with django-static-precompiler and less) we encountered similar problems. However these turned out to be extremly difficult to replicate in a more minimalistic setup. We were able to get arround this by re-enabling sendfile in our uwsig config which we had previously disabled in our dev setup to fix a bug in an older version of docker for mac.
Commenting out disable-sendfile = true in our uwsgi config .ini file solved the problem - I hope this helps.

@bryanhelmig
Copy link

@mpauly curious to hear a bit about your uwsgi + dev server setup.

@bryanhelmig
Copy link

@ijc25 Hey Ian - I have a reproducible sample project here https://github.com/bryanhelmig/docker-compose-crash-repro. It should be as simple as cloning, building and running the container to repro. You can tweak up/down the pressure as needed.

Can you take a peek and let me know if it happens for you? Happy to go further - this is a massive pain point for us now.

@mpauly
Copy link

mpauly commented Oct 7, 2016

Unfortunately I was a bit early on this - even after enabling sendfile the bug reoccured, only less frequently. I also created a sample project to reproduce https://github.com/mpauly/dockerbug. It looks fairly similar to what @bryanhelmig did.

@bryanhelmig
Copy link

@ijc25 thanks from us as well - looking forward to giving beta28 a spin.

@bryanhelmig
Copy link

First pass with beta28 and it looks 👍 . If you don't see me complain/link to a new issue, this is resolved in the wild for us!

@mpauly
Copy link

mpauly commented Oct 18, 2016

Same here, beta28 solves the problem. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants