-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Caught segfault during siege #289
Comments
I have a feeling this is related to something trying to access something on disk... cannot reproduce with simple-errors. Trying to make a better reprex |
CC @wch does this look like an httpuv issue? @shapenaji can you include the |
Hi @trestletech sure, here's session_info Session Info
we have a couple of internal packages there which are loading everything else, xxxxxx and yyyyyy, I'm digging through them to be sure. As far as I can tell though, the plumber function involved only does data.frame lookups... still digging that out at the moment |
Got it: Hit the endpoint with good requests first: good_urls
then hit with a lot of 404's bad_urls >
From R on the API host:
EDIT: also, got this when I tried again: (the exact combination of bad requests and good requests is unclear, I just alternate till I get the segfault)
At the time of the crash I was using this plumber file: plumber.R
EDIT: updated (and smaller) session_info: session_info
|
I haven't seen this error before, so I don't know if it's related to httpuv. The fact that those errors are different each time suggests that there's some memory corruption going on. Has this server been upgraded from older versions of R, or is it a fresh server? If it has been upgraded, it's possible that some old packages were compiled against an old version of R and are not compatible. If you run: pkgs <- as.data.frame(installed.packages(), stringsAsFactors = FALSE, row.names = FALSE)
pkgs[, c("Package", "Version", "Built")] The Built column should ideally be 3.4.4 for all packages, or at least 3.4.0. You can update the packages with: update.packages(ask = FALSE, checkBuilt = TRUE) See http://shiny.rstudio.com/articles/upgrade-R.html for more information. @shapenaji If you're feeling bold, you can try running your code in this Docker image: https://hub.docker.com/r/wch1/r-debug/ The Docker image contains a bunch of builds of R that help catch memory problems. Running under Another option is to run R under gdb and see if that helps catch the problem. If you can provide very simple instructions for reproducing the problem, I'd appreciate it! |
Server is reasonably fresh, all are on 3.4.0 or higher: I'll upgrade all the same and try again (current versions below) packageVersions
Will try running it off that container, and then under gdb |
@wch Seems to be specifically an R 3.4.4 issue. I cannot reproduce using R 3.5, but as soon as I reinstalled R 3.4.4 I get the problem again Inside the container I only get 3.5, so I was not able to reproduce with the given tools. going to try with gdb and R 3.4.4 In depth details: DETAILSHost:Outside of container:
plumber.R
ClientUsing two urlfiles: (if you want you can replace the /test1/ endpoint and just use a port, but in my network I need to use port 80 minigood:
minibad:
siegerc:
siege commands (I alternate between these):
Extra Notes with RDcsan, plumber doesn't actually start:
|
R 3.4.4 with gdb
Backtrace:
could be triggered by cancelling the siege... seems to happen as soon as I hit CTRL+C |
R 3.5 release notes appear to have made a large number of changes to buffered connections, related? |
@shapenaji and I worked on this for a while, but we haven't been able to come up with an easily reproducible test case. However, I'm quite sure that I've found the source of the segmentation fault: Here's a walkthrough of what this looks like for us in gdb. Note that I've compiled R 3.4.4 (which is the latest version for Ubuntu) with debug symbols. I'm running an extremely basic I've also set a breakpoint on
I think After a while:
Eventually there's a broken socket, but for some reason the signal ends up on thread 2, as you can see above. Further, the
Since R is not thread-safe, and
Where the segmentation fault occurs seems to be a bit random; often it is in the line above, but we saw it in several other places. It probably just depends on where the main thread happens to be when SIGPIPE shows up. I can provide full backtrace output but I think it's irrelevant in this case. Does this information help? Are we doing something wrong configuring |
@atheriel Thanks for the detailed investigation! httpuv starts another thread when a server is started (which is what plumber does). A httpuv application may also use the In the stack traces you provided, thread 2 was running httpuv code. If the third thread is running code from later, it will look something like this:
FWIW, in a clean R session, you can cause later to start a new thread with the following: later(function() cat("hello\n")) I'm not sure at this point exactly how the thread and signal handling is working, but if it is the case that the SIGPIPE is caused by something that happens on thread 2 (httpuv), and then it causes that thread to execute internal R code, that could trigger the kind of problems you've seen. R is not thread-safe, so we have tried very hard in httpuv to not accidentally execute internal R code on a background thread. But a signal handler could conceivably bypass the safeguards we've put in place to avoid these problems. |
In my debugging it was always the second thread (the one executing It might be possible to prevent the threads started by |
@atheriel You're right: previously, httpuv did ignore SIGPIPE (rstudio/httpuv@fe95d0d8). It's possible this workaround got lost when we a big overhaul of httpuv's internals. |
Remaining actions:
|
remaining actions moved to PR above. Still waiting on |
Merged to master. Currently, httpuv is still a remote. @atheriel Sorry for the delay, I thought httpuv would be released sooner. |
Is there a direct way to resolve the segmentation fault? I have tried subsetting the GTF file but it shouts the same as @shapenaji error. |
@arpankbasak If you're encountering a similar problem, please file a new issue, preferably with a reproducible example. |
Migrating from here:
https://community.rstudio.com/t/stressing-the-plumber/12327/4
I can reproduce a segfault with bad requests against a plumber API
API Host:
Ubuntu 16.04 running on Azure
4 vCPUs
64 G ram
Steps to reproduce:
Set up a plumber API on port XYZ
siege api with bad requests
Message:
Traceback
The text was updated successfully, but these errors were encountered: