Skip to content
This repository has been archived by the owner on Oct 23, 2023. It is now read-only.

Proposal: Log SystemExit in wsgi wrapper #675

Closed
mattrobenolt opened this issue Oct 27, 2015 · 11 comments
Closed

Proposal: Log SystemExit in wsgi wrapper #675

mattrobenolt opened this issue Oct 27, 2015 · 11 comments

Comments

@mattrobenolt
Copy link
Contributor

When using gunicorn, at least, and a worker times out, gunicorn sends a SIGABRT signal to the worker process. This manifests itself as a SystemExit exception that gets raised in the worker.

When this happens in Django, Django completely ignores SystemExit and lets it re-raise so the process can exit, without ever calling the got_request_exception signal, so in our case, we never see the exception and nobody knows that anything happened since it happily exited without a fuss.

My proposal would be to explicitly catch SystemExit, log, then reraise.

imo this seems like a reasonable thing to do within a middleware. If something is trying to shut down the process while in the middle of the request, probably worth logging.

In gunicorn, this wouldn't log for a normal shutdown, since that sends a SIGTERM instead of SIGABRT.

Would need to be tested against other wsgi runners, specifically uwsgi.

Refs:
https://github.com/django/django/blob/37ea3cb03e80de80380009a7a7939bc48d75abe9/django/core/handlers/base.py#L223-L225
https://github.com/benoitc/gunicorn/blob/e1912db59a2f1df36387dd28bb3515202cc03098/gunicorn/arbiter.py#L443

@dcramer
Copy link
Member

dcramer commented Oct 27, 2015

this sounds like noise to me -- how is it different than things like the client disconnecting before the response finishes streaming?

@mattrobenolt
Copy link
Contributor Author

That shouldn't trigger a SystemExit.

This timeout is specifically when gunicorn kills the worker because it was taking too long to respond. Maybe a busy loop or something like that. So this goes unnoticed.

If the client disconnects, it should be an IOError or something akin to that. I'd have to investigate exactly what happens in that case a bit more. But it's unlikely that it's a SystemExit since that's telling the process to die.

@dcramer
Copy link
Member

dcramer commented Oct 27, 2015

The problem is that "SystemExit" is not a generic error. It'd be akin to us logging an error whenever you called sys.exit(1), which doesnt make much sense. We have the syshook handlers which capture things that get bubbled up (and raise SystemExit should), so I'm not sure what case this is specifically trying to cater towards.

@mattrobenolt
Copy link
Contributor Author

For posterity:

# timeout.py
from time import sleep

def application(env, start_response):
    try:
        sleep(5)
    except:
        import traceback
        traceback.print_exc()
        raise

Run with gunicorn timeout:application -t 1

@mattrobenolt
Copy link
Contributor Author

It'd be akin to us logging an error whenever you called sys.exit(1), which doesnt make much sense

I think this is different since it's happening within the wsgi middleware, so the sys.exit() would have to have been called at some point within the request. Which... imo probably worth logging, but could be wrong. Not sure when someone would ever legitimately do that.

We have the syshook handlers which capture things that get bubbled up (and raise SystemExit should)

I'll investigate this and see if that's triggered. I forgot about that.

so I'm not sure what case this is specifically trying to cater towards.

The specific case is a view in your app that is timing out and the process is getting killed without you knowing about it since nothing was logged. This would log it and log where it was killed from.

For example, the stacktrace looks like this:

[2015-10-27 00:38:40 -0700] [13684] [CRITICAL] WORKER TIMEOUT (pid:13687)
Traceback (most recent call last):
  File ".../timeout.py", line 5, in application
    sleep(5)
  File ".../gunicorn/workers/base.py", line 159, in handle_abort
    sys.exit(1)
SystemExit: 1

@dcramer
Copy link
Member

dcramer commented Oct 27, 2015

Oh is this specifically because we ignore that error today?

I personally dont like catering to gunicorn, but its probably fine to capture all in the middleware

@mattrobenolt
Copy link
Contributor Author

We don't ignore it specifically, but it's not caught by Exception since it's stemming from BaseException, and Django specifically ignores it, so it's never caught by its signal handler.

We specifically miss it here: https://github.com/getsentry/raven-python/blob/master/raven/middleware.py#L36, https://github.com/getsentry/raven-python/blob/master/raven/middleware.py#L43, and https://github.com/getsentry/raven-python/blob/master/raven/middleware.py#L52 since it's not an Exception.

My proposal would be to basically add:

except Exception:
    self.handle_exception(environ)
except SystemExit:
    self.handle_exception(environ)
    raise

@dcramer
Copy link
Member

dcramer commented Oct 27, 2015

except (Exception, SystemExit), but thats a minor detail

You're sure other processes dont cause it to raise SystemExit when shutting down? Otherwise seems fine.

@mattrobenolt
Copy link
Contributor Author

Or actually, probably just:

except (Exception, SystemExit):
    self.handle_exception(environ)
    raise

@mattrobenolt
Copy link
Contributor Author

Ah, so on further investigation, we can narrow this down a bit:

A normal shut down of gunicorn from SIGINT sends sys.exit(0) to the worker to exit cleanly, but sys.exit(1) when being killed.

Gonna test against uwsgi to see what this does.

mattrobenolt added a commit that referenced this issue Nov 13, 2015
SystemExit does not subclass Exception, so we don't catch this.
SystemExit manifests itself (at least) from gunicorn when gunicorn
attempts to gracefully shut down a worker process. See GH-675 for more
background.

Fixes: GH-675
@ThunderTrent
Copy link

I'm having this exact issue where I can't trace an issue with Sentry where I'm getting systemexit(1) errors and I can't really see what is going on.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants