Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have a global signalling mechanism for request/response lifecycle #1348

Closed
untitaker opened this issue Sep 6, 2019 · 12 comments
Closed

Have a global signalling mechanism for request/response lifecycle #1348

untitaker opened this issue Sep 6, 2019 · 12 comments

Comments

@untitaker
Copy link

untitaker commented Sep 6, 2019

Core Django, Flask and some other web frameworks have a way to register a callback whenever a request is handled or a response is sent out. This registry exists globally and independently of application state. It's invaluable to monitoring tools like Newrelic and Sentry, as their SDKs/Agents can just subscribe to signals at any point in time without requiring the user to register a middleware or really do any other kind of configuration (example). For frameworks/libraries where such signals don't exist, there is still the possibility of monkeypatching (example). I am affiliated with Sentry, as I work on the Sentry SDK for Python.

Channels offers no possibility to safely hook into the request/response lifecycle of the ASGI application it exports. Like most frameworks it has no global signals, but the routing mechanism is left up to the user in a way that makes it very hard to wrap the outermost ASGI application even through monkeypatching. Django also allows the user to wrap the Django-exported WSGI application in arbitrarily many layers of WSGI middleware that we cannot instrument, but Channels makes it a common pattern for the user to use ASGI middleware while it's uncommon enough in Django to ignore the issue there.

What I would like to see is the following:

  1. Modify channels.routing.get_default_application() to wrap the discovered ASGI application in a middleware ChannelsApp that does nothing but forward to the inner application. This already allows us to monkeypatch to achieve our goal.
  2. (optional) Add new global signals that are emitted by this middleware. We are fine with using internal APIs to instrument channels, but official APIs are always better.

Even just the first change enables Sentry and similar tools to hook into channels before Django settings have been loaded. This point is crucial because:

  • Sentry's SDK is initialized as part of loading settings in the first place
  • NewRelic's Agent attempts to register signals and monkeypatch shortly after process startup

The status-quo solution to this problem is to do one of the following:

  • Patch AsgiHandler exclusively if you only care about HTTP

  • Patch ProtocolTypeRouter under the assumption that most people use this type to wrap everything else

  • Use some sort of trampoline to look into Django settings to discover ASGI_APPLICATION, and then patch that object. This opens two more choices:

    • Patch Django to know when settings are loaded
    • Do it on the first request where you can be sure that settings are loaded

    This is a bit involved because ASGI_APPLICATION can be any kind of object that might be callable without providing you a settable __call__. For example it might be a primitive function, in which case patching func_code/__code__ might do the trick.

Would love to know what you think about this. I am happy to prepare the patch for this.

Django's upcoming ASGI support seems to do much better in this regard, as there only ASGIHandler needs to be patched.

@carltongibson
Copy link
Member

Hmmm. My initial thought here is, can’t you already just wrap the ASGI app in a Sentry provided app/middleware (which in ASGI is the same thing really)?

@untitaker
Copy link
Author

@carltongibson no serious apm tool gives the user something to configure, as sooner or later it has to monkeypatch anyway for things that cannot be instrumented through official APIs.

From my op:

their SDKs/Agents can just subscribe to signals at any point in time without requiring the user to register a middleware or really do any other kind of configuration

@carltongibson
Copy link
Member

I’m not sure I quite follow you. (Surely I have to configure something...)

Why not just say “Edit asgi.py to wrap the default application in the Sentry middleware, and you’re done! 💃”?

(Just trying to understand what you’re proposing.)

@untitaker
Copy link
Author

@carltongibson Take a look here and here to see what is actually configured, and take a look at the code I linked in the OP to see how Django is then instrumented.

Ideally things "just work" by calling init() so we don't have to document a 12-step process to get sentry up and running (and then that process might depend on the framework's version too). Following this dogma has worked quite well so far. The inverse extreme of this would be to let the user hook everything explicitly, which puts the maintenance burden of accessing potentially undocumented APIs on them.

Django's and Flask's global signalling mechanism are the kind of API I'd ultimatively like to see, so that's what I am proposing.

@untitaker
Copy link
Author

Note that channels already has a signals module that contains some definitions, but those signals seem to be never fired.

@hishnash
Copy link
Contributor

@carltongibson wrapping in a sentry middleware does not help that much since scope is immutable knowing what application the request was routed to is hard.

as an example here is the middleware I have used: https://gist.github.com/hishnash/4e44f2b617a67379cfc63129e141f9c1

this groups issues by the URL (that is not optimal since 2 most of our urls have variables within the path)

to enable grouping on a per-handler (like sentry does for vanilla Django) we need to either wrap each consumer or Patch/Subclass the PathRouter middleware.

this makes adding sentry to an application a big code change in comparison to adding sentry to vanilla Django were you just init sentry and it hooks into Django to capture info but you do not need to modify any other code (very useful if you only want to have sentry in the loop in some deployments but not others)

@carltongibson
Copy link
Member

Yeah... At this point, I'm still not sure what to say — I haven't had the time to follow the various links and internalise all that, so...?

I'd suggest the best way forward would be a PoC PR saying, "This is what it would look like, and these are the benefits it would bring".

If the footprint were small enough (relative the gain) why not?

(It's not that I'm anti any proposal here, but rather I'm not yet sure what's involved.)

@untitaker
Copy link
Author

@carltongibson this would be the minimal patch:

diff --git a/channels/routing.py b/channels/routing.py
index 62d7bd5..a7bd06d 100644
--- a/channels/routing.py
+++ b/channels/routing.py
@@ -39,7 +39,17 @@ def get_default_application():
         raise ImproperlyConfigured(
             "Cannot find %r in ASGI_APPLICATION module %s" % (name, path)
         )
-    return value
+    return DefaultApplication(value)
+
+
+class DefaultApplication:
+    __slots__ = ("inner",)
+
+    def __init__(self, inner):
+        self.inner
+
+    def __call__(self, scope):
+        return self.inner(scope)
 
 
 class ProtocolTypeRouter:

With that alone all the APM tools can monkeypatch DefaultApplication.__call__ to read request data. This is absolutely sufficient for me.

The nicer solution would be to emit some signals in DefaultApplication that those tools can subscribe to.

@carltongibson
Copy link
Member

OK, but then this is exactly what I first suggested, except I'd have you do this:

# project/asgi.py

...

application = SentryWrapper(get_default_application())

Which is clear and explicit, and doesn't need any monkey patching.

I don't then understand this comment at all:

@carltongibson wrapping in a sentry middleware does not help that much since scope is immutable knowing what application the request was routed to is hard.

What am I missing here?

@untitaker
Copy link
Author

except I'd have you do this

That's the difference I care about. This is the entire premise of this issue:

It's invaluable to monitoring tools like Newrelic and Sentry, as their SDKs/Agents can just subscribe to signals at any point in time without requiring the user to register a middleware or really do any other kind of configuration

I believe @hishnash's issue is unrelated.

@carltongibson
Copy link
Member

OK, I understand.

In that case, no, this isn't something I'd want to add. Given the choice between explicitly wrapping the ASGI application or opaquely monkey patching in, I would always opt for the former. I think we do no-one any real favours otherwise.

Explicit beats implicit, and ASGI's one obvious way to do things is wrapping applications in this manner.

Sorry.

@untitaker
Copy link
Author

The third choice is to have global signals like Flask. I mentioned this possibility four times now. Honestly it's frustrating to engage with someone who doesn't bother to read anything that I write (and doesn't ask for clarification either in case my comments were too vague).

Be that as it may, Django 3 will have the signals we need, so I guess time will solve this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants