Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ASGI Support #162

Closed
tomchristie opened this issue Nov 1, 2018 · 19 comments
Closed

ASGI Support #162

tomchristie opened this issue Nov 1, 2018 · 19 comments
Labels
Enhancement New feature or request Help wanted Extra attention is needed New Integration Integrating with a new framework or library

Comments

@tomchristie
Copy link

Hi folks. I'm considering tackling an ASGI integration, that'd bring support to Starlette, Responder, Quart.

Because we'd need to monkey-path a generic application __call__ method, rather than a known class, you'd need to pass the app instance to the integration instantiation.

import sentry_sdk
from sentry_sdk.integrations.asgi import ASGIIntegration

app = Starlette()

sentry_sdk.init(
    dsn="https://<key>@sentry.io/<project>",
    integrations=[ASGIIntegration(app)]
)

Alternatively, you could design a generic ASGI implementation, and just have very light individual framework-specific integrations that just plugged that into starlette's Starlette.__call__, responder's API.__call__, etc...

When handled properly ASGI exceptions are still propagated out of the stack, even when a technical 500 page has been sent by the framework, so I think that along with logging there's plenty of generic context that can be captured here.

I guess my initial questions would be:

  • Should I be thinking of this in terms of a third-party library, or looking to build it in for the official sentry-python package as an integration?
  • What important context am I likely to be missing out on by working at this level, rather than at a framework specific level? (It's possible that we might get some loose patterns around where things like user state or routed endpoint should get kept in the ASGI scope dictionary, so it's feasible we might be able to capture some of that, as well as the basic request info)
@untitaker
Copy link
Member

untitaker commented Nov 1, 2018

We haven't figured out how to expose the existing WSGI integration yet, because right now all integrations monkeypatch whichever framework they are integrating with globally. This is strongly preferred over passing app objects to the SDK (initialization order suddenly doesn't matter anymore), but it's obv. not possible for pure WSGI apps. I think the best way would be to export a middleware without using the integrations interface at all, which operates on Hub.current. I guess all of this applies to ASGI.

What important context am I likely to be missing out on by working at this level, rather than at a framework specific level?

I am not sure how you get user info into the ASGI scope, but if capturing this stuff works for the most popular frameworks I think it'd be fine.

We generally try to capture request data in the same way the framework parsed it, not how it actually looks. E.g. if there's a framework that decoded the request body (Django Rest Framework), we should try to use that decoded thing and not try to parse it on our own. The two major advantages we see here are:

  • Frameworks are usually seeing structured data more often than the SDK would see at the WSGI/ASGI level. This allows for prettier formatting in Sentry's UI, better payload truncation, and maybe also PII stripping sometime.
  • Performance: decode once vs twice

When handled properly ASGI exceptions are still propagated out of the stack

This is interesting, because for WSGI we usually would loose any exception information by the time the error page has been rendered. This is why we currently have framework-specific integrations and don't see exposing the WSGI middleware as a big priority. Not having 500s lead to an error out of the box is kind of a dealbreaker for most people.

Should I be thinking of this in terms of a third-party library, or looking to build it in for the official sentry-python package as an integration?

My personal take on this is: We're concerned about having to maintain too many integrations by ourselves, at the same time separating into too many packages will degrade UX, especially given the state of Python's packaging.

Right now I am mostly interested in a PoC because I don't quite understand how you would have a usable framework-agnostic integration with an ASGI middleware that is able to capture 500s properly.

That said I haven't looked into that ecosystem very much. It would be nice to have more async frameworks supported just for testing if the core client is actually usable in those environments. At the same time there's not a ton of users on asyncio it seems

@tomchristie
Copy link
Author

I think the best way would be to export a middleware without using the integrations interface at all, which operates on Hub.current. I guess all of this applies to ASGI.

Still getting to grips with the terminology and usage, but as I understand it, it'd be a sensible approach to create a new hub for each individual request/response cycle right? Or not? I can see from the Sanic case that 3.7's contextvars support is needed, which enables sentry to tie otherwise separate events such as logging back into the same context as the main exception recording.

Right now I am mostly interested in a PoC

Indeed. Based on this convo, I think that implementing a (third party) ASGI middleware is the approach I'd take for demonstrating this initially.

At the same time there's not a ton of users on asyncio it seems

Yeah no doubt. We'll need to see how things mature, but there's certainly lots of work in this area. One other thing that will start to happen is ASGI frameworks that also support regular sync views that run in a threadpool. That means you get a ton of ASGI benefits such as WebSockets, SSE, HTTP/2 server push, managed background/timer/clock tasks, while still working with standard thread-concurrency ORMs, and be able to drop into async views in cases where it's really necessary. If existing WSGI frameworks are able to make the transition that'd be a big deal too.

@tomchristie
Copy link
Author

tomchristie commented Nov 2, 2018

Oh right, use the current hub but push a new scope onto it. (And make sure it's Python 3.7+ because otherwise it's not going to be properly task-context-aware anyways)

@untitaker
Copy link
Member

Wrt hubs see #147, you need a new hub in asyncio for now

@tomchristie
Copy link
Author

I have this... https://github.com/encode/sentry-asgi

That works great for me when capturing exceptions from Starlette.

One bit I haven't been able to make work is the integration with using capture_message inside the codebase (You'll see I have a test cases there marked as expected failure.)

For whatever reason my registered event processor doesn't appear to fire in that case.

@untitaker
Copy link
Member

untitaker commented Nov 6, 2018 via email

@tomchristie
Copy link
Author

I would prefer that the middleware doesn't call the SDK's init though.

Done.

Feel free to add a link to this page

Sure. I'll wait 'til we've thrashed things out just a little more.

I'll take a look at the failing test sometime this week

Okay. Happy to collaborate there or whatever.

@untitaker
Copy link
Member

untitaker commented Nov 7, 2018 via email

@tomchristie
Copy link
Author

Apparently the request handler of your starlette app is running in a different thread than the middleware.

Gotcha yes. So this'll be a common pattern in ASGI frameworks - synchronous views ought to run within a threadpool executor. Having updated the test to use an async view I have everything passing.

Does the Sentry SDK have some indication of "child" contexts? Eg. suppose I was creating a sub-task rather than a thread, do events within that get attached to the same Hub as the original task? Is there some generic way that we can indicate to the context that the threadpool executor is a child context of the main thread?

@tomchristie
Copy link
Author

@tomchristie
Copy link
Author

@untitaker
Copy link
Member

Sentry has no concept of tasks, it just uses contextvars. The child context has to be created explicitly, basically the with Hub(old_hub) you did.

You can do the copy_context thing as well, but you need to create a new hub for each request anyway.

I would attach the new hub to the asgi scope or wherever, and do with myhub in the same thread as the request handler

@tomchristie
Copy link
Author

Righty, all resolved: https://github.com/encode/sentry-asgi

I'll actually switch things around at some point so that this is handled by the framework automatically, and propose the fix to other frameworks.

They won't need to overriding the default executor, but they should ensure that submitted jobs run within the correct context.

I would attach the new hub to the asgi scope or wherever

Really it's a general issue. ThreadPools really ought to support running within the same context, but until that becomes a language feature it'll remain an implementation issue for "This is the pattern that async frameworks ought to use when spinning sync work off to threads".

Related: The Sentry SDK could bring async context support to earlier Python versions, with https://github.com/MagicStack/contextvars in order to support 3.5, 3.6 properly.

@untitaker
Copy link
Member

untitaker commented Nov 7, 2018

Sounds good! Another thing I noticed is that send_default_pii is not honored. Concretely we don't send IP addresses, cookies, etc (see wsgi code)

@tomchristie
Copy link
Author

Okay, the latest release of sentry_asgi leaves it up to the framework to handle the contextvars support correctly.

The middleware will still work on 3.5/3.6, but obviously won't properly tie in logging etc... to the request context.

ContextVars issue resolved in Starlette: encode/starlette#192
Proposed in Responder: kennethreitz/responder#211
Noted against ASGI repo: django/asgiref#71

Another thing I noticed is that send_default_pii is not honored. Concretely we don't send IP addresses, cookies, etc (see wsgi code)

Gotcha. I'll add an issue for that.

@untitaker
Copy link
Member

Cool stuff! I'm going to leave this issue open until we at least link to the ASGI middleware in the docs.

@untitaker untitaker added Enhancement New feature or request Help wanted Extra attention is needed New Integration Integrating with a new framework or library labels Nov 10, 2018
@untitaker
Copy link
Member

@tomchristie So this came up again in the context of https://stackoverflow.com/questions/53541652/asgi-sentry-errors-appear-to-show-mixed-tracebacks-from-different-requests. I wonder if we can figure out how to automatically enable the ASGI middleware for frameworks that optionally use ASGI and we already support. Not sure how to hook things up without mashing everything into one package.

@tomchristie
Copy link
Author

@untitaker I don't know anything about how existing Django channels projects are typically configured with Sentry.

I've issues this PR to asgiref (a channels dependency), which will ensure that contextvar is properly supported. It's possible that would be sufficient to resolve the S/O issue. (Or else they need both that to land in channels, and to be using this ASGI Sentry support, instead of however the existing integration is setup)

@untitaker
Copy link
Member

0.10.2 adds an ASGI middleware: https://docs.sentry.io/platforms/python/asgi/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement New feature or request Help wanted Extra attention is needed New Integration Integrating with a new framework or library
Projects
None yet
Development

No branches or pull requests

2 participants