Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix parsing of Django path patterns #2452

Merged
merged 18 commits into from
Oct 25, 2023
Merged

Fix parsing of Django path patterns #2452

merged 18 commits into from
Oct 25, 2023

Conversation

sentrivana
Copy link
Contributor

@sentrivana sentrivana commented Oct 17, 2023

TL;DR: Our way of parsing named group regexes out of Django URL patterns is not ideal. If someone is using (Django 2.0+) <converter:parameter> path patterns, we can parse those directly without turning them into regexes first and potentially hitting our limitations there.

Background

When instrumenting Django for performance monitoring, the Sentry Python SDK will by default name the transactions based on the URL pattern rule that was used to serve the request.

There are two main ways of creating a URL pattern in Django: by using regexes in the rule directly, or (since Django 2.0) by using the custom <converter:parameter> syntax, like so:

urlpatterns = [
    path('articles/<int:year>/<int:month>/', views.month_archive),
]

These <converter:parameter> style patterns are also regexes in the background. Each of the converters defines its own regex (see e.g. the IntConverter) that is then used for matching URLs to the patterns.

The SDK is parsing both types of patterns with the aim of replacing all named groups with placeholders, so that e.g. the URL pattern above creates transactions with the name articles/{year}/{month}. To do this, the URL pattern is first turned into its regex form and then parsed with the RavenResolver.

This is not a problem with the default set of converters Django offers, but if folks define their own custom converters whose regexes don't play nice with our RavenResolver, we fail to extract the transaction name correctly. This can be prevented by parsing the original, simpler URL pattern instead without turning it into its regex form.

Note that this doesn't solve all transaction name problems as re_paths are still vulnerable, but it should make new-style paths work.

Closes #2446

@sentrivana sentrivana changed the title Django resolver experiments Fix parsing of new-style Django path patterns Oct 24, 2023
@sentrivana sentrivana changed the title Fix parsing of new-style Django path patterns Fix parsing of Django path patterns Oct 24, 2023
@sentrivana sentrivana marked this pull request as ready for review October 24, 2023 14:09
@sentrivana sentrivana marked this pull request as draft October 24, 2023 14:41
@sentrivana sentrivana marked this pull request as ready for review October 24, 2023 15:00
Copy link
Member

@sl0thentr0py sl0thentr0py left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice work! don't really know how the resolver works in detail but looks ok high level

sentry_sdk/integrations/django/transactions.py Outdated Show resolved Hide resolved
@sentrivana sentrivana enabled auto-merge (squash) October 25, 2023 14:38
@sentrivana sentrivana merged commit 0ce9021 into master Oct 25, 2023
284 checks passed
@sentrivana sentrivana deleted the ivana/django-matcher branch October 25, 2023 14:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Django URL patterns broken in 1.32.0
2 participants