Use OTel for performance instrumentation (PoC) #2272

sentrivana · 2023-07-26T12:26:01Z

This is a PR for powering our performance monitoring with OTel instead of our custom Sentry instrumentation. Note that this is a proof of concept which we might end up utilizing or not -- depending on how successful this attempt is at addressing the various issues we've identified with regards to our compatibility with OTel.

See the associated issue for more details on the motivation behind this.

On OTel's `opentelemetry-instrument`

As the goal was to make this work automatically without requiring the user to set anything up, the autoinstrumentation builds on what the official opentelemetry-instrument tool does, but without having to actually use it to run a program (opentelemetry-instrument python app.py).

What opentelemetry-instrument does to make autoinstrumentation work is that it uses the built-in site mechanism to run the instrumentation code before anything else is run. In simplified terms, site is a Python module loaded at interpreter start that looks for a file called sitecustomize.py in specific locations and if it finds it, it loads and runs it before any user code is run. sitecustomize.py is usually meant to be put in site-packages. opentelemetry-instrument works around this by first doing some path manipulation before execl-ing to force site to look at its sitecustomize.py, which does the autoinstrumentation. Being able to ensure that the autoinstrumentation runs before any user code is handy since by the time actual user code is executed, everything necessary has been monkeypatched in the background.

We can't use the sitecustomize.py hack unless we also want to execl the user program, which seems dangerous. This means we need to require folks to sentry_sdk.init() before importing any class that's supposed to get instrumented. This PR tries to work around that by looking at sys.modules after the autoinstrumentation has finished patching, searching for unpatched classes, and patching them.

Differences to `opentelemetry-instrument`

We're doing the same things as OTel does for autoinstrumenting, with some differences.

The autoinstrumentation itself is taken from here. We do the same except calling load_configurators(), which as far as I can tell does this in the background. OTel describes configurators as "Configurators are used to configure SDKs (i.e. TracerProvider, MeterProvider, Processors...) to reduce the amount of manual configuration required." The autoinstrumentation seems to work fine without this, and we do some minimal configuration ourselves.

We also don't do any PYTHONPATH manipulations since these seem to only have to do with making the sitecustomize trick describe above work and are then reverted right away.

Closes #2251

sentry_sdk/integrations/opentelemetry/integration.py

sl0thentr0py · 2023-08-10T11:15:33Z

sentry_sdk/integrations/opentelemetry/integration.py

+    inheriting from it). In those cases it's still necessary to sentry_sdk.init()
+    before importing anything that's supposed to be instrumented.
+    """
+    for module_name, module in sys.modules.copy().items():


I don't understand the outermost loop here, why do we need to go through sys.modules and then original_classes in each of them? Can you explain with a simple flask example?

So imagine a user has something like this:

# app.py from flask import Flask import sentry_sdk sentry_sdk.init(...) # potel enabled app = Flask()

The OTel autoinstrumentation runs during sentry_sdk.init(), patching the Flask class with its own _InstrumentedFlask, but since the user imported Flask before, in app.py it is still the old unpatched Flask, and so is app.

The loop goes through all modules loaded so far via sys.modules and does the following for each of the original classes:

Tries reimporting them to see if what we import is instrumented or not. This is used to check if the autoinstrumentation was actually successful in the first place, because if not, we don't want to patch any leftover unpatched classes. (This check needs to be moved out of this loop since this is something that needs to only be checked once for a class, I will do that.)

After having verified that the patching was successful, it then goes through the vars of each sys.module and checks whether the original type is in scope, and if so, replaces it with the patched type.

So in the above example, it would find the app.py module in sys.modules, it would check it for occurrences of the original Flask type, and replace them with _InstrumentedFlask.

sentry_sdk/integrations/opentelemetry/integration.py

sl0thentr0py · 2023-08-17T16:10:24Z

ok just thinking out loud because this way we'll need to maintain the dict INSTRUMENTED_CLASSES and also it's kinda hard to reason about what patching is happening when.

The BaseInstrumentor class in otel (that all their instrumentors derive from) has the following:

instrument
uninstrument
_is_instrumented_by_opentelemetry attribute

can we somehow make the logic a bit cleaner (and also possibly more general) by inspecting / triggering those methods ?

sentrivana · 2023-08-18T07:39:25Z

ok just thinking out loud because this way we'll need to maintain the dict INSTRUMENTED_CLASSES and also it's kinda hard to reason about what patching is happening when.

The BaseInstrumentor class in otel (that all their instrumentors derive from) has the following:
* `instrument`

* `uninstrument`

* `_is_instrumented_by_opentelemetry` attribute
can we somehow make the logic a bit cleaner (and also possibly more general) by inspecting / triggering those methods ?

This is a good point, I tried using those directly in the beginning but then dropped that approach in favor of this more automagic approach. This was at a time when I wasn't doing any post-patching, but with that now added on top, there are essentially two ways to patch the same thing which is not great.

So my plan now:

see if I can make the whole thing (auto patching + the post patching for things imported earlier) use the methods/attrs directly and make it more consistent
if not, get rid of the post-patching entirely (meaning people will have to init sentry_sdk before importing anything that should be instrumented), but still see if we can do the initial patching by using the methods/attrs directly

No need to reimport the _InstrumentedClass again to replace the old class. We already have a ref to the fully set up _InstrumentedClass from before.

sentrivana added 12 commits August 4, 2023 12:49

Sort integrations

e7452f1

Fix test

c6585a7

Fix flake8 warning

b8a0339

Add basic otel integration

f044604

mypy

9ca2add

mypy

628af21

add more otel instrumentation libraries

fcd44cf

work around import order

ee4d016

minor tweaks

7609159

naming

c402deb

rudimentary tests

1d5b6ad

add django

4c6e310

sentrivana force-pushed the ivana/performance-powered-by-otel branch from ce9dbf4 to 4c6e310 Compare August 4, 2023 10:50

sentrivana added 8 commits August 4, 2023 14:40

properly check if something has been instrumented

dc01799

add missing annotation

1629463

improve txn mapping

8eabfff

set op for txns too

91b65c8

tweak logger msgs

b748978

tweak text

2acbef6

remove debug

d22bb6a

remove extra newline

f71d43e

sentrivana marked this pull request as ready for review August 7, 2023 10:57

sentrivana requested a review from sl0thentr0py August 7, 2023 14:20

sentrivana added 3 commits August 8, 2023 14:52

add to Experiments

cd68f99

Merge branch 'master' into ivana/performance-powered-by-otel

686d9ef

Merge branch 'master' into ivana/performance-powered-by-otel

a6ac7f9

sl0thentr0py requested changes Aug 10, 2023

View reviewed changes

sentrivana added 3 commits August 10, 2023 14:08

review fixes

ce097af

fix

6333c89

fix

30446eb

sentrivana added 2 commits August 10, 2023 14:19

Merge branch 'master' into ivana/performance-powered-by-otel

2faa56f

lint fix

a3659bd

sentrivana requested a review from sl0thentr0py August 16, 2023 11:42

sentrivana added 4 commits August 16, 2023 13:42

Merge branch 'master' into ivana/performance-powered-by-otel

5f88fc8

Merge branch 'master' into ivana/performance-powered-by-otel

6341ae7

add aiohttp-client otel package

8ee0f5f

Merge branch 'master' into ivana/performance-powered-by-otel

c1514bb

sentrivana added 2 commits August 18, 2023 09:39

Merge branch 'master' into ivana/performance-powered-by-otel

69dc6b7

Use the already imported instrumented OTel class

b991755

No need to reimport the _InstrumentedClass again to replace the old class. We already have a ref to the fully set up _InstrumentedClass from before.

sl0thentr0py approved these changes Aug 28, 2023

View reviewed changes

Merge branch 'master' into ivana/performance-powered-by-otel

3192771

sentrivana enabled auto-merge (squash) August 28, 2023 09:05

sentrivana merged commit 3d2517d into master Aug 28, 2023
245 of 246 checks passed

sentrivana deleted the ivana/performance-powered-by-otel branch August 28, 2023 09:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use OTel for performance instrumentation (PoC) #2272

Use OTel for performance instrumentation (PoC) #2272

sentrivana commented Jul 26, 2023 •

edited

sl0thentr0py Aug 10, 2023

sentrivana Aug 10, 2023

sl0thentr0py commented Aug 17, 2023 •

edited

sentrivana commented Aug 18, 2023

Use OTel for performance instrumentation (PoC) #2272

Use OTel for performance instrumentation (PoC) #2272

Conversation

sentrivana commented Jul 26, 2023 • edited

On OTel's opentelemetry-instrument

Differences to opentelemetry-instrument

sl0thentr0py Aug 10, 2023

Choose a reason for hiding this comment

sentrivana Aug 10, 2023

Choose a reason for hiding this comment

sl0thentr0py commented Aug 17, 2023 • edited

sentrivana commented Aug 18, 2023

sentrivana commented Jul 26, 2023 •

edited

On OTel's `opentelemetry-instrument`

Differences to `opentelemetry-instrument`

sl0thentr0py commented Aug 17, 2023 •

edited