Skip to content

Remove obfuscatory Trace, DebugTrace, and _TraceMeta classes#62154

Open
dstandish wants to merge 3 commits intoapache:mainfrom
astronomer:remove-bad-otel-stuff
Open

Remove obfuscatory Trace, DebugTrace, and _TraceMeta classes#62154
dstandish wants to merge 3 commits intoapache:mainfrom
astronomer:remove-bad-otel-stuff

Conversation

@dstandish
Copy link
Contributor

@dstandish dstandish commented Feb 19, 2026

These classes did basically nothing but obfuscate.

As far as I can tell, they only existed as confusing wrappers to call get_otel_tracer conditionally.

I don't think I changed behavior meaningfully, but it's hard to really tell.

Main action is in airflow-core/src/airflow/observability/trace.py‎ and the equivalent copy-pasted sdk one.

Side note

Ultimately I think we should probably also remove the OtelTrace class. But this is a first step in trying to clean this up a bit. I do not think the OtelTrace class should exist. It just slightly tweaks some of the methods in the upstream API. It also is inconsistent with EmptyTrace and their shared (but not actually implemented) protocol ("Tracer"). But this slight tweaking is confusing for anyone who wants to do any work here since it deviates from the official API for not much good reason. And it's confusing for AI agents as well.

cc @xBis7 @ashb @jedcunningham @nickstenning

These classes did basically nothing but obfuscate.

As far as I can tell, they only existed as confusing wrappers to call get_otel_tracer conditionally.
@boring-cyborg boring-cyborg bot added area:DAG-processing area:Executors-core LocalExecutor & SequentialExecutor area:Scheduler including HA (high availability) scheduler area:task-sdk area:Triggerer labels Feb 19, 2026
@dstandish dstandish changed the title Remove obfuscatory Trace and DebugTrace classes Remove obfuscatory Trace, DebugTrace, and _TraceMeta classes Feb 19, 2026
Copy link
Contributor

@nickstenning nickstenning left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like an improvement to me but I also agree with your assessment that in general this code is

deviat[ing] from the official API for not much good reason

Having a single tracer for the entire application is very much not how the OTel libraries are intended to be used and will make reasoning about where traces came from (and, for example, making sampling decisions) unnecessarily hard.

The intent of the design of the OTel packages is that instrumenting code doesn't need to know anything about whether tracing is configured or how. It should use the opentelemetry-api package, e.g.

from opentelemetry import trace

tracer = trace.get_tracer(__name__)

...


def some_function():
    # Add attributes on a preexisting span
    span = trace.get_current_span()
    span.set_attributes(
        {
            "agent.id": id_,
            "agent.uid": str(uid),
        }
    )

    # Start a new span
    with tracer.start_as_current_span("service.agent_supervisor.run_once"):
        do_work()

Only the code responsible for configuring the tracer provider, span processors, exporter, etc., would usually have to touch the opentelemetry-sdk package.

Reading between the lines I think there might be two separate issues here:

  1. A desire to provide a from airflow.* import that provides a shared abstraction over the tracing implementation rather than having from opentelemetry.* imports throughout the codebase.
  2. A desire to have traces and spans which are emitted conditionally, in order to increase or decrease the level of detail.

Personally, I'm not super sympathetic to (1) and would prefer we just instrumented things as above. I think if we really want to have our own interface to tracing, it should be a package that mirrors opentelemetry.trace to some extent.

from airflow.observability import trace 

tracer = trace.get_tracer(__name__)

On (2) I think there's a more fundamental issue, which is that if the desire is to provide configurable verbosity/detail levels for the traces, it doesn't make sense to try and achieve that at the level of a trace. It has to be a span-by-span decision, so that you can have optional "detailed" spans emitted within traces that are always emitted.

I think the most "native" way to implement this would be by writing a custom head-sampler and defining standard attributes which cause spans to be dropped if the configured detail level is too low, e.g.

from opentelemetry import trace

from airflow.observability.attributes import DETAIL_LEVEL

tracer = trace.get_tracer(__name__)

...

with tracer.start_as_current_span("myspan", attributes={DETAIL_LEVEL: 2}):
        do_work()

(Note that if you do this, it will need to be based on attributes provided at span creation time as head samplers do not have access to attributes created while the span is in flight.)

Copy link
Contributor

@xBis7 xBis7 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dstandish I'm in favor of removing the Trace metaclass and following a simpler approach. But we have to make sure that it's easy to extend in the future if needed to support more tracers. For example, Stats has 4 different implementation including the default no-op. We shouldn't assume that traces will always be either OTel or no-op.

I've added some comments in places that I've wanted to clean up for a while but I haven't had the time to do it.

Another topic for cleanup are all the debug traces. Anything that used to be called with the DebugTrace, is exposing internal operations. These might be useful to a developer and that's why I added a debug flag and didn't remove them entirely. But for sure, they are confusing to users who don't really care about individual steps in the scheduler's loop.

Here is the question, do you find it useful seeing dozen of spans, named after a method name, all with similar or different timings? I think we should either add more info to make them useful or remove them entirely. But that's out of scope for this PR.

return result


def gen_link_from_traceparent(traceparent: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on this change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And these as well, aren't used and should be removed

https://github.com/astronomer/airflow/blob/e604a20a21447579488cfd8fe48061dd625e0259/shared/observability/src/airflow_shared/observability/traces/utils.py#L42-L62

They are doing what extract() and inject() are designed to do.

)


class AirflowOtelIdGenerator(IdGenerator):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also be removed entirely. Generating a span_id and a trace_id are internal operations of the SDK and we shouldn't mess with them.

@@ -17,16 +17,44 @@
from __future__ import annotations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There has been a plan to refactor this, add an initialize method and move it under shared, similarly to Stats.

Also related
#62127

cc @potiuk @amoghrajesh

@xBis7
Copy link
Contributor

xBis7 commented Feb 19, 2026

If the integration tests here pass, then it means that the behavior has stayed the same.

https://github.com/astronomer/airflow/blob/remove-bad-otel-stuff/airflow-core/tests/integration/otel/test_otel.py

Also, you can test the behavior manually like so

  • ztart breeze with otel

    breeze shell --backend postgres --integration redis --integration otel
    
  • under the breeze shell, export this var

    export use_otel=true
    
  • then run each integration test

    pytest -s airflow-core/tests/integration/otel/test_otel.py::TestOtelIntegration::test_same_scheduler_processing_the_entire_dag
    
  • check jaeger

    http://localhost:36686/
    

@xBis7
Copy link
Contributor

xBis7 commented Feb 19, 2026

Hi @nickstenning,

A desire to provide a from airflow.* import that provides a shared abstraction over the tracing implementation rather than having from opentelemetry.* imports throughout the codebase.

We can't use opentelemetry.* imports throughout the codebase and let me explain why. Airflow is a complicated project composed of a bunch of independent processes and subprocesses. You don't have a central in-memory constructor that initializes a bunch of objects and shares them across all processes. When I'm saying processes, I'm referring to a scheduler, a worker, a dag processor, etc. Each one of these, needs to configure the SDK and get its own instance of a tracer. That's why we need a common file with the SDK initialization logic that can be reused.

Addiitionally, the current implementation with the metaclass makes sure that the SDK is initialized only once per process.

Apart from the initialization, airflow is very hard to auto-instrument. Many operations are depending on db updates. E.g. once that is changed in the db, then take this action. For that reason, we have to handle the span context change manually and start-end spans manually.

A desire to have traces and spans which are emitted conditionally, in order to increase or decrease the level of detail.

On (2) I think there's a more fundamental issue, which is that if the desire is to provide configurable verbosity/detail levels for the traces, it doesn't make sense to try and achieve that at the level of a trace.

The goal is for users to have continuous visibility under tasks. To be able to figure out, at which step of the execution the task is and then also understand how long each took.

Apart from the dag_run traces that also allow the creation of task sub-spans, IMO everything else should be removed. The debug spans are providing details about internal operations and they are only useful to developers who have better tools at their disposal like an IDE debugger or a flamegraph profiler. @nickstenning If I understand correctly, you are also agreeing on this.

@nickstenning
Copy link
Contributor

nickstenning commented Feb 19, 2026

That's why we need a common file with the SDK initialization logic that can be reused.

It's certainly true that we need common logic for SDK initialization, but I don't think it's a given that that should happen through calls to the tracing API (I'm using these terms in the sense of opentelemetry-api and opentelemetry-sdk).

It seems to me that it would be reasonable to have common SDK initialization logic which is called at bootstrap time for each process.

Separately, there's the question of what the instrumentation API looks like for instrumented code. It certainly sounds like there are some more complicated use cases where trace context will have to be retrieved from an external system, spans started and ended explicitly, etc., but none of this is unfamiliar territory for opentelemetry-api, and all the API methods support this natively: Tracer.start_span takes an optional Context, for example.

I'm not arguing that we shouldn't have some helper packages to make instrumenting Airflow easier, but I think the current implementation is somewhat mixing up and confusing the instrumentation API with the SDK. I think that should be avoidable and I'd recommend avoiding it.

The debug spans are providing details about internal operations and they are only useful to developers who have better tools at their disposal like an IDE debugger or a flamegraph profiler. @nickstenning If I understand correctly, you are also agreeing on this.

I'm not super familiar with what is instrumented with debug spans today. I suspect you are probably right.

There may still be good arguments for a configurable level of detail in the traces emitted by Airflow. If we want to do that I'd still suggest going the head-sampling route.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:DAG-processing area:Executors-core LocalExecutor & SequentialExecutor area:Scheduler including HA (high availability) scheduler area:task-sdk area:Triggerer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants

Comments