Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIP-49] Airflow OpenTelemetry Provider #37989

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

howardyoo
Copy link
Contributor

closes #37628


Airflow OpenTelemetry Provider is a part of the Airflow OpenTelemetry feature that will allow DAG writers to emit opentelemetry metrics and traces conveniently within the framework of Airflow's provider. User who do not wish to use Airflow's core otel feature, or having Airflow version that does NOT have the otel feature available can use the provider package to send user specific metrics and traces.

The provider package also comes with listener plugin that will, when configured, automatically emit traces on DAG runs and TaskInstance runs. You can also combine the user-generated spans as part of the DAG run, giving users a complete control over how they would like to generate OpenTelemetry data while running DAGs.

pyproject.toml Show resolved Hide resolved
@potiuk
Copy link
Member

potiuk commented Mar 14, 2024

Finally made first pass.

@howardyoo
Copy link
Contributor Author

howardyoo commented Mar 14, 2024 via email

@hussein-awala
Copy link
Member

The PR was broken with around 100 unrelated commits, I just squashed all related commits into a single commit and rebased main as it was impossible to review the PR. Could you please check it?

@howardyoo
Copy link
Contributor Author

The PR was broken with around 100 unrelated commits, I just squashed all related commits into a single commit and rebased main as it was impossible to review the PR. Could you please check it?

sure, willl do. Appreciate doing it.

@howardyoo
Copy link
Contributor Author

The PR was broken with around 100 unrelated commits, I just squashed all related commits into a single commit and rebased main as it was impossible to review the PR. Could you please check it?

Just checked it, the merged commit looks good!

# Get all triggers that have no task instances depending on them...
ids = session.scalars(
# Get all triggers that have no task instances depending on them and delete them
ids = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's unrelated ? (bad merge?)

#
# END PROVIDER EXTRAS HERE

# PLEASE DO NOT MODIFY THIS SECTION MANUALLY. IT WILL BE OVERWRITTEN BY PRE-COMMIT !!
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad merge here. It should be regenerated after the recent change for dynamic properties in pyproject toml. Copying the current version from main and re-running pre-commit should fix it.

The `devel` extras are not available in the released packages. They are only available when you install
Airflow from sources in `editable` installation - i.e. one that you are usually using to contribute to
Airflow. They provide tools such as `pytest` and `mypy` for general purpose development and testing.
aiobotocore, airbyte, alibaba, all, all-core, all-dbs, amazon, apache-atlas, apache-beam, apache-
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bad merge here - should be replaced with the one from main and pre-commit run.

@potiuk
Copy link
Member

potiuk commented Apr 9, 2024

It looks good in general - I left a few comments where bad rebase/squash occured. Tests and Docs need to be fixed of course.

I think we need to add more description (separate page in the "guides" section) explaining the usage of the provider - basically what you have described in the PR description here - but likely with some examples and explanation how the provider can be configured and used.

It's very useful to be able to have traces automatically generated with listeners - this will enable a lot of opentelemetry configuration for earlier versions of airflow - particularly I think we need an example showing how to configure earlier version of Airflow (2.7.0+) and show few examples of how open-telemetry custom events can be emitted.

"""Check whether otel listener is disabled."""
return (
is_otel_traces_enabled() is not True
and os.getenv("OTEL_LISTENER_DISABLED", "false").lower() == "false"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be a provider-specific configuration in provider.yaml rather than environment variable in [opentelemetry] section

@potiuk
Copy link
Member

potiuk commented Apr 9, 2024

One more comment here - I think we need to clearly describe what is the relation between the provider's traces that plugin interface provides - in the future when we merge the #37752 . Will traces replace those "plugin" events ? Or wil they be running side-by-side ? it's not clear for me at this moment what is the setup for:

  • 2.7.0 - 2.9.0 Airflow with the Opentelemetry provider and traces/metrics enabled
  • 2.10.0 with provider and traces/metrics enabled.

@howardyoo
Copy link
Contributor Author

One more comment here - I think we need to clearly describe what is the relation between the provider's traces that plugin interface provides - in the future when we merge the #37752 . Will traces replace those "plugin" events ? Or wil they be running side-by-side ? it's not clear for me at this moment what is the setup for:

  • 2.7.0 - 2.9.0 Airflow with the Opentelemetry provider and traces/metrics enabled
  • 2.10.0 with provider and traces/metrics enabled.

So, in order to explain this a little bit more, the airflow's core otel trace will effectively 'replace' the provider's plugin events in case the provider detects existing OTEL tracing is currently enabled and running. yes, should be more clearer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Airflow Provider for OpenTelemetry
3 participants