Skip to content

Add spark.openlineage.appName tag to spark.application spans#11246

Merged
gh-worker-dd-mergequeue-cf854d[bot] merged 3 commits intomasterfrom
adrien.boitreaud/ol-jobname
May 4, 2026
Merged

Add spark.openlineage.appName tag to spark.application spans#11246
gh-worker-dd-mergequeue-cf854d[bot] merged 3 commits intomasterfrom
adrien.boitreaud/ol-jobname

Conversation

@aboitreaud
Copy link
Copy Markdown
Contributor

@aboitreaud aboitreaud commented Apr 30, 2026

What Does This Do

Capture on the spans the OL job name. can be overriden in OL so we need to capture it and have it on the span

Motivation

OpenLineage names the application job using spark.openlineage.appName if set, falling back to spark.app.name otherwise.
The gap: when a user sets spark.openlineage.appName to something different from spark.app.name, OL uses that override as its job name but Datadog never reads that key — so job_name on the span = spark.app.name while OL's job name = spark.openlineage.appName. Correlation with our job_name attr breaks.

Additional Notes

Contributor Checklist

Jira ticket: [PROJ-IDENT]

Note: Once your PR is ready to merge, add it to the merge queue by commenting /merge. /merge -c cancels the queue request. /merge -f --reason "reason" skips all merge queue checks; please use this judiciously, as some checks do not run at the PR-level. For more information, see this doc.

Copy link
Copy Markdown
Contributor

@pawel-big-lebowski pawel-big-lebowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice tests 🥇

@aboitreaud aboitreaud marked this pull request as ready for review May 4, 2026 07:17
@aboitreaud aboitreaud requested a review from a team as a code owner May 4, 2026 07:17
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

@aboitreaud
Copy link
Copy Markdown
Contributor Author

/merge

@gh-worker-devflow-routing-ef8351
Copy link
Copy Markdown

gh-worker-devflow-routing-ef8351 Bot commented May 4, 2026

View all feedbacks in Devflow UI.

2026-05-04 07:17:59 UTC ℹ️ Start processing command /merge


2026-05-04 07:18:09 UTC ℹ️ MergeQueue: waiting for PR to be ready

This pull request is not mergeable according to GitHub. Common reasons include pending required checks, missing approvals, or merge conflicts — but it could also be blocked by other repository rules or settings.
It will be added to the queue as soon as checks pass and/or get approvals. View in MergeQueue UI.
Note: if you pushed new commits since the last approval, you may need additional approval.
You can remove it from the waiting list with /remove command.


2026-05-04 08:02:14 UTC ℹ️ MergeQueue: merge request added to the queue

The expected merge time in master is approximately 1h (p90).


2026-05-04 09:29:05 UTC ℹ️ MergeQueue: This merge request was merged

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c4bac7b335

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

}

private void captureOpenlineageJobInfo(AgentTracer.SpanBuilder builder) {
String olAppName = sparkConf.get("spark.openlineage.appName", null);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Read OpenLineage app name from OL listener config

captureOpenlineageJobInfo reads spark.openlineage.appName only from sparkConf, but this codebase already documents and handles cases where Spark config is not captured (see LiveListenerBusAdvice, which reflects getConf() and stores openLineageSparkConf for that reason). In those environments (notably the Databricks path called out in AbstractSparkInstrumentation), OpenLineage can still resolve an app name while this span tag remains missing, so the intended OL/DD correlation is still broken. Prefer reading from openLineageSparkConf when available, with fallback to sparkConf.

Useful? React with 👍 / 👎.

@aboitreaud aboitreaud added inst: apache spark Apache Spark instrumentation type: enhancement Enhancements and improvements labels May 4, 2026
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot merged commit bfbfa5b into master May 4, 2026
570 of 574 checks passed
@gh-worker-dd-mergequeue-cf854d gh-worker-dd-mergequeue-cf854d Bot deleted the adrien.boitreaud/ol-jobname branch May 4, 2026 09:29
@github-actions github-actions Bot added this to the 1.62.0 milestone May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

inst: apache spark Apache Spark instrumentation type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants