Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark: Add SparkApplicationDetailsFacet #2688

Merged

Conversation

dolfinus
Copy link
Contributor

@dolfinus dolfinus commented May 10, 2024

Problem

Discussion: #2589 (comment)

Solution

Add run facet containing Spark application information:

  • master: string
  • appName: string
  • applicationId: string
  • deployMode: string
  • userName: string
  • driverHost: string
  • webUiUrl: string | null - URL of Spark driver WebUI. It can be null if spark.ui.enabled is false.
  • proxyUrl: string | null - URL of Spark driver if it is served behind a reverse proxy (e.g. K8s ingress, YarnUI proxy), if any.
  • historyUrl: string | null - URL of Spark history server, if any. Can be set only on Yarn, because K8s and standalone Spark instances does not populate option with Spark History address, only with event logs location.

One-line summary:

Add SparkApplicationDetailsFacet to runEvents emitted on Spark application start.

Checklist

  • You've signed-off your work
  • Your pull request title follows our guidelines
  • Your changes are accompanied by tests (if relevant)
  • Your change contains a small diff and is self-contained
  • You've updated any relevant documentation (if relevant)
  • Your comment includes a one-liner for the changelog about the specific purpose of the change (not required for changes to tests, docs, or CI config)
  • You've versioned the core OpenLineage model or facets according to SchemaVer (if relevant)
  • You've added a header to source files (if relevant)

SPDX-License-Identifier: Apache-2.0
Copyright 2018-2023 contributors to the OpenLineage project

@dolfinus
Copy link
Contributor Author

dolfinus commented May 10, 2024

This facet is added only to SparkListenerApplicationStart events, but currently such events emitted with no facets. So waiting for #2677

@dolfinus dolfinus force-pushed the feature/spark-application-details branch 3 times, most recently from 1cd4a2f to d37c147 Compare May 10, 2024 20:32
@dolfinus dolfinus force-pushed the feature/spark-application-details branch 3 times, most recently from 46805e0 to 37e2137 Compare May 15, 2024 08:16
@dolfinus dolfinus marked this pull request as ready for review May 15, 2024 08:33
Signed-off-by: Martynov Maxim <martinov_m_s_@mail.ru>
@dolfinus dolfinus force-pushed the feature/spark-application-details branch from 37e2137 to 71b1a69 Compare May 15, 2024 11:39
@dolfinus
Copy link
Contributor Author

dolfinus commented May 17, 2024

Can someone review, please? @pawel-big-lebowski @mobuchowski

Copy link
Member

@mobuchowski mobuchowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for the contribution @dolfinus .

@mobuchowski mobuchowski merged commit fabaa23 into OpenLineage:main May 17, 2024
33 checks passed
@dolfinus dolfinus deleted the feature/spark-application-details branch May 17, 2024 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants