Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

metrics, spark: add support for telemetry mechanism in Spark integration #2528

Merged
merged 1 commit into from Apr 4, 2024

Conversation

mobuchowski
Copy link
Member

This PR follows previous metrics PR #2496 and adds instrumentation to the Spark integration. This comprises of

  • setting global Spark version, OpenLineage-Spark version, disabled facets tags
  • emitting event received counters (Application, Job, SQL Start/End)
  • adding timers around RunFacets
  • adding serialized metrics to DebugFacet

Copy link
Contributor

@pawel-big-lebowski pawel-big-lebowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left few single comments' feedback above. Overall I am really happy with the progress you do on the PR and the shape it becomes. Amazing job @mobuchowski 🥇

@mobuchowski mobuchowski force-pushed the openlineage-metrics-spark branch 5 times, most recently from 19ced62 to 4ccf656 Compare March 28, 2024 12:51
Copy link
Contributor

@pawel-big-lebowski pawel-big-lebowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks really well and I appreciate the solution to extract timers for facet builders.

Did you experience any problems with testing? I couldn't find any new tests added. Would it make sense to see a debug facet filled with timers for each visitor and builder?

@mobuchowski mobuchowski force-pushed the openlineage-metrics-spark branch 3 times, most recently from d7d2637 to 43d6deb Compare April 3, 2024 18:10
Copy link
Contributor

@pawel-big-lebowski pawel-big-lebowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor cosmetic comments + a question about thread.sleep which I would love to avoid

@mobuchowski mobuchowski force-pushed the openlineage-metrics-spark branch 2 times, most recently from a411c2c to 6d93dfe Compare April 4, 2024 11:17
Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
@mobuchowski mobuchowski merged commit a353a75 into main Apr 4, 2024
42 checks passed
@mobuchowski mobuchowski deleted the openlineage-metrics-spark branch April 4, 2024 12:28
blacklight pushed a commit to blacklight/OpenLineage that referenced this pull request Apr 4, 2024
OpenLineage#2528)

Signed-off-by: Maciej Obuchowski <obuchowski.maciej@gmail.com>
Signed-off-by: Fabio Manganiello <fabio@manganiello.tech>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants