Skip to content

Conversation

@pawel-big-lebowski
Copy link
Contributor

@pawel-big-lebowski pawel-big-lebowski commented Oct 29, 2025

What Does This Do

Spark listeners on databricks environments are instantiated in slightly different way than on regular vanilla Spark setups. This PR assures the datadog listener instruments Openlineage listener correctly on the databricks environment.

Motivation

Additional Notes

Contributor Checklist

Jira ticket: [PROJ-IDENT]

@datadog-datadog-prod-us1
Copy link
Contributor

datadog-datadog-prod-us1 bot commented Oct 29, 2025

🎯 Code Coverage
Patch Coverage: 100.00%
Total Coverage: 73.62% (+14.04%)

View detailed report

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 7f18c4e | Docs | Datadog PR Page | Was this helpful? Give us feedback!

@pr-commenter
Copy link

pr-commenter bot commented Oct 29, 2025

Benchmarks

Startup

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master pawel.leszczynski/databricks-openlineage-support
git_commit_date 1763031873 1763035955
git_commit_sha b112f74 7f18c4e
release_version 1.56.0-SNAPSHOT~b112f74d8e 1.56.0-SNAPSHOT~7f18c4e09e
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1763037772 1763037772
ci_job_id 1232366518 1232366518
ci_pipeline_id 82236025 82236025
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-1-0e4yrzao 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-1-0e4yrzao 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
module Agent Agent
parent None None

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 58 metrics, 7 unstable metrics.

Startup time reports for insecure-bank
gantt
    title insecure-bank - global startup overhead: candidate=1.56.0-SNAPSHOT~7f18c4e09e, baseline=1.56.0-SNAPSHOT~b112f74d8e

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.102 s) : 0, 1101545
Total [baseline] (8.843 s) : 0, 8842514
Agent [candidate] (1.103 s) : 0, 1103296
Total [candidate] (8.841 s) : 0, 8841278
section iast
Agent [baseline] (1.241 s) : 0, 1241417
Total [baseline] (9.529 s) : 0, 9528607
Agent [candidate] (1.239 s) : 0, 1239234
Total [candidate] (9.565 s) : 0, 9565486
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.102 s -
Agent iast 1.241 s 139.872 ms (12.7%)
Total tracing 8.843 s -
Total iast 9.529 s 686.092 ms (7.8%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.103 s -
Agent iast 1.239 s 135.938 ms (12.3%)
Total tracing 8.841 s -
Total iast 9.565 s 724.209 ms (8.2%)
gantt
    title insecure-bank - break down per module: candidate=1.56.0-SNAPSHOT~7f18c4e09e, baseline=1.56.0-SNAPSHOT~b112f74d8e

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.454 ms) : 0, 1454
crashtracking [candidate] (1.453 ms) : 0, 1453
BytebuddyAgent [baseline] (704.28 ms) : 0, 704280
BytebuddyAgent [candidate] (705.539 ms) : 0, 705539
GlobalTracer [baseline] (248.004 ms) : 0, 248004
GlobalTracer [candidate] (248.455 ms) : 0, 248455
AppSec [baseline] (32.574 ms) : 0, 32574
AppSec [candidate] (32.271 ms) : 0, 32271
Debugger [baseline] (68.145 ms) : 0, 68145
Debugger [candidate] (68.254 ms) : 0, 68254
Remote Config [baseline] (656.78 µs) : 0, 657
Remote Config [candidate] (634.253 µs) : 0, 634
Telemetry [baseline] (8.169 ms) : 0, 8169
Telemetry [candidate] (8.276 ms) : 0, 8276
Flare Poller [baseline] (3.698 ms) : 0, 3698
Flare Poller [candidate] (3.713 ms) : 0, 3713
section iast
crashtracking [baseline] (1.454 ms) : 0, 1454
crashtracking [candidate] (1.449 ms) : 0, 1449
BytebuddyAgent [baseline] (829.159 ms) : 0, 829159
BytebuddyAgent [candidate] (827.672 ms) : 0, 827672
GlobalTracer [baseline] (237.955 ms) : 0, 237955
GlobalTracer [candidate] (237.493 ms) : 0, 237493
AppSec [baseline] (32.123 ms) : 0, 32123
AppSec [candidate] (29.412 ms) : 0, 29412
Debugger [baseline] (64.941 ms) : 0, 64941
Debugger [candidate] (64.892 ms) : 0, 64892
Remote Config [baseline] (548.201 µs) : 0, 548
Remote Config [candidate] (549.092 µs) : 0, 549
Telemetry [baseline] (7.6 ms) : 0, 7600
Telemetry [candidate] (7.652 ms) : 0, 7652
Flare Poller [baseline] (3.477 ms) : 0, 3477
Flare Poller [candidate] (3.566 ms) : 0, 3566
IAST [baseline] (29.457 ms) : 0, 29457
IAST [candidate] (31.989 ms) : 0, 31989
Loading
Startup time reports for petclinic
gantt
    title petclinic - global startup overhead: candidate=1.56.0-SNAPSHOT~7f18c4e09e, baseline=1.56.0-SNAPSHOT~b112f74d8e

    dateFormat X
    axisFormat %s
section tracing
Agent [baseline] (1.108 s) : 0, 1107913
Total [baseline] (10.88 s) : 0, 10879518
Agent [candidate] (1.11 s) : 0, 1109521
Total [candidate] (10.728 s) : 0, 10727594
section appsec
Agent [baseline] (1.298 s) : 0, 1298413
Total [baseline] (11.238 s) : 0, 11238493
Agent [candidate] (1.28 s) : 0, 1280178
Total [candidate] (11.034 s) : 0, 11033784
section iast
Agent [baseline] (1.243 s) : 0, 1242654
Total [baseline] (11.221 s) : 0, 11220803
Agent [candidate] (1.252 s) : 0, 1252206
Total [candidate] (11.339 s) : 0, 11339291
section profiling
Agent [baseline] (1.245 s) : 0, 1244846
Total [baseline] (11.166 s) : 0, 11166457
Agent [candidate] (1.229 s) : 0, 1229479
Total [candidate] (10.993 s) : 0, 10992932
Loading
  • baseline results
Module Variant Duration Δ tracing
Agent tracing 1.108 s -
Agent appsec 1.298 s 190.5 ms (17.2%)
Agent iast 1.243 s 134.741 ms (12.2%)
Agent profiling 1.245 s 136.934 ms (12.4%)
Total tracing 10.88 s -
Total appsec 11.238 s 358.975 ms (3.3%)
Total iast 11.221 s 341.285 ms (3.1%)
Total profiling 11.166 s 286.939 ms (2.6%)
  • candidate results
Module Variant Duration Δ tracing
Agent tracing 1.11 s -
Agent appsec 1.28 s 170.657 ms (15.4%)
Agent iast 1.252 s 142.686 ms (12.9%)
Agent profiling 1.229 s 119.959 ms (10.8%)
Total tracing 10.728 s -
Total appsec 11.034 s 306.19 ms (2.9%)
Total iast 11.339 s 611.697 ms (5.7%)
Total profiling 10.993 s 265.338 ms (2.5%)
gantt
    title petclinic - break down per module: candidate=1.56.0-SNAPSHOT~7f18c4e09e, baseline=1.56.0-SNAPSHOT~b112f74d8e

    dateFormat X
    axisFormat %s
section tracing
crashtracking [baseline] (1.462 ms) : 0, 1462
crashtracking [candidate] (1.477 ms) : 0, 1477
BytebuddyAgent [baseline] (707.194 ms) : 0, 707194
BytebuddyAgent [candidate] (709.528 ms) : 0, 709528
GlobalTracer [baseline] (250.174 ms) : 0, 250174
GlobalTracer [candidate] (250.446 ms) : 0, 250446
AppSec [baseline] (32.635 ms) : 0, 32635
AppSec [candidate] (32.512 ms) : 0, 32512
Debugger [baseline] (69.219 ms) : 0, 69219
Debugger [candidate] (68.449 ms) : 0, 68449
Remote Config [baseline] (632.664 µs) : 0, 633
Remote Config [candidate] (623.974 µs) : 0, 624
Telemetry [baseline] (8.182 ms) : 0, 8182
Telemetry [candidate] (8.048 ms) : 0, 8048
Flare Poller [baseline] (3.821 ms) : 0, 3821
Flare Poller [candidate] (3.693 ms) : 0, 3693
section appsec
crashtracking [baseline] (1.489 ms) : 0, 1489
crashtracking [candidate] (1.451 ms) : 0, 1451
BytebuddyAgent [baseline] (741.314 ms) : 0, 741314
BytebuddyAgent [candidate] (729.985 ms) : 0, 729985
GlobalTracer [baseline] (244.212 ms) : 0, 244212
GlobalTracer [candidate] (240.953 ms) : 0, 240953
AppSec [baseline] (175.797 ms) : 0, 175797
AppSec [candidate] (174.584 ms) : 0, 174584
Debugger [baseline] (61.949 ms) : 0, 61949
Debugger [candidate] (60.767 ms) : 0, 60767
Remote Config [baseline] (679.023 µs) : 0, 679
Remote Config [candidate] (652.964 µs) : 0, 653
Telemetry [baseline] (8.432 ms) : 0, 8432
Telemetry [candidate] (8.359 ms) : 0, 8359
Flare Poller [baseline] (4.019 ms) : 0, 4019
Flare Poller [candidate] (3.852 ms) : 0, 3852
IAST [baseline] (25.398 ms) : 0, 25398
IAST [candidate] (24.825 ms) : 0, 24825
section iast
crashtracking [baseline] (1.46 ms) : 0, 1460
crashtracking [candidate] (1.467 ms) : 0, 1467
BytebuddyAgent [baseline] (829.476 ms) : 0, 829476
BytebuddyAgent [candidate] (837.275 ms) : 0, 837275
GlobalTracer [baseline] (238.275 ms) : 0, 238275
GlobalTracer [candidate] (239.466 ms) : 0, 239466
AppSec [baseline] (28.517 ms) : 0, 28517
AppSec [candidate] (29.451 ms) : 0, 29451
Debugger [baseline] (65.795 ms) : 0, 65795
Debugger [candidate] (65.864 ms) : 0, 65864
Remote Config [baseline] (549.547 µs) : 0, 550
Remote Config [candidate] (557.419 µs) : 0, 557
Telemetry [baseline] (7.629 ms) : 0, 7629
Telemetry [candidate] (7.778 ms) : 0, 7778
Flare Poller [baseline] (3.519 ms) : 0, 3519
Flare Poller [candidate] (3.597 ms) : 0, 3597
IAST [baseline] (32.774 ms) : 0, 32774
IAST [candidate] (31.971 ms) : 0, 31971
section profiling
crashtracking [baseline] (1.439 ms) : 0, 1439
crashtracking [candidate] (1.43 ms) : 0, 1430
BytebuddyAgent [baseline] (735.515 ms) : 0, 735515
BytebuddyAgent [candidate] (728.479 ms) : 0, 728479
GlobalTracer [baseline] (224.728 ms) : 0, 224728
GlobalTracer [candidate] (221.542 ms) : 0, 221542
AppSec [baseline] (33.651 ms) : 0, 33651
AppSec [candidate] (32.15 ms) : 0, 32150
Debugger [baseline] (67.93 ms) : 0, 67930
Debugger [candidate] (67.424 ms) : 0, 67424
Remote Config [baseline] (661.456 µs) : 0, 661
Remote Config [candidate] (631.569 µs) : 0, 632
Telemetry [baseline] (8.133 ms) : 0, 8133
Telemetry [candidate] (7.886 ms) : 0, 7886
Flare Poller [baseline] (3.871 ms) : 0, 3871
Flare Poller [candidate] (3.695 ms) : 0, 3695
ProfilingAgent [baseline] (98.703 ms) : 0, 98703
ProfilingAgent [candidate] (96.747 ms) : 0, 96747
Profiling [baseline] (99.292 ms) : 0, 99292
Profiling [candidate] (97.317 ms) : 0, 97317
Loading

Load

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master pawel.leszczynski/databricks-openlineage-support
git_commit_date 1763031873 1763035955
git_commit_sha b112f74 7f18c4e
release_version 1.56.0-SNAPSHOT~b112f74d8e 1.56.0-SNAPSHOT~7f18c4e09e
See matching parameters
Baseline Candidate
application insecure-bank insecure-bank
ci_job_date 1763038339 1763038339
ci_job_id 1232366519 1232366519
ci_pipeline_id 82236025 82236025
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-7xi3s5dh 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-7xi3s5dh 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 3 performance improvements and 0 performance regressions! Performance is the same for 18 metrics, 15 unstable metrics.

scenario Δ mean agg_http_req_duration_p50 Δ mean agg_http_req_duration_p95 Δ mean throughput candidate mean agg_http_req_duration_p50 candidate mean agg_http_req_duration_p95 candidate mean throughput baseline mean agg_http_req_duration_p50 baseline mean agg_http_req_duration_p95 baseline mean throughput
scenario:load:insecure-bank:iast_GLOBAL:high_load better
[-170.900µs; -66.003µs] or [-5.987%; -2.312%]
unsure
[-510.270µs; -150.167µs] or [-6.320%; -1.860%]
unstable
[-53.585op/s; +206.023op/s] or [-4.287%; +16.481%]
2.736ms 7.744ms 1326.281op/s 2.855ms 8.074ms 1250.062op/s
scenario:load:insecure-bank:iast_FULL:high_load better
[-452.557µs; -275.630µs] or [-8.427%; -5.133%]
better
[-1029.229µs; -314.399µs] or [-8.115%; -2.479%]
unstable
[-19.898op/s; +130.523op/s] or [-2.626%; +17.224%]
5.006ms 12.011ms 813.094op/s 5.370ms 12.683ms 757.781op/s
Request duration reports for petclinic
gantt
    title petclinic - request duration [CI 0.99] : candidate=1.56.0-SNAPSHOT~7f18c4e09e, baseline=1.56.0-SNAPSHOT~b112f74d8e
    dateFormat X
    axisFormat %s
section baseline
no_agent (18.727 ms) : 18537, 18917
.   : milestone, 18727,
appsec (18.777 ms) : 18584, 18970
.   : milestone, 18777,
code_origins (17.561 ms) : 17383, 17739
.   : milestone, 17561,
iast (17.663 ms) : 17487, 17839
.   : milestone, 17663,
profiling (19.973 ms) : 19765, 20181
.   : milestone, 19973,
tracing (17.639 ms) : 17467, 17812
.   : milestone, 17639,
section candidate
no_agent (18.08 ms) : 17895, 18265
.   : milestone, 18080,
appsec (18.876 ms) : 18688, 19065
.   : milestone, 18876,
code_origins (17.869 ms) : 17689, 18049
.   : milestone, 17869,
iast (17.7 ms) : 17523, 17878
.   : milestone, 17700,
profiling (19.713 ms) : 19514, 19911
.   : milestone, 19713,
tracing (17.474 ms) : 17302, 17646
.   : milestone, 17474,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.727 ms [18.537 ms, 18.917 ms] -
appsec 18.777 ms [18.584 ms, 18.97 ms] 50.017 µs (0.3%)
code_origins 17.561 ms [17.383 ms, 17.739 ms] -1.166 ms (-6.2%)
iast 17.663 ms [17.487 ms, 17.839 ms] -1.064 ms (-5.7%)
profiling 19.973 ms [19.765 ms, 20.181 ms] 1.245 ms (6.7%)
tracing 17.639 ms [17.467 ms, 17.812 ms] -1.088 ms (-5.8%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 18.08 ms [17.895 ms, 18.265 ms] -
appsec 18.876 ms [18.688 ms, 19.065 ms] 796.122 µs (4.4%)
code_origins 17.869 ms [17.689 ms, 18.049 ms] -210.936 µs (-1.2%)
iast 17.7 ms [17.523 ms, 17.878 ms] -379.851 µs (-2.1%)
profiling 19.713 ms [19.514 ms, 19.911 ms] 1.632 ms (9.0%)
tracing 17.474 ms [17.302 ms, 17.646 ms] -606.346 µs (-3.4%)
Request duration reports for insecure-bank
gantt
    title insecure-bank - request duration [CI 0.99] : candidate=1.56.0-SNAPSHOT~7f18c4e09e, baseline=1.56.0-SNAPSHOT~b112f74d8e
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.191 ms) : 1178, 1203
.   : milestone, 1191,
iast (3.184 ms) : 3148, 3220
.   : milestone, 3184,
iast_FULL (6.105 ms) : 5998, 6213
.   : milestone, 6105,
iast_GLOBAL (3.678 ms) : 3595, 3761
.   : milestone, 3678,
profiling (2.053 ms) : 2036, 2071
.   : milestone, 2053,
tracing (1.782 ms) : 1768, 1797
.   : milestone, 1782,
section candidate
no_agent (1.199 ms) : 1187, 1210
.   : milestone, 1199,
iast (3.245 ms) : 3200, 3289
.   : milestone, 3245,
iast_FULL (5.686 ms) : 5629, 5743
.   : milestone, 5686,
iast_GLOBAL (3.454 ms) : 3411, 3497
.   : milestone, 3454,
profiling (2.019 ms) : 2001, 2037
.   : milestone, 2019,
tracing (1.87 ms) : 1853, 1887
.   : milestone, 1870,
Loading
  • baseline results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.191 ms [1.178 ms, 1.203 ms] -
iast 3.184 ms [3.148 ms, 3.22 ms] 1.994 ms (167.4%)
iast_FULL 6.105 ms [5.998 ms, 6.213 ms] 4.915 ms (412.8%)
iast_GLOBAL 3.678 ms [3.595 ms, 3.761 ms] 2.487 ms (208.9%)
profiling 2.053 ms [2.036 ms, 2.071 ms] 862.85 µs (72.5%)
tracing 1.782 ms [1.768 ms, 1.797 ms] 591.476 µs (49.7%)
  • candidate results
Variant Request duration [CI 0.99] Δ no_agent
no_agent 1.199 ms [1.187 ms, 1.21 ms] -
iast 3.245 ms [3.2 ms, 3.289 ms] 2.046 ms (170.7%)
iast_FULL 5.686 ms [5.629 ms, 5.743 ms] 4.487 ms (374.3%)
iast_GLOBAL 3.454 ms [3.411 ms, 3.497 ms] 2.255 ms (188.2%)
profiling 2.019 ms [2.001 ms, 2.037 ms] 820.405 µs (68.4%)
tracing 1.87 ms [1.853 ms, 1.887 ms] 671.36 µs (56.0%)

Dacapo

Parameters

Baseline Candidate
baseline_or_candidate baseline candidate
git_branch master pawel.leszczynski/databricks-openlineage-support
git_commit_date 1763031873 1763035955
git_commit_sha b112f74 7f18c4e
release_version 1.56.0-SNAPSHOT~b112f74d8e 1.56.0-SNAPSHOT~7f18c4e09e
See matching parameters
Baseline Candidate
application biojava biojava
ci_job_date 1763038046 1763038046
ci_job_id 1232366520 1232366520
ci_pipeline_id 82236025 82236025
cpu_model Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz Intel(R) Xeon(R) Platinum 8259CL CPU @ 2.50GHz
kernel_version Linux runner-zfyrx7zua-project-304-concurrent-0-u6mrbspe 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux Linux runner-zfyrx7zua-project-304-concurrent-0-u6mrbspe 6.8.0-1031-aws #33~22.04.1-Ubuntu SMP Thu Jun 26 14:22:30 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux

Summary

Found 0 performance improvements and 0 performance regressions! Performance is the same for 10 metrics, 2 unstable metrics.

Execution time for biojava
gantt
    title biojava - execution time [CI 0.99] : candidate=1.56.0-SNAPSHOT~7f18c4e09e, baseline=1.56.0-SNAPSHOT~b112f74d8e
    dateFormat X
    axisFormat %s
section baseline
no_agent (15.569 s) : 15569000, 15569000
.   : milestone, 15569000,
appsec (14.596 s) : 14596000, 14596000
.   : milestone, 14596000,
iast (18.731 s) : 18731000, 18731000
.   : milestone, 18731000,
iast_GLOBAL (17.985 s) : 17985000, 17985000
.   : milestone, 17985000,
profiling (15.216 s) : 15216000, 15216000
.   : milestone, 15216000,
tracing (14.571 s) : 14571000, 14571000
.   : milestone, 14571000,
section candidate
no_agent (15.025 s) : 15025000, 15025000
.   : milestone, 15025000,
appsec (14.89 s) : 14890000, 14890000
.   : milestone, 14890000,
iast (18.306 s) : 18306000, 18306000
.   : milestone, 18306000,
iast_GLOBAL (17.894 s) : 17894000, 17894000
.   : milestone, 17894000,
profiling (15.026 s) : 15026000, 15026000
.   : milestone, 15026000,
tracing (14.68 s) : 14680000, 14680000
.   : milestone, 14680000,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.569 s [15.569 s, 15.569 s] -
appsec 14.596 s [14.596 s, 14.596 s] -973.0 ms (-6.2%)
iast 18.731 s [18.731 s, 18.731 s] 3.162 s (20.3%)
iast_GLOBAL 17.985 s [17.985 s, 17.985 s] 2.416 s (15.5%)
profiling 15.216 s [15.216 s, 15.216 s] -353.0 ms (-2.3%)
tracing 14.571 s [14.571 s, 14.571 s] -998.0 ms (-6.4%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 15.025 s [15.025 s, 15.025 s] -
appsec 14.89 s [14.89 s, 14.89 s] -135.0 ms (-0.9%)
iast 18.306 s [18.306 s, 18.306 s] 3.281 s (21.8%)
iast_GLOBAL 17.894 s [17.894 s, 17.894 s] 2.869 s (19.1%)
profiling 15.026 s [15.026 s, 15.026 s] 1.0 ms (0.0%)
tracing 14.68 s [14.68 s, 14.68 s] -345.0 ms (-2.3%)
Execution time for tomcat
gantt
    title tomcat - execution time [CI 0.99] : candidate=1.56.0-SNAPSHOT~7f18c4e09e, baseline=1.56.0-SNAPSHOT~b112f74d8e
    dateFormat X
    axisFormat %s
section baseline
no_agent (1.477 ms) : 1465, 1488
.   : milestone, 1477,
appsec (3.727 ms) : 3510, 3945
.   : milestone, 3727,
iast (2.211 ms) : 2148, 2275
.   : milestone, 2211,
iast_GLOBAL (2.255 ms) : 2191, 2319
.   : milestone, 2255,
profiling (2.09 ms) : 2036, 2144
.   : milestone, 2090,
tracing (2.034 ms) : 1984, 2084
.   : milestone, 2034,
section candidate
no_agent (1.48 ms) : 1468, 1492
.   : milestone, 1480,
appsec (3.698 ms) : 3481, 3915
.   : milestone, 3698,
iast (2.199 ms) : 2136, 2262
.   : milestone, 2199,
iast_GLOBAL (2.252 ms) : 2188, 2316
.   : milestone, 2252,
profiling (2.515 ms) : 2343, 2687
.   : milestone, 2515,
tracing (2.025 ms) : 1975, 2074
.   : milestone, 2025,
Loading
  • baseline results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.477 ms [1.465 ms, 1.488 ms] -
appsec 3.727 ms [3.51 ms, 3.945 ms] 2.251 ms (152.4%)
iast 2.211 ms [2.148 ms, 2.275 ms] 734.487 µs (49.7%)
iast_GLOBAL 2.255 ms [2.191 ms, 2.319 ms] 777.953 µs (52.7%)
profiling 2.09 ms [2.036 ms, 2.144 ms] 613.122 µs (41.5%)
tracing 2.034 ms [1.984 ms, 2.084 ms] 557.422 µs (37.7%)
  • candidate results
Variant Execution Time [CI 0.99] Δ no_agent
no_agent 1.48 ms [1.468 ms, 1.492 ms] -
appsec 3.698 ms [3.481 ms, 3.915 ms] 2.218 ms (149.9%)
iast 2.199 ms [2.136 ms, 2.262 ms] 719.425 µs (48.6%)
iast_GLOBAL 2.252 ms [2.188 ms, 2.316 ms] 772.147 µs (52.2%)
profiling 2.515 ms [2.343 ms, 2.687 ms] 1.035 ms (69.9%)
tracing 2.025 ms [1.975 ms, 2.074 ms] 544.681 µs (36.8%)

@pawel-big-lebowski pawel-big-lebowski force-pushed the pawel.leszczynski/databricks-openlineage-support branch 4 times, most recently from f8f26bf to cef2b46 Compare October 31, 2025 13:56
@pawel-big-lebowski pawel-big-lebowski marked this pull request as ready for review October 31, 2025 14:38
@pawel-big-lebowski pawel-big-lebowski requested a review from a team as a code owner October 31, 2025 14:38
@github-actions
Copy link
Contributor

github-actions bot commented Oct 31, 2025

Hi! 👋 Thanks for your pull request! 🎉

To help us review it, please make sure to:

  • Add at least one type, and one component or instrumentation label to the pull request

If you need help, please check our contributing guidelines.

@pawel-big-lebowski pawel-big-lebowski added inst: apache spark Apache Spark instrumentation type: enhancement Enhancements and improvements labels Oct 31, 2025
Comment on lines +137 to +148
try {
log.debug("Getting OpenLineage conf from the listener");
Object openLineageConf = listener.getClass().getMethod("getConf").invoke(listener);
if (openLineageConf != null) {
InstanceStore.of(SparkConf.class)
.put("openLineageSparkConf", (SparkConf) openLineageConf);
}
} catch (IllegalAccessException | NoSuchMethodException | InvocationTargetException e) {
log.warn(
"Issue when obtaining OpenLineage conf (possibly unsupported OpenLineage version): {}",
e.getMessage());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe instead of reflection, we should find exactly the databricks forked class, and write an instrumentation similar to what we do for SparkConf itself?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can try this @mobuchowski.
Reflection is a workaround for the LiveListenerBusAdvice not working on databricks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done another round of tests and OpenLineageSparkListenerAdvice is not working on databricks environment. It's not working even without considering sparkConf argument. There may be different reasons behind that: perhaps OpenLineage listener gets created before datadog's instrumentation loads or databricks initiates listeners other way than calling regular constructors (they can use reflection as well). To wrap up: there ain't no easy nor known solution to this.

For OpenLineage connector >= 1.39 we could have used a regular listener method call (listener.getConf()) to obtain the conf. However, we need to make sure the code is working for older connector versions, when getConf method didn't exist. That's why I think reflection is necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be different reasons behind that: perhaps OpenLineage listener gets created before datadog's instrumentation loads or databricks initiates listeners other way than calling regular constructors (they can use reflection as well). To wrap up: there ain't no easy nor known solution to this.

Maybe we can test that by raising exception on constructor, to see the stack trace?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caused by: java.lang.RuntimeException: some exception
	at io.openlineage.spark.agent.OpenLineageSparkListener.<init>(OpenLineageSparkListener.java:99)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at java.base/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:500)
	at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:481)
	at org.apache.spark.util.Utils$.$anonfun$loadExtensions$1(Utils.scala:3610)
	at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:293)
	at scala.collection.Iterator.foreach(Iterator.scala:943)
	at scala.collection.Iterator.foreach$(Iterator.scala:943)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at scala.collection.TraversableLike.flatMap(TraversableLike.scala:293)
	at scala.collection.TraversableLike.flatMap$(TraversableLike.scala:290)
	at scala.collection.AbstractTraversable.flatMap(Traversable.scala:108)
	at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:3602)
	at org.apache.spark.SparkContext.$anonfun$setupAndStartListenerBus$1(SparkContext.scala:3683)
	at org.apache.spark.SparkContext.$anonfun$setupAndStartListenerBus$1$adapted(SparkContext.scala:3680)
	at scala.Option.foreach(Option.scala:407)
	at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:3680)
	... 42 more

@aboitreaud aboitreaud force-pushed the pawel.leszczynski/databricks-openlineage-support branch from d6565eb to 85ed98d Compare October 31, 2025 17:18
@pawel-big-lebowski pawel-big-lebowski force-pushed the pawel.leszczynski/databricks-openlineage-support branch from 85ed98d to c917bfc Compare November 13, 2025 09:34
@pawel-big-lebowski pawel-big-lebowski force-pushed the pawel.leszczynski/databricks-openlineage-support branch from c917bfc to 7f18c4e Compare November 13, 2025 12:12
@pawel-big-lebowski pawel-big-lebowski merged commit c99a7da into master Nov 13, 2025
703 of 704 checks passed
@pawel-big-lebowski pawel-big-lebowski deleted the pawel.leszczynski/databricks-openlineage-support branch November 13, 2025 13:10
@github-actions github-actions bot added this to the 1.56.0 milestone Nov 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

inst: apache spark Apache Spark instrumentation type: enhancement Enhancements and improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants