DATAFLINT-5041: dataflint-spark4-databricks shaded artifact for DBR 17.3+#73
Merged
DATAFLINT-5041: dataflint-spark4-databricks shaded artifact for DBR 17.3+#73
Conversation
Databricks Runtime 17.3 ships javax.servlet instead of jakarta.servlet, crashing the standard Spark 4 plugin at startup with NoClassDefFoundError on jakarta/servlet/Servlet (issue #47). Add a parallel SBT module pluginspark4databricks that source-shares with pluginspark4 but applies ShadeRule.rename("jakarta.servlet.**" -> "javax.servlet.@1") at assembly time, producing io.dataflint:dataflint-spark4-databricks_2.13. A Spark4DatabricksPageFactory subclass inverts the Databricks UI gate so the new jar enables the UI only on DBR (and silently degrades to listeners-only if accidentally installed on stock Spark 4); the original Spark4PageFactory is unchanged. Drop the Maven-Central verify step in cd.yml — it only checked spark_2.12 and didn't work for snapshots. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ize metrics Databricks Runtime 17.x rewrites SQLMetrics with explicit overloads instead of Scala default args, so the bytecode-level helpers createTimingMetric$default$3 and createSizeMetric$default$3 don't exist on DBR. Calling SQLMetrics.createTimingMetric(sc, name) compiles to a $default$3 fetch + 3-arg call — the helper fetch NoSuchMethodErrors before the metric is ever created, the catch falls through to step 3 (SQLMetrics.createMetric, a SUM metric), and the TimedExec "duration" surfaces in the Spark UI as a bare number with no unit (e.g. "1058") instead of "5s (1s, 2s, 3s)". The DataFlint React UI then crashes parsing the bare number with "Unsupported time unit: 58". Pass -1L explicitly so the bytecode emits a direct 3-arg invokevirtual, matching the runtime overload that exists on both stock Spark 4 and DBR 17.x. -1L is the same value stock Spark uses as the default — semantics unchanged. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
timeStringToMilliseconds slices the last 2 chars of a metric value as the
unit and threw "Unsupported time unit: ${unit}" if it wasn't ms/s/m/h.
Some Spark forks (Databricks) return duration-named metrics as bare numbers
with no unit suffix — the slice then picks up digit pairs (e.g. "58") and
the throw bubbles up through the React render, blanking the SQL plan page.
Return undefined instead. Every caller (SqlReducer, GraphDurationAttribution)
already handles undefined with `?? 0`, so missing duration data degrades
gracefully and the rest of the page keeps rendering. Logs a console warning
so the malformed value is still discoverable in DevTools.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Scaladoc couldn't resolve [[Spark4DatabricksPageFactory]] when generating docs in CD because of how the new module compiles via source-share. Use backticks (markdown code) instead of doc links — same readability, no linker pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
menishmueli
approved these changes
May 7, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #47
Two distinct issues that surface together when running DataFlint OSS on Databricks Runtime 17.3+:
NoClassDefFoundError: jakarta/servlet/Servlet. DBR 17.3 is Spark 4–based but shipsjavax.servlet, notjakarta.servlet. Fixed by publishing a separate shaded artifact.5s (1s, 2s, 3s)formatting), and the DataFlint React UI crashes withUnsupported time unit: 58. Fixed by passing the metricinitValueexplicitly so the bytecode skips a Scala-generated default-arg helper that DBR doesn't have.1. Separate shaded artifact for Databricks
flowchart LR Stock["dataflint-spark4_2.13<br/>(jakarta.servlet)"]:::stock Spark4["Stock Spark 4.x<br/>(jakarta.servlet)"]:::ok DBR["Databricks Runtime 17.3<br/>(javax.servlet)"]:::bad Stock -->|"works ✓"| Spark4 Stock -->|"NoClassDefFoundError ✗"| DBR classDef stock fill:#e8eef5,stroke:#36c classDef ok fill:#dff7e0,stroke:#2a7 classDef bad fill:#fde2e2,stroke:#c33Build a second artifact
io.dataflint:dataflint-spark4-databricks_2.13from the same sources aspluginspark4with sbt-assembly'sShadeRule.rename("jakarta.servlet.**" -> "javax.servlet.@1").inAll. Same plugin class (io.dataflint.spark.SparkDataflintPlugin); only the jar coordinate differs.flowchart LR subgraph Sources [shared sources] S1[plugin/] S2[pluginspark4/] end subgraph Modules [sbt modules] M1[pluginspark4] M2["pluginspark4databricks<br/>(shade rule)"] end subgraph Artifacts [published jars] A1["dataflint-spark4_2.13<br/>(jakarta.servlet)"]:::stock A2["dataflint-spark4-databricks_2.13<br/>(javax.servlet — shaded)"]:::dbr end subgraph Runtimes [target runtimes] R1[Stock Spark 4.x]:::ok R2[Databricks Runtime 17.3+]:::ok end Sources --> M1 Sources --> M2 M1 -->|sbt assembly| A1 M2 -->|"sbt assembly + ShadeRule"| A2 A1 --> R1 A2 --> R2 classDef stock fill:#e8eef5,stroke:#36c classDef dbr fill:#fff3cd,stroke:#a80 classDef ok fill:#dff7e0,stroke:#2a7Class hierarchy + UI gate
classDiagram class DataflintPageFactory { <<abstract>> +isUISupported(ui) Boolean = true } class Spark4PageFactory { +isUISupported(ui) = !isDatabricks } class Spark4DatabricksPageFactory { +isUISupported(ui) = isDatabricks } DataflintPageFactory <|-- Spark4PageFactory Spark4PageFactory <|-- Spark4DatabricksPageFactory note for Spark4PageFactory "in pluginspark4/<br/>UI on stock Spark 4 only" note for Spark4DatabricksPageFactory "in pluginspark4databricks/<br/>UI on Databricks only"Symmetric checks → each jar serves UI exclusively on its intended runtime; misinstall silently degrades to "listeners run, no UI" instead of crashing.
dataflint-spark4_2.13dataflint-spark4-databricks_2.132.
SQLMetricsdefault-arg bytecode bug on DBRThe
TimedExec"duration" metric was rendering as a bare number on DBR's Spark UI (nototal (min, med, max)format). The DataFlint React UI then sliced the last 2 chars as a unit and threwUnsupported time unit: 58.Root cause
Stock Spark 4 source:
The Scala compiler emits
createTimingMetric$default$3()as a separate static helper for the default value-1L. CallingSQLMetrics.createTimingMetric(sc, name)compiles to:Databricks rewrote
SQLMetricswith explicit overloads instead of default args, so on DBR there's no$default$3helper. Theinvokevirtualon itNoSuchMethodErrors, ourcatchfalls through:flowchart TD Call["SQLMetrics.createTimingMetric(sc, name)"] D3["createTimingMetric$default$3()"] Created["TIMING SQLMetric<br/>'5s (1s, 2s, 3s)'"]:::ok Step2["new SQLMetric('timing', 0L)"] SQL2arg["2-arg SQLMetric ctor"] Step3["SQLMetrics.createMetric(sc, name)<br/>SUM metric '1058' ❌"]:::bad UI["React: timeStringToMilliseconds<br/>throws 'Unsupported time unit: 58'"]:::bad Call --> D3 D3 -- exists, stock Spark 4 --> Created D3 -- "NoSuchMethodError on DBR" --> Step2 Step2 --> SQL2arg SQL2arg -- exists, stock Spark --> Created SQL2arg -- "NoSuchMethodError on DBR<br/>(only 3-arg ctor)" --> Step3 Step3 --> UI classDef ok fill:#dff7e0,stroke:#2a7 classDef bad fill:#fde2e2,stroke:#c33Verified the assumed
NoSuchMethodErrors on a real DBR 17.3 cluster via the REST API:Fix
Pass
initValueexplicitly so the bytecode emits a direct 3-arginvokevirtualwith no$default$3fetch:(Same change for
createSizeMetric.)-1Lmatches stock Spark's default — runtime semantics are unchanged on stock + EMR + Spark 3.5; the only effect is the bytecode now skips the$default$3helper and lands on a 3-arg overload that exists on every target runtime. Verified by disassembling the rebuilt jar —$default$3is gone.Files changed
spark-plugin/plugin/.../MetricsUtils.scala-1LtocreateTimingMetric/createSizeMetric(fix for issue #2)spark-plugin/build.sbtpluginspark4databricksSBT module: shade rule + source-share + loader-exclude filter (fix for issue #1)spark-plugin/pluginspark4databricks/.../api/Spark4DatabricksPageFactory.scalaSpark4PageFactory; invertsisUISupportedspark-plugin/pluginspark4databricks/.../DataflintSparkUILoader.scalaspark-plugin/clean-and-setup.shREADME.md.github/workflows/cd.ymlpluginspark4source files: untouched (other than the sharedMetricsUtilsfix inplugin/).Test plan
Done locally
sbt "plugin/compile; pluginspark3/compile; pluginspark4/compile; pluginspark4databricks/compile"— all[success]sbt pluginspark4databricks/assembly— jar builtrenderJsonin the new jar referencesjavax.servlet.http.HttpServletRequest(shade applied)pageFactoryfield is typedSpark4DatabricksPageFactory/dataflint/applicationinfo/json/returns HTTP 200MetricsUtilsbytecode no longer referencescreateTimingMetric$default$3/createSizeMetric$default$3createTimingMetric$default$3does NOT exist, but the 3-arg overload does — explicit-1Lwill route correctlyPending (your side)
5s (1s, 2s, 3s)formatting, no ReactUnsupported time uniterrorspark_2.12,dataflint-spark4_2.13,dataflint-spark4-databricks_2.13all publishOut of scope
MetricsUtilsimprovement).🤖 Generated with Claude Code