Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] openlineage-spark crashes spark driver on Databricks #2499

Closed
efthymiosh opened this issue Mar 8, 2024 · 4 comments
Closed

[Bug] openlineage-spark crashes spark driver on Databricks #2499

efthymiosh opened this issue Mar 8, 2024 · 4 comments

Comments

@efthymiosh
Copy link

We are in the process of configuring the io.openlineage.spark.agent.OpenLineageSparkListener for our databricks compute.

Running workloads crashes the spark driver with the following stacktrace logged:

24/03/07 14:14:05 ERROR Utils: throw uncaught fatal error in thread spark-listener-group-shared
java.lang.NoClassDefFoundError: com/databricks/sdk/scala/dbutils/DbfsUtils
	at io.openlineage.spark.agent.facets.builder.DatabricksEnvironmentFacetBuilder.getDbfsUtils(DatabricksEnvironmentFacetBuilder.java:124)
	at io.openlineage.spark.agent.facets.builder.DatabricksEnvironmentFacetBuilder.getDatabricksEnvironmentalAttributes(DatabricksEnvironmentFacetBuilder.java:92)
	at io.openlineage.spark.agent.facets.builder.DatabricksEnvironmentFacetBuilder.build(DatabricksEnvironmentFacetBuilder.java:58)
	at io.openlineage.spark.agent.facets.builder.DatabricksEnvironmentFacetBuilder.build(DatabricksEnvironmentFacetBuilder.java:32)
	at io.openlineage.spark.api.CustomFacetBuilder.accept(CustomFacetBuilder.java:40)
	at io.openlineage.spark.agent.lifecycle.OpenLineageRunEventBuilder.lambda$null$27(OpenLineageRunEventBuilder.java:508)
	at java.lang.Iterable.forEach(Iterable.java:75)
	at io.openlineage.spark.agent.lifecycle.OpenLineageRunEventBuilder.lambda$buildRunFacets$28(OpenLineageRunEventBuilder.java:508)
	at java.util.ArrayList.forEach(ArrayList.java:1259)
	at io.openlineage.spark.agent.lifecycle.OpenLineageRunEventBuilder.buildRunFacets(OpenLineageRunEventBuilder.java:508)
	at io.openlineage.spark.agent.lifecycle.OpenLineageRunEventBuilder.populateRun(OpenLineageRunEventBuilder.java:328)
	at io.openlineage.spark.agent.lifecycle.OpenLineageRunEventBuilder.buildRun(OpenLineageRunEventBuilder.java:304)
	at io.openlineage.spark.agent.lifecycle.OpenLineageRunEventBuilder.buildRun(OpenLineageRunEventBuilder.java:265)
	at io.openlineage.spark.agent.lifecycle.SparkSQLExecutionContext.start(SparkSQLExecutionContext.java:221)
	at io.openlineage.spark.agent.OpenLineageSparkListener.lambda$null$13(OpenLineageSparkListener.java:178)
	at io.openlineage.client.circuitBreaker.NoOpCircuitBreaker.run(NoOpCircuitBreaker.java:27)
	at io.openlineage.spark.agent.OpenLineageSparkListener.lambda$onJobStart$14(OpenLineageSparkListener.java:176)
	at java.util.Optional.ifPresent(Optional.java:159)
	at io.openlineage.spark.agent.OpenLineageSparkListener.onJobStart(OpenLineageSparkListener.java:172)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent(SparkListenerBus.scala:37)
	at org.apache.spark.scheduler.SparkListenerBus.doPostEvent$(SparkListenerBus.scala:28)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:42)
	at org.apache.spark.scheduler.AsyncEventQueue.doPostEvent(AsyncEventQueue.scala:42)
	at org.apache.spark.util.ListenerBus.postToAll(ListenerBus.scala:118)
	at org.apache.spark.util.ListenerBus.postToAll$(ListenerBus.scala:102)
	at org.apache.spark.scheduler.AsyncEventQueue.super$postToAll(AsyncEventQueue.scala:114)
	at org.apache.spark.scheduler.AsyncEventQueue.$anonfun$dispatch$1(AsyncEventQueue.scala:114)
	at scala.runtime.java8.JFunction0$mcJ$sp.apply(JFunction0$mcJ$sp.java:23)
	at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62)
	at org.apache.spark.scheduler.AsyncEventQueue.org$apache$spark$scheduler$AsyncEventQueue$$dispatch(AsyncEventQueue.scala:109)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.$anonfun$run$1(AsyncEventQueue.scala:105)
	at org.apache.spark.util.Utils$.tryOrStopSparkContext(Utils.scala:1493)
	at org.apache.spark.scheduler.AsyncEventQueue$$anon$2.run(AsyncEventQueue.scala:105)
Caused by: java.lang.ClassNotFoundException: com.databricks.sdk.scala.dbutils.DbfsUtils
	at java.net.URLClassLoader.findClass(URLClassLoader.java:387)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:419)
	at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352)
	at java.lang.ClassLoader.loadClass(ClassLoader.java:352)
	... 33 more

We have replicated with openlineage-spark 1.9.1 and databricks runtimes 14.3 and 14.2.

We initialize the listener using the following init script:

#!/bin/bash
VERSION="1.9.1"
SCALA_VERSION="2.12"
wget -O /mnt/driver-daemon/jars/openlineage-spark_${SCALA_VERSION}-${VERSION}.jar https://repo1.maven.org/maven2/io/openlineage/openlineage-spark_${SCALA_VERSION}/${VERSION}/openlineage-spark_${SCALA_VERSION}-${VERSION}.jar
SPARK_DEFAULTS_FILE="/databricks/driver/conf/00-openlineage-defaults.conf"

if [[ $DB_IS_DRIVER = "TRUE" ]]; then
  cat > $SPARK_DEFAULTS_FILE <<- EOF
    [driver] {
      "spark.extraListeners"                                        = "com.databricks.backend.daemon.driver.DBCEventLoggingListener,io.openlineage.spark.agent.OpenLineageSparkListener"
      "spark.openlineage.version"                                   = "v1"
      "spark.openlineage.transport.type"                            = "http"
      "spark.openlineage.transport.url"                             = "https://some.url/"
      "spark.openlineage.dataset.removePath.pattern"                = "(\/[a-z]+[-a-zA-Z0-9]+)+(?<remove>.*)"
      "spark.openlineage.namespace"                                 = "some_namespace"
    }
EOF
fi
Copy link

boring-cyborg bot commented Mar 8, 2024

Thanks for creating your first OpenLineage issue! Your feedback is valuable and improves the project. If you haven't already, please be sure to follow the issue template!

@algorithmy1
Copy link
Contributor

algorithmy1 commented Mar 18, 2024

I don't think that the spark's integration is working anymore for any of the environments in Databricks and not only the version 14.


The issue is coming from this change :
89beaad#diff-5542f4a02d45d03e2fdc704d8bf3aab869f919d72a4785a077c52acf86c3aa6a

@efthymiosh efthymiosh changed the title [Bug] openlineage-spark crashes spark driver on Databricks Runtime 14.x [Bug] openlineage-spark crashes spark driver on Databricks Mar 19, 2024
@pawel-big-lebowski
Copy link
Contributor

@efthymiosh Hi, could you verify if the issue is still present in recently released version after merging -> #2537

@efthymiosh
Copy link
Author

@efthymiosh Hi, could you verify if the issue is still present in recently released version after merging -> #2537

Running a job with 1.11.3 active I can verify the issue is no longer present and events are being received on the transport URL. Thank you so much for your work on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants