You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are some breaking Spark changes with Spark Connect. For example, this code that accesses the sparkContext (self.spark_session.sparkContext) will not work.
You should be able to restructure the code so that it works with both traditional Spark & Spark Connect.
Ideas of implementation
You can install Spark Connect with pip install spark[connect] and see what's breaking. You can also just try out Dagster on a serverless Databricks cluster to see if it works as expected. Looks like dagster accepts Spark RDDs and those aren't supported by Spark Connect, so that might be a place the code will break.
Additional information
No response
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
The text was updated successfully, but these errors were encountered:
What's the use case?
Spark Connect is a different Spark architecture that's now used by some vendor runtimes, like Databricks Serverless.
There are some breaking Spark changes with Spark Connect. For example, this code that accesses the sparkContext (
self.spark_session.sparkContext
) will not work.You should be able to restructure the code so that it works with both traditional Spark & Spark Connect.
Ideas of implementation
You can install Spark Connect with
pip install spark[connect]
and see what's breaking. You can also just try out Dagster on a serverless Databricks cluster to see if it works as expected. Looks like dagster accepts Spark RDDs and those aren't supported by Spark Connect, so that might be a place the code will break.Additional information
No response
Message from the maintainers
Impacted by this issue? Give it a 👍! We factor engagement into prioritization.
The text was updated successfully, but these errors were encountered: