-
Notifications
You must be signed in to change notification settings - Fork 16.5k
Description
Apache Airflow Provider(s)
apache-spark
Versions of Apache Airflow Providers
apache-airflow-providers-apache-spark 5.0.0
pyspark 3.5.4
Apache Airflow version
2.10.3
Operating System
ubuntu server 22.04
Deployment
Other
Deployment details
NA
What happened
Hi
SparkSubmitOperator fails when interacting with a spark standalone server in deploy-mode=cluster.
In such condition, the _resolve_should_track_driver_status function in /usr/local/lib/python3.12/site-packages/airflow/providers/apache/spark/hooks/spark_submit.py returns True...
I have exactly the same pb as the one discussed here :
#21799
I did find a case opened for this issue.
What you think should happen instead
I have an error when _resolve_should_track_driver_status returns True.
If I modify the function to return False, testing the task passes but in such case the final spark job status is not confirmed by the spark driver.
How to reproduce
cat sparkoperator_test.py
import airflow
from datetime import datetime, timedelta
from airflow.providers.apache.spark.operators.spark_submit import SparkSubmitOperator
default_args = {
'owner': 'xxxxxxxx',
'start_date': datetime(2025, 1, 31),
}
with airflow.DAG('sparkoperator_test',
default_args=default_args,
schedule_interval='@hourly',
description='test of sparkoperator in airflow by Omar and Regis',
) as dag:
spark_conf={
'spark.standalone.submit.waitAppCompletion':'true',
}
spark_compute_pi = SparkSubmitOperator(
task_id='spark-compute-pi',
conn_id='spark_qa1',
java_class='org.apache.spark.examples.SparkPi',
application='/opt/spark/examples/jars/spark-examples_2.12-3.5.3.jar',
name='compute_pi_from_airflow',
application_args=["30"],
total_executor_cores=4,
executor_memory='1g',
executor_cores=1,
verbose='true',
conf=spark_conf,
execution_timeout=timedelta(minutes=10),
retries=0,
)
spark_compute_pi
spark version : 3.5.3
1 single node (hosting the master + 1 worker node)
The error happens when testing the task with airflow tasks test ...
Anything else
No response
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project's Code of Conduct