-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extract spark from airflow container #47
Conversation
…airflow container
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Weird thing. Tried to run the s3-external-spark example, but i get the following error in the SparkSubmitOperator
:
Job 3: Subtask fetch_csv_from_s3_and_update_postgres airflow.exceptions.AirflowException: Cannot execute: ['spark-submit', '--master', 'spark://sparkmaster:7077', '--conf', 'spark.hadoop.fs.s3a.impl=org.apache.hadoop.fs.s3a.S3AFileSystem', '--conf', 'spark.hadoop.fs.s3a.access.key=bar', '--conf', 'spark.hadoop.fs.s3a.secret.key=foo', '--conf', 'spark.hadoop.fs.s3a.endpoint=s3server:4563', '--conf', 'spark.hadoop.fs.s3a.connection.ssl.enabled=false', '--conf', 'spark.hadoop.fs.s3a.path.style.access=true', '--conf', 'spark.hadoop.fs.s3.impl=org.apache.hadoop.fs.s3a.S3AFileSystem', '--name', 'airflow-spark', '--queue', 'root.default', '--deploy-mode', 'client', '/usr/local/airflow/dags/spark-s3-to-postgres/spark//s3topostgres.py', '-f', 's3://demo-s3-output/input/data/demo/spark/20190613/', '-t', 'demo']. Error code is: 127.
[2019-06-13 19:12:32,252] {logging_mixin.py:95} INFO - [2019-06-13 19:12:32,252] {jobs.py:186} DEBUG - [heartbeat]
[2019-06-13 19:12:32,253] {logging_mixin.py:95} INFO - [2019-06-
I logged into the airflow container and the spark-submit
executable is just available on the PATH
any idea what it could be?
docker/airflow-python/Dockerfile
Outdated
@@ -0,0 +1,90 @@ | |||
FROM python:3.6-slim as airflow-base |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this ignores the PYTHON_VERSION
env var? Is it possible to use it in the tag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works. Pushed final changes
Looking great though! |
Maybe create a tag of the current master and then delete the |
No description provided.