Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DominoSparkOperator #80

Merged
merged 7 commits into from
Nov 2, 2020

Conversation

wseaton
Copy link
Contributor

@wseaton wseaton commented Oct 22, 2020

Add new DominoSparkOperator that supports the v4 job start function signature along with associated unit tests.

@wseaton
Copy link
Contributor Author

wseaton commented Oct 22, 2020

@abhijeet2096 For my personal development environment I've been using poetry to manage package dependencies (also to pull in test dependencies like pytest), I can also push those files if you all are interested.

change the template field to be a proper tuple
Add RunFailedException as a possible throwable from the polling function.
@@ -377,6 +377,8 @@ def job_start_blocking(self, poll_freq: int = 5, max_poll_time: int = 6000, **kw
def get_job_status(job_identifier):
status = self.job_status(job_identifier)
self.log.info(f"Polling Job: {job_identifier} status is completed: {status['statuses']['isCompleted']}")
if status['statuses']['executionStatus'] == "Failed":
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhijeet2096 throwing an exception here if the status is failed so it can be handled downstream by other functions, like the airflow operator. I noticed failures were happening silenty since we just check for job completion and
job_complete != job_success

dag=dag,
task_id="foo",
project=TEST_PROJECT,
command="test_spark.py",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @wseaton can we have python/bash commands instead of name of the test files test_spark.py, test_spark_fail.sh?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not here, since I can't provide a direct command via the v4 api, before I could do this via isDirect. I'd also like to not have to use wrapper scripts 🙂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @wseaton can you add these sample test files (test_spark.py, test_spark_fail.sh) here, in this folder?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abhijeet2096 this has been done

Copy link
Contributor

@abhijeet2096 abhijeet2096 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good. Can you also update the associated readme for airflow section.

@wseaton
Copy link
Contributor Author

wseaton commented Oct 28, 2020

@abhijeet2096 readme has been updated

Copy link
Contributor

@abhijeet2096 abhijeet2096 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@abhijeet2096 abhijeet2096 merged commit ef1fbd1 into dominodatalab:master Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants