Add sensor for AWS Batch (#19850) #19885

yeshbash · 2021-11-30T06:57:30Z

Adds a sensor implementation to ask for the status of an
AWS Batch job. The sensor will enable DAGs to wait for the
batch job to reach a terminal state before proceeding to the
downstream tasks.

closes: #19850

^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

boring-cyborg · 2021-11-30T06:57:48Z

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
Here are some useful points:

Pay attention to the quality of your code (flake8, mypy and type annotations). Our pre-commits will help you with that.
In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
Please follow ASF Code of Conduct for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
Be sure to read the Airflow Coding style.
Apache Airflow is a community-driven project and together we are making it better 🚀.
In case of doubts contact the developers at:
Mailing List: dev@airflow.apache.org
Slack: https://s.apache.org/airflow-slack

eladkal · 2021-11-30T09:19:36Z

airflow/providers/amazon/aws/sensors/batch.py

+    INTERMEDIATE_STATES = (
+        'SUBMITTED',
+        'PENDING',
+        'RUNNABLE',
+        'STARTING',
+        'RUNNING',
+    )
+    FAILURE_STATES = ('FAILED',)
+    SUCCESS_STATES = ('SUCCEEDED',)


I think it would be better to manage the statuses on the hook level.
The hook itself also has a function that require the same definitions:

airflow/airflow/providers/amazon/aws/hooks/batch_client.py

Lines 248 to 255 in 0df50f4

if job_status == "SUCCEEDED":

self.log.info("AWS batch job (%s) succeeded: %s", job_id, job)

return True

if job_status == "FAILED":

raise AirflowException(f"AWS Batch job ({job_id}) failed: {job}")

if job_status in ["SUBMITTED", "PENDING", "RUNNABLE", "STARTING", "RUNNING"]:

So we should have one place to define it for both.

Good point - I will define these in AwsBatchClientHook and reference it.

While making the change, I noticed that the hook also uses two additional combinations of state values for its own methods

poll_for_job_running: checks for ["RUNNING", "SUCCEEDED", "FAILED"].

poll_for_job_complete : checks for ["SUCCEEDED", "FAILED"]

I'm planning to create the following state definitions to cater to all cases. Let me know if you think otherwise and leaving the above two methods the same would be better.

FAILURE_STATE = 'FAILED' SUCCESS_STATE = 'SUCCEEDED' RUNNING_STATE = 'RUNNING' INTERMEDIATE_STATES = ( 'SUBMITTED', 'PENDING', 'RUNNABLE', 'STARTING', RUNNING_STATE, )

ferruzzi · 2021-11-30T18:08:59Z

airflow/providers/amazon/aws/sensors/batch.py

+        if state in self.FAILURE_STATES:
+            raise AirflowException(f'Batch sensor failed. Batch Job Status: {state}')
+
+        if state in self.INTERMEDIATE_STATES:
+            return False


I'm not overly familiar with Batch yet. Is there a use case for setting a target state? Rather than just returning success/working/failed, is it desirable to have the option to check if it is currently RUNNING, for example?

This sensor's objective is to poke until the job reaches a logical end - success/failure. In this context, I can't think of a use-case where matching against any other target state can be useful. There could be other places where we want to check against a target state and the AwsBatchClientHook already has a method for this.

airflow/airflow/providers/amazon/aws/hooks/batch_client.py

Line 321 in 0df50f4

def poll_job_status(self, job_id: str, match_status: List[str]) -> bool:

Let me know if I misunderstood your comment.

I was looking at emr_sensor and that helped me understand your question better. Are you asking if making the terminal states configurable such that the sensor can return true/false based on it?

I don't know a definitive answer but this is my thought process

I could see it useful for DAGs which follow a fire-and-forget pattern to launch the job.

In fire-and-forget scenarios, would there be a need to use a Sensor? - I don't think so

In the case of EMR, a cluster RUNNING could be a logical terminal state for long-running clusters. On the contrary, the batch is for ephemeral tasks and the same might not make sense

I see AwsBatchOperator providing this option through the waiter argument.

With this, I'm inclining to say that a configurable terminal state might not be useful here. Let me know your thoughts

I appreciate the detailed reply; that makes sense.

As an example of what I was thinking: the EKS Nodegroup sensor can be used to wait for the nodegroup to go "creating" so it can send the command to start spinning up a second, or wait to go "active" and trigger a task to run on it, or wait to go "nonexistent" so the cluster it is attached to could be deleted. I was wondering if Batch had a usecase for adding similar logic, but it sounds like that may not be the case.

Carry on, and sorry for the distraction. 😄

Adds a sensor implementation to ask for the status of an AWS Batch job. The sensor will enable DAGs to wait for the batch job to reach a terminal state before proceeding to the downstream tasks.

Different Aws Batch job statuses are used in both AwsBatchClientHook and BatchSensor. This change defines the status strings inside AwsBatchClientHook such that it can be reused across other classes

eladkal

LGTM
@yeshbash would you mind adding also example dag and docs?
you can use GCS docs as reference

github-actions · 2021-12-05T10:08:59Z

The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.

boring-cyborg · 2021-12-05T21:52:33Z

Awesome work, congrats on your first merged pull request!

boring-cyborg bot added area:providers provider:amazon-aws AWS/Amazon - related issues labels Nov 30, 2021

eladkal requested changes Nov 30, 2021

View reviewed changes

ferruzzi reviewed Nov 30, 2021

View reviewed changes

Yeshwanth Balachander added 2 commits November 30, 2021 14:39

Add sensor for AWS Batch (apache#19850)

153022d

Adds a sensor implementation to ask for the status of an AWS Batch job. The sensor will enable DAGs to wait for the batch job to reach a terminal state before proceeding to the downstream tasks.

Define job status inside AwsBatchClientHook

0b95fa6

Different Aws Batch job statuses are used in both AwsBatchClientHook and BatchSensor. This change defines the status strings inside AwsBatchClientHook such that it can be reused across other classes

yeshbash force-pushed the feature-aws-batch-sensor branch from b6b5daf to 0b95fa6 Compare November 30, 2021 22:46

yeshbash requested a review from eladkal December 1, 2021 04:21

eladkal approved these changes Dec 5, 2021

View reviewed changes

github-actions bot added the okay to merge It's ok to merge this PR as it does not require more tests label Dec 5, 2021

potiuk approved these changes Dec 5, 2021

View reviewed changes

potiuk merged commit af28b41 into apache:main Dec 5, 2021

potiuk mentioned this pull request Dec 31, 2021

Status of testing Providers that were prepared on December 31, 2021 #20615

Closed

91 tasks

potiuk mentioned this pull request Feb 5, 2022

Status of testing Providers that were prepared on February 05, 2022 #21348

Closed

54 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add sensor for AWS Batch (#19850) #19885

Add sensor for AWS Batch (#19850) #19885

yeshbash commented Nov 30, 2021 •

edited by eladkal

Loading

boring-cyborg bot commented Nov 30, 2021

eladkal Nov 30, 2021

yeshbash Nov 30, 2021 •

edited

Loading

ferruzzi Nov 30, 2021

yeshbash Nov 30, 2021

yeshbash Nov 30, 2021

ferruzzi Dec 1, 2021

eladkal left a comment

github-actions bot commented Dec 5, 2021

boring-cyborg bot commented Dec 5, 2021

	if job_status == "SUCCEEDED":
	self.log.info("AWS batch job (%s) succeeded: %s", job_id, job)
	return True

	if job_status == "FAILED":
	raise AirflowException(f"AWS Batch job ({job_id}) failed: {job}")

	if job_status in ["SUBMITTED", "PENDING", "RUNNABLE", "STARTING", "RUNNING"]:

Add sensor for AWS Batch (#19850) #19885

Add sensor for AWS Batch (#19850) #19885

Conversation

yeshbash commented Nov 30, 2021 • edited by eladkal Loading

boring-cyborg bot commented Nov 30, 2021

eladkal Nov 30, 2021

Choose a reason for hiding this comment

yeshbash Nov 30, 2021 • edited Loading

Choose a reason for hiding this comment

ferruzzi Nov 30, 2021

Choose a reason for hiding this comment

yeshbash Nov 30, 2021

Choose a reason for hiding this comment

yeshbash Nov 30, 2021

Choose a reason for hiding this comment

ferruzzi Dec 1, 2021

Choose a reason for hiding this comment

eladkal left a comment

Choose a reason for hiding this comment

github-actions bot commented Dec 5, 2021

boring-cyborg bot commented Dec 5, 2021

yeshbash commented Nov 30, 2021 •

edited by eladkal

Loading

yeshbash Nov 30, 2021 •

edited

Loading