Check whether the task finishes before deferring the task for KubernetesPodOperatorAsync #1104

Lee-W · 2023-05-16T10:59:08Z

Like the SnowflakeOperatorAsync, we try to verify if the submitted job has already completed before deferring it to prevent unnecessary deferring. This way, we can skip deferring the task if it has already been finished.

Lee-W · 2023-05-16T14:42:41Z

codecov · 2023-05-17T02:27:07Z

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (ebb00d0) 98.58% compared to head (75841ed) 98.58%.

Additional details and impacted files

@@           Coverage Diff           @@
##             main    #1104   +/-   ##
=======================================
  Coverage   98.58%   98.58%           
=======================================
  Files          90       90           
  Lines        5377     5383    +6     
=======================================
+ Hits         5301     5307    +6     
  Misses         76       76

Impacted Files	Coverage Δ
...viders/cncf/kubernetes/operators/kubernetes_pod.py	`94.11% <100.00%> (+0.46%)`	⬆️
...oviders/cncf/kubernetes/triggers/wait_container.py	`98.64% <100.00%> (+0.01%)`	⬆️

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

astronomer/providers/cncf/kubernetes/operators/kubernetes_pod.py

pankajkoti · 2023-06-02T04:37:24Z

astronomer/providers/cncf/kubernetes/operators/kubernetes_pod.py

+                "status": "failed",
+                "message": self.pod.status.message,
+            }
+        return self.trigger_reentry(context=context, event=event)


is the event payload that we send to trigger_reentry handled correctly? I am doubting if it is consistent with how we return it from the triggerer and that it gets handled correctly.

Thanks for the reminder. In the original implementation, I utilized the logic in oss airflow but I made a change yesterday to use the WaitContainerTrigger logic instead.

This has been tested as well

pankajkoti · 2023-06-06T07:04:13Z

astronomer/providers/cncf/kubernetes/operators/kubernetes_pod.py

        self.pod_request_obj = self.build_pod_request_obj(context)
        self.pod: k8s.V1Pod = self.get_or_create_pod(self.pod_request_obj, context)
+        pod_status = self.pod.status.phase
+        if pod_status in PodPhase.terminal_states or not container_is_running(
+            pod=self.pod, container_name=self.BASE_CONTAINER_NAME


Will there always be a base container in all the pods? Also, if there is a base container what does it do? I am thinking if there are multiple containers in the pod then should we not check for the running status of all the containers in the pod? Or the base container is meant to keep a check on the running status of other containers in the pod?

If it's not possible to check the status of all containers I think we could just remove the or condition which checks the container status and then the PR looks good to me.

I think so 🤔 That's how we check when deferred.

https://github.com/astronomer/astronomer-providers/blob/main/astronomer/providers/cncf/kubernetes/operators/kubernetes_pod.py#L69
https://github.com/astronomer/astronomer-providers/blob/main/astronomer/providers/cncf/kubernetes/triggers/wait_container.py#L109

I am afraid we have a wrong implementation there with respect to the above questions. It's done a bit differently in the OSS provider, no?

yep, I think we implement the logic in different ways

…sync

…ContainerTrigger

pankajkoti

I don't have a strong reservation on this. Since this is consistent with the existing trigger implementation, other issues would be outside the scope of the PR.

Lee-W · 2023-06-07T02:10:30Z

Thanks! I'll merge it for now. As deferrable mode has been added in OSS airflow, I think we'll eventually do something like #1169

…peratorAsync (#1104)" This reverts commit 89ccc7e. PR #1104 adds logic to poke for Pod status before putting the status check on deferral. However, it is observed that our DAG fails always when we go on checking the status.phase on pods as Pod.status is None before the pod gets scheduled on a None. So, in most scenarios it does not make sense to check for the pod status immediately to verify that the pod has completed it's desired execution and hence, we revert this poke in case of Kubernetes Pod operator.

…peratorAsync (#1104)" (#1209) This reverts commit 89ccc7e. PR #1104 adds logic to poke for Pod status before putting the status check on deferral. However, it is observed that our DAG fails always when we go on checking the status.phase on pods as Pod.status is None before the pod gets scheduled on a node. So, in most scenarios it does not make sense to check for the pod status immediately to verify that the pod has completed its desired execution and hence, we revert this poke in case of Kubernetes Pod operator. closes: #1208

Lee-W mentioned this pull request May 16, 2023

Check whether the task finishes before deferring the task #1102

Closed

54 tasks

Lee-W force-pushed the KubernetesPodOperatorAsync-check-before-deferring-task branch from e6c08cb to f867f2d Compare May 17, 2023 02:19

Lee-W marked this pull request as ready for review May 17, 2023 02:40

Lee-W requested review from phanikumv, pankajastro, pankajkoti, sunank200 and utkarsharma2 as code owners May 17, 2023 02:40

Lee-W force-pushed the KubernetesPodOperatorAsync-check-before-deferring-task branch from f867f2d to a2dc9aa Compare May 17, 2023 02:45

pankajastro reviewed May 23, 2023

View reviewed changes

astronomer/providers/cncf/kubernetes/operators/kubernetes_pod.py Outdated Show resolved Hide resolved

Lee-W force-pushed the KubernetesPodOperatorAsync-check-before-deferring-task branch from a2dc9aa to 9bda075 Compare May 30, 2023 04:25

pankajkoti reviewed Jun 2, 2023

View reviewed changes

Lee-W force-pushed the KubernetesPodOperatorAsync-check-before-deferring-task branch from 9bda075 to 2503200 Compare June 2, 2023 09:49

Lee-W requested a review from pankajkoti June 2, 2023 23:42

pankajkoti reviewed Jun 6, 2023

View reviewed changes

Lee-W added 3 commits June 6, 2023 15:58

feat(kubernetes): check state before deferring KubernetesPodOperatorA…

ebfe418

…sync

style(cncf): use future annotations

8598da7

fix(cncf): align how KubernetesPodOperatorAsync handle event and Wait…

75841ed

…ContainerTrigger

Lee-W force-pushed the KubernetesPodOperatorAsync-check-before-deferring-task branch from 2503200 to 75841ed Compare June 6, 2023 07:58

pankajkoti approved these changes Jun 6, 2023

View reviewed changes

Lee-W merged commit 89ccc7e into main Jun 7, 2023
10 checks passed

Lee-W deleted the KubernetesPodOperatorAsync-check-before-deferring-task branch June 7, 2023 02:10

pankajkoti mentioned this pull request Jun 22, 2023

Revert "feat(kubernetes): check state before deferring KubernetesPodOperatorAsync (#1104)" #1209

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Check whether the task finishes before deferring the task for KubernetesPodOperatorAsync #1104

Check whether the task finishes before deferring the task for KubernetesPodOperatorAsync #1104

Lee-W commented May 16, 2023

Lee-W commented May 16, 2023

codecov bot commented May 17, 2023 •

edited

Loading

pankajkoti Jun 2, 2023

Lee-W Jun 2, 2023

pankajkoti Jun 6, 2023 •

edited

Loading

Lee-W Jun 6, 2023

pankajkoti Jun 6, 2023

Lee-W Jun 7, 2023

pankajkoti left a comment

Lee-W commented Jun 7, 2023

Check whether the task finishes before deferring the task for KubernetesPodOperatorAsync #1104

Check whether the task finishes before deferring the task for KubernetesPodOperatorAsync #1104

Conversation

Lee-W commented May 16, 2023

Lee-W commented May 16, 2023

codecov bot commented May 17, 2023 • edited Loading

Codecov Report

pankajkoti Jun 2, 2023

Choose a reason for hiding this comment

Lee-W Jun 2, 2023

Choose a reason for hiding this comment

pankajkoti Jun 6, 2023 • edited Loading

Choose a reason for hiding this comment

Lee-W Jun 6, 2023

Choose a reason for hiding this comment

pankajkoti Jun 6, 2023

Choose a reason for hiding this comment

Lee-W Jun 7, 2023

Choose a reason for hiding this comment

pankajkoti left a comment

Choose a reason for hiding this comment

Lee-W commented Jun 7, 2023

codecov bot commented May 17, 2023 •

edited

Loading

pankajkoti Jun 6, 2023 •

edited

Loading