[AIRFLOW-1310] Basic operator to run docker container on Kubernetes #2456

mdziemianko · 2017-07-18T15:55:02Z

Dear Airflow maintainers,

The PR contains implementation of the hook and operator that allow creation,
deployment and tracking of docker containers to Kubernetes pods.
The hook supports crating the specification, deploying,
and deleting the pod, while the operator allows for fire-forget
or wait for completion execution modes.
The feature was requested here https://issues.apache.org/jira/browse/AIRFLOW-1310

Suitable test cases provided.
The operator is being used for last few weeks without any major issues in our internal deployments.

codecov-io · 2017-07-18T16:32:27Z

Codecov Report

Merging #2456 into master will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #2456   +/-   ##
=======================================
  Coverage   69.99%   69.99%           
=======================================
  Files         146      146           
  Lines       11454    11454           
=======================================
  Hits         8017     8017           
  Misses       3437     3437

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0bc248f...e65f7ee. Read the comment docs.

bolkedebruin · 2017-07-19T11:28:58Z

What's the relationship with #2414 ?

mdziemianko · 2017-07-19T11:36:55Z

As far as I understand #2414 is a larger effort to provide native K8s executor. This is a simple operator that allows spawning any docker based container on a (remote) K8s cluster. Does not require k8s executor so you might stay on your local/celery for general operation and just deploy selected containers to k8s. I imagine when the #2414 is ready this becomes less useful for people who use k8s executor as they will be able to run docker containers directly. I suppose both PRs might be integrated/or this replaced when 2414 is ready

stroop23 · 2017-07-19T17:14:04Z

airflow/contrib/operators/kubernetes_operator.py

+
+                sleep(self.poke_interval)
+        finally:
+            if (self.wait and self.should_do_cleanup(status)):


Status may still be None

The loop should not exit in a normal way if the status is None. The only way it can happen is an uncaught exception - in this case the whole DAG blows up anyways, I was thinking what to do in such case - simplest way is to set status to Failed, but thats a lazy solution. A good suggestion welcome!

It's more that the pod will never be cleaned up if there was an exception of some sorts. Which will require manual intervention if pod names are not forced to be unique.

stroop23 · 2017-07-19T17:16:07Z

airflow/contrib/operators/kubernetes_operator.py

+        try:
+            status = None
+            while not (status in exit_statuses):
+                status = hook.get_pod_state(pod)


Pod logs should be captured.

stroop23 · 2017-07-19T17:22:21Z

airflow/contrib/operators/kubernetes_operator.py

+            poke_interval=3,
+            *args, **kwargs):
+        super(KubernetesPodOperator, self).__init__(*args, **kwargs)
+        self.image = image


It's probably more flexible to allow passing in a pod manifest, so you can template more complex pods in the DAG file.

I am planning to add it in next step - currently supporting the basic case where you want to run simple container in a simple pod

stroop23 · 2017-07-19T17:29:30Z

airflow/contrib/operators/kubernetes_operator.py

+            *args, **kwargs):
+        super(KubernetesPodOperator, self).__init__(*args, **kwargs)
+        self.image = image
+        self.name = name


pod names must be unique within the namespace. If a pod failed and wasn't cleaned up, this might block execution. Not sure if it should be the executors task of guarannteeing uniqueness or if the user should take care of this in the DAG.

As you say - hard to tell whose responsibility it is. I guess the best solution is to provide 2 params "name" and "name_prefix" or something like that - one is used directly, while the other is used with a random string appended - this way the dev has a choice what he want to do

Probably a unique_name flag which defaults to True is a more clear solution. If it's true, name may then even be optional, you can just construct it from available DAG data.

The commit contains implementation of the hook and operator that allow creation, deployment and tracking of docker containers to Kubernetes pods. The hook supports crating the specification, deploying, and deleting the pod, while the operator allows for fire-forget or wait for completion execution modes.

stale · 2018-12-10T23:50:11Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

mdziemianko force-pushed the AIRFLOW-1310-kubernetes branch 3 times, most recently from 197cca8 to 42bd04a Compare July 19, 2017 09:54

stroop23 reviewed Jul 19, 2017

View reviewed changes

mdziemianko force-pushed the AIRFLOW-1310-kubernetes branch 4 times, most recently from 8efe55e to 8810119 Compare August 6, 2017 14:02

mdziemianko force-pushed the AIRFLOW-1310-kubernetes branch from b20951f to 52a45d5 Compare August 11, 2017 13:51

mdziemianko force-pushed the AIRFLOW-1310-kubernetes branch from 52a45d5 to e65f7ee Compare August 11, 2017 14:55

stale bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Dec 10, 2018

stale bot closed this Dec 18, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AIRFLOW-1310] Basic operator to run docker container on Kubernetes #2456

[AIRFLOW-1310] Basic operator to run docker container on Kubernetes #2456

mdziemianko commented Jul 18, 2017

codecov-io commented Jul 18, 2017 •

edited

bolkedebruin commented Jul 19, 2017

mdziemianko commented Jul 19, 2017

stroop23 Jul 19, 2017

mdziemianko Jul 20, 2017

stroop23 Jul 21, 2017

stroop23 Jul 19, 2017

stroop23 Jul 19, 2017

mdziemianko Jul 20, 2017

stroop23 Jul 19, 2017

mdziemianko Jul 20, 2017

stroop23 Jul 21, 2017

stale bot commented Dec 10, 2018

[AIRFLOW-1310] Basic operator to run docker container on Kubernetes #2456

[AIRFLOW-1310] Basic operator to run docker container on Kubernetes #2456

Conversation

mdziemianko commented Jul 18, 2017

codecov-io commented Jul 18, 2017 • edited

Codecov Report

bolkedebruin commented Jul 19, 2017

mdziemianko commented Jul 19, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stale bot commented Dec 10, 2018

codecov-io commented Jul 18, 2017 •

edited