New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AIRFLOW-1310] Basic operator to run docker container on Kubernetes #2456
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2456 +/- ##
=======================================
Coverage 69.99% 69.99%
=======================================
Files 146 146
Lines 11454 11454
=======================================
Hits 8017 8017
Misses 3437 3437 Continue to review full report at Codecov.
|
197cca8
to
42bd04a
Compare
What's the relationship with #2414 ? |
As far as I understand #2414 is a larger effort to provide native K8s executor. This is a simple operator that allows spawning any docker based container on a (remote) K8s cluster. Does not require k8s executor so you might stay on your local/celery for general operation and just deploy selected containers to k8s. I imagine when the #2414 is ready this becomes less useful for people who use k8s executor as they will be able to run docker containers directly. I suppose both PRs might be integrated/or this replaced when 2414 is ready |
|
||
sleep(self.poke_interval) | ||
finally: | ||
if (self.wait and self.should_do_cleanup(status)): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Status may still be None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The loop should not exit in a normal way if the status is None. The only way it can happen is an uncaught exception - in this case the whole DAG blows up anyways, I was thinking what to do in such case - simplest way is to set status to Failed, but thats a lazy solution. A good suggestion welcome!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's more that the pod will never be cleaned up if there was an exception of some sorts. Which will require manual intervention if pod names are not forced to be unique.
try: | ||
status = None | ||
while not (status in exit_statuses): | ||
status = hook.get_pod_state(pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pod logs should be captured.
poke_interval=3, | ||
*args, **kwargs): | ||
super(KubernetesPodOperator, self).__init__(*args, **kwargs) | ||
self.image = image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably more flexible to allow passing in a pod manifest, so you can template more complex pods in the DAG file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am planning to add it in next step - currently supporting the basic case where you want to run simple container in a simple pod
*args, **kwargs): | ||
super(KubernetesPodOperator, self).__init__(*args, **kwargs) | ||
self.image = image | ||
self.name = name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pod names must be unique within the namespace. If a pod failed and wasn't cleaned up, this might block execution. Not sure if it should be the executors task of guarannteeing uniqueness or if the user should take care of this in the DAG.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you say - hard to tell whose responsibility it is. I guess the best solution is to provide 2 params "name" and "name_prefix" or something like that - one is used directly, while the other is used with a random string appended - this way the dev has a choice what he want to do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a unique_name flag which defaults to True is a more clear solution. If it's true, name may then even be optional, you can just construct it from available DAG data.
8efe55e
to
8810119
Compare
b20951f
to
52a45d5
Compare
The commit contains implementation of the hook and operator that allow creation, deployment and tracking of docker containers to Kubernetes pods. The hook supports crating the specification, deploying, and deleting the pod, while the operator allows for fire-forget or wait for completion execution modes.
52a45d5
to
e65f7ee
Compare
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Dear Airflow maintainers,
The PR contains implementation of the hook and operator that allow creation,
deployment and tracking of docker containers to Kubernetes pods.
The hook supports crating the specification, deploying,
and deleting the pod, while the operator allows for fire-forget
or wait for completion execution modes.
The feature was requested here https://issues.apache.org/jira/browse/AIRFLOW-1310
Suitable test cases provided.
The operator is being used for last few weeks without any major issues in our internal deployments.