Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to apply caps only on alive pods #252

Merged
merged 1 commit into from May 29, 2018

Conversation

Projects
None yet
2 participants
@nehaljwani
Copy link

nehaljwani commented Nov 17, 2017

A pod can be in one of five phases: running, pending, succeeded, failed
or unknown. The pods which are not in the running/pending state do not
consume any CPU/Memory resources, they just need to be GC'ed. If, in the
scenario when a cap has been specified on a k8s cloud (single namespace)
and let's say 40% of the count are in the 'failed' phase, and 60% are
'runnning/pending', newer pods will not be spawned in that namespace,
since the plugin will count all those pods for instance/pod cap and even
though it could have spawned 40% more pods, it won't and jobs will
starve. This patch adds an option to calculate the cap for pods only in
the 'running' or 'pending' phase, on a cloud level and on a pod template
level.

@nehaljwani nehaljwani force-pushed the nehaljwani:cap-not-undead-pods branch 3 times, most recently from 5ee08a9 to dcb8b00 Nov 19, 2017

@carlossg

This comment has been minimized.

Copy link

carlossg commented Nov 21, 2017

the problem I see is that the errored pods fail for a reason (ie. wrong image), and this would just enter in a infinite loop of spawning new pods

@nehaljwani

This comment has been minimized.

Copy link
Author

nehaljwani commented Nov 21, 2017

Well, the option is an opt-in.

Perhaps the plugin should optionally also provide a strategy (like, exponential back-off) to delay the launch of new pods?

@nehaljwani

This comment has been minimized.

Copy link
Author

nehaljwani commented Nov 28, 2017

@carlossg Should I add a warning in the description of the checkbox that this can lead to an infinite loop of spawning new pods?

@carlossg
Copy link

carlossg left a comment

I think it looks mostly ok, see my comment, but capOnlyOnUnDeadPods sounds really weird to me, wouldn't be more intuitive capOnlyOnAlivePods ?

@@ -424,6 +436,22 @@ private boolean addProvisionedSlave(@Nonnull PodTemplate template, @CheckForNull
PodList namedList = client.pods().inNamespace(templateNamespace).withLabels(labelsMap).list();
List<Pod> namedListItems = namedList.getItems();

if (this.isCapOnlyOnUnDeadPods()) {

This comment has been minimized.

@carlossg

carlossg Dec 3, 2017

instead of getting the full list, then filtering here, can't it be done using something like withFields("status.phase", "running|pending") not sure what would be the right syntax but should be possible based on kubernetes/kubernetes#49387

This comment has been minimized.

@nehaljwani

nehaljwani Dec 3, 2017

Author

I can do

./kubectl get pods --field-selector=status.phase='Succeeded'

I can also do:

./kubectl get pods --field-selector=status.phase='Failed'

But I can't do:

./kubectl get pods --field-selector=status.phase='Succeeded|Failed'

Neither can I do:

./kubectl get pods --field-selector='status.phase in (Succeeded, Failed)'

I couldn't find any official documentation on fieldselector 😢 https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#

I also looked at the implementation of withFields and getFieldQueryParam and it doesn't seem like a selector has been implemented yet. 😭

This comment has been minimized.

@nehaljwani

nehaljwani Dec 3, 2017

Author

However, I can do:

./kubectl get pods --field-selector='status.phase!=Failed,status.phase!=Succeeded,status.phase!=Unknown'

But again, we really need a function withFieldSelector() at https://github.com/fabric8io/kubernetes-client/blob/v3.1.0/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/Filterable.java

@nehaljwani nehaljwani force-pushed the nehaljwani:cap-not-undead-pods branch from dcb8b00 to 020eaf6 Dec 3, 2017

Add option to apply caps only on alive pods
A pod can be in one of five phases: running, pending, succeeded, failed
or unknown. The pods which are not in the running/pending phase do not
consume any CPU/Memory resources, they just need to be GC'ed. If, in the
scenario when a cap has been specified on a k8s cloud (single namespace)
and let's say 40% of the count are in the 'failed' phase, and 60% are
'runnning/pending', newer pods will not be spawned in that namespace,
since the plugin will count all those pods for instance/pod cap and even
though it could have spawned 40% more pods, it won't and jobs will
starve. This patch adds an option to calculate the cap for pods only in
the 'running' or 'pending' phase, on a cloud level and on a pod template
level, so that the cap applies to only those pods which are alive or
about to be.

Change-Id: Id77e837aa9a42742618cd3c543e7b99f9d38a10a

@nehaljwani nehaljwani force-pushed the nehaljwani:cap-not-undead-pods branch from 020eaf6 to a615156 Dec 3, 2017

@nehaljwani

This comment has been minimized.

Copy link
Author

nehaljwani commented Dec 3, 2017

I've updated the PR with a less weird name 😉

@nehaljwani nehaljwani changed the title Add option to apply caps only on undead pods Add option to apply caps only on alive pods Dec 3, 2017

@carlossg carlossg merged commit d90c9c9 into jenkinsci:master May 29, 2018

1 of 2 checks passed

continuous-integration/jenkins/pr-merge This commit cannot be built
Details
continuous-integration/jenkins/pr-head This commit looks good
Details
@nehaljwani

This comment has been minimized.

Copy link
Author

nehaljwani commented May 29, 2018

Finally, after 5 months! 🎉

@carlossg

This comment has been minimized.

Copy link

carlossg commented May 29, 2018

sorry, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.