Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to apply caps only on alive pods #252

Merged
merged 1 commit into from May 29, 2018

Conversation

@nehaljwani
Copy link
Contributor

@nehaljwani nehaljwani commented Nov 17, 2017

A pod can be in one of five phases: running, pending, succeeded, failed
or unknown. The pods which are not in the running/pending state do not
consume any CPU/Memory resources, they just need to be GC'ed. If, in the
scenario when a cap has been specified on a k8s cloud (single namespace)
and let's say 40% of the count are in the 'failed' phase, and 60% are
'runnning/pending', newer pods will not be spawned in that namespace,
since the plugin will count all those pods for instance/pod cap and even
though it could have spawned 40% more pods, it won't and jobs will
starve. This patch adds an option to calculate the cap for pods only in
the 'running' or 'pending' phase, on a cloud level and on a pod template
level.

@nehaljwani nehaljwani force-pushed the cap-not-undead-pods branch 3 times, most recently from 5ee08a9 to dcb8b00 Nov 19, 2017
@carlossg
Copy link
Contributor

@carlossg carlossg commented Nov 21, 2017

the problem I see is that the errored pods fail for a reason (ie. wrong image), and this would just enter in a infinite loop of spawning new pods

Loading

@nehaljwani
Copy link
Contributor Author

@nehaljwani nehaljwani commented Nov 21, 2017

Well, the option is an opt-in.

Perhaps the plugin should optionally also provide a strategy (like, exponential back-off) to delay the launch of new pods?

Loading

@nehaljwani
Copy link
Contributor Author

@nehaljwani nehaljwani commented Nov 28, 2017

@carlossg Should I add a warning in the description of the checkbox that this can lead to an infinite loop of spawning new pods?

Loading

Copy link
Contributor

@carlossg carlossg left a comment

I think it looks mostly ok, see my comment, but capOnlyOnUnDeadPods sounds really weird to me, wouldn't be more intuitive capOnlyOnAlivePods ?

Loading

@@ -424,6 +436,22 @@ private boolean addProvisionedSlave(@Nonnull PodTemplate template, @CheckForNull
PodList namedList = client.pods().inNamespace(templateNamespace).withLabels(labelsMap).list();
List<Pod> namedListItems = namedList.getItems();

if (this.isCapOnlyOnUnDeadPods()) {
Copy link
Contributor

@carlossg carlossg Dec 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of getting the full list, then filtering here, can't it be done using something like withFields("status.phase", "running|pending") not sure what would be the right syntax but should be possible based on kubernetes/kubernetes#49387

Loading

Copy link
Contributor Author

@nehaljwani nehaljwani Dec 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do

./kubectl get pods --field-selector=status.phase='Succeeded'

I can also do:

./kubectl get pods --field-selector=status.phase='Failed'

But I can't do:

./kubectl get pods --field-selector=status.phase='Succeeded|Failed'

Neither can I do:

./kubectl get pods --field-selector='status.phase in (Succeeded, Failed)'

I couldn't find any official documentation on fieldselector 😢 https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#

I also looked at the implementation of withFields and getFieldQueryParam and it doesn't seem like a selector has been implemented yet. 😭

Loading

Copy link
Contributor Author

@nehaljwani nehaljwani Dec 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, I can do:

./kubectl get pods --field-selector='status.phase!=Failed,status.phase!=Succeeded,status.phase!=Unknown'

But again, we really need a function withFieldSelector() at https://github.com/fabric8io/kubernetes-client/blob/v3.1.0/kubernetes-client/src/main/java/io/fabric8/kubernetes/client/dsl/Filterable.java

Loading

@nehaljwani nehaljwani force-pushed the cap-not-undead-pods branch from dcb8b00 to 020eaf6 Dec 3, 2017
A pod can be in one of five phases: running, pending, succeeded, failed
or unknown. The pods which are not in the running/pending phase do not
consume any CPU/Memory resources, they just need to be GC'ed. If, in the
scenario when a cap has been specified on a k8s cloud (single namespace)
and let's say 40% of the count are in the 'failed' phase, and 60% are
'runnning/pending', newer pods will not be spawned in that namespace,
since the plugin will count all those pods for instance/pod cap and even
though it could have spawned 40% more pods, it won't and jobs will
starve. This patch adds an option to calculate the cap for pods only in
the 'running' or 'pending' phase, on a cloud level and on a pod template
level, so that the cap applies to only those pods which are alive or
about to be.

Change-Id: Id77e837aa9a42742618cd3c543e7b99f9d38a10a
@nehaljwani nehaljwani force-pushed the cap-not-undead-pods branch from 020eaf6 to a615156 Dec 3, 2017
@nehaljwani
Copy link
Contributor Author

@nehaljwani nehaljwani commented Dec 3, 2017

I've updated the PR with a less weird name 😉

Loading

@nehaljwani nehaljwani changed the title Add option to apply caps only on undead pods Add option to apply caps only on alive pods Dec 3, 2017
@carlossg carlossg merged commit d90c9c9 into jenkinsci:master May 29, 2018
1 of 2 checks passed
Loading
@nehaljwani
Copy link
Contributor Author

@nehaljwani nehaljwani commented May 29, 2018

Finally, after 5 months! 🎉

Loading

@carlossg
Copy link
Contributor

@carlossg carlossg commented May 29, 2018

sorry, thanks!

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants