Skip to content

Forbid health checks longer than kill time#1498

Merged
ssalinas merged 3 commits intomasterfrom
forbid-impossible-healthcheck-delay
Apr 26, 2017
Merged

Forbid health checks longer than kill time#1498
ssalinas merged 3 commits intomasterfrom
forbid-impossible-healthcheck-delay

Conversation

@PtrTeixeira
Copy link
Contributor

Singularity is configured to automatically kill tasks which are
nonresponsive for a certain amount of time after startup
(killAfterTasksDoNotRunDefaultSeconds, set in the config file). If a
task is configured such that it won't start accepting healthchecks for
longer than that interval, Singularity will just kill it off, without
ever sending health checks. This makes it so that that task
configuration will be rejected ahead of time by Singularity.

Singularity is configured to automatically kill tasks which are
nonresponsive for a certain amount of time after startup
(`killAfterTasksDoNotRunDefaultSeconds`, set in the config file).  If a
task is configured such that it won't start accepting healthchecks for
longer than that interval, Singularity will just kill it off, without
ever sending health checks. This makes it so that that task
configuration will be rejected ahead of time by Singularity.
int startUpDelay = deploy.getHealthcheck().get().getStartupDelaySeconds().get();

checkBadRequest(startUpDelay < defaultKillAfterNotHealthySeconds,
String.format("Health check startup delay time must be less than kill after wait time %s (was %s)", defaultKillAfterNotHealthySeconds, startUpDelay));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably not as important for the user to know that we call it the 'kill after wait time'. Maybe just Health check startup delay time must be less than {time} seconds (was {time} seconds)

Other than that LGTM

There were two fields on the new deploy form that were labeled "HC
startup delay." This renames one of them to "HC startup timeout." They
bot functioned correctly and pointed to the the correct fields in the
deploy JSON; the label on one of them was just mixed up.
Was previously giving perhaps too much information to the user (ie, that
the upper limit was coming from how long we would wait to kill a task
that didn't appear to be starting). Not it just reflects the maximum
amount of time that you are allowed to put down.

WebApplicationException exn = (WebApplicationException) catchThrowable(() -> validator.checkDeploy(request, deploy));
assertThat((String) exn.getResponse().getEntity())
.contains("Health check startup delay");
Copy link
Contributor

@matush-v matush-v Apr 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whoa, this catch is very cool

@ssalinas ssalinas modified the milestone: 0.15.0 Apr 19, 2017
@ssalinas ssalinas merged commit d9515d8 into master Apr 26, 2017
@ssalinas ssalinas deleted the forbid-impossible-healthcheck-delay branch April 26, 2017 13:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants