New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Task alerts #721

Merged
merged 18 commits into from Oct 15, 2015

Conversation

Projects
None yet
2 participants
@kwm4385
Contributor

kwm4385 commented Oct 6, 2015

Show an alert on the task page if it is a scheduled task that has been running for longer than twice the average for tasks in the request. /cc @tpetr

screen shot 2015-10-06 at 3 59 44 pm

kwm4385 added some commits Oct 6, 2015

Sp
@kwm4385

This comment has been minimized.

Show comment
Hide comment
@kwm4385

kwm4385 Oct 8, 2015

Contributor

Alert if a task is killed due to slave decommissioning.

screen shot 2015-10-08 at 11 53 59 am

Contributor

kwm4385 commented Oct 8, 2015

Alert if a task is killed due to slave decommissioning.

screen shot 2015-10-08 at 11 53 59 am

Show outdated Hide outdated SingularityUI/app/controllers/TaskDetail.coffee
getAlerts: (requestTaskHistory) =>
task = @models.task
alerts = []
# Is this a scheduled task that has been running much longer than previous ones?

This comment has been minimized.

@tpetr

tpetr Oct 14, 2015

Member

instead of calculating this yourself, you should use averageRuntimeMillis in the SingularityDeployStatistics object, which can be grabbed from the deploy history endpoint (.../api/history/request/REQUEST_ID/deploy/DEPLOY_ID). it would also be wise to surface warnIfScheduledJobIsRunningPastNextRunPct from https://github.com/HubSpot/Singularity/blob/master/SingularityService/src/main/java/com/hubspot/singularity/config/SingularityConfiguration.java#L172 into the window.config object so that we don't have to hardcode the threshold value

@tpetr

tpetr Oct 14, 2015

Member

instead of calculating this yourself, you should use averageRuntimeMillis in the SingularityDeployStatistics object, which can be grabbed from the deploy history endpoint (.../api/history/request/REQUEST_ID/deploy/DEPLOY_ID). it would also be wise to surface warnIfScheduledJobIsRunningPastNextRunPct from https://github.com/HubSpot/Singularity/blob/master/SingularityService/src/main/java/com/hubspot/singularity/config/SingularityConfiguration.java#L172 into the window.config object so that we don't have to hardcode the threshold value

Show outdated Hide outdated SingularityUI/app/controllers/TaskDetail.coffee
title: 'Warning:',
message: 'This scheduled task has been running longer than twice the average for the request and may be stuck.',
level: 'warning'
# Was this task killed by a discomissioning slave?

This comment has been minimized.

@tpetr

tpetr Oct 14, 2015

Member

typo

@tpetr

tpetr Oct 14, 2015

Member

typo

@kwm4385

This comment has been minimized.

Show comment
Hide comment
@kwm4385

kwm4385 Oct 14, 2015

Contributor

Updated

Contributor

kwm4385 commented Oct 14, 2015

Updated

Show outdated Hide outdated ...ervice/src/main/java/com/hubspot/singularity/config/UIConfiguration.java
@@ -63,6 +64,8 @@ public String getFinishedTaskLogPath() {
private boolean hideNewDeployButton = false;
private boolean hideNewRequestButton = false;
private Optional<Integer> warnIfScheduledJobIsRunningPastNextRunPct;

This comment has been minimized.

@tpetr

tpetr Oct 14, 2015

Member

i don't think this is necessary -- using the value inside SingularityConfiguration is probably good enough

@tpetr

tpetr Oct 14, 2015

Member

i don't think this is necessary -- using the value inside SingularityConfiguration is probably good enough

Show outdated Hide outdated SingularityUI/app/controllers/TaskDetail.coffee
if current > (avg * threshold)
alerts.push
title: 'Warning:',
message: 'This scheduled task has been running longer than twice the average for the request and may be stuck.',

This comment has been minimized.

@tpetr

tpetr Oct 14, 2015

Member

can you fix the message to mention the percentage instead of "twice" ?

@tpetr

tpetr Oct 14, 2015

Member

can you fix the message to mention the percentage instead of "twice" ?

Show outdated Hide outdated SingularityUI/config.coffee
@@ -53,6 +53,7 @@ exports.config =
runningTaskLogPath: process.env.SINGULARITY_RUNNING_TASK_LOG_PATH ? "stdout"
finishedTaskLogPath: process.env.SINGULARITY_FINISHED_TASK_LOG_PATH ? "stdout"
commonHostnameSuffixToOmit: process.env.SINGULARITY_COMMON_HOSTNAME_SUFFIX_TO_OMIT ? ""
warnIfScheduledJobIsRunningPastNextRunPct: process.env.WARN_IF_SCHEDULED_JOB_IS_RUNNING_PAST_NEXT_RUN_PCT ? 200

This comment has been minimized.

@tpetr

tpetr Oct 14, 2015

Member

prefix this with SINGULARITY_ to match all the others

@tpetr

tpetr Oct 14, 2015

Member

prefix this with SINGULARITY_ to match all the others

tpetr and others added some commits Oct 14, 2015

Plz
Show outdated Hide outdated SingularityUI/app/controllers/TaskDetail.coffee
deployInfo = new DeployDetails
deployId: deployId
requestId: requestId
deployInfo.fetch().success =>

This comment has been minimized.

@tpetr

tpetr Oct 14, 2015

Member

technically we could only make this call only if it's needed (i.e. if the task is scheduled + currently running)

@tpetr

tpetr Oct 14, 2015

Member

technically we could only make this call only if it's needed (i.e. if the task is scheduled + currently running)

tpetr added a commit that referenced this pull request Oct 15, 2015

@tpetr tpetr merged commit 6d6a23b into master Oct 15, 2015

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details

@tpetr tpetr deleted the task_alerts branch Oct 15, 2015

@tpetr tpetr added this to the 0.4.6 milestone Oct 16, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment