Skip to content
This repository has been archived by the owner on Apr 24, 2023. It is now read-only.

Refactor straggler handler to not use global compute cluster #1251

Merged
merged 2 commits into from Oct 21, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
3 changes: 1 addition & 2 deletions scheduler/src/cook/mesos.clj
Expand Up @@ -184,8 +184,7 @@
; Many of these should look at the compute-cluster of the underlying jobs, and not use driver at all.
(cook.scheduler.scheduler/lingering-task-killer mesos-datomic-conn compute-cluster
task-constraints lingering-task-trigger-chan)
(cook.scheduler.scheduler/straggler-handler mesos-datomic-conn compute-cluster
straggler-trigger-chan)
(cook.scheduler.scheduler/straggler-handler mesos-datomic-conn straggler-trigger-chan)
(cook.scheduler.scheduler/cancelled-task-killer mesos-datomic-conn
cancelled-task-trigger-chan)
(cook.mesos.heartbeat/start-heartbeat-watcher! mesos-datomic-conn mesos-heartbeat-chan)
Expand Down
6 changes: 4 additions & 2 deletions scheduler/src/cook/scheduler/scheduler.clj
Expand Up @@ -1053,10 +1053,12 @@
(defn straggler-handler
"Periodically checks for running jobs that are in groups and runs the associated
straggler handler."
[conn compute-cluster trigger-chan]
[conn trigger-chan]
(util/chime-at-ch trigger-chan
(fn straggler-handler-event []
(handle-stragglers conn #(cc/kill-task compute-cluster (:instance/task-id %))))
(handle-stragglers conn (fn [task-ent]
(cc/kill-task (cook.task/task-ent->ComputeCluster task-ent)
(:instance/task-id task-ent)))))
{:error-handler (fn [e]
(log/error e "Failed to handle stragglers"))}))

Expand Down