Skip to content

Commit

Permalink
Scheduling fix for jobs with specific nodes required
Browse files Browse the repository at this point in the history
If a job requires specific nodes and can not run due to those nodes being
busy, the main scheduling loop will block those specific nodes rather than
the entire queue/partition.
bug 595
  • Loading branch information
jette committed Feb 20, 2014
1 parent 5db06c0 commit eafc0a4
Show file tree
Hide file tree
Showing 2 changed files with 19 additions and 1 deletion.
5 changes: 4 additions & 1 deletion NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,11 @@ documents those changes that are of interest to users and admins.
-- Fix issue where if using munge and munge wasn't running and a slurmd
needed to forward a message, the slurmd would core dump.
-- Update srun.1 man page documenting the PMI2 support.
-- Fix slurmctld core dump when a jobs gets its qos updated but there
-- Fix slurmctld core dump when a jobs gets its QOS updated but there
is not a corresponding association.
-- If a job requires specific nodes and can not run due to those nodes being
busy, the main scheduling loop will block those specific nodes rather than
the entire queue/partition.

* Changes in Slurm 2.6.6
========================
Expand Down
15 changes: 15 additions & 0 deletions src/slurmctld/job_scheduler.c
Original file line number Diff line number Diff line change
Expand Up @@ -1008,6 +1008,21 @@ next_part: part_ptr = (struct part_record *)
* scheduler is configured. */
if (!backfill_sched)
fail_by_part = false;
#else
if (job_ptr->details &&
job_ptr->details->req_node_bitmap &&
(bit_set_count(job_ptr->details->
req_node_bitmap)>=
job_ptr->details->min_nodes)) {
fail_by_part = false;
/* Do not schedule more jobs on nodes required
* by this job, but don't block the entirecd srcccccc
* queue/partition. */
bit_not(job_ptr->details->req_node_bitmap);
bit_and(avail_node_bitmap,
job_ptr->details->req_node_bitmap);
bit_not(job_ptr->details->req_node_bitmap);
}
#endif
if (fail_by_part) {
/* do not schedule more jobs in this partition
Expand Down

0 comments on commit eafc0a4

Please sign in to comment.