Skip to content

Fix bug that under-allocated compute nodes#149

Merged
daniel-thom merged 1 commit intomainfrom
fix/under-allocation-compute
Feb 21, 2026
Merged

Fix bug that under-allocated compute nodes#149
daniel-thom merged 1 commit intomainfrom
fix/under-allocation-compute

Conversation

@daniel-thom
Copy link
Collaborator

The scheduling algorithm was not correctly accounting for walltime in this case and scheduled half the required number of nodes.

The scheduling algorithm was not correctly accounting for walltime in
this case and scheduled half the required number of nodes.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an under-allocation bug in the Slurm scheduler planning logic where sequential batch capacity (“time slots”) was computed using the partition max walltime rather than the actual walltime that would be requested for the allocation under the selected walltime strategy.

Changes:

  • Update allocation counting to compute time_slots from the actual allocation walltime (derived from WalltimeStrategy), not partition.max_walltime_secs.
  • Update/expand HPC scheduler-generation tests to match the corrected allocation math and add a regression test for long-running whole-node jobs.
  • Adjust the auto-merge CLI test to explicitly use --walltime-strategy max-partition-time so it continues to validate runtime-aware merging behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/client/scheduler_plan.rs Uses computed allocation walltime for time_slots so allocation counts reflect the requested walltime strategy.
tests/test_hpc.rs Updates expected allocation counts and adds a regression test covering the previously under-allocating long-runtime case.
tests/test_slurm_commands.rs Makes the test’s walltime assumptions explicit via --walltime-strategy max-partition-time.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@daniel-thom daniel-thom merged commit 4f31239 into main Feb 21, 2026
8 checks passed
@daniel-thom daniel-thom deleted the fix/under-allocation-compute branch February 21, 2026 03:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants