Skip to content

SCHED-138: add SOPERATOR_NODE_SETS_ON=true for static worker configuration#1638

Merged
Uburro merged 2 commits intomainfrom
uburro/SCHED-138
Oct 9, 2025
Merged

SCHED-138: add SOPERATOR_NODE_SETS_ON=true for static worker configuration#1638
Uburro merged 2 commits intomainfrom
uburro/SCHED-138

Conversation

@Uburro
Copy link
Collaborator

@Uburro Uburro commented Oct 8, 2025

Problem

Currently, slurmd daemon runs with dynamic node configuration using the --conf flag, which passes node configuration (NodeHostname, NodeAddr, RealMemory, Gres, Features) at runtime. This approach doesn't support the new structured partition configuration feature where node configuration is pre-defined in slurm.conf via NodeSets.

Solution

Added conditional logic to slurmd_entrypoint.sh that switches between two modes based on the SOPERATOR_NODE_SETS_ON environment variable:

  • When SOPERATOR_NODE_SETS_ON=true: runs slurmd without --conf flag, relying on node configuration from slurm.conf (generated by AddNodeSetsToSlurmConfig and AddNodesToSlurmConfig)
  • Otherwise: maintains backward compatibility with dynamic node configuration using --conf flag

Refactored the script to use a bash array for slurmd arguments, eliminating code duplication and improving readability.

Testing

  • Manual testing with SOPERATOR_NODE_SETS_ON=true to verify slurmd starts without --conf flag
  • Manual testing without the variable set to verify backward compatibility with existing dynamic node configuration
  • Verified slurmd logs show correct startup messages for both modes

Release Notes

Feature: Added support for structured partition configuration mode in slurmd. When SOPERATOR_NODE_SETS_ON environment variable is set to true, slurmd will use static node configuration from slurm.conf instead of dynamic configuration, enabling NodeSet-based partition management. This change is backward compatible - existing clusters continue to work without modifications.

## Problem
Currently, slurmd daemon runs with dynamic node configuration using the `--conf` flag, which passes node configuration (NodeHostname, NodeAddr, RealMemory, Gres, Features) at runtime. This approach doesn't support the new structured partition configuration feature where node configuration is pre-defined in `slurm.conf` via NodeSets.

## Solution
Added conditional logic to `slurmd_entrypoint.sh` that switches between two modes based on the `SOPERATOR_NODE_SETS_ON` environment variable:
- When `SOPERATOR_NODE_SETS_ON=true`: runs slurmd without `--conf` flag, relying on node configuration from `slurm.conf` (generated by `AddNodeSetsToSlurmConfig` and `AddNodesToSlurmConfig`)
- Otherwise: maintains backward compatibility with dynamic node configuration using `--conf` flag

Refactored the script to use a bash array for slurmd arguments, eliminating code duplication and improving readability.

## Testing
- Manual testing with `SOPERATOR_NODE_SETS_ON=true` to verify slurmd starts without `--conf` flag
- Manual testing without the variable set to verify backward compatibility with existing dynamic node configuration
- Verified slurmd logs show correct startup messages for both modes

## Release Notes
Feature: Added support for structured partition configuration mode in slurmd. When `SOPERATOR_NODE_SETS_ON` environment variable is set to `true`, slurmd will use node configuration from `slurm.conf` instead of dynamic configuration, enabling NodeSet-based partition management. This change is backward compatible - existing clusters continue to work without modifications.
@Uburro Uburro added docker Pull requests that update Docker code feature labels Oct 8, 2025
@Uburro Uburro changed the title SCHED-136: del gres structure from nodesets.spec (#1636) SCHED-138: add SOPERATOR_NODE_SETS_ON=true for static worker configuration Oct 8, 2025
@Uburro Uburro marked this pull request as ready for review October 8, 2025 19:29
@theyoprst theyoprst requested a review from Copilot October 9, 2025 07:38
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for structured partition configuration in slurmd by introducing the SOPERATOR_NODE_SETS_ON environment variable. When set to true, slurmd uses static node configuration from slurm.conf instead of dynamic configuration, enabling NodeSet-based partition management.

Key changes:

  • Added conditional logic to switch between static and dynamic node configuration modes
  • Refactored slurmd argument handling using bash arrays for better maintainability
  • Maintained backward compatibility with existing dynamic configuration

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@Uburro Uburro merged commit 86addf7 into main Oct 9, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docker Pull requests that update Docker code feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants