Skip to content

SlinkyProject/slurm-bridge

Repository files navigation

Slurm Bridge

License Tag Go-Version Last-Commit

Run Slurm as a Kubernetes scheduler. A Slinky project.

Table of Contents

Overview

Slurm and Kubernetes are workload managers originally designed for different kinds of workloads. In broad strokes: Kubernetes excels at scheduling workloads that typically run for an indefinite amount of time, with potentially vague resource requirements, on a single node, with loose policy, but can scale its resource pool infinitely to meet demand; Slurm excels at quickly scheduling workloads that run for a finite amount of time, with well defined resource requirements and topology, on multiple nodes, with strict policy, but its resource pool is known.

This project enables the best of both workload managers. It contains a Kubernetes scheduler to manage select workload from Kubernetes.

Slurm Bridge Architecture

For additional architectural notes, see the architecture docs.

Features

Slurm

Slurm is a full featured HPC workload manager. To highlight a few features:

  • Priority: assigns priorities to jobs upon submission and on an ongoing basis (e.g. as they age).
  • Preemption: stop one or more low-priority jobs to let a high-priority job run.
  • QoS: sets of policies affecting scheduling priority, preemption, and resource limits.
  • Fairshare: distribute resources equitably among users and accounts based on historical usage.

Requirements

  • Kubernetes Version: >= v1.29
  • Slurm Version: >= 25.05

Limitations

  • Exclusive, whole node allocations are made for each pod.
  • Annotations may be used to set CpusPerTask and MemoryPerNode for placeholder jobs.

Installation

Create a secret for slurm-bridge to communicate with Slurm.

export SLURM_JWT=$(scontrol token username=slurm lifespan=infinite)
kubectl create namespace slurm-bridge
kubectl create secret generic slurm-bridge-jwt-token --namespace=slinky --from-literal="auth-token=$SLURM_JWT" --type=Opaque

Install the slurm-bridge scheduler:

helm install slurm-bridge oci://ghcr.io/slinkyproject/charts/slurm-bridge \
  --namespace=slinky --create-namespace

For additional instructions, see the quickstart guide.

Documentation

Project documentation is located in the docs directory of this repository.

Slinky documentation can be found here.

License

Copyright (C) SchedMD LLC.

Licensed under the Apache License, Version 2.0 you may not use project except in compliance with the license.

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

About

Run Slurm as a Kubernetes scheduler. A Slinky project.

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •