Main branch v2026-05-11
·
114 commits
to main
since this release
What's Changed
- SCHED-1183: Retry transient Nebius API failures in TF provider and disk cleanup by @theyoprst in #945
- Merge to soperator-release-4.0: SCHED-1183: Retry transient Nebius API failures in TF provider and disk cleanup by @github-actions[bot] in #946
- Merge to main: Merge to soperator-release-4.0: SCHED-1183: Retry transient Nebius API failures in TF provider and disk cleanup by @github-actions[bot] in #947
- SCHED-1557 fix absent NVIDIA_DRIVER_CAPABILITIES env var on worker pods by @asteny in #942
- Merge to main: SCHED-1557 fix absent NVIDIA_DRIVER_CAPABILITIES env var on worker pods by @github-actions[bot] in #948
- NOTIC: Don't recommend to use Mk8s default version in tfvars by @rdjjke in #950
- soperator/example: add missing variable validations by @realvz in #944
- k8s-training: make B200 CUDA driverfull preset - move B200 to CUDA 13 by @dasbatta in #943
- SCHED-1223: adjust system node group size for big clusters by @itechdima in #949
- adjust jail logs cleaner schedule for big clusters by @itechdima in #952
- Add forbid_deletion aka deletion protection support for shared filesy… by @aaronbfagan in #951
- soperator/example: forbid /home in filestore_jail_submounts by @dorukozturk in #953
- SCHED-301: Scale KSM scrape limit for large clusters by @ChessProfessor in #957
- SCHED-1583: Limit max pods for Soperator worker node groups by @ChessProfessor in #956
- SCHED-292: Configure OpenTelemetry batch settings by @ChessProfessor in #959
- SCHED-1570 remove impossible platforms for b300 in UK by @asteny in #960
- Merge to main: SCHED-1570 remove impossible platforms for b300 in UK by @github-actions[bot] in #961
- SCHED-1631: Support BIC region in Terraform by @Uburro in #963
- SCHED-1628: Support adding SSH keys to VMs without public IP addresses by @Uburro in #964
- [CHORE] Changed CI project and GPU preset by @roman-iurkov in #941
- SCHED-1626: add support for sending logs to same region by @Uburro in #965
- Harden k8s-training CI failure handling and bucket cleanup by @aaronbfagan in #958
New Contributors
- @dorukozturk made their first contribution in #953
Full Changelog: main-v2026-04-27...main-v2026-05-11