Skip to content

v1.92.0

Choose a tag to compare

@sudheer-quad sudheer-quad released this 04 Jun 15:05
dbd2806

What's Changed

Key New Features 🎉

Breaking Changes 🚨

  • Transitioning to Slurm Native Auth with resilient workbench keys distribution by @arpit974 in #5695
  • default to sauth for newer deployments in h4d and a3mega-gcsfuse blueprints by @arpit974 in #5707

New Modules 🧱

Module Improvements 🔨

  • Adding native K8s annotations and GKE cluster enhancements by @arpit974 in #5610
  • Default Kueue config for Pathways by @scaliby in #5628

Improvements 🛠

  • [Telemetry] Get blueprint even from deployment directory by @kadupoornima in #5656
  • [Telemetry] Capture exit code upon fatal command failures by @kadupoornima in #5658
  • (gke) Remove additional network settings from A3U blueprint by @agrawalkhushi18 in #5652
  • (gke) Remove additional networks from A4 and A4X family blueprints by @agrawalkhushi18 in #5682
  • (gke) Remove additional network settings from TPU v6e,7x and g4 by @agrawalkhushi18 in #5692
  • [Telemetry] Add support to merge vars from deployment files and CLI --vars by @kadupoornima in #5694
  • [Telemetry] Add support for collection of CPU machines and Default machines when unset in module by @kadupoornima in #5696
  • Make Managed lustre default in A3u and A3m series Slurm blueprints by @saara-tyagi27 in #5396
  • [Telemetry] Add a retry mechanism to get the GCP Project information to eliminate transient issues by @kadupoornima in #5702
  • [Telemetry] Add an atomic flag to ensure telemetry event is not recurrently called by @kadupoornima in #5705
  • Pin DCGM to version 4.5.3 by @shubpal07 in #5721
  • feat(gke): expose monitoring components as a parameter by @cboneti in #5722
  • feat(job submission): Dynamic topology routing for gke jobs by @Neelabh94 in #5664

Deprecations 💤

Version Updates ⏫

  • Fix A3 HighGPU test by pinning GKE version to 1.33 to resolve COS incompatibility by @kadupoornima in #5673
  • Update minimum required Packer version to 1.15.3 by @AdarshK15 in #5701

Bug fixes 🐞

Full Changelog: v1.91.0...v1.92.0