Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[slurm version in AutoScale script] #543

Closed
imangohari1 opened this issue Oct 21, 2021 · 3 comments
Closed

[slurm version in AutoScale script] #543

imangohari1 opened this issue Oct 21, 2021 · 3 comments
Assignees
Labels
bug Something isn't working

Comments

@imangohari1
Copy link

Describe the bug
Outdated slurm version for AutoScale.
https://www.schedmd.com/archives.php/news.php?id=253

To Reproduce

$ wget https://download.schedmd.com/slurm/slurm-19.05.5.tar.bz2
--2021-10-21 08:36:23--  https://download.schedmd.com/slurm/slurm-19.05.5.tar.bz2
Resolving proxy.jf.intel.com (proxy.jf.intel.com)... 10.7.211.16
Connecting to proxy.jf.intel.com (proxy.jf.intel.com)|10.7.211.16|:911... connected.
Proxy request sent, awaiting response... 404 Not Found
2021-10-21 08:36:24 ERROR 404: Not Found.

Additional context
source: https://github.com/Azure/azurehpc/blob/master/examples/slurm_autoscale/scripts/slurmctl.sh#L7-L17

Here are proposed changes:

--- a/examples/slurm_autoscale/scripts/slurmctl.sh
+++ b/examples/slurm_autoscale/scripts/slurmctl.sh
@@ -4,12 +4,14 @@ yum install -y epel-release screen

 yum install perl-ExtUtils-MakeMaker gcc mariadb-devel openssl openssl-devel pam-devel rpm-build numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel man2html libibmad libibumad -y

-if [ ! -f "slurm-19.05.5.tar.bz2" ]; then
-  wget https://download.schedmd.com/slurm/slurm-19.05.5.tar.bz2
+slurm_version=${1:-20.11.8}
+slurm_tarball=slurm-${slurm_version}.tar.bz2
+if [ ! -f $slurm_tarball ]; then
+  wget https://download.schedmd.com/slurm/$slurm_tarball
 fi

 if [ ! -f "/apps/rpms/slurm*.rpm" ]; then
-  rpmbuild -ta slurm-19.05.5.tar.bz2
+  rpmbuild -ta $slurm_tarball
   mkdir -p /apps/rpms
   cp /root/rpmbuild/RPMS/x86_64/slurm-* /apps/rpms/
@imangohari1 imangohari1 added the bug Something isn't working label Oct 21, 2021
@vgamayunov
Copy link
Contributor

Older versions have been removed from SchedMD website Due to a security vulnerability (CVE-2021-31215), all versions of Slurm prior to 20.11.7 or 20.02.7 are no longer available for download.

@vgamayunov vgamayunov self-assigned this Oct 21, 2021
@imangohari1
Copy link
Author

@vgamayunov
It did work with the 20.11.8 version.
After completion: I ran into this issue:

[hpcadmin@headnode ~]$ sinfo
sinfo: fatal: SallocDefaultCommand has been removed. Please consider setting LaunchParameters=use_interactive_step instead.
[hpcadmin@headnode ~]$ squeue
squeue: fatal: SallocDefaultCommand has been removed. Please consider setting LaunchParameters=use_interactive_step instead.

@vgamayunov
Copy link
Contributor

hi @imangohari1
fixed with #544

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants