# Installing and Configuring a Slurm Master Instance

In [None]:
hostname

**Change hostname** (if not already set)

In [None]:
hostnamectl set-hostname slurmmaster

In [None]:
systemctl restart network

In [None]:
hostname

vim /etc/hosts

127.0.0.1 localhost

**Setup NTPD**

In [None]:
yum install ntp

In [None]:
systemctl enable ntpd.service

In [None]:
systemctl start ntpd

**Setup Global Users**

Check if *slurm* or *munge* users already exist.  If they do, make sure their GID and UID are the same on all nodes.  If they don't, create them.

In [None]:
cat /etc/passwd | grep slurm

In [None]:
cat /etc/passwd | grep munge

Pick unused UIDs and GIDs.

In [None]:
export MUNGEUID=991

In [None]:
export SLURMUID=1001

In [None]:
export MUNGEGID=985

In [None]:
export SLURMGID=1001

In [None]:
groupadd -g $MUNGEGID munge

In [None]:
groupadd -g $SLURMGID slurm

In [None]:
useradd  -m -c "Munge" -d /var/lib/munge -u $MUNGEUID -g munge -s /sbin/nologin munge

In [None]:
useradd  -m -c "SLURM workload manager" -d /var/run/slurm -u $SLURMUID -g slurm -s /bin/bash slurm

In */etc/passwd* should see the following two entries:
```
munge:x:991:985:Munge:/var/lib/munge:/sbin/nologin
slurm:x:1001:1001:SLURM workload manager:/var/lib/slurm:/bin/bash
...
```

** Install and configure MariaDB **

In [None]:
yum install mariadb-server mariadb-devel -y

In [None]:
systemctl enable mariadb

In [None]:
systemctl start mariadb

In [None]:
mysql_secure_installation

Start a mysql shell:

In [None]:
mysql -p

In [None]:
MariaDB [(none)]> create database slurm_acct_db;

In [None]:
MariaDB [(none)]> grant all on slurm_acct_db.* TO 'slurm'@'localhost';

**Configure and Enable SLURMDBD**

Create the following *slurmdbd.conf* file and place it in */etc/slurm/*
```
AuthType=auth/munge
DbdAddr=localhost
DbdHost=localhost
SlurmUser=slurm
DebugLevel=4
LogFile=/var/log/slurm/slurmdbd.log
PidFile=/var/run/slurmdbd.pid
StorageType=accounting_storage/mysql
StorageHost=localhost
StoragePass=somepasswd
StorageUser=slurm
StorageLoc=slurm_acct_db
```

In [None]:
systemctl enable slurmdbd 

In [None]:
systemctl start slurmdbd

** Install and Configure Munge **

In [None]:
yum install epel-release

In [None]:
yum install munge munge-libs munge-devel -y

After installing Munge, create a secret ket with it only on the Slurm server

In [None]:
yum install rng-tools -y

In [None]:
rngd -r /dev/urandom

In [None]:
/usr/sbin/create-munge-key -r

In [None]:
dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key

In [None]:
chown munge: /etc/munge/munge.key

In [None]:
chmod 400 /etc/munge/munge.key

Create an SSH pub key and add it to all nodes.

In [None]:
ssh-keygen -t rsa

In [None]:
cat ~/.ssh/id_rsa.pub | ssh root@nfs-1 'cat >> ~/.ssh/authorized_keys'

Copy the Munge key to all nodes you want to partake in the cluster

In [None]:
scp /etc/munge/munge.key root@nfs-1:/etc/munge

Login on every node on the cluster and change the Munge key permissions and then start Munge

In [None]:
chown -R munge: /etc/munge/ /var/log/munge/

In [None]:
chmod 0700 /etc/munge/ /var/log/munge/

In [None]:
systemctl enable munge

In [None]:
systemctl start munge

Test Communication Between Munge Clients

In [None]:
munge -n

In [None]:
munge -n | unmunge

In [None]:
munge -n | ssh nfs-1 unmunge

In [None]:
remunge

** Install and Configure Slurm **

Install Slurm Dependencies:

In [None]:
yum install -y rrdtool hwloc munge-devel munge-libs readline-devel perl-ExtUtils-MakeMaker openssl-devel pam-devel rpm-build perl-DBI perl-Switch munge mariadb-devel

Copy Slurm RPMs to all cluster nodes:

In [None]:
scp -r /root/slurm_rpms nfs-1:/root/slurm_rpms

In [None]:
rpm -Uvh /root/slurm_rpms/*.rpm

Create Slurm Home Dir (if needed)

In [None]:
usermod -d /home/slurm slurm

In [None]:
mkdir /home/slurm

In [None]:
chmod -R slurm: /home/slurm

Open Ports

In [None]:
firewall-cmd --permanent --zone=public --add-port=6817/udp

In [None]:
firewall-cmd --permanent --zone=public --add-port=6817/tcp

In [None]:
firewall-cmd --permanent --zone=public --add-port=6818/udp

In [None]:
firewall-cmd --permanent --zone=public --add-port=6818/tcp

In [None]:
firewall-cmd --permanent --zone=public --add-port=7321/udp

In [None]:
firewall-cmd --permanent --zone=public --add-port=7321/tcp

In [None]:
firewall-cmd --reload

Start Slurm on All Compute Nodes First

In [None]:
systemctl start slurmd

Start the *slurmctld* on the Slurm Master Node

In [None]:
systemctl enable slurmctld

In [None]:
systemctl start slurmctld