# Slurm jobs on NeSI GPUs

In this section, we will revisit our first model and try to execute it non-interactively on GPUs, using a Slurm job.

## Create a Python script

- TODO describe the goal of this section
- TODO provide instructions

In [None]:
%%writefile train_model.py
import tensorflow as tf
from tensorflow.keras import datasets, layers, models

(train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()
train_images, test_images = train_images / 255.0, test_images / 255.0
input_shape = train_images.shape[1:]

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation="relu", input_shape=input_shape))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation="relu"))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation="relu"))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation="relu"))
model.add(layers.Dense(10))

model.summary()

model.compile(
    optimizer="adam",
    loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics=["accuracy"],
)

history = model.fit(
    train_images, train_labels, epochs=20, validation_data=(test_images, test_labels)
)

model.save("trained_model_cifar10")

test_loss, test_acc = model.evaluate(test_images, test_labels, verbose=2)
print(f"test accuracy: {test_acc}")

## Create a Slurm job script

- TODO describe the goal of the section
- TODO explain the basics of a Slurm script
- TODO provide instructions

In [None]:
%%writefile train_model.sl
#!/usr/bin/env bash
#SBATCH --account=nesi99999
#SBATCH --time=00-00:10:00
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=2
#SBATCH --mem=8GB
#SBATCH --gpus-per-node=A100:1

# load required environment modules
module purge
module load Miniconda3 cuDNN/8.1.1.33-CUDA-11.2.0

# activate the conda environment
source $(conda info --base)/etc/profile.d/conda.sh
export PYTHONNOUSERSITE=1
conda activate /scale_wlg_persistent/filesets/project/nesi99999/ml102_jupyter_kernel_env

# execute the script
python train_model.py

## Submit a Slurm job and check results

- TODO describe goals of the section
- TODO explain basics of job submission, queue & co
- TODO check results

In [None]:
!sbatch train_model.sl

In [None]:
!squeue --me