Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom batch_submit_cmd with conditional expressions ( for gres ) #304

Closed
wixaw opened this issue Mar 29, 2024 · 2 comments
Closed

Custom batch_submit_cmd with conditional expressions ( for gres ) #304

wixaw opened this issue Mar 29, 2024 · 2 comments
Labels

Comments

@wixaw
Copy link

wixaw commented Mar 29, 2024

Hello,

I am using OptionsFormSpawner to offer a form to my users.

I need to be able to change the sbatch command depending on whether we are on a partition with GPUs or not.

I tried the following code:

c.BatchSpawnerBase.batch_submit_cmd = """
if [ "{partition}" = "GPUNodes" -o "{partition}" = "RTX6000Node" -o "{partition}" = "GPUNodes1080-dev" ]
then
sudo -E -u {username} sbatch --parsable --gres-flags=enforce-binding --gres=gpu:{gpu_number}
else
sudo -E -u {username} sbatch --parsable
fi
"""

But it is not working.

If I choose the GPUNodes partition in my form, it never enters the condition, and my nvidia-smi command does not return any GPU.
If I put "sudo -E -u {username} sbatch --parsable --gres-flags=enforce-binding --gres=gpu:{gpu_number}" in my else statement, it works correctly.

This used to work in JupyterHub 1.x.

Packages version :

batchspawner              1.3.1.dev0      /local/batchspawner       /usr/local/lib/python3.9/site-packages   pip
jupyterhub                4.0.2                                     /usr/local/lib/python3.9/site-packages   pip
optionsspawner            0.1.0                                     /usr/local/lib/python3.9/site-packages   pip
wrapspawner               1.0.2.dev0                                /usr/local/lib/python3.9/site-packages   pip

My complete config :

import os
import importlib.machinery
import batchspawner

c = get_config()  #noqa
#======================================================
# GENERAL
#======================================================

# Config de base
c.Authenticator.admin_users = {'xxx','xxx'}
c.JupyterHub.hub_ip = 'r9jupyter.domain.fr'
c.JupyterHub.bind_url = 'http://0.0.0.0:8000'
# On reset les sessions a chaque restart
c.JupyterHub.reset_db = True

#======================================================
# FORMULAIRE
#======================================================
from optionsspawner.forms import (
    FormField,
    TextInputField,
    NumericalInputField,
    CheckboxInputField,
    SelectField,
)


partition_select = SelectField('req_partition',
    label='Select a partition',
    attr_required=True,
    choices=[
        ('GPUNodes', "GPUNodes"),
        ('RTX6000Node', "RTX6000Node"),
        ('24CPUNodes', "24CPUNodes"),
        ('48CPUNodes', "48CPUNodes"),
        ('GPUNodes1080-dev', "GPUNodes1080-dev"),
    ],
    default='GPUNodes1080-dev'
)


runtime_input = TextInputField('req_runtime',
    label='Specify runtime (HH:MM:SS format, 19hr max)',
    attr_required=True,
    attr_value='01:00:00',
    attr_pattern="[01]{1}[0-9]{1}:[0-5]{1}[0-9]{1}:[0-5]{1}[0-9]{1}"
)

gpu_number_input = NumericalInputField('req_gpu_number',
    label='Specify number of GPUs (2 max per server), only for GPUNodes or RTX6000Node  partition',
    attr_required=True,
    attr_value=1,
    attr_min=1,
    attr_max=2
)

cpu_task_input = NumericalInputField('req_cpu_task',
    label='Specify number of CPUs per task',
    attr_required=True,
    attr_value=4,
    attr_min=1,
    attr_max=64
)

#======================================================
# Pour tous les fichier dans le répertoire <cuda_11>
#list_container_cuda11 = []
#cuda_11 = "/apps/containerCollections/CUDA11"
#for basename in os.listdir(cuda_11):
#    # Si le fichier fini par ".sif"
#    if basename[-4:] == ".sif":
#        # on crer le chemin complet
#        path = os.path.join(cuda_11, basename)
#        list_container_cuda11.append( (path, "CUDA-11 " + basename) )

# Pour tous les fichier dans le répertoire <cuda_12>
cuda_12 = "/apps/containerCollections/CUDA12"
list_container_cuda12 = []
for basename in os.listdir(cuda_12):
    # Si le fichier fini par ".sif"
    if basename[-4:] == ".sif":
        # on crer le chemin complet
        path = os.path.join(cuda_12, basename)
        list_container_cuda12.append( (path, "CUDA-12 " + basename) )

#======================================================

image_select = SelectField('req_image_path',
    label='Select a singularity image in the list',
    attr_required=True,
    #choices=list_container_cuda11 + list_container_cuda12 + [("autre", "autre")],
    choices=list_container_cuda12 + [("autre", "autre")],
    default='autre'
)

image_input = TextInputField('req_specificimage_path',
    label="or specify your own image (which MUST contains jupyterhub and jupyterlab packages)",
    attr_placeholder='Path to your singularity image on OSIRIM (/users/.../your-image.sif or /projets/.../your-image.sif)',
)

form_fields = [
    image_select,
    image_input,
    runtime_input,
    partition_select,
    gpu_number_input,
    cpu_task_input,
]

#======================================================
# SPAWNER
#======================================================


c.BatchSpawnerBase.batch_script = '''#!/bin/bash
#SBATCH --time={runtime}
#SBATCH --output={homedir}/occidata-jupyter-%j.log
#SBATCH --error={homedir}/occidata-jupyter-%j.error
#SBATCH --job-name=jupyterlab
#SBATCH --export={keepvars}
#SBATCH --cpus-per-task={cpu_task}
#SBATCH --partition={partition}
#SBATCH --chdir={homedir}
#SBATCH --get-user-env=L
#SBATCH --ntasks=1

set -x

trap 'echo SIGTERM received' TERM

{prologue}

if [ "{image_path}" = "autre" ]
then
export SINGULARITYENV_CONTAINER_PATH={specificimage_path}
else
export SINGULARITYENV_CONTAINER_PATH={image_path}
fi

singularity exec --bind $PWD:/run/user $SINGULARITYENV_CONTAINER_PATH {cmd}

echo "jupyterhub-singleuser ended gracefully"

{epilogue}
'''


#sudo -E -u {username} sbatch --parsable --gres-flags=enforce-binding --gres=gpu:{gpu_number}
# On supprime sudo pour le remettre dans les conditions ci dessous
c.BatchSpawnerBase.exec_prefix = ""
# En fonction de la partition, on configure les gpu ou non
c.BatchSpawnerBase.batch_submit_cmd = """if [ "{partition}" = "GPUNodes" -o "{partition}" = "RTX6000Node" -o "{partition}" = "GPUNodes1
080-dev"]
then
sudo -E -u {username} sbatch --parsable --gres-flags=enforce-binding --gres=gpu:{gpu_number}
else
sudo -E -u {username} sbatch --parsable
fi"""

# On selectionne le mode formulaire
c.JupyterHub.spawner_class = 'optionsspawner.OptionsFormSpawner'
# On defini le spawner , detail dans https://github.com/jupyterhub/batchspawner/blob/main/batchspawner/batchspawner.py
c.OptionsFormSpawner.child_class = 'batchspawner.SlurmSpawner'
# On lui donne les champs precedement defini
c.OptionsFormSpawner.form_fields = form_fields

# BatchSpawner config
c.BatchSpawnerBase.req_host = 'r9jupyter.domain.fr'
c.BatchSpawnerBase.req_runtime = '12:00:00'
c.BatchSpawnerBase.req_nprocs = '2'
c.BatchSpawnerBase.req_queue = 'r9jupyter'

# Spawner config
c.Spawner.default_url = '/lab'
c.Spawner.notebook_dir = '~'
c.Spawner.start_timeout = 300
c.Spawner.http_timeout = 300

Thank you in advance for your help.

Best regards,

@wixaw wixaw added the bug label Mar 29, 2024
Copy link

welcome bot commented Mar 29, 2024

Thank you for opening your first issue in this project! Engagement like this is essential for open source projects! 🤗

If you haven't done so already, check out Jupyter's Code of Conduct. Also, please try to follow the issue template as it helps other other community members to contribute more effectively.
welcome
You can meet the other Jovyans by joining our Discourse forum. There is also an intro thread there where you can stop by and say Hi! 👋

Welcome to the Jupyter community! 🎉

@wixaw
Copy link
Author

wixaw commented Apr 4, 2024

Sorry i found !

The problem came from the "if" statement being too long and split, the following code works:

if [ "{partition}" = "GPUNodes" ]
then 
sudo -E -u {username} sbatch --parsable --gres-flags=enforce-binding --gres=gpu:{gpu_number}
elif [ "{partition}" = "RTX6000Node" ]
then
sudo -E -u {username} sbatch --parsable --gres-flags=enforce-binding --gres=gpu:{gpu_number}
elif [ "{partition}" = "GPUNodes1080-dev" ]
then
sudo -E -u {username} sbatch --parsable --gres-flags=enforce-binding --gres=gpu:{gpu_number}
else
sudo -E -u {username} sbatch --parsable
fi

@wixaw wixaw closed this as completed Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant