diff --git a/doc/source/examples.rst b/doc/source/examples.rst index c1dc5aa..fd046f9 100644 --- a/doc/source/examples.rst +++ b/doc/source/examples.rst @@ -216,7 +216,74 @@ This is the same example, but with an explicit setup to ask for 2 full nodes. Job Arrays ---------- -.. warning:: **WORK IN PROGRESS!** +Job arrays are a handy way to send multiple jobs that vary e.g. just by some parameters of the calculation. SLURM's documentation has a very well-written page on job arrays, I suggest you to take a look for more details and examples: https://slurm.schedmd.com/job_array.html. Here I'll just show a couple of examples. + +A job array is specified via the :data:`--array=` option (see :ref:`Partition, Walltime and Output`), that takes a range of integers as ````. This range can be specified as an interval, e.g. ``1-10`` (numbers from 1 to 10), or as a sequence, e.g. ``3,5,23``, or both, e.g. ``1-5,13`` (numbers from 1 to 5, then 13). + +For example, if one uses the option :data:`--array=1-5,13`, then SLURM will generate 6 different jobs, each of one containing the following environment variables: + +.. table:: + :width: 100% + :widths: auto + + +-----------------------------+-----------------------------------------+ + | Environment Variable | Value | + +=============================+=========================================+ + | ``$SLURM_ARRAY_TASK_ID`` | One of the following: ``1,2,3,4,5,13``. | + +-----------------------------+-----------------------------------------+ + | ``$SLURM_ARRAY_TASK_COUNT`` | ``6`` (number of jobs in the array) | + +-----------------------------+-----------------------------------------+ + | ``$SLURM_ARRAY_TASK_MAX`` | ``13`` (max of given range) | + +-----------------------------+-----------------------------------------+ + | ``$SLURM_ARRAY_TASK_MIN`` | ``1`` (min of given range) | + +-----------------------------+-----------------------------------------+ + +In other words, each of these 6 different jobs will have a variable ``$SLURM_ARRAY_TASK_ID`` containing one (and only one) of the numbers given to :data:`--array`. This variable can then be used to generate one or more parameters of the simulation, in a way that's completely up to you. + +.. note:: Array ranges can additionally be specified with a step. For example, to generate multiples of 3 up to 21, you can use :data:`--array=0-21:3`. + +.. note:: You can also specify a maximum number of jobs in that array that are allowed to run at the same time. For example, :data:`--array=1-20%4` generates 20 jobs but only 4 of them are allowed to run at the same time. + +Serial Job Array +^^^^^^^^^^^^^^^^ + +.. code-block:: bash + :caption: Serial job array asking for 2 threads (bounded to 1 physical core), 990 MB of memory and 6 hours for each of the 32 jobs, on ``regular2``. The output and error filenames are in TORQUE style. + :linenos: + + #!/usr/bin/env bash + # + #SBATCH --job-name=Array_Job + #SBATCH --mail-type=ALL + #SBATCH --mail-user=jdoe@sissa.it + # + #SBATCH --ntasks=1 + #SBATCH --cpus-per-task=2 + #SBATCH --ntasks-per-core=1 + # + #SBATCH --mem-per-cpu=990mb + # + #SBATCH --array=1-32 + #SBATCH --partition=regular2 + #SBATCH --time=06:00:00 + #SBATCH --output=%x.o%A-%a + #SBATCH --error=%x.e%A-%a + # + + ## YOUR CODE GOES HERE (load the modules and do the calculations) + ## Sample code: + + # Make sure it's the same module used at compile time + module load intel + + # Calculate the parameter of the calculation based on the array index, + # e.g. in this case as 5 times the array index + PARAM=$((${SLURM_ARRAY_TASK_ID}*5)) + + # Run calculation + ./my_program.x $PARAM + +.. note:: This workload is based on the specifics of the regular2 nodes. With these numbers you should be able to occupy even just a single node, if it's available; but hey, nonetheless you are running 32 calculations at the same time! 😄 Dependencies ------------ diff --git a/doc/source/extra-tips.rst b/doc/source/extra-tips.rst index 9b7172d..9554dd5 100644 --- a/doc/source/extra-tips.rst +++ b/doc/source/extra-tips.rst @@ -48,6 +48,56 @@ If you want to **totally disable** Hyper-Threading, you can use .. code-block:: console $ sbatch --hint=nomultithread --cpu-bind=cores send_job.sh + +Automatic Login +--------------- + +If you're on a **trusted computer**, you can avoid entering your password every time you login in Ulysses. + +First, generate an SSH keypair via: + +.. code-block:: console + + $ ssh-keygen + +Then, upload your credentials to Ulysses: + +.. code-block:: console + + $ ssh-copy-id username@frontend2.hpc.sissa.it + +You'll be asked for your password for the last time. 🙃 + +You can further shorten the login procedure by opening (or creating) the file ``~/.ssh/config`` and adding the following lines (replace ``username`` with your SISSA username and ``sissacluster2`` with the name you prefer): + +.. code-block:: console + + Host sissacluster2 + User username + HostName frontend2.hpc.sissa.it + IdentityFile ~/.ssh/id_rsa + ServerAliveInterval 120 + ServerAliveCountMax 60 + +Then, in order to login, you will just need + +.. code-block:: console + + $ ssh sissacluster2 + +An even shorter way to login is then to open or create the file ``~/.bash_profile`` and, at the end, add the following line (replace ``cluster2`` with some name you like): + +.. code-block:: bash + + alias cluster2='ssh sissacluster2' + +At this point, logging in to Ulysses becomes a matter of executing the command + +.. code-block:: console + + $ cluster2 + +in a terminal (you might need to close and reopen the terminal, first). Explore Files in a User-Friendly Way ------------------------------------