# Santos Dumont (SD) - Numba CPU MPI B710

In [2]:
# Mostra os recursos do nó de login
! lscpu | head -n 15 | grep "Model \|CPU(s):\|Thre\|Core\|NUMA\|MHz"

CPU(s):                24
Thread(s) per core:    1
Core(s) per socket:    12
NUMA node(s):          2
Model name:            Intel(R) Xeon(R) CPU E5-2695 v2 @ 2.40GHz
CPU MHz:               2865.820


### Testa a execução

In [4]:
%%bash
module load intel_psxe/2020
source /opt/intel/parallel_studio_xe_2020/intelpython3/etc/profile.d/conda.sh
unset I_MPI_PMI_LIBRARY
time mpiexec -n 1 python -m cProfile -s cumtime numbampib710.py > numbampi.txt


real	0m8.299s
user	0m11.938s
sys	0m1.350s


In [6]:
! head numbampi.txt

Heat: 750.0000 | Time: 0.6470 | MPISize: 1
         2638012 function calls (2405034 primitive calls) in 7.454 seconds

   Ordered by: cumulative time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
    910/1    0.009    0.000    7.457    7.457 {built-in method builtins.exec}
        1    0.033    0.033    7.457    7.457 numbampi.py:1(<module>)
   629/32    0.005    0.000    4.783    0.149 <frozen importlib._bootstrap>:978(_find_and_load)
   629/32    0.004    0.000    4.782    0.149 <frozen importlib._bootstrap>:948(_find_and_load_unlocked)
   605/32    0.004    0.000    4.760    0.149 <frozen importlib._bootstrap>:663(_load_unlocked)
   898/32    0.001    0.000    4.750    0.148 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
   513/61    0.002    0.000    4.118    0.068 {built-in method builtins.__import__}
8657/5866    0.011    0.000    4.109    0.001 <frozen importlib._bootstrap>:1009(_handle_fromlist)
   520/30    0.002    0.000    3.710    0.

Testa com 16 processos:

In [7]:
%%bash
module load intel_psxe/2020
source /opt/intel/parallel_studio_xe_2020/intelpython3/etc/profile.d/conda.sh
unset I_MPI_PMI_LIBRARY
mpiexec -n 16 python numbampib710.py

Heat: 750.0000 | Time: 1.3539 | MPISize: 16


### Copia arquivo com código python para /scratch

In [8]:
! cp  numbampib710.py  /scratch${PWD#/prj}

### Arquivo de lote do Slurm

In [9]:
%%writefile numbampi.srm
#!/bin/bash
#SBATCH --ntasks=96            #Total de tarefas
#SBATCH --job-name numbampi    #Nome do job, 8 caracteres
#SBATCH --partition cpu_dev    #Fila (partition) a ser utilizada
#SBATCH --time=00:01:00        #Tempo max. de execução
#SBATCH --exclusive            #Utilização exclusiva dos nós

echo '- Job ID:' $SLURM_JOB_ID
echo '- Tarefas por no:' $SLURM_NTASKS_PER_NODE
echo '- Qtd. de nos:' $SLURM_JOB_NUM_NODES
echo '- Tot. de tarefas:' $SLURM_NTASKS
echo '- Nos alocados:' $SLURM_JOB_NODELIST
nodeset -e $SLURM_JOB_NODELIST

#Modulos
module load intel_psxe/2020
source /opt/intel/parallel_studio_xe_2020/intelpython3/etc/profile.d/conda.sh

#Entra no diretório de trabalho
cd /scratch${PWD#/prj}

#Executavel
EXEC='python numbampib710.py'

#Dispara a execucao
srun --mpi=pmi2  -n $SLURM_NTASKS  $EXEC

Writing numbampi.srm


## Envia para a fila de execução dev

In [11]:
%%bash
sbatch numbampi.srm
squeue --user $(whoami) -h -r | wc -l
squeue --partition=cpu_dev -h -r | wc -l
squeue --start --name=numbampi --format "%S %.8i %.9P %.5j %.2t %.5M %.5D %.4C"

Submitted batch job 1360854
1
4
START_TIME    JOBID PARTITION  NAME ST  TIME NODES CPUS
N/A  1360854   cpu_dev numba PD  0:00     4   96


Verifica se já executou:

In [12]:
! squeue --start --name=numbampi --format "%S %.8i %.9P %.5j %.2t %.5M %.5D %.4C"

START_TIME    JOBID PARTITION  NAME ST  TIME NODES CPUS


Mostra o arquivo contendo a saída:

In [13]:
! cat /scratch${PWD#/prj}/slurm-1360854.out

- Job ID: 1360854
- Tarefas por no:
- Qtd. de nos: 4
- Tot. de tarefas: 96
- Nos alocados: sdumont[1263-1266]
sdumont1263 sdumont1264 sdumont1265 sdumont1266
Heat: 602.6262 | Time: 3.2346 | MPISize: 96


Neste caso enviamos para fila `cpu_dev` que é uma fila "rápida" para executar testes, e para trabalhos pequenos.

## Analisando tarefas passadas

In [30]:
! sacct --jobs=1360854 --format=jobname,ncpus,nnodes,maxrss,maxrssnode%13,start,elapsed,cputime

   JobName      NCPUS   NNodes     MaxRSS    MaxRSSNode               Start    Elapsed    CPUTime 
---------- ---------- -------- ---------- ------------- ------------------- ---------- ---------- 
  numbampi         96        4                          2021-09-23T21:13:08   00:00:15   00:24:00 
     batch         24        1          0   sdumont1263 2021-09-23T21:13:08   00:00:15   00:06:00 
    python         96        4          0   sdumont1266 2021-09-23T21:13:10   00:00:13   00:20:48 


In [23]:
! scontrol show node sdumont1263

NodeName=sdumont1263 Arch=x86_64 CoresPerSocket=12
   CPUAlloc=0 CPUErr=0 CPUTot=24 CPULoad=0.01
   AvailableFeatures=(null)
   ActiveFeatures=(null)
   Gres=(null)
   NodeAddr=sdumont1263 NodeHostName=sdumont1263 Version=17.02
   OS=Linux RealMemory=64000 AllocMem=0 FreeMem=61622 Sockets=2 Boards=1
   State=IDLE ThreadsPerCore=1 TmpDisk=0 Weight=1 Owner=N/A MCS_label=N/A
   Partitions=cpu_dev 
   BootTime=2020-12-05T16:53:44 SlurmdStartTime=2021-03-10T20:07:55
   CfgTRES=cpu=24,mem=62.50G
   AllocTRES=
   CapWatts=n/a
   Socket_CapWatts=n/a
   CurrentWatts=7 LowestJoules=210 ConsumedJoules=189874440
   ExtSensorsJoules=n/s ExtSensorsWatts=0 ExtSensorsTemp=n/s
   

