I was concerned about running some jobs using Matlab on blanca, the same job could take
30 min on one node and 4h30 on another one and I didn't understand why.
Until I find out that despite submitting jobs by asking --ntasks-per-node=N, matlab was
starting a parallel pool of n workers only, with n < N, for instance N = 28, n = 16.
And this is actually normal given that Matlab launches a number of workers which equals at
maximum the number of physical cores.
So I was in this unknown situation when I'm allocated 28 logical cores but can have only 16
workers, corresponding to the 16 physical cores of the node, e.g. bnode0101, and thus limiting
the extent of my parallelization in a significant way.
I found out the solution to be allocated the correct number of physical cores I want, by adding
the option --cores-per-socket 28 in my sbatch command.
I didn't make any test on Alpine yet, but I noticed that some alpine nodes had also a reduced
number of physical cores.
I think a note on all this would be helpful for other users to add in the document.
We should confirm the behavior they state. If confirmed, we should add this information to our MATLAB page.
We had a user who stated the following:
We should confirm the behavior they state. If confirmed, we should add this information to our MATLAB page.