While checking the latest test reports, I figured that there must have been a misunderstanding in my specifications for the heavy-test job.
This is the regular new backend ScalableTestSuite job report:

and this is the corresponding heavy-test new backend ScalableTestSuite job report:

All the code generation tasks take 40% less time in the heavy-test job, I guess because you don't have many cores competing for memory access on one or two DMA memory channels. However, the C-compilation task takes 7X times, I guess because we are not running many clang instances in parallel on the available cores. BTW, this explains why the heavy-tests job really takes forever to run.
My intention was that the simulations in the heavy-tests job run as if you are running them standalone on the workstation: one by one, but using all the available memory and cores. For some reason, this probably ended up with just one core being used.
Can you please fix that?