go test ./... -bench=BenchmarkParallelMatMulOnHeap -benchmem -cpu 1,2,4 -run=^#
What did you expect to see?
It looks like that first benchmark runs with 4 threads except 1 (as I expected). And so the results of first and last benchmarks are the same.
I suppose it happens because my benchmark takes long time, so it executes only one time (if I make size of the matrices smaller it would execute several times and results will be correct). It's probably happens because here: