numpy is most likely parallelizing operations (we need to look into how to control the number of threads here), so we should as well.