OpenBLAS and MKL behave very much differently when used in a multithreaded Julia process. Understandably, this leads to confusions for users. Given that using MKL is seen and also marketed as the easiest way to switch between OpenBLAS and MKL I think a warning in the README.md about the different behavior in a multithreaded settings is warranted.
Example:
The current MKL default for BLAS.get_num_threads() is # of cores, i.e. 40 on my machine. While this is good for serial Julia, it's pretty bad when you run your code with julia -t N in which case you will immediately oversubscribe your cores by a factor of N. This is not the case for OpenBLAS, which will still use 40 BLAS threads. However, there it is generally better to set OPENBLAS_NUM_THREADS=1.