Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: Numpy freezes/reboot on a i9-7980XE 18cores machine #13300

Closed
ll-portes opened this issue Apr 11, 2019 · 6 comments
Closed

BUG: Numpy freezes/reboot on a i9-7980XE 18cores machine #13300

ll-portes opened this issue Apr 11, 2019 · 6 comments

Comments

@ll-portes
Copy link

If I run the simple code below, one of our computers completely freezes for 5-10 seconds and then reboots. This happens only on a machine with Intel i9-7980XE 18 cores cpu, and Anaconda3-2019.03-Linux-x86_64.sh installed (running on Ubuntu 18.04.2 LTS,). The same Conda/Ubuntu environment with an i7-7700 4 cores cpu has no problems. This happens on the python command line and on a Jupyter notebook.

Reproducing code example:

import numpy as np
A = np.matrix([[1.], [3.]])
B = np.matrix([[2., 3.]])
np.dot(A, B)

Note 1: For one week, I tried without success several combinations of (re)installing different Anaconda versions, updating and downgrading mkl, blas etc. I started to find a "solution" on the references below, but in my case it's not a virtual environment:

A "partial" solution that worked for me was:

conda install nomkl numpy scipy scikit-learn numexpr
conda remove mkl mkl-service

And only then use this code in my Jupyter notebooks (or in the Python terminal):

import os
os.environ['OPENBLAS_CORETYPE']='Haswell'

It is a "partial" solution because I just discovered (yesterday) that if I use the Python Igraph package, the computer starts again to freeze/reboot. I know nothing about programming etc as you guys. So I'm sorry if I'm reporting here in a wrong way. I don't know where else to report (I was even not aware of the difference between BLAS and OpenBLAS...). Since this started before removing the mkl versions etc, I suppose I should report here as well because is not only a problem in openBLAS (I suppose I'm using openBLAS only after conda install nonmkl etc).

Error message:

I have no error messages. The 18cores computer just completely freezes for 5-10 seconds, and then reboot. Sometimes (but not always), it was freezing/reboot with a simple np.__version__.

Note 2: the connection between using os.environ['OPENBLAS_CORETYPE']='Haswell' and not using it has been completely deterministic. For example, just after a freeze/reboot, I can run the same code (but now running this command as the first one) and the computer works fine. It will work even with joblib and all the 18 cores at 100%. So, the computer can be cold or hot, and the only requirement to freeze/reboot is running or not the code above. (Note: yesterday, by using IGraph, it will freeze/reboot anyway...).

Numpy/Python version information:

The current output from print(numpy.__version__, sys.version) (i9-7980XE 18 cores cpu):

1.16.2 3.7.3 (default, Mar 27 2019, 22:11:17) [GCC 7.3.0]

Note 3: this output is the same as the other computer ( i7-7700 4 cores cpu), which has no problems (so, it uses the mkl versions of numpy).

Thank you all in advance!

@mattip mattip changed the title Numpy freezes/reboot only on a i9-7980XE 18cores machine BUG: Numpy freezes/reboot in dot on a i9-7980XE 18cores machine Apr 11, 2019
@matthew-brett
Copy link
Contributor

Would you mind trying your tests using the system (apt-get) python, and numpy installed via pip? I mean, without using conda?

Something like:

sudo apt-get install python python-pip
pip install --user numpy

@ll-portes
Copy link
Author

Hi @matthew-brett, thank you for your help. Sorry for the delay, but I live in Australia...

Without using conda, it worked! But now the Python version is 2.7, instead of the previous 3.7.

print(np.__version__, sys.version)
('1.16.2', '2.7.15rc1 (default, Nov 12 2018, 14:31:15) \n[GCC 7.3.0]')

So, I made the same test after installing Python3 and Numpy, and it worked!

print(np.__version__, sys.version)
1.16.2 3.6.7 (default, Oct 22 2018, 11:32:17) [GCC 8.2.0]

Due to this result, should I report to Anaconda? May I do anything to help?

Note: @mattip , I saw you changed the title of the issue with a focus on np.dot. However, (I don't know if the following information can help) I had this issue as well by running an SVD decomposition:

import numpy as np
nr=1000;nc=10000
X=np.random.rand(nr,nc)
u,s,vt=np.linalg.svd(X)

After the pip installation, with both Python2 and 3, both np.dot and SVD are working.

@matthew-brett
Copy link
Contributor

No problem for the delay - yes, please do report this to the Anaconda folks. I am sure they will ask for your help debugging, they may not have a CPU like yours to test on.

@mattip mattip changed the title BUG: Numpy freezes/reboot in dot on a i9-7980XE 18cores machine BUG: Numpy freezes/reboot on a i9-7980XE 18cores machine Apr 12, 2019
@matthew-brett
Copy link
Contributor

Oh - also - if you do report to Anaconda, I think this the right place : https://github.com/ContinuumIO/anaconda-issues

Could you leave a reference to your Anaconda issue here, for our reference?

@ll-portes
Copy link
Author

Thank you @matthew-brett ! The link is ContinuumIO/anaconda-issues#10832

@charris
Copy link
Member

charris commented Sep 5, 2019

Closing, seems to be an Anaconda issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants