Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segfault when doing index.search using faiss-cpu #15

Closed
rom1504 opened this issue Jul 6, 2020 · 15 comments
Closed

segfault when doing index.search using faiss-cpu #15

rom1504 opened this issue Jul 6, 2020 · 15 comments

Comments

@rom1504
Copy link

rom1504 commented Jul 6, 2020

Running the faiss example

import numpy as np
d = 64                           # dimension
nb = 100000                      # database size
nq = 10000                       # nb of queries
np.random.seed(1234)             # make reproducible
xb = np.random.random((nb, d)).astype('float32')
xb[:, 0] += np.arange(nb) / 1000.
xq = np.random.random((nq, d)).astype('float32')
xq[:, 0] += np.arange(nq) / 1000.

import faiss                   # make faiss available
index = faiss.IndexFlatL2(d)   # build the index
print(index.is_trained)
index.add(xb)                  # add vectors to the index
print(index.ntotal)

k = 4                          # we want to see 4 nearest neighbors
D, I = index.search(xb[:5], k) # sanity check
print(I)
print(D)
D, I = index.search(xq, k)     # actual search
print(I[:5])                   # neighbors of the 5 first queries
print(I[-5:])                  # neighbors of the 5 last queries

I'm getting a segfault when doing the search.

OS: ubuntu 20.04
version : 1.6.3 of faiss-cpu

any idea what could be wrong ?

@kyamagu
Copy link
Owner

kyamagu commented Jul 6, 2020

Not sure, likely the ABI compatibility issue in the runtime. Do you have any stack trace?

@rom1504
Copy link
Author

rom1504 commented Jul 6, 2020

#0  0x00007fffe814e28d in ?? () from /media/donnees/non_sauvegarde/image_embeddings_env/lib/python3.7/site-packages/faiss/_swigfaiss.cpython-37m-x86_64-linux-gnu.so
#1  0x00007fffe8174dc0 in ?? () from /media/donnees/non_sauvegarde/image_embeddings_env/lib/python3.7/site-packages/faiss/_swigfaiss.cpython-37m-x86_64-linux-gnu.so
#2  0x00000000005c8663 in _PyMethodDef_RawFastCallKeywords ()
#3  0x0000000000535990 in ?? ()
#4  0x000000000053c5a1 in _PyEval_EvalFrameDefault ()
#5  0x00000000005c916b in _PyFunction_FastCallKeywords ()
#6  0x0000000000535880 in ?? ()
#7  0x0000000000538713 in _PyEval_EvalFrameDefault ()
#8  0x00000000005365e7 in _PyEval_EvalCodeWithName ()
#9  0x00000000005c9468 in _PyFunction_FastCallKeywords ()
#10 0x0000000000535880 in ?? ()
#11 0x0000000000538713 in _PyEval_EvalFrameDefault ()
#12 0x00000000005365e7 in _PyEval_EvalCodeWithName ()
#13 0x000000000064cbb3 in PyEval_EvalCode ()
#14 0x00000000006402a3 in ?? ()
#15 0x0000000000640357 in PyRun_FileExFlags ()
#16 0x000000000064110a in PyRun_SimpleFileExFlags ()

Not sure if that really helps

@rom1504
Copy link
Author

rom1504 commented Jul 6, 2020

Ah interesting...
It seems I'm using the same issue as #1
In my environment I had tensorflow 2.2.0
If I make an environment with only faiss-cpu, it works

@rom1504
Copy link
Author

rom1504 commented Jul 6, 2020

Do you think a similar fix as #1 (comment) could be possible ?

@rom1504
Copy link
Author

rom1504 commented Jul 6, 2020

Ok no I identified the actual issue.
In order to install tensorflow 2, it's needed to update pip.
So I updated to pip pip-20.1.1.
Run pip install -U pip in a venv for this

These commands reproduce my issue :

python3 -m venv env
source env/bin/activate
pip install -U pip
pip install faiss-cpu
python test.py

With test.py content being the thing in the first post

I'm using python 3.7.5 if that can make a difference

Can you reproduce ? any idea why that would happen ?

@kyamagu
Copy link
Owner

kyamagu commented Jul 7, 2020

There could be many possibilities, but I could not reproduce this on docker ubuntu 20.04 image. It could be bad interaction with multiple package managers (pip and conda), it could be lack of cpu feature (avx2), etc.

@kyamagu
Copy link
Owner

kyamagu commented Jul 7, 2020

Also, if you import tensorflow as tf, always do that before importing other pacakges.

@rom1504
Copy link
Author

rom1504 commented Jul 7, 2020

rom1504@rom1504-W35-37ET:~$ cat /proc/cpuinfo
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 58
model name      : Intel(R) Core(TM) i5-3230M CPU @ 2.60GHz
stepping        : 9
microcode       : 0x21
cpu MHz         : 1197.274
cache size      : 3072 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 2
apicid          : 0
initial apicid  : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 13
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_fault epb pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts md_clear flush_l1d
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds swapgs itlb_multihit srbds
bogomips        : 5188.15
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management:

Indeed looks like my CPU does not support avx2

What I'm quite confused about is if I do not update pip (so keeping pip 18.1 with python 3.7) then everything works.
If I do update pip to 20.1.1, I'm getting that segfault, even without installing tensorflow.
I don't use conda

@rom1504
Copy link
Author

rom1504 commented Jul 7, 2020

Ah I think I found something.
If I use pip 1.18.1 (working) it's using faiss_cpu-1.6.3-cp37-cp37m-manylinux1_x86_64.whl
whereas if I use pip 20.1.1 (not working) it's using faiss_cpu-1.6.3-cp37-cp37m-manylinux2010_x86_64.whl

Are there some differences in this second file that might explain my result ?
Maybe this second file requires avx2 ?

If that's the case, would it be possible to provide a non avx2 build or do you advise I build it myself ?

(and do you know a way to identify if the issue is missing avx2 ?)

@rom1504
Copy link
Author

rom1504 commented Jul 7, 2020

I tried using python 3.8 instead of 3.7 as per #8 (comment) but this makes no difference in my case

@rom1504
Copy link
Author

rom1504 commented Jul 7, 2020

Tried to build myself but getting some other errors when trying to follow https://github.com/kyamagu/faiss-wheels#linux , I opened a separate issue for this (#17)
I think it may make sense for me to just not support hardware without AVX2 for the lib I'm building, and advising to use google colab in that case.

@kyamagu
Copy link
Owner

kyamagu commented Jul 8, 2020

manylinux1 and manylinux2010 have different glibc version dependency, and they are usually forward compatible: manylinux1 works on centos 5 environment and above, manylinux2010 works on centos 6 and above. ubuntu 20.04 should be fine with both.

Hmm, it makes sense to support option to disable / enable avx2.

@kyamagu
Copy link
Owner

kyamagu commented Jul 8, 2020

If you build from source, you can also try the official approach:

cd faiss
./configure
make
make -C python

@rom1504
Copy link
Author

rom1504 commented Jul 8, 2020

So as discussed in the github action PR, the new version will not use avx2 instructions. That's great. I understand it's not possible to change an existing release and you follow faiss versioning which makes sense.
In the mean time, I forked your repo (https://github.com/rom1504/faiss-wheels), and thanks to the use of github action, I was able by just changing NAME in setup.py to release this package https://pypi.org/project/faiss-cpu-noavx2 which completely solves my problem

I will use it until faiss release a new version, at this point I'll be able to go back to faiss-cpu

Again thanks for the support and for building this simple and useful repo, I think at some point you could consider simply doing a PR to faiss main repo, they might just accept it and maintain it in the future.

@rom1504 rom1504 closed this as completed Jul 8, 2020
@kyamagu
Copy link
Owner

kyamagu commented Jul 9, 2020

@rom1504 Great.

As a side note, faiss maintainers seem not to be interested in supporting PyPI: facebookresearch/faiss#170 (comment)
This is why this unofficial repo exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants