Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

faiss build failed #520

Closed
danny1984 opened this issue Jul 14, 2018 · 8 comments
Closed

faiss build failed #520

danny1984 opened this issue Jul 14, 2018 · 8 comments
Labels

Comments

@danny1984
Copy link

danny1984 commented Jul 14, 2018

Summary

g++ -std=c++11 -DFINTEGER=int -fPIC -m64 -Wall -g -O3 -fopenmp -Wno-sign-compare -mavx -msse4 -mpopcnt -c index_io.cpp -o index_io.o
index_io.cpp: In function 'void faiss::read_ArrayInvertedLists_sizes(faiss::IOReader*, std::vector&)':
index_io.cpp:572:12: warning: unused variable 'nlist' [-Wunused-variable]
size_t nlist = sizes.size();
^
g++ -std=c++11 -DFINTEGER=int -fPIC -m64 -Wall -g -O3 -fopenmp -Wno-sign-compare -mavx -msse4 -mpopcnt -c IndexScalarQuantizer.cpp -o IndexScalarQuantizer.o
In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:81:0,
from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:46,
from /usr/include/x86_64-linux-gnu/c++/5/bits/opt_random.h:33,
from /usr/include/c++/5/random:50,
from /usr/include/c++/5/bits/stl_algo.h:66,
from /usr/include/c++/5/algorithm:62,
from IndexScalarQuantizer.cpp:14:
/usr/lib/gcc/x86_64-linux-gnu/5/include/f16cintrin.h: In function 'uint16_t faiss::{anonymous}::encode_fp16(float)':
/usr/lib/gcc/x86_64-linux-gnu/5/include/f16cintrin.h:67:1: error: inlining failed in call to always_inline '__m128i _mm_cvtps_ph(__m128, int)': target specific option mismatch
_mm_cvtps_ph (__m128 __A, const int __I)
^
IndexScalarQuantizer.cpp:129:58: error: called from here
xf, _MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC);
^
In file included from /usr/lib/gcc/x86_64-linux-gnu/5/include/immintrin.h:81:0,
from /usr/lib/gcc/x86_64-linux-gnu/5/include/x86intrin.h:46,
from /usr/include/x86_64-linux-gnu/c++/5/bits/opt_random.h:33,
from /usr/include/c++/5/random:50,
from /usr/include/c++/5/bits/stl_algo.h:66,
from /usr/include/c++/5/algorithm:62,
from IndexScalarQuantizer.cpp:14:
/usr/lib/gcc/x86_64-linux-gnu/5/include/f16cintrin.h:67:1: error: inlining failed in call to always_inline '__m128i _mm_cvtps_ph(__m128, int)': target specific option mismatch
_mm_cvtps_ph (__m128 __A, const int __I)
^
IndexScalarQuantizer.cpp:129:58: error: called from here
xf, _MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC);
^
Makefile:27: recipe for target 'IndexScalarQuantizer.o' failed
make: *** [IndexScalarQuantizer.o] Error 1

Platform

Ubuntu 16.04.4 LTS \n \l

makefile.inc are like this

CXX = g++ -std=c++11
CXXFLAGS = -fPIC -m64 -Wall -g -O3 -fopenmp -Wno-sign-compare
CPUFLAGS = -mavx -msse4 -mpopcnt
LDFLAGS = -fPIC -fopenmp

# common linux flags
SHAREDEXT = so
SHAREDFLAGS = -shared
MKDIR_P = mkdir -p

prefix ?= /root/anaconda2/
exec_prefix ?= ${prefix}
libdir = ${exec_prefix}/lib
includedir = ${prefix}/include

# 2. Openblas
#
# The library contains both BLAS and Lapack. About 30% slower than MKL. Please see
# https://github.com/facebookresearch/faiss/wiki/Troubleshooting#slow-brute-force-search-with-openblas
# to fix performance problemes with OpenBLAS

# for Ubuntu 16:
# sudo apt-get install libopenblas-dev python-numpy python-dev
BLASFLAGS=/usr/lib/libopenblas.so.0

# for Ubuntu 14:
# sudo apt-get install libopenblas-dev liblapack3 python-numpy python-dev

CPPFLAGS += -DFINTEGER=int
LIBS += -lopenblas -llapack -L/root/anaconda2/lib

# SWIG executable. This should be at least version 3.x
SWIG = swig

PYTHONCFLAGS = -I/root/anaconda2/include/python2.7/ -I/root/anaconda2/lib/python2.7/site-packages/numpy/core/include/
PYTHONLIB = -lpython

Running on:

  • [ √ ] CPU
    Ubuntu16-docker

Interface:

  • [ √] C++
  • [ √] Python
@danny1984 danny1984 mentioned this issue Jul 14, 2018
2 tasks
@danny1984
Copy link
Author

It solved by setting "./configure LIBS=-lgomp".

@Purg
Copy link

Purg commented Jul 27, 2018

I have also just encountered the above build error on current master and the v1.3.0 tag. Adding a LIBS += -lgomp to the file does not seem to impact this for me. In order to successfully build I had to comment out the #ifdef USE_AVX parts of IndexScalarQuantizer.cpp (obviously not an optimal fix).

Summary

g++ -std=c++11 -DFINTEGER=int -fPIC -m64 -Wall -g -O3 -fopenmp -Wno-sign-compare -mavx -msse4 -mpopcnt -c IndexScalarQuantizer.cpp -o IndexScalarQuantizer.o                       
In file included from /usr/lib/gcc/x86_64-redhat-linux/7/include/immintrin.h:87:0,
                 from IndexScalarQuantizer.cpp:18:
/usr/lib/gcc/x86_64-redhat-linux/7/include/f16cintrin.h: In function ‘uint16_t faiss::{anonymous}::encode_fp16(float)’:                                                            
/usr/lib/gcc/x86_64-redhat-linux/7/include/f16cintrin.h:67:1: error: inlining failed in call to always_inline ‘__m128i _mm_cvtps_ph(__m128, int)’: target specific option mismatch 
 _mm_cvtps_ph (__m128 __A, const int __I)
 ^~~~~~~~~~~~
IndexScalarQuantizer.cpp:129:58: note: called from here
          xf, _MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC);
                                                          ^
In file included from /usr/lib/gcc/x86_64-redhat-linux/7/include/immintrin.h:87:0,
                 from IndexScalarQuantizer.cpp:18:
/usr/lib/gcc/x86_64-redhat-linux/7/include/f16cintrin.h:67:1: error: inlining failed in call to always_inline ‘__m128i _mm_cvtps_ph(__m128, int)’: target specific option mismatch 
 _mm_cvtps_ph (__m128 __A, const int __I)
 ^~~~~~~~~~~~
IndexScalarQuantizer.cpp:129:58: note: called from here
          xf, _MM_FROUND_TO_NEAREST_INT |_MM_FROUND_NO_EXC);
                                                          ^
make: *** [Makefile:27: IndexScalarQuantizer.o] Error 1

Platform

$ cat /etc/redhat-release
Fedora release 27 (Twenty Seven)

$ uname -a
Linux nexus 4.17.3-100.fc27.x86_64 #1 SMP Tue Jun 26 14:19:03 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

$ gcc --version
gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-6)
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ cat /proc/cpuinfo
processor       : 0                                                                                                                                                                                                 
vendor_id       : GenuineIntel                                                                                                                                                                                      
cpu family      : 6                                                                                                                                                                                                 
model           : 62                                                                                                                                                                                                
model name      : Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz                                                                                                                                                         
stepping        : 4                                                                                                                                                                                                 
microcode       : 0x42c                                                                                                                                                                                             
cpu MHz         : 1312.199                                                                                                                                                                                          
cache size      : 12288 KB                                                                                                                                                                                          
physical id     : 0                                                                                                                                                                                                 
siblings        : 12                                                                                                                                                                                                
core id         : 0                                                                                                                                                                                                 
cpu cores       : 6                                                                                                                                                                                                 
apicid          : 0                                                                                                                                                                                                 
initial apicid  : 0                                                                                                                                                                                                 
fpu             : yes                                                                                                                                                                                               
fpu_exception   : yes                                                                                                                                                                                               
cpuid level     : 13                                                                                                                                                                                                
wp              : yes                                                                                                                                                                                               
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm cpuid_
fault epb pti ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm arat pln pts                                                                                                 
bugs            : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass                                                                                                                                              
bogomips        : 6983.70                                                                                                                                                                                           
clflush size    : 64                                                                                                                                                                                                
cache_alignment : 64                                                                                                                                                                                                
address sizes   : 46 bits physical, 48 bits virtual                                                                                                                                                                 
power management:                                                                                                                                                                                                   

...

Makefile

# -*- makefile -*-
# Copyright (c) 2015-present, Facebook, Inc.
# All rights reserved.
#
# This source code is licensed under the BSD+Patents license found in the
# LICENSE file in the root directory of this source tree.

# tested on CentOS 7, Ubuntu 16 and Ubuntu 14, see below to adjust flags to distribution.


CXX      = g++ -std=c++11
CXXFLAGS = -fPIC -m64 -Wall -g -O3 -fopenmp -Wno-sign-compare
CPUFLAGS = -mavx -msse4 -mpopcnt
LDFLAGS  = -fPIC -fopenmp

# common linux flags
SHAREDEXT   = so
SHAREDFLAGS = -shared
MKDIR_P = mkdir -p

#prefix      ?= /usr/local
prefix      ?= /home/purg/miniconda/envs/smqtk_py36
exec_prefix ?= ${prefix}
libdir       = ${exec_prefix}/lib
includedir   = ${prefix}/include

##########################################################################
# Uncomment one of the 4 BLAS/Lapack implementation options
# below. They are sorted # from fastest to slowest (in our
# experiments).
##########################################################################

#
# 1. Intel MKL
#
# This is the fastest BLAS implementation we tested. Unfortunately it
# is not open-source and determining the correct linking flags is a
# nightmare. See
#
#   https://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
#
# The latest tested version is MKL 2017.0.098 (2017 Initial Release) and can
# be downloaded here:
#
#   https://registrationcenter.intel.com/en/forms/?productid=2558&licensetype=2
#
# The following settings are working if MKL is installed on its default folder:
#
# MKLROOT   = /opt/intel/compilers_and_libraries/linux/mkl/
#
# LDFLAGS  += -Wl,--no-as-needed -L$(MKLROOT)/lib/intel64
# LIBS     += -lmkl_intel_ilp64 -lmkl_core -lmkl_gnu_thread -ldl -lpthread
#
# CPPFLAGS += -DFINTEGER=long
#
# You may have to set the LD_LIBRARY_PATH=$MKLROOT/lib/intel64 at runtime.
#
# If at runtime you get the error:
#   Intel MKL FATAL ERROR: Cannot load libmkl_avx2.so or libmkl_def.so
# you may set
#   LD_PRELOAD=$MKLROOT/lib/intel64/libmkl_core.so:$MKLROOT/lib/intel64/libmkl_sequential.so
# at runtime as well.

#
# 2. Openblas
#
# The library contains both BLAS and Lapack. About 30% slower than MKL. Please see
#   https://github.com/facebookresearch/faiss/wiki/Troubleshooting#slow-brute-force-search-with-openblas
# to fix performance problemes with OpenBLAS

# for Ubuntu 16:
# sudo apt-get install libopenblas-dev python-numpy python-dev

# for Ubuntu 14:
# sudo apt-get install libopenblas-dev liblapack3 python-numpy python-dev

CPPFLAGS += -DFINTEGER=int
LIBS     += -lopenblas -llapack
BLASLDFLAGS=/usr/lib64/libopenblas.so

# 3. Atlas
#
# Automatically tuned linear algebra package. As the name indicates,
# it is tuned automatically for a give architecture, and in Linux
# distributions, it the architecture is typically indicated by the
# directory name, eg. atlas-sse3 = optimized for SSE3 architecture.
#
# BLASCFLAGS=-DFINTEGER=int
# BLASLDFLAGS=/usr/lib64/atlas-sse3/libptf77blas.so.3 /usr/lib64/atlas-sse3/liblapack.so

#
# 4. reference implementation
#
# This is just a compiled version of the reference BLAS
# implementation, that is not optimized at all.
#
# CPPFLAGS += -DFINTEGER=int
# LIBS += /usr/lib64/libblas.so.3 /usr/lib64/liblapack.so.3.2
#


##########################################################################
# SWIG and Python flags
##########################################################################

# SWIG executable. This should be at least version 3.x
SWIG = swig

# The Python include directories for a given python executable can
# typically be found with
#
# python -c "import distutils.sysconfig; print distutils.sysconfig.get_python_inc()"
# python -c "import numpy ; print numpy.get_include()"
#
# or, for Python 3, with
#
# python3 -c "import distutils.sysconfig; print(distutils.sysconfig.get_python_inc())"
# python3 -c "import numpy ; print(numpy.get_include())"
#

# Defaults
#PYTHONCFLAGS = -I/usr/include/python2.7/ -I/usr/lib64/python2.7/site-packages/numpy/core/include/
#PYTHONLIB    = -lpython

# Miniconda 3.6.6 SMQTK env
PYTHONCFLAGS = -I/home/purg/miniconda/envs/smqtk_py36/include/python3.6/ \
						   -I/home/purg/miniconda/envs/smqtk_py36/lib/python3.6/site-packages/numpy/core/include/
PYTHONLIB    = -L/home/purg/miniconda/envs/smqtk_py36/lib -lpython


###########################################################################
# Cuda GPU flags
###########################################################################



# root of the cuda 8 installation
#CUDAROOT     = /usr/local/cuda-8.0
CUDAROOT     = /usr/local/cuda-9.2
NVCC         = $(CUDAROOT)/bin/nvcc
NVCCLDFLAGS  = -L$(CUDAROOT)/lib64
NVCCLIBS     = -lcudart -lcublas -lcuda
CUDACFLAGS   = -I$(CUDAROOT)/include
NVCCFLAGS    = -I $(CUDAROOT)/targets/x86_64-linux/include/ \
-Xcompiler -fPIC \
-Xcudafe --diag_suppress=unrecognized_attribute \
-gencode arch=compute_35,code="compute_35" \
-gencode arch=compute_50,code="compute_50" \
-gencode arch=compute_52,code="compute_52" \
-gencode arch=compute_50,code="compute_53" \
-gencode arch=compute_60,code="compute_60" \
-lineinfo \
-ccbin $(CXX) -DFAISS_USE_FLOAT16

@hijoe320
Copy link

encountered the same problem, and "./configure LIBS=-lgomp" won't fix it.

@keything
Copy link

keything commented Aug 13, 2018

./configure LIBS=-lgomp doesn't work.

I fix it by Purg's method: it doesn't define USE_AVX

in file IndexScalarQuantizer.cpp
41 #ifdef AVX
42 #define COMMENT_USE_AVX
43 #endif

@ahappycutedog
Copy link

./configure LIBS=-lgomp doesn't work.

I fix it by Purg's method: it doesn't define USE_AVX

in file IndexScalarQuantizer.cpp
41 #ifdef AVX
42 #define COMMENT_USE_AVX
43 #endif

I want to know which document should be modified? makefile.inc? I can't find a solution

@esdotzed
Copy link
Contributor

@ahappycutedog Not sure if you are still working on this, but I fixed it according to Purg's method and it's here:
https://github.com/esdotzed/faiss/tree/temp-fix

@bduclaux
Copy link

You might want to use -march=native in your makefile.inc :
CPUFLAGS = -march=native -mpopcnt
instead of :
CPUFLAGS = -mavx -msse4 -mpopcnt

Worked for me.

@lucasjinreal
Copy link

This error caused by example/makefile.inc.Linux, don't copy it, Just

./configure --without-cuda
make

Everything works fine

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants