running demo_ivfpq_indexing_gpu Segmentation fault #67

Closed
caydyn-skd opened this Issue Apr 5, 2017 · 11 comments

Comments

Projects
None yet
5 participants
@caydyn-skd

caydyn-skd commented Apr 5, 2017

1. The results are as follows:

[0.382 s] Generating 100000 vectors in 128D for training
[0.540 s] Training the index
Training IVF quantizer on 100000 vectors in 128D
Clustering 100000 points in 128D to 1788 clusters, redo 1 times, 10 iterations
  Preprocessing in 0.032731 s
  Iteration 0 (0.12 s, search 0.09 s): objective=1.43954e+06 imbalance=2.907 nsplit=0       Iteration 9 (3.68 s, search 3.61 s): objective=930934 imbalance=1.255 nsplit=0       
computing residuals
training 4 x 256 product quantizer on 16384 vectors in 128D
Training PQ slice 0/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00141504 s
  Iteration 24 (2.73 s, search 2.45 s): objective=27271.5 imbalance=1.018 nsplit=0       
Training PQ slice 1/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.000438965 s
  Iteration 24 (2.37 s, search 2.06 s): objective=27193.4 imbalance=1.016 nsplit=0       
Training PQ slice 2/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.000931885 s
  Iteration 24 (2.59 s, search 2.25 s): objective=27230.8 imbalance=1.021 nsplit=0       
Training PQ slice 3/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.000437988 s
  Iteration 24 (1.96 s, search 1.78 s): objective=27174 imbalance=1.023 nsplit=0         
[14.164 s] storing the pre-trained index to /tmp/index_trained.faissindex
[14.186 s] Building a dataset of 200000 vectors to index
[14.506 s] Adding the vectors to the index
Segmentation fault (core dumped)

2. Library dependency:

        linux-vdso.so.1 =>  (0x00007ffc876da000)
	libopenblas.so.0 => /usr/lib/libopenblas.so.0 (0x00007f1f42cfb000)
	libcublas.so.8.0 => /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcublas.so.8.0 (0x00007f1f40263000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f1f4005a000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f1f3fe3d000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f1f3fc39000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f1f3f8b6000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f1f3f5ad000)
	libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f1f3f38b000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f1f3f174000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f1f3edab000)
	/lib64/ld-linux-x86-64.so.2 (0x0000564584980000)
	libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007f1f3ea80000)
	libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007f1f3e840000)

3. Heap information:

  #0  0x00007fffe9828c9a in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
  #1  0x00007fffe974f696 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
  #2  0x00007fffe9884992 in cuEventDestroy_v2 ()
   from /usr/lib/x86_64-linux-gnu/libcuda.so.1
  #3  0x00000000004f62f4 in cudart::cudaApiEventDestroy(CUevent_st*) ()
  #4  0x0000000000524b94 in cudaEventDestroy ()
  #5  0x0000000000440af8 in faiss::gpu::streamWaitBase<std::vector<CUstream_st*, std::allocator<CUstream_st*> >, std::initializer_list<CUstream_st*> > (
      listWaiting=std::vector of length 2, capacity 2 = {...}, listWaitOn=...)
      at impl/../utils/DeviceUtils.h:131
  #6  0x0000000000479294 in faiss::gpu::streamWait<std::vector<CUstream_st*,   std::allocator<CUstream_st*> > > (b=..., a=std::vector of length 2, capacity 2 = {...})
      at impl/../utils/DeviceUtils.h:140
  #7  faiss::gpu::runL2Distance<float> (resources=0x7fffffffe780, centroids=..., 
    centroidNorms=centroidNorms@entry=0xb9e1a30, queries=..., k=k@entry=1, 
    outDistances=..., outIndices=..., ignoreOutDistances=true, tileSize=256)
    at impl/Distance.cu:110
  #8  0x000000000047032e in faiss::gpu::runL2Distance (resources=<optimized out>, 
    vectors=..., vectorNorms=vectorNorms@entry=0xb9e1a30, queries=..., k=k@entry=1, 
    outDistances=..., outIndices=..., ignoreOutDistances=<optimized out>, tileSize=-1)
    at impl/Distance.cu:307
  #9  0x000000000042e574 in faiss::gpu::FlatIndex::query (this=0xb9e1970, vecs=..., 
    k=k@entry=1, outDistances=..., outIndices=..., 
    exactDistance=exactDistance@entry=false, tileSize=-1) at impl/FlatIndex.cu:121
  #10 0x00000000004432c3 in faiss::gpu::IVFPQ::classifyAndAddVectors (this=0xb9e2d80, 
    vecs=..., indices=...) at impl/IVFPQ.cu:138
  #11 0x0000000000424fd0 in faiss::gpu::GpuIndexIVFPQ::add_with_ids (this=0x7fffffffe8d0, 
    n=200000, x=0x7fffcf580010, xids=0xc0ab390) at GpuIndexIVFPQ.cu:355
  #12 0x000000000042092b in faiss::gpu::GpuIndexIVF::add (this=0x7fffffffe8d0, n=200000, 
    x=0x7fffcf580010) at GpuIndexIVF.cu:254
  #13 0x000000000040e8bf in main () at test/demo_ivfpq_indexing_gpu.cpp:114

4. Hardware information

        01:00.0 3D controller: NVIDIA Corporation GM206M [GeForce GTX 965M] (rev a1)
	DeviceName: NVIDIA N16E-GR
	Subsystem: Hewlett-Packard Company GM206M [GeForce GTX 965M]
	Flags: bus master, fast devsel, latency 0, IRQ 134
	Memory at a3000000 (32-bit, non-prefetchable) [size=16M]
	Memory at 90000000 (64-bit, prefetchable) [size=256M]
	Memory at a0000000 (64-bit, prefetchable) [size=32M]
	I/O ports at 4000 [size=128]
	[virtual] Expansion ROM at a4000000 [disabled] [size=512K]
	Capabilities: <access denied>
	Kernel driver in use: nvidia
	Kernel modules: nvidiafb, nouveau, nvidia_375_drm, nvidia_375

@mdouze mdouze added the duplicate label Apr 5, 2017

@mdouze

This comment has been minimized.

Show comment
Hide comment
@mdouze

mdouze Apr 5, 2017

Contributor

Hi @caydyn-skd

Thanks for the extensive bug report. I believe this is linked to the low-mem issue #66 mentioned here

#66 (comment)

Please stay tuned until we have a fix.

Contributor

mdouze commented Apr 5, 2017

Hi @caydyn-skd

Thanks for the extensive bug report. I believe this is linked to the low-mem issue #66 mentioned here

#66 (comment)

Please stay tuned until we have a fix.

@mdouze

This comment has been minimized.

Show comment
Hide comment
@mdouze

mdouze Apr 6, 2017

Contributor

Could you try with the current version? It has better low-mem GPU support.

Contributor

mdouze commented Apr 6, 2017

Could you try with the current version? It has better low-mem GPU support.

@eduj36

This comment has been minimized.

Show comment
Hide comment
@eduj36

eduj36 Apr 6, 2017

I also have an error running demo_ivfpq_indexing_gpu as below. The most recent version is used and it is running on TITAN X (Maxwell) with 12 GB of memory.

$ ./test/demo_ivfpq_indexing_gpu
[0.561 s] Generating 100000 vectors in 128D for training
[0.699 s] Training the index
Training IVF quantizer on 100000 vectors in 128D
Clustering 100000 points in 128D to 1788 clusters, redo 1 times, 10 iterations
  Preprocessing in 0.01 s
  Iteration 9 (0.34 s, search 0.26 s): objective=930934 imbalance=1.255 nsplit=0
computing residuals
training 4 x 256 product quantizer on 16384 vectors in 128D
Training PQ slice 0/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (1.89 s, search 1.52 s): objective=27271.5 imbalance=1.018 nsplit=0
Training PQ slice 1/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (1.99 s, search 1.62 s): objective=27193.4 imbalance=1.016 nsplit=0
Training PQ slice 2/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (1.97 s, search 1.60 s): objective=27230.8 imbalance=1.021 nsplit=0
Training PQ slice 3/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (1.49 s, search 1.20 s): objective=27174 imbalance=1.023 nsplit=0
[8.526 s] storing the pre-trained index to /tmp/index_trained.faissindex
[8.573 s] Building a dataset of 200000 vectors to index
[8.841 s] Adding the vectors to the index
Faiss assertion err == CUBLAS_STATUS_SUCCESS failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with T = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at utils/MatrixMult.cu:141Aborted (core dumped)

eduj36 commented Apr 6, 2017

I also have an error running demo_ivfpq_indexing_gpu as below. The most recent version is used and it is running on TITAN X (Maxwell) with 12 GB of memory.

$ ./test/demo_ivfpq_indexing_gpu
[0.561 s] Generating 100000 vectors in 128D for training
[0.699 s] Training the index
Training IVF quantizer on 100000 vectors in 128D
Clustering 100000 points in 128D to 1788 clusters, redo 1 times, 10 iterations
  Preprocessing in 0.01 s
  Iteration 9 (0.34 s, search 0.26 s): objective=930934 imbalance=1.255 nsplit=0
computing residuals
training 4 x 256 product quantizer on 16384 vectors in 128D
Training PQ slice 0/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (1.89 s, search 1.52 s): objective=27271.5 imbalance=1.018 nsplit=0
Training PQ slice 1/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (1.99 s, search 1.62 s): objective=27193.4 imbalance=1.016 nsplit=0
Training PQ slice 2/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (1.97 s, search 1.60 s): objective=27230.8 imbalance=1.021 nsplit=0
Training PQ slice 3/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (1.49 s, search 1.20 s): objective=27174 imbalance=1.023 nsplit=0
[8.526 s] storing the pre-trained index to /tmp/index_trained.faissindex
[8.573 s] Building a dataset of 200000 vectors to index
[8.841 s] Adding the vectors to the index
Faiss assertion err == CUBLAS_STATUS_SUCCESS failed in void faiss::gpu::runMatrixMult(faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, faiss::gpu::Tensor<T, 2, true>&, bool, float, float, cublasHandle_t, cudaStream_t) [with T = float; cublasHandle_t = cublasContext*; cudaStream_t = CUstream_st*] at utils/MatrixMult.cu:141Aborted (core dumped)
@caydyn-skd

This comment has been minimized.

Show comment
Hide comment
@caydyn-skd

caydyn-skd Apr 7, 2017

Hi @mdouze I tried running the latest version, but still have the same problem

caydyn@dev:/home/caydyn/faiss$ git log -1
commit 7abe81b4f6abad56731ec1c27968173c8ce0d322
Author: matthijs <matthijs@fb.com>
Date:   Thu Apr 6 04:33:41 2017 -0700

    Better support for low-mem GPUs
    avoid reading beyond the end of an array in fvec_L2sqr and related functions
#0  0x00007fffe9828c9a in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#1  0x00007fffe974f696 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2  0x00007fffe9884992 in cuEventDestroy_v2 ()
   from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3  0x00000000004f9124 in cudart::cudaApiEventDestroy(CUevent_st*) ()
#4  0x00000000005279c4 in cudaEventDestroy ()
#5  0x0000000000442458 in faiss::gpu::streamWaitBase<std::vector<CUstream_st*, std::allocator<CUstream_st*> >, std::initializer_list<CUstream_st*> > (
    listWaiting=std::vector of length 2, capacity 2 = {...}, listWaitOn=...)
    at impl/../utils/DeviceUtils.h:131
#6  0x000000000047a906 in faiss::gpu::streamWait<std::vector<CUstream_st*, std::allocator<CUstream_st*> > > (b=..., a=std::vector of length 2, capacity 2 = {...})
    at impl/../utils/DeviceUtils.h:140
#7  faiss::gpu::runL2Distance<float> (resources=0x7fffffffe770, centroids=..., 
    centroidsTransposed=0x0, centroidNorms=centroidNorms@entry=0xb9e8570, queries=..., 
    k=k@entry=1, outDistances=..., outIndices=..., ignoreOutDistances=true, 
    tileSizeOverride=-1) at impl/Distance.cu:145
#8  0x0000000000471aee in faiss::gpu::runL2Distance (resources=<optimized out>, 
    vectors=..., vectorsTransposed=<optimized out>, 
    vectorNorms=vectorNorms@entry=0xb9e8570, queries=..., k=k@entry=1, outDistances=..., 
    outIndices=..., ignoreOutDistances=<optimized out>, tileSizeOverride=-1)
    at impl/Distance.cu:349
#9  0x000000000042f402 in faiss::gpu::FlatIndex::query (this=0xb9e8420, input=..., 
    k=k@entry=1, outDistances=..., outIndices=..., 
    exactDistance=exactDistance@entry=false, tileSize=-1) at impl/FlatIndex.cu:124
#10 0x0000000000444c23 in faiss::gpu::IVFPQ::classifyAndAddVectors (this=0xc37f3c0, 
    vecs=..., indices=...) at impl/IVFPQ.cu:138
#11 0x0000000000425abb in faiss::gpu::GpuIndexIVFPQ::addImpl_ (this=0x7fffffffe8c0, 
    n=200000, x=<optimized out>, xids=<optimized out>) at GpuIndexIVFPQ.cu:352
#12 0x0000000000417424 in faiss::gpu::GpuIndex::addInternal_ (this=0x7fffffffe8c0, 
    n=200000, x=0x7fffcfd81010, ids=0xbae4260) at GpuIndex.cu:74
#13 0x000000000042134b in faiss::gpu::GpuIndexIVF::add (this=0x7fffffffe8c0, n=200000, 
    x=0x7fffcfd81010) at GpuIndexIVF.cu:259
#14 0x000000000040ea8f in main () at test/demo_ivfpq_indexing_gpu.cpp:114

(gdb) f 14
#14 0x000000000040ea8f in main () at test/demo_ivfpq_indexing_gpu.cpp:114
114	        index.add (nb, database.data());
(gdb) l
109	        }
110	
111	        printf ("[%.3f s] Adding the vectors to the index\n",
112	                elapsed() - t0);
113	
114	        index.add (nb, database.data());
115	
116	        printf ("[%.3f s] done\n", elapsed() - t0);
117	
118	        // remember a few elements from the database as queries


[0.379 s] Generating 100000 vectors in 128D for training
[0.526 s] Training the index
Training IVF quantizer on 100000 vectors in 128D
Clustering 100000 points in 128D to 1788 clusters, redo 1 times, 10 iterations
  Preprocessing in 0.03 s
  Iteration 0 (0.15 s, search 0.12 s): objective=1.43954e+06 imbalance=2.907 nsplit=0       Iteration 9 (3.68 s, search 3.59 s): objective=930934 imbalance=1.255 nsplit=0       
computing residuals
training 4 x 256 product quantizer on 16384 vectors in 128D
Training PQ slice 0/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (3.20 s, search 2.67 s): objective=27271.5 imbalance=1.018 nsplit=0       
Training PQ slice 1/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (2.95 s, search 2.49 s): objective=27193.4 imbalance=1.016 nsplit=0       
Training PQ slice 2/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (3.24 s, search 2.75 s): objective=27230.8 imbalance=1.021 nsplit=0       
Training PQ slice 3/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (2.43 s, search 2.15 s): objective=27174 imbalance=1.023 nsplit=0         
[16.307 s] storing the pre-trained index to /tmp/index_trained.faissindex
[16.353 s] Building a dataset of 200000 vectors to index
[16.649 s] Adding the vectors to the index
Segmentation fault (core dumped)
linux-vdso.so.1 =>  (0x00007fff637dd000)
libopenblas.so.0 => /usr/lib/libopenblas.so.0 (0x00007f0182d83000)
libcublas.so.8.0 => /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcublas.so.8.0 (0x00007f01802eb000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f01800e2000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f017fec5000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f017fcc1000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f017f93e000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f017f635000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f017f413000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f017f1fc000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f017ee33000)
/lib64/ld-linux-x86-64.so.2 (0x00005628e2e86000)
libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007f017eb08000)
libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007f017e8c8000)

Hi @mdouze I tried running the latest version, but still have the same problem

caydyn@dev:/home/caydyn/faiss$ git log -1
commit 7abe81b4f6abad56731ec1c27968173c8ce0d322
Author: matthijs <matthijs@fb.com>
Date:   Thu Apr 6 04:33:41 2017 -0700

    Better support for low-mem GPUs
    avoid reading beyond the end of an array in fvec_L2sqr and related functions
#0  0x00007fffe9828c9a in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#1  0x00007fffe974f696 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2  0x00007fffe9884992 in cuEventDestroy_v2 ()
   from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3  0x00000000004f9124 in cudart::cudaApiEventDestroy(CUevent_st*) ()
#4  0x00000000005279c4 in cudaEventDestroy ()
#5  0x0000000000442458 in faiss::gpu::streamWaitBase<std::vector<CUstream_st*, std::allocator<CUstream_st*> >, std::initializer_list<CUstream_st*> > (
    listWaiting=std::vector of length 2, capacity 2 = {...}, listWaitOn=...)
    at impl/../utils/DeviceUtils.h:131
#6  0x000000000047a906 in faiss::gpu::streamWait<std::vector<CUstream_st*, std::allocator<CUstream_st*> > > (b=..., a=std::vector of length 2, capacity 2 = {...})
    at impl/../utils/DeviceUtils.h:140
#7  faiss::gpu::runL2Distance<float> (resources=0x7fffffffe770, centroids=..., 
    centroidsTransposed=0x0, centroidNorms=centroidNorms@entry=0xb9e8570, queries=..., 
    k=k@entry=1, outDistances=..., outIndices=..., ignoreOutDistances=true, 
    tileSizeOverride=-1) at impl/Distance.cu:145
#8  0x0000000000471aee in faiss::gpu::runL2Distance (resources=<optimized out>, 
    vectors=..., vectorsTransposed=<optimized out>, 
    vectorNorms=vectorNorms@entry=0xb9e8570, queries=..., k=k@entry=1, outDistances=..., 
    outIndices=..., ignoreOutDistances=<optimized out>, tileSizeOverride=-1)
    at impl/Distance.cu:349
#9  0x000000000042f402 in faiss::gpu::FlatIndex::query (this=0xb9e8420, input=..., 
    k=k@entry=1, outDistances=..., outIndices=..., 
    exactDistance=exactDistance@entry=false, tileSize=-1) at impl/FlatIndex.cu:124
#10 0x0000000000444c23 in faiss::gpu::IVFPQ::classifyAndAddVectors (this=0xc37f3c0, 
    vecs=..., indices=...) at impl/IVFPQ.cu:138
#11 0x0000000000425abb in faiss::gpu::GpuIndexIVFPQ::addImpl_ (this=0x7fffffffe8c0, 
    n=200000, x=<optimized out>, xids=<optimized out>) at GpuIndexIVFPQ.cu:352
#12 0x0000000000417424 in faiss::gpu::GpuIndex::addInternal_ (this=0x7fffffffe8c0, 
    n=200000, x=0x7fffcfd81010, ids=0xbae4260) at GpuIndex.cu:74
#13 0x000000000042134b in faiss::gpu::GpuIndexIVF::add (this=0x7fffffffe8c0, n=200000, 
    x=0x7fffcfd81010) at GpuIndexIVF.cu:259
#14 0x000000000040ea8f in main () at test/demo_ivfpq_indexing_gpu.cpp:114

(gdb) f 14
#14 0x000000000040ea8f in main () at test/demo_ivfpq_indexing_gpu.cpp:114
114	        index.add (nb, database.data());
(gdb) l
109	        }
110	
111	        printf ("[%.3f s] Adding the vectors to the index\n",
112	                elapsed() - t0);
113	
114	        index.add (nb, database.data());
115	
116	        printf ("[%.3f s] done\n", elapsed() - t0);
117	
118	        // remember a few elements from the database as queries


[0.379 s] Generating 100000 vectors in 128D for training
[0.526 s] Training the index
Training IVF quantizer on 100000 vectors in 128D
Clustering 100000 points in 128D to 1788 clusters, redo 1 times, 10 iterations
  Preprocessing in 0.03 s
  Iteration 0 (0.15 s, search 0.12 s): objective=1.43954e+06 imbalance=2.907 nsplit=0       Iteration 9 (3.68 s, search 3.59 s): objective=930934 imbalance=1.255 nsplit=0       
computing residuals
training 4 x 256 product quantizer on 16384 vectors in 128D
Training PQ slice 0/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (3.20 s, search 2.67 s): objective=27271.5 imbalance=1.018 nsplit=0       
Training PQ slice 1/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (2.95 s, search 2.49 s): objective=27193.4 imbalance=1.016 nsplit=0       
Training PQ slice 2/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (3.24 s, search 2.75 s): objective=27230.8 imbalance=1.021 nsplit=0       
Training PQ slice 3/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (2.43 s, search 2.15 s): objective=27174 imbalance=1.023 nsplit=0         
[16.307 s] storing the pre-trained index to /tmp/index_trained.faissindex
[16.353 s] Building a dataset of 200000 vectors to index
[16.649 s] Adding the vectors to the index
Segmentation fault (core dumped)
linux-vdso.so.1 =>  (0x00007fff637dd000)
libopenblas.so.0 => /usr/lib/libopenblas.so.0 (0x00007f0182d83000)
libcublas.so.8.0 => /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcublas.so.8.0 (0x00007f01802eb000)
librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007f01800e2000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f017fec5000)
libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007f017fcc1000)
libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f017f93e000)
libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f017f635000)
libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007f017f413000)
libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f017f1fc000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f017ee33000)
/lib64/ld-linux-x86-64.so.2 (0x00005628e2e86000)
libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007f017eb08000)
libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007f017e8c8000)

@wickedfoo

This comment has been minimized.

Show comment
Hide comment
@wickedfoo

wickedfoo Apr 19, 2017

Contributor

@eduj36 Can you run nvidia-smi and copy the output here? What version is your driver? Does it match your CUDA SDK version (8.0)?

@caydyn-skd can you run nvidia-smi and copy the output here as well?

Contributor

wickedfoo commented Apr 19, 2017

@eduj36 Can you run nvidia-smi and copy the output here? What version is your driver? Does it match your CUDA SDK version (8.0)?

@caydyn-skd can you run nvidia-smi and copy the output here as well?

@wag

This comment has been minimized.

Show comment
Hide comment
@wag

wag Apr 24, 2017

I'm experiencing the same issue with a GTX 970 (4GB) on the latest version, 2816831

dev:~/build/faiss/gpu/test$ gdb ./demo_ivfpq_indexing_gpu
[...]
[0.365 s] Generating 100000 vectors in 128D for training
[0.483 s] Training the index
Training IVF quantizer on 100000 vectors in 128D
Clustering 100000 points in 128D to 1788 clusters, redo 1 times, 10 iterations
  Preprocessing in 0.03 s
  Iteration 9 (0.43 s, search 0.37 s): objective=930934 imbalance=1.255 nsplit=0            
computing residuals
training 4 x 256 product quantizer on 16384 vectors in 128D
Training PQ slice 0/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (4.50 s, search 3.79 s): objective=27271.5 imbalance=1.018 nsplit=0       
Training PQ slice 1/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (4.59 s, search 3.97 s): objective=27193.4 imbalance=1.016 nsplit=0       
Training PQ slice 2/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (4.72 s, search 4.07 s): objective=27230.8 imbalance=1.021 nsplit=0       
Training PQ slice 3/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (4.60 s, search 3.93 s): objective=27174 imbalance=1.023 nsplit=0         
[19.516 s] storing the pre-trained index to /tmp/index_trained.faissindex
[19.550 s] Building a dataset of 200000 vectors to index
[19.785 s] Adding the vectors to the index

Thread 1 "demo_ivfpq_inde" received signal SIGSEGV, Segmentation fault.
0x00007fffe1825caa in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
(gdb) bt
#0  0x00007fffe1825caa in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#1  0x00007fffe174c696 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2  0x00007fffe1881962 in cuEventDestroy_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3  0x00000000004f9124 in cudart::cudaApiEventDestroy(CUevent_st*) ()
#4  0x00000000005279c4 in cudaEventDestroy ()
#5  0x0000000000442458 in faiss::gpu::streamWaitBase<std::vector<CUstream_st*, std::allocator<CUstream_st*> >, std::initializer_list<CUstream_st*> > (
    listWaiting=std::vector of length 2, capacity 2 = {...}, listWaitOn=...) at impl/../utils/DeviceUtils.h:131
#6  0x000000000047a906 in faiss::gpu::streamWait<std::vector<CUstream_st*, std::allocator<CUstream_st*> > > (b=..., a=std::vector of length 2, capacity 2 = {...})
    at impl/../utils/DeviceUtils.h:140
#7  faiss::gpu::runL2Distance<float> (resources=0x7fffffffd970, centroids=..., centroidsTransposed=0x0, centroidNorms=centroidNorms@entry=0xb523740, queries=..., k=k@entry=1, 
    outDistances=..., outIndices=..., ignoreOutDistances=true, tileSizeOverride=-1) at impl/Distance.cu:145
#8  0x0000000000471aee in faiss::gpu::runL2Distance (resources=<optimized out>, vectors=..., vectorsTransposed=<optimized out>, vectorNorms=vectorNorms@entry=0xb523740, queries=..., 
    k=k@entry=1, outDistances=..., outIndices=..., ignoreOutDistances=<optimized out>, tileSizeOverride=-1) at impl/Distance.cu:349
#9  0x000000000042f402 in faiss::gpu::FlatIndex::query (this=0xb5235f0, input=..., k=k@entry=1, outDistances=..., outIndices=..., exactDistance=exactDistance@entry=false, tileSize=-1)
    at impl/FlatIndex.cu:124
#10 0x0000000000444c23 in faiss::gpu::IVFPQ::classifyAndAddVectors (this=0xbebae20, vecs=..., indices=...) at impl/IVFPQ.cu:138
#11 0x0000000000425abb in faiss::gpu::GpuIndexIVFPQ::addImpl_ (this=0x7fffffffdac0, n=200000, x=<optimized out>, xids=<optimized out>) at GpuIndexIVFPQ.cu:352
#12 0x0000000000417424 in faiss::gpu::GpuIndex::addInternal_ (this=0x7fffffffdac0, n=200000, x=0x7fffc364f010, ids=0xb61f440) at GpuIndex.cu:74
#13 0x000000000042134b in faiss::gpu::GpuIndexIVF::add (this=0x7fffffffdac0, n=200000, x=0x7fffc364f010) at GpuIndexIVF.cu:259
#14 0x000000000040ea8f in main ()
dev:~/build/faiss/gpu/test$ ldd demo_ivfpq_indexing_gpu
	linux-vdso.so.1 =>  (0x00007fff112e6000)
	libopenblas.so.0 => /usr/lib/libopenblas.so.0 (0x00007fb9fb90a000)
	libcublas.so.8.0 => /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcublas.so.8.0 (0x00007fb9f8e72000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fb9f8c69000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fb9f8a4c000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb9f8848000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fb9f84c5000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb9f81bc000)
	libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007fb9f7f9a000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb9f7d83000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb9f79ba000)
	/lib64/ld-linux-x86-64.so.2 (0x000055c5d2d2d000)
	libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007fb9f768f000)
	libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007fb9f744f000)
dev:~$ lspci -vvv -s 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Elitegroup Computer Systems GM204 [GeForce GTX 970]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 319
	Region 0: Memory at de000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at c0000000 (64-bit, prefetchable) [size=256M]
	Region 3: Memory at d0000000 (64-bit, prefetchable) [size=32M]
	Region 5: I/O ports at e000 [size=128]
	[virtual] Expansion ROM at df000000 [disabled] [size=512K]
	Capabilities: <access denied>
	Kernel driver in use: nvidia
	Kernel modules: nvidiafb, nouveau, nvidia_375_drm, nvidia_375
dev:~$ nvidia-smi 
Mon Apr 24 09:41:45 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 970     Off  | 0000:01:00.0      On |                  N/A |
| 35%   30C    P8    19W / 151W |    872MiB /  4036MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1986    G   /usr/lib/xorg/Xorg                             658MiB |
|    0      4119    G   ...s-passed-by-fd --v8-snapshot-passed-by-fd   212MiB |
+-----------------------------------------------------------------------------+

wag commented Apr 24, 2017

I'm experiencing the same issue with a GTX 970 (4GB) on the latest version, 2816831

dev:~/build/faiss/gpu/test$ gdb ./demo_ivfpq_indexing_gpu
[...]
[0.365 s] Generating 100000 vectors in 128D for training
[0.483 s] Training the index
Training IVF quantizer on 100000 vectors in 128D
Clustering 100000 points in 128D to 1788 clusters, redo 1 times, 10 iterations
  Preprocessing in 0.03 s
  Iteration 9 (0.43 s, search 0.37 s): objective=930934 imbalance=1.255 nsplit=0            
computing residuals
training 4 x 256 product quantizer on 16384 vectors in 128D
Training PQ slice 0/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (4.50 s, search 3.79 s): objective=27271.5 imbalance=1.018 nsplit=0       
Training PQ slice 1/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (4.59 s, search 3.97 s): objective=27193.4 imbalance=1.016 nsplit=0       
Training PQ slice 2/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (4.72 s, search 4.07 s): objective=27230.8 imbalance=1.021 nsplit=0       
Training PQ slice 3/4
Clustering 16384 points in 32D to 256 clusters, redo 1 times, 25 iterations
  Preprocessing in 0.00 s
  Iteration 24 (4.60 s, search 3.93 s): objective=27174 imbalance=1.023 nsplit=0         
[19.516 s] storing the pre-trained index to /tmp/index_trained.faissindex
[19.550 s] Building a dataset of 200000 vectors to index
[19.785 s] Adding the vectors to the index

Thread 1 "demo_ivfpq_inde" received signal SIGSEGV, Segmentation fault.
0x00007fffe1825caa in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
(gdb) bt
#0  0x00007fffe1825caa in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#1  0x00007fffe174c696 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2  0x00007fffe1881962 in cuEventDestroy_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3  0x00000000004f9124 in cudart::cudaApiEventDestroy(CUevent_st*) ()
#4  0x00000000005279c4 in cudaEventDestroy ()
#5  0x0000000000442458 in faiss::gpu::streamWaitBase<std::vector<CUstream_st*, std::allocator<CUstream_st*> >, std::initializer_list<CUstream_st*> > (
    listWaiting=std::vector of length 2, capacity 2 = {...}, listWaitOn=...) at impl/../utils/DeviceUtils.h:131
#6  0x000000000047a906 in faiss::gpu::streamWait<std::vector<CUstream_st*, std::allocator<CUstream_st*> > > (b=..., a=std::vector of length 2, capacity 2 = {...})
    at impl/../utils/DeviceUtils.h:140
#7  faiss::gpu::runL2Distance<float> (resources=0x7fffffffd970, centroids=..., centroidsTransposed=0x0, centroidNorms=centroidNorms@entry=0xb523740, queries=..., k=k@entry=1, 
    outDistances=..., outIndices=..., ignoreOutDistances=true, tileSizeOverride=-1) at impl/Distance.cu:145
#8  0x0000000000471aee in faiss::gpu::runL2Distance (resources=<optimized out>, vectors=..., vectorsTransposed=<optimized out>, vectorNorms=vectorNorms@entry=0xb523740, queries=..., 
    k=k@entry=1, outDistances=..., outIndices=..., ignoreOutDistances=<optimized out>, tileSizeOverride=-1) at impl/Distance.cu:349
#9  0x000000000042f402 in faiss::gpu::FlatIndex::query (this=0xb5235f0, input=..., k=k@entry=1, outDistances=..., outIndices=..., exactDistance=exactDistance@entry=false, tileSize=-1)
    at impl/FlatIndex.cu:124
#10 0x0000000000444c23 in faiss::gpu::IVFPQ::classifyAndAddVectors (this=0xbebae20, vecs=..., indices=...) at impl/IVFPQ.cu:138
#11 0x0000000000425abb in faiss::gpu::GpuIndexIVFPQ::addImpl_ (this=0x7fffffffdac0, n=200000, x=<optimized out>, xids=<optimized out>) at GpuIndexIVFPQ.cu:352
#12 0x0000000000417424 in faiss::gpu::GpuIndex::addInternal_ (this=0x7fffffffdac0, n=200000, x=0x7fffc364f010, ids=0xb61f440) at GpuIndex.cu:74
#13 0x000000000042134b in faiss::gpu::GpuIndexIVF::add (this=0x7fffffffdac0, n=200000, x=0x7fffc364f010) at GpuIndexIVF.cu:259
#14 0x000000000040ea8f in main ()
dev:~/build/faiss/gpu/test$ ldd demo_ivfpq_indexing_gpu
	linux-vdso.so.1 =>  (0x00007fff112e6000)
	libopenblas.so.0 => /usr/lib/libopenblas.so.0 (0x00007fb9fb90a000)
	libcublas.so.8.0 => /usr/local/cuda-8.0/targets/x86_64-linux/lib/libcublas.so.8.0 (0x00007fb9f8e72000)
	librt.so.1 => /lib/x86_64-linux-gnu/librt.so.1 (0x00007fb9f8c69000)
	libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007fb9f8a4c000)
	libdl.so.2 => /lib/x86_64-linux-gnu/libdl.so.2 (0x00007fb9f8848000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007fb9f84c5000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007fb9f81bc000)
	libgomp.so.1 => /usr/lib/x86_64-linux-gnu/libgomp.so.1 (0x00007fb9f7f9a000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007fb9f7d83000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007fb9f79ba000)
	/lib64/ld-linux-x86-64.so.2 (0x000055c5d2d2d000)
	libgfortran.so.3 => /usr/lib/x86_64-linux-gnu/libgfortran.so.3 (0x00007fb9f768f000)
	libquadmath.so.0 => /usr/lib/x86_64-linux-gnu/libquadmath.so.0 (0x00007fb9f744f000)
dev:~$ lspci -vvv -s 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation GM204 [GeForce GTX 970] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: Elitegroup Computer Systems GM204 [GeForce GTX 970]
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 319
	Region 0: Memory at de000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at c0000000 (64-bit, prefetchable) [size=256M]
	Region 3: Memory at d0000000 (64-bit, prefetchable) [size=32M]
	Region 5: I/O ports at e000 [size=128]
	[virtual] Expansion ROM at df000000 [disabled] [size=512K]
	Capabilities: <access denied>
	Kernel driver in use: nvidia
	Kernel modules: nvidiafb, nouveau, nvidia_375_drm, nvidia_375
dev:~$ nvidia-smi 
Mon Apr 24 09:41:45 2017       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 970     Off  | 0000:01:00.0      On |                  N/A |
| 35%   30C    P8    19W / 151W |    872MiB /  4036MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
|    0      1986    G   /usr/lib/xorg/Xorg                             658MiB |
|    0      4119    G   ...s-passed-by-fd --v8-snapshot-passed-by-fd   212MiB |
+-----------------------------------------------------------------------------+
@wag

This comment has been minimized.

Show comment
Hide comment
@wag

wag Apr 26, 2017

Just tried the same on a system with 2 GTX 1070 (8GB each) without any problems.

wag commented Apr 26, 2017

Just tried the same on a system with 2 GTX 1070 (8GB each) without any problems.

@wickedfoo

This comment has been minimized.

Show comment
Hide comment
@wickedfoo

wickedfoo Apr 26, 2017

Contributor

@wag Can you try it on the GTX 970 without running other processes (like X) on the GPU, since there appear to be 2 processes using resources on the GPU? (e.g., just straight from the console). I think internally we've only used server GPUs with nothing running on them or no other resources consumed, curious if it conflicts somehow on the lower-mem GPUs with 4 GB.

Contributor

wickedfoo commented Apr 26, 2017

@wag Can you try it on the GTX 970 without running other processes (like X) on the GPU, since there appear to be 2 processes using resources on the GPU? (e.g., just straight from the console). I think internally we've only used server GPUs with nothing running on them or no other resources consumed, curious if it conflicts somehow on the lower-mem GPUs with 4 GB.

@eduj36

This comment has been minimized.

Show comment
Hide comment
@eduj36

eduj36 Apr 26, 2017

@wickedfoo
Hi, sorry for late response. Here is the result from nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX TIT...  Off  | 0000:02:00.0     Off |                  N/A |
| 22%   52C    P8    16W / 250W |      2MiB / 12207MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX TIT...  Off  | 0000:03:00.0     Off |                  N/A |
| 22%   50C    P8    16W / 250W |      2MiB / 12207MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

And from nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61

eduj36 commented Apr 26, 2017

@wickedfoo
Hi, sorry for late response. Here is the result from nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 375.39                 Driver Version: 375.39                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX TIT...  Off  | 0000:02:00.0     Off |                  N/A |
| 22%   52C    P8    16W / 250W |      2MiB / 12207MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX TIT...  Off  | 0000:03:00.0     Off |                  N/A |
| 22%   50C    P8    16W / 250W |      2MiB / 12207MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

And from nvcc --version

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2016 NVIDIA Corporation
Built on Tue_Jan_10_13:22:03_CST_2017
Cuda compilation tools, release 8.0, V8.0.61
@wag

This comment has been minimized.

Show comment
Hide comment
@wag

wag Apr 27, 2017

@wickedfoo Same segfault without X or any other processes running on the GPU.

wag commented Apr 27, 2017

@wickedfoo Same segfault without X or any other processes running on the GPU.

@mdouze

This comment has been minimized.

Show comment
Hide comment
@mdouze

mdouze Jun 23, 2017

Contributor

Closing for now. Please re-open if the bug occurs with the current version of Faiss.

Contributor

mdouze commented Jun 23, 2017

Closing for now. Please re-open if the bug occurs with the current version of Faiss.

@mdouze mdouze closed this Jun 23, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment