GPU Search billion vectors failed #379

0DF0Arc · 2018-03-26T14:06:17Z

Summary

Platform

OS: <Ubuntu 14.04->

Faiss version:
Running on :

GPU P40 *2

Reproduction instructions

Using a Index IVF10000, PQ24, when add 1 billion random generated data to the index and do search ,get the follow error:

WARN: increase temp memory to avoid cudaMalloc, or decrease query/add size (alloc 18446744065779875840 B, highwater 0 B)
Faiss assertion 'err == cudaSuccess' failed in char* faiss::gpu::StackDeviceMemory::Stack::getAlloc(size_t, cudaStream_t) at /root/workspace/backup/back/SimilaritySearch/src/gpu/utils/StackDeviceMemory.cpp:77; details: cudaMalloc error 2 on alloc size 18446744065770205184

with 400 million vectors, the index and code works ok

code :
faiss::Index* tmp_cpu = faiss::read_index("/home/zxin10/index08b/index_800m.index", false);
faiss::Index* gpu_index = faiss::gpu::index_cpu_to_gpu_multiple((std::vectorfaiss::gpu::GpuResources* &)gpu_memory.res_mul, devices, cpu_index, gpu_memory.options_mul
faiss::Index::idx_t *indices = new faiss::Index::idx_t[nq * k];
float *distances = new float[nq * k];
index->search(nq, query_vecs.data(), k, distances, indices);

wickedfoo · 2018-03-26T19:09:19Z

Your code snippet is not complete, as it won't ocmpile. Are you enabling sharding?

400 million * 24 bytes per vector will fit on a single GPU, whereas 1 billion will not. Can 500 million fit on a single GPU? If so, then sharding should allow it to fit on 2 GPUs.

wickedfoo · 2018-03-26T19:10:51Z

since your GPU appears to have 24 GB of memory, note that 18% by default is eaten up for temp scratch space in StandardGpuResources, you can reduce this to 1.5 GB or so, and also note that there are overheads.

Try aiming for 700 million or so on a single GPU and see if that works.

0DF0Arc · 2018-03-27T12:27:30Z

@wickedfoo Hi, Actually, I tried to shard 800million to 2 GPU, shard mode =1, and the gpu memory consumption is aounr 18G each after index_cpu_to_gpu_multiple, sitll same issue. Could this have something to do with the Index?

ZhuoranLyu · 2018-05-09T06:25:50Z

@0DF0Arc Same issue. I try to use one p4(8GB) to index 30 million 128D vectors with PQ8. However, it fails to add all the vectors to the index. Here is the backtrace:

#0 memmove_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:1848
#1 0x00007fffdff48b4f in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#2 0x00007fffe00f1faf in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#3 0x00007fffdfff563e in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#4 0x00007fffdfff63bc in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#5 0x00007fffdff184c8 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#6 0x00007fffdff199e0 in ?? () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#7 0x00007fffe0059612 in cuMemcpyHtoDAsync_v2 () from /usr/lib/x86_64-linux-gnu/libcuda.so.1
#8 0x00007ffff247c8cc in ?? () from /usr/local/cuda-8.0/lib64/libcudart.so.8.0
#9 0x00007ffff2458b5b in ?? () from /usr/local/cuda-8.0/lib64/libcudart.so.8.0
#10 0x00007ffff2492b08 in cudaMemcpyAsync () from /usr/local/cuda-8.0/lib64/libcudart.so.8.0
#11 0x0000000000433f73 in faiss::gpu::Tensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits>::copyFrom(faiss::gpu::Tensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits>&, CUstream_st*) ()
#12 0x0000000000432357 in faiss::gpu::DeviceTensor<float, 2, true, int, faiss::gpu::traits::DefaultPtrTraits> faiss::gpu::toDevice<float, 2>(faiss::gpu::GpuResources*, int, float*, CUstream_st*, std::initializer_list) ()
#13 0x000000000043c833 in faiss::gpu::GpuIndexIVFPQ::addImpl(long, float const*, long const*) ()
#14 0x00000000004393aa in faiss::gpu::GpuIndex::addInternal(long, float const*, long const*) ()
#15 0x000000000043910c in faiss::gpu::GpuIndex::add_with_ids(long, float const*, long const*) ()
#16 0x000000000040ce65 in CwAnnTopkImpl::add_with_batch_gpu (this=0x7fffffffc240,
vec_feats=0x7ff42a0e9010, feat_num=31061938, ids=0x7fff3c56c010) at CwAnnTopkImpl.cpp:219
#17 0x000000000040e02d in CwAnnTopkImpl::add_with_ids_cwfeat_gpu (this=0x7fffffffc240,
vec_feats=0x7ff42a0e9010, feat_num=31061938, feat_dim=128, ids=0x7fff3c56c010)
at CwAnnTopkImpl.cpp:559
#18 0x000000000040a060 in main (argc=1, argv=0x7fffffffe2d8) at test/test_cwimpl_testhitrate.cpp:143

Appreciate any help.

mdouze · 2018-05-11T13:15:08Z

No activity, closing.

mdouze added the GPU label Mar 27, 2018

mdouze closed this as completed May 11, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPU Search billion vectors failed #379

GPU Search billion vectors failed #379

0DF0Arc commented Mar 26, 2018

wickedfoo commented Mar 26, 2018

wickedfoo commented Mar 26, 2018

0DF0Arc commented Mar 27, 2018

ZhuoranLyu commented May 9, 2018

mdouze commented May 11, 2018

GPU Search billion vectors failed #379

GPU Search billion vectors failed #379

Comments

0DF0Arc commented Mar 26, 2018

Summary

Platform

Reproduction instructions

wickedfoo commented Mar 26, 2018

wickedfoo commented Mar 26, 2018

0DF0Arc commented Mar 27, 2018

ZhuoranLyu commented May 9, 2018

mdouze commented May 11, 2018