Fails to search topk vectors among 20000 same vectors #484

ZhuoranLyu · 2018-06-07T01:42:44Z

Summary

I was trying to search topk vectors among 20000 exactly same vectors. Seg fault

Platform

OS: Ubuntu 14.04

Running on :

GPU
Tesla P4

Reproduction instructions

Briefly, I was trying to search in 20000 128D vectors. I used GpuIndexIVFPQ(PQ8) with a Tesla P4 with 8 GB memory. It crashes when I search for top 100 Nearest Neighbors. Here is the WARN info and the backtrace of gdb.

WARN: increase temp memory to avoid cudaMalloc, or decrease query/add size (alloc 3473344000 B, highwater 0 B)
WARN: increase temp memory to avoid cudaMalloc, or decrease query/add size (alloc 3473344000 B, highwater 3473344000 B)
Faiss assertion 'err == cudaSuccess' failed in char* faiss::gpu::StackDeviceMemory::Stack::getAlloc(size_t, cudaStream_t) at utils/StackDeviceMemory.cpp:77; details: cudaMalloc error 2 on alloc size 3473344000

bt of gdb:

Program received signal SIGABRT, Aborted.
[Switching to Thread 0x7ffdf3dff700 (LWP 8711)]
0x00007fffeea9fc37 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
56 ../nptl/sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) bt
#0 0x00007fffeea9fc37 in __GI_raise (sig=sig@entry=6)
at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007fffeeaa3028 in GI_abort () at abort.c:89
#2 0x00000000004f258b in faiss::gpu::StackDeviceMemory::Stack::getAlloc (
this=0xd145a90, size=3473344000, stream=0x36059a0)
at utils/StackDeviceMemory.cpp:75
#3 0x00000000004f2fb4 in faiss::gpu::StackDeviceMemory::getMemory (this=0xd145a80,
stream=0x36059a0, size=3473344000) at utils/StackDeviceMemory.cpp:207
#4 0x000000000046bce0 in faiss::gpu::DeviceTensor<float, 1, true, int, faiss::gpu::traits::DefaultPtrTraits>::DeviceTensor (this=0x7ffdf3dfe0e0, m=..., sizes=...,
stream=0x36059a0, space=faiss::gpu::Device)
at impl/../utils/DeviceTensor-inl.cuh:132
#5 0x0000000000464f0d in faiss::gpu::runPQScanMultiPassPrecomputed (queries=...,
precompTerm1=..., precompTerm2=..., precompTerm3=..., topQueryToCentroid=...,
useFloat16Lookup=true, bytesPerCode=8, numSubQuantizers=8,
numSubQuantizerCodes=256, listCodes=..., listIndices=...,
indicesOptions=faiss::gpu::INDICES_64_BIT, listLengths=..., maxListLength=217084,
k=100, outDistances=..., outIndices=..., res=0x3605840)
at impl/PQScanMultiPassPrecomputed.cu:488
#6 0x00000000004521e1 in faiss::gpu::IVFPQ::runPQPrecomputedCodes (
this=0x7ffdec36e140, queries=..., coarseDistances=..., coarseIndices=..., k=100,
outDistances=..., outIndices=...) at impl/IVFPQ.cu:661
#7 0x0000000000451903 in faiss::gpu::IVFPQ::query (this=0x7ffdec36e140, queries=...,
nprobe=500, k=100, outDistances=..., outIndices=...) at impl/IVFPQ.cu:551
#8 0x000000000044b952 in faiss::gpu::GpuIndexIVFPQ::searchImpl (this=0xd0bfb10,
n=1000, x=0x7ffdec000c40, k=100, distances=0x7ffdec204670, labels=0x7ffdec141160)
at GpuIndexIVFPQ.cu:380
#9 0x0000000000447cef in faiss::gpu::GpuIndex::search (this=0xd0bfb10, n=1000,
x=0x7ffdec000c40, k=100, distances=0x7ffdec204670, labels=0x7ffdec141160)
at GpuIndex.cu:143
#10 0x0000000000410776 in CwAnnShardImpl::search_cw_feat_unit (this=0x1bc86a0,
vec_query_feats=0x7fffa13b5010, feat_num=1000, feat_dim=130,
vec_added_feats=0xd145ce0, topk=100, res_dists=0x7ffe599de010,
res_nns=0x7ffe03c9d010) at CwAnnMTImpl.cpp:1102
#11 0x00000000004103c0 in CwAnnShardImpl::search_cw_feats_with_batch_gpu (
this=0x1bc86a0, vec_query_feats=0x7fffa13b5010, feat_num=1000, feat_dim=130,
vec_added_feats=0xd145ce0, k=100, res_dists=0x7ffe599de010, res_nns=0x7ffe03c9d010)
at CwAnnMTImpl.cpp:1026
#12 0x000000000040ebce in CwAnnShardMTImpl::__lambda9::operator() (__closure=0xd1340e0)
at CwAnnMTImpl.cpp:725
#13 0x000000000041a0dd in std::_Function_handler<void(), CwAnnShardMTImpl::search_cw_batch_unit(float const*, int, int, int, float*, long int*)::__lambda9>::_M_invoke(const std---Type to continue, or q to quit---
::_Any_data &) (__functor=...) at /usr/include/c++/4.8/functional:2071
#14 0x00000000004b3810 in std::function<void ()>::operator()() const (
this=0x7ffdf3dfecd0) at /usr/include/c++/4.8/functional:2471
#15 0x00000000004b0f49 in faiss::gpu::WorkerThread::threadLoop (this=0xd0be020)
at utils/WorkerThread.cpp:100
#16 0x00000000004b0d70 in faiss::gpu::WorkerThread::threadMain (this=0xd0be020)
at utils/WorkerThread.cpp:69
#17 0x00000000004b0a97 in faiss::gpu::WorkerThread::__lambda5::operator() (
__closure=0x35d92a0) at utils/WorkerThread.cpp:31
#18 0x00000000004b2140 in std::_Bind_simplefaiss::gpu::WorkerThread::startThread()::__lambda5()::_M_invoke<>(std::_Index_tuple<>) (this=0x35d92a0)
at /usr/include/c++/4.8/functional:1732
#19 0x00000000004b2097 in std::_Bind_simplefaiss::gpu::WorkerThread::startThread()::__lambda5()::operator()(void) (this=0x35d92a0) at /usr/include/c++/4.8/functional:1720
#20 0x00000000004b2030 in std::thread::_Impl<std::_Bind_simplefaiss::gpu::WorkerThread::startThread()::__lambda5() >::_M_run(void) (this=0x35d9288)
at /usr/include/c++/4.8/thread:115
#21 0x00007fffef60ea60 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#22 0x00007ffff2237184 in start_thread (arg=0x7ffdf3dff700) at pthread_create.c:312
#23 0x00007fffeeb6703d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

Appreciate any help.

mdouze · 2018-06-08T09:57:06Z

Hi
This is a corner case, but we may want to look into it.
Could you post a minimal code that reproduces it, preferably in C++ ?

ZhuoranLyu · 2018-06-11T03:06:36Z

@mdouze I may not be able to provide the entire code. However, the main procedure is almost the same as this demo except that I use the same number instead of drand48() in the searching database:

https://github.com/facebookresearch/faiss/blob/1fe2872013685092d697f08a2a48e110acd25b2b/gpu/test/demo_ivfpq_indexing_gpu.cpp

I use 100000 128D vectors in the searching database with PQ8. One 8GB P4 gpu.

Thank you very much!

shutcode · 2018-06-12T06:18:21Z

it seems your gpu memory is exhausted or temp memory not set

wickedfoo · 2018-06-13T00:33:07Z

Disable precomputed codes on the index; it appears that you do not have enough memory to use precomputed codes.

wickedfoo · 2018-06-13T00:33:36Z

Also, how many coarse IVF centroids does your index use?

ZhuoranLyu · 2018-06-19T01:46:45Z

@shutcode Thanks, but I do set temp mem to 18%. And you are right, the gpu mem is exhausted but I do not think it should be exhausted. I only try to search 100000 vectors with a P4 GPU with 8GB memory. I assume the memory should be enough to search 100000 vectors.

ZhuoranLyu · 2018-06-19T01:48:56Z

@wickedfoo Hi wickedfoo, thank you for your reply. I have tried to disable and enable the precomputed codes but it does not make a difference. I have tried 100, 500 and 1000 coarse centroids but all of them do not seem to work.

ZhuoranLyu · 2018-06-19T01:50:59Z

The problem is that I was wondering why it asks for so much memory since I only try to search for 100000 vectors. If they are not exactly the same (add a random number on the vector), it only needs a little memory. However, if they are exactly the same vectors, they require so much memory.

mdouze · 2018-08-28T09:27:24Z

No activity, closing.

mdouze added bug cant-repro GPU labels Jun 8, 2018

mdouze assigned wickedfoo Jun 8, 2018

mdouze closed this as completed Aug 28, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fails to search topk vectors among 20000 same vectors #484

Fails to search topk vectors among 20000 same vectors #484

ZhuoranLyu commented Jun 7, 2018 •

edited

Loading

mdouze commented Jun 8, 2018

ZhuoranLyu commented Jun 11, 2018

shutcode commented Jun 12, 2018

wickedfoo commented Jun 13, 2018

wickedfoo commented Jun 13, 2018

ZhuoranLyu commented Jun 19, 2018

ZhuoranLyu commented Jun 19, 2018

ZhuoranLyu commented Jun 19, 2018

mdouze commented Aug 28, 2018

Fails to search topk vectors among 20000 same vectors #484

Fails to search topk vectors among 20000 same vectors #484

Comments

ZhuoranLyu commented Jun 7, 2018 • edited Loading

Summary

Platform

Reproduction instructions

mdouze commented Jun 8, 2018

ZhuoranLyu commented Jun 11, 2018

shutcode commented Jun 12, 2018

wickedfoo commented Jun 13, 2018

wickedfoo commented Jun 13, 2018

ZhuoranLyu commented Jun 19, 2018

ZhuoranLyu commented Jun 19, 2018

ZhuoranLyu commented Jun 19, 2018

mdouze commented Aug 28, 2018

ZhuoranLyu commented Jun 7, 2018 •

edited

Loading