Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when build_disk_index with Multithreading #4

Closed
lajiyuan opened this issue Mar 29, 2021 · 2 comments
Closed

Segmentation fault when build_disk_index with Multithreading #4

lajiyuan opened this issue Mar 29, 2021 · 2 comments

Comments

@lajiyuan
Copy link

What happened:

I generate 10k bin data with a simple program.

#include <iostream>
#include <fstream>
using namespace std;
int main()
{
  ofstream myFile ("data.bin", ios::out | ios::binary);
  int mockNum = 10;
  int dimension = 10;
  myFile.write((char*)&mockNum, sizeof(int));
  myFile.write((char*)&dimension, sizeof(int));
  for (int i = 0; i < mockNum; i++)
  {
    for (int j = 0; j < dimension; j++)
    {
      int8_t x = (i + j);
      myFile.write((char*)&x, sizeof(int8_t));
    }
  }
  myFile.close();
  return 0;
}

But segmentation fault when I running build_disk_index with Multithreading.

successful command

./tests/build_disk_index int8 ../cmake-build-debug/tests/data.bin output/index 60 75 10 10 1

core dump command

./tests/build_disk_index int8 ../cmake-build-debug/tests/data.bin output/index 60 75 10 10 2

message log

Starting index build: R=60 L=75 Query RAM budget: 1.0469e+10 Indexing ram budget: 10 T: 2
Compressing 10-dimensional data into 10 bytes per vector.
Opened: ../cmake-build-debug/tests/data.bin, size: 108, cache_size: 108
Training data loaded of size 10
Stat(output/index_pq_pivots.bin) returned: 0
Reading bin file output/index_pq_pivots.bin ...
Metadata: #pts = 256, #dims = 10...
PQ pivot file exists. Not generating again
Opened: ../cmake-build-debug/tests/data.bin, size: 108, cache_size: 108
Stat(output/index_pq_pivots.bin) returned: 0
Reading bin file output/index_pq_pivots.bin_centroid.bin ...
Metadata: #pts = 10, #dims = 1...
Reading bin file output/index_pq_pivots.bin_rearrangement_perm.bin ...
Metadata: #pts = 10, #dims = 1...
Reading bin file output/index_pq_pivots.bin_chunk_offsets.bin ...
Metadata: #pts = 11, #dims = 1...
Reading bin file output/index_pq_pivots.bin ...
Metadata: #pts = 256, #dims = 10...
Loaded PQ pivot information
Processing points [0, 10)...done.
Full index fits in RAM, building in one shot
Number of frozen points = 0
Reading bin file ../cmake-build-debug/tests/data.bin ...Metadata: #pts = 10, #dims = 10, aligned_dim = 16...allocating aligned memory, 160 bytes...done. Copying data... done.
Using AVX2 distance computation
Starting index build...
Number of syncs: 40
Completed (round: 0, sync: 1/40 with L 75) sync_time: 0.00149s; inter_time: 2.958e-05s
Completed (round: 0, sync: 3/40 with L 75) sync_time: 0.001415s; inter_time: 2.37e-05s
Segmentation fault (core dumped)

core trace

Starting index build...
Number of syncs: 40

Thread 1 "build_disk_inde" received signal SIGSEGV, Segmentation fault.
tcmalloc::SLL_PopRange (end=, start=, N=8, head=0x1387c60) at src/linked_list.h:88
88 tmp = SLL_Next(tmp);
(gdb) bt
#0 tcmalloc::SLL_PopRange (end=, start=, N=8, head=0x1387c60) at src/linked_list.h:88
#1 tcmalloc::ThreadCache::FreeList::PopRange (end=, start=, N=8, this=0x1387c60) at src/thread_cache.h:238
#2 tcmalloc::ThreadCache::ReleaseToCentralCache (this=this@entry=0x1387c40, src=src@entry=0x1387c60, cl=, N=8, N@entry=32) at src/thread_cache.cc:206
#3 0x00007f03767b878c in tcmalloc::ThreadCache::ListTooLong (this=0x1387c40, list=0x1387c60, cl=) at src/thread_cache.cc:164
#4 0x00000000004d8020 in __gnu_cxx::new_allocator::deallocate (this=0x7ffe91b27f10, __p=0x1b3c0c0) at /usr/include/c++/9/ext/new_allocator.h:128
#5 0x00000000004d3ab5 in std::allocator_traits<std::allocator >::deallocate (__a=..., __p=0x1b3c0c0, __n=2) at /usr/include/c++/9/bits/alloc_traits.h:469
#6 0x00000000004cfb7c in std::_Vector_base<unsigned int, std::allocator >::_M_deallocate (this=0x7ffe91b27f10, __p=0x1b3c0c0, __n=2) at /usr/include/c++/9/bits/stl_vector.h:351
#7 0x00000000004cde10 in std::_Vector_base<unsigned int, std::allocator >::~_Vector_base (this=0x7ffe91b27f10, __in_chrg=) at /usr/include/c++/9/bits/stl_vector.h:332
#8 0x00000000004cde61 in std::vector<unsigned int, std::allocator >::~vector (this=0x7ffe91b27f10, __in_chrg=) at /usr/include/c++/9/bits/stl_vector.h:680
#9 0x00000000005129ae in std::__shrink_to_fit_aux<std::vector<unsigned int, std::allocator >, true>::_S_do_it (__c=...) at /usr/include/c++/9/bits/allocator.h:265
#10 0x000000000050ea67 in std::vector<unsigned int, std::allocator >::_M_shrink_to_fit (this=0x1b3e240) at /usr/include/c++/9/bits/vector.tcc:693
#11 0x0000000000509896 in std::vector<unsigned int, std::allocator >::shrink_to_fit (this=0x1b3e240) at /usr/include/c++/9/bits/stl_vector.h:987
#12 0x000000000051c7bf in diskann::Index<signed char, int>::_ZN7diskann5IndexIaiE4linkERNS_10ParametersE._omp_fn.2(void) () at /workspace/DiskANN/src/index.cpp:875
#13 0x00007f036f6d08f8 in __kmp_api_GOMP_parallel (task=0x7, data=0x1b3c0c0, num_threads=0, flags=8) at ../../src/kmp_gsupport.cpp:1430
#14 0x00000000004f77cd in diskann::Index<signed char, int>::link (this=0x1b40200, parameters=...) at /workspace/DiskANN/src/index.cpp:867
#15 0x00000000004f02b2 in diskann::Index<signed char, int>::build (this=0x1b40200, parameters=..., tags=...) at /workspace/DiskANN/src/index.cpp:988
#16 0x00000000004ca33c in diskann::build_merged_vamana_index (base_file=..., _compareMetric=diskann::L2, L=75, R=60, sampling_rate=150000, ram_budget=10, mem_index_path=..., medoids_file=..., centroids_file=...)
at /workspace/DiskANN/src/aux_utils.cpp:371
#17 0x00000000004c7abc in diskann::build_disk_index (dataFilePath=0x7ffe91b2d217 "../cmake-build-debug/tests/data.bin", indexFilePath=0x7ffe91b2d23b "output/index", indexBuildParameters=0x7ffe91b2b140 "60 75 10 10 2",
_compareMetric=diskann::L2) at /workspace/DiskANN/src/aux_utils.cpp:712
#18 0x00000000004be7a7 in build_index (dataFilePath=0x7ffe91b2d217 "../cmake-build-debug/tests/data.bin", indexFilePath=0x7ffe91b2d23b "output/index", indexBuildParameters=0x7ffe91b2b140 "60 75 10 10 2")
at /workspace/DiskANN/tests/build_disk_index.cpp:15
#19 0x00000000004be1ad in main (argc=9, argv=0x7ffe91b2b3d8) at /workspace/DiskANN/tests/build_disk_index.cpp:34

@ShikharJ
Copy link
Contributor

ShikharJ commented Mar 30, 2021

@lajiyuan I'm unable to reproduce the issue on my system in either case. Are you sure this issue isn't related to the installation of the pre-requisite libraries? I suspect tcmalloc isn't setup in the right manner. Alternatively, it could be a system related issue as well.

@lajiyuan
Copy link
Author

@ShikharJ It is indeed a problem with the pre-requisite. I use the docker file https://github.com/erikbern/ann-benchmarks/pull/230/files#diff-99a5298dde50b3aa8e4ca5133869cb9390f81cf1f09ef7934630bc55b531030f to build docker image and run it. And I can get the result it deserves. Thanks a lot.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants