dense_retriever -- MemoryError: std::bad_alloc #33

aarzchan · 2020-07-24T03:35:09Z

Hi! It seems that no matter what value I set index_buffer to, I get the following error when running dense_retriever.py:

Traceback (most recent call last):
  File "dense_retriever.py", line 331, in <module>
    main(args)
  File "dense_retriever.py", line 268, in main
    retriever.index_encoded_data(input_paths, buffer_size=index_buffer_sz)
  File "dense_retriever.py", line 100, in index_encoded_data
    self.index.index_data(buffer)
  File "/home/aarchan/qa-aug/qa-aug/dpr/indexer/faiss_indexers.py", line 93, in index_data
    self.index.add(vectors)
  File "/home/aarchan/anaconda2/envs/qa-aug/lib/python3.8/site-packages/faiss/__init__.py", line 138, in replacement_add
    self.add_c(n, swig_ptr(x))
  File "/home/aarchan/anaconda2/envs/qa-aug/lib/python3.8/site-packages/faiss/swigfaiss.py", line 1454, in add
    return _swigfaiss.IndexFlat_add(self, n, x)
MemoryError: std::bad_alloc

For reference, the machine I'm running this on has 128GB RAM, but it doesn't seem to be enough. Could you please help me with this issue? Thanks!

The text was updated successfully, but these errors were encountered:

vlad-karpukhin · 2020-07-24T17:08:33Z

Hi Aaron,
yes, unfortunately 128GB server is not enough for the retriever inference time setup even with flat index. HNSW index requires even more memory (it alone takes ~ 160GB of ram).

aarzchan · 2020-07-24T17:13:08Z

How much RAM is required for the flat index?

vlad-karpukhin · 2020-07-24T18:00:44Z

The flat index setup showed max 95 GB ram consumption for the entire process (i.e index+ wikipedia passages data in memory, etc.).

aarzchan · 2020-07-24T18:06:11Z

Hmm, I'm confused. Why is 128GB RAM not enough for the flat index setup, if the flat index setup only requires 95GB RAM?

vlad-karpukhin · 2020-07-24T19:31:29Z

I have no idea why 128 it not enough, I run it on 512 GB server and measured the highest RES consumption. Virtual max was 135.
may be you have some other processes consuming some RAM so you have less than 95 left for the retriever?

aarzchan · 2020-07-24T19:36:29Z

I've checked that, prior to running dense_retriever.py, my server is using less than 1GB of RAM, so it seems that all of the RAM usage is coming from the retriever script. I do also have access to a server with 256GB RAM, so I'll try running on that machine.

soheeyang · 2020-08-09T13:48:34Z

I run it on 512 GB server and measured the highest RES consumption. Virtual max was 135.

How many CPU cores does the server have and how long does it take to run dense_retriever with the server?

vlad-karpukhin · 2020-08-10T19:53:11Z

It has 64 cores

…

On Sun, Aug 9, 2020 at 6:48 AM Sohee Yang ***@***.***> wrote: I run it on 512 GB server and measured the highest RES consumption. Virtual max was 135. How many CPU cores does the server have? — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#33 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABDJ5QCWYZHJKLWIWUCCNXDR72SL3ANCNFSM4PGKTZZQ> .

soheeyang · 2020-08-11T01:38:10Z

It has 64 cores

Thank you :D

vlad-karpukhin · 2020-08-20T17:20:03Z

Seems like this can be closed now.

luomancs · 2020-12-22T16:42:05Z

I have the same issue of MemoryError: std::bad_alloc when I use dense-retriever.py, the size of my indexing is 50G and my server has 86G Memory. I changed the function iterate_encoded_files in faiss_indexers.py, specifically as follows,

def iterate_encoded_files(vector_files: list) -> Iterator[Tuple[object, np.array]]:
for i, file in enumerate(vector_files):
logger.info('Reading file %s', file)
doc_vectors = []
with open(file, "rb") as reader:
doc_vectors.extend(pickle.load(reader))
# for doc in doc_vectors:
# db_id, doc_vector = doc
# yield db_id, doc_vector
# del doc_vectors
# gc.collect()
return doc_vectors

the function loads all the indexing at once (since 50G is less than 86G, it is fine), so that DenseIndexer index_data all indexing at once rather than using a buffer.

now MemoryError: std::bad_alloc has gone.

vlad-karpukhin closed this as completed Aug 20, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dense_retriever -- MemoryError: std::bad_alloc #33

dense_retriever -- MemoryError: std::bad_alloc #33

aarzchan commented Jul 24, 2020

vlad-karpukhin commented Jul 24, 2020

aarzchan commented Jul 24, 2020

vlad-karpukhin commented Jul 24, 2020 •

edited

aarzchan commented Jul 24, 2020

vlad-karpukhin commented Jul 24, 2020

aarzchan commented Jul 24, 2020

soheeyang commented Aug 9, 2020 •

edited

vlad-karpukhin commented Aug 10, 2020 via email

soheeyang commented Aug 11, 2020

vlad-karpukhin commented Aug 20, 2020

luomancs commented Dec 22, 2020

dense_retriever -- MemoryError: std::bad_alloc #33

dense_retriever -- MemoryError: std::bad_alloc #33

Comments

aarzchan commented Jul 24, 2020

vlad-karpukhin commented Jul 24, 2020

aarzchan commented Jul 24, 2020

vlad-karpukhin commented Jul 24, 2020 • edited

aarzchan commented Jul 24, 2020

vlad-karpukhin commented Jul 24, 2020

aarzchan commented Jul 24, 2020

soheeyang commented Aug 9, 2020 • edited

vlad-karpukhin commented Aug 10, 2020 via email

soheeyang commented Aug 11, 2020

vlad-karpukhin commented Aug 20, 2020

luomancs commented Dec 22, 2020

vlad-karpukhin commented Jul 24, 2020 •

edited

soheeyang commented Aug 9, 2020 •

edited