Numpy memory error #30

Deepakchawla · 2017-08-21T08:28:26Z

When I am running python scripts/retriever/interactive.py command then it shows me below error.
root@ubuntu-2gb-nyc3-01:~/DrQA# python scripts/retriever/interactive.py
08/21/2017 08:13:28 AM: [ Initializing ranker... ]
08/21/2017 08:13:28 AM: [ Loading /root/DrQA/data/wikipedia/docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz ]
Traceback (most recent call last):
File "scripts/retriever/interactive.py", line 27, in
ranker = retriever.get_class('tfidf')(tfidf_path=args.model)
File "/root/DrQA/drqa/retriever/tfidf_doc_ranker.py", line 37, in init
matrix, metadata = utils.load_sparse_csr(tfidf_path)
File "/root/DrQA/drqa/retriever/utils.py", line 34, in load_sparse_csr
matrix = sp.csr_matrix((loader['data'], loader['indices'],
File "/root/anaconda3/lib/python3.6/site-packages/numpy/lib/npyio.py", line 233, in getitem
pickle_kwargs=self.pickle_kwargs)
File "/root/anaconda3/lib/python3.6/site-packages/numpy/lib/format.py", line 664, in read_array
array = numpy.ndarray(count, dtype=dtype)
MemoryError

I am using it without GPU and below is my system information.
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1
Core(s) per socket: 1
Socket(s): 4
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
Stepping: 1
CPU MHz: 2199.998
BogoMIPS: 4399.99
Hypervisor vendor: KVM
Virtualization type: full
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 30720K
NUMA node0 CPU(s): 0-3
Can some one help me to resolve that problem..??

Thank You

ajfisch · 2017-08-21T12:25:15Z

How much free RAM does your system have? Is it possible your download was interrupted and got corrupted?

Deepakchawla · 2017-08-21T17:40:25Z

below is free command results:
total used free shared buff/cache available
Mem: 7484 92 7176 9 215 7158
Swap: 0 0 0

Deepakchawla · 2017-08-21T18:02:09Z

I set the value of cat /proc/sys/vm/overcommit_memory to 1 using echo 1 > /proc/sys/vm/overcommit_memory and again run interactive.py file and it shows me below message...
deepakchawla35@deepak-server:~/DrQA$ python scripts/pipeline/interactive.py
08/21/2017 05:49:49 PM: [ Running on CPU only. ]
08/21/2017 05:49:49 PM: [ Initializing pipeline... ]
08/21/2017 05:49:49 PM: [ Initializing document ranker... ]
08/21/2017 05:49:49 PM: [ Loading /home/deepakchawla35/DrQA/data/wikipedia/docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz ]
Killed
now what should I do...??

ajfisch · 2017-08-21T19:22:24Z

From your free, it looks like you do not have enough RAM on your machine. You need at least around 15 GB and it looks like you have 8 (if the units you posted are MB).

Deepakchawla · 2017-08-22T03:05:09Z

Ok I will change it from 8gb to 15gb but when I changed its value from 0 to 1 then it doesn't show me any memory relates error and run smoothly but it shows some killed like message now what the reason behind that killed message..

ajfisch · 2017-08-22T03:19:20Z

Setting the value from 0 to 1 enabled overcommit, always. In overcommit mode the linux kernel always lets a memory allocation like malloc return true. But then when your program actually uses that memory, you will run out of space, and the kernel OOM Killer will kill the process (hence your Killed message).

On the other hand, If overcommit is not enabled, then the kernel will not let programs allocate more virtual memory than is physically available. malloc will return false and the actual program (in this case numpy) will exit with an error (MemoryError).

Deepakchawla · 2017-08-22T03:24:36Z

okay got your point but now I changed by RAM size and free -m before running Python file
total used free shared buff/cache available
Mem: 22099 148 21876 10 74 21708
Swap: 0 0 0
deepakchawla35@deepak-server:~/DrQA$ python scripts/pipeline/interactive.py
08/22/2017 03:17:25 AM: [ Running on CPU only. ]
08/22/2017 03:17:25 AM: [ Initializing pipeline... ]
08/22/2017 03:17:25 AM: [ Initializing document ranker... ]
08/22/2017 03:17:25 AM: [ Loading /home/deepakchawla35/DrQA/data/wikipedia/docs-tfidf-ngram=2-hash=16777216-tokenizer=simple.npz ]
08/22/2017 03:19:24 AM: [ Initializing document reader... ]
08/22/2017 03:19:24 AM: [ Loading model /home/deepakchawla35/DrQA/data/reader/multitask.mdl ]
08/22/2017 03:19:31 AM: [ Initializing tokenizers and document retrievers... ]
Traceback (most recent call last):
File "scripts/pipeline/interactive.py", line 70, in
tokenizer=args.tokenizer
File "/home/deepakchawla35/DrQA/drqa/pipeline/drqa.py", line 140, in init
initargs=(tok_class, tok_opts, db_class, db_opts, fixed_candidates)
File "/home/deepakchawla35/anaconda3/lib/python3.6/multiprocessing/context.py", line 119, in Pool
context=self.get_context())
File "/home/deepakchawla35/anaconda3/lib/python3.6/multiprocessing/pool.py", line 168, in init
self._repopulate_pool()
File "/home/deepakchawla35/anaconda3/lib/python3.6/multiprocessing/pool.py", line 233, in _repopulate_pool
w.start()
File "/home/deepakchawla35/anaconda3/lib/python3.6/multiprocessing/process.py", line 105, in start
self._popen = self._Popen(self)
File "/home/deepakchawla35/anaconda3/lib/python3.6/multiprocessing/context.py", line 277, in _Popen
return Popen(process_obj)
File "/home/deepakchawla35/anaconda3/lib/python3.6/multiprocessing/popen_fork.py", line 20, in init
self._launch(process_obj)
File "/home/deepakchawla35/anaconda3/lib/python3.6/multiprocessing/popen_fork.py", line 67, in _launch
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
and running python file it shows something else RAM size...
free -m
total used free shared buff/cache available
Mem: 22099 148 13961 10 7989 21628
Swap: 0 0 0

ajfisch · 2017-08-22T03:28:59Z

Do you still have overcommit enabled? You might need that to run with the tokenizers, as it allocates (but doesn't use all) memory for the JVM for each tokenizer process.

You can also see if running with --tokenizer spacy works.
Edit: Try --tokenizer regexp first, as you'd need to pip install spacy && python -m spacy download en for the former

Deepakchawla · 2017-08-22T03:31:46Z

no currently overcommit disabled
deepakchawla35@deepak-server:~/DrQA$ cat /proc/sys/vm/overcommit_memory
0
You can also see if running with --tokenizer spacy works. => don't get your point...

ajfisch · 2017-08-22T03:34:53Z

Try running with overcommit enabled (echo 1 > /proc/sys/vm/overcommit_memory)
If that still errors, try running python scripts/pipeline/interactive.py --tokenizer regexp, it uses a less resource intensive tokenizer (where your machine is failing).

Deepakchawla · 2017-08-22T03:44:37Z

okay, let me try...

Deepakchawla · 2017-08-22T03:54:49Z

now it working perfectly... thank you so much but it giving me the wrong prediction for some questions:-
**>>> process('when facebook company ipo launched')
08/22/2017 03:49:42 AM: [ Processing 1 queries... ]
08/22/2017 03:49:42 AM: [ Retrieving top 5 docs... ]
08/22/2017 03:49:43 AM: [ Reading 323 paragraphs... ]
08/22/2017 03:49:51 AM: [ Processed 1 queries in 8.7226 (s) ]
Top Predictions:
+------+--------+-------------------------------------+--------------+-----------+
| Rank | Answer | Doc | Answer Score | Doc Score |
+------+--------+-------------------------------------+--------------+-----------+
| 1 | 2009 | Initial public offering of Facebook | 49060 | 248.07 |
+------+--------+-------------------------------------+--------------+-----------+

Contexts:
[ Doc = Initial public offering of Facebook ]
To ensure that early investors would retain control of the company, Facebook in 2009 instituted a dual-class stock structure. After the IPO, Zuckerberg was to retain a 22% ownership share in Facebook and was to own 57% of the voting shares. The document also stated that the company was seeking to raise 5 billion, which would make it one of the largest IPOs in tech history and the biggest in Internet history.**

**>>> process('when facebook company IPO launched')
08/22/2017 03:51:07 AM: [ Processing 1 queries... ]
08/22/2017 03:51:07 AM: [ Retrieving top 5 docs... ]
08/22/2017 03:51:07 AM: [ Reading 323 paragraphs... ]
08/22/2017 03:51:14 AM: [ Processed 1 queries in 6.7024 (s) ]
Top Predictions:
+------+--------+-------------------------------------+--------------+-----------+
| Rank | Answer | Doc | Answer Score | Doc Score |
+------+--------+-------------------------------------+--------------+-----------+
| 1 | 2012 | Initial public offering of Facebook | 4.8931e+05 | 248.07 |
+------+--------+-------------------------------------+--------------+-----------+

Contexts:
[ Doc = Initial public offering of Facebook ]
The social networking company Facebook held its initial public offering (IPO) on Friday, May 18, 2012. The IPO was the biggest in technology and one of the biggest in Internet history, with a peak market capitalization of over $104 billion. Media pundits called it a "cultural touchstone."**

**>>> process('who is father of deep learning')
08/22/2017 03:52:47 AM: [ Processing 1 queries... ]
08/22/2017 03:52:47 AM: [ Retrieving top 5 docs... ]
08/22/2017 03:52:48 AM: [ Reading 479 paragraphs... ]
08/22/2017 03:52:55 AM: [ Processed 1 queries in 7.3674 (s) ]
Top Predictions:
+------+---------------------+---------------+--------------+-----------+
| Rank | Answer | Doc | Answer Score | Doc Score |
+------+---------------------+---------------+--------------+-----------+
| 1 | Juergen Schmidhuber | Deep learning | 3.7192e+08 | 453.99 |
+------+---------------------+---------------+--------------+-----------+

Contexts:
[ Doc = Deep learning ]
Deep learning algorithms transform their inputs through more layers than shallow learning algorithms. At each layer, the signal is transformed by a processing unit, like an artificial neuron, whose parameters are 'learned' through training. A chain of transformations from input to output is a "credit assignment path" (CAP). CAPs describe potentially causal connections between input and output and may vary in length – for a feedforward neural network, the depth of the CAPs (thus of the network) is the number of hidden layers plus one (as the output layer is also parameterized), but for recurrent neural networks, in which a signal may propagate through a layer more than once, the CAP is potentially unlimited in length. There is no universally agreed upon threshold of depth dividing shallow learning from deep learning, but most researchers in the field agree that deep learning has multiple nonlinear layers (CAP > 2) and Juergen Schmidhuber considers CAP > 10 to be very deep learning.**

ajfisch · 2017-08-22T04:05:00Z

I am glad that it is working.

DrQA is just an AI research project -- of course there is no guarantee that it will answer all questions correctly (or in the case of this model be invariant to spelling, capitalization, or phrasing). In fact from our reported evaluations on several QA datasets, you can expect that DrQA will get most questions wrong (but also a fair amount correct). Hopefully this model can be a baseline for machine reading at scale that someone like you can beat 😉.

Then again, the answers to some of these questions are subjective. Perhaps Juergen wouldn't mind the answer to your question 3...

Deepakchawla · 2017-08-22T04:11:12Z

okay and are you improving or working on its QA datasets to give more accurate answers.... and one more thing currently it taking so much time on giving the answers I want to do it in max. 3 sec.. what should I have to do to achieve this...??

ajfisch · 2017-08-22T04:21:49Z

Reading comprehension and open-domain QA is an active area of research, for FAIR and others.

To improve the runtime performance of DrQA you will need a machine with better specs. It also scales better with large batches (faster average time per question).

Ideally you will have a machine with a GPU and CUDNN. The higher quality the GPU, the better.
Having more CPU cores (especially if you are lacking a GPU) is also very helpful. The prediction pipeline runs on both CPU and GPU. >15 cores is good, more if not using a GPU.
Running in large batch sizes (say up to 1000 questions) is quite more efficient than single question. You can see how batching is done in scripts/pipeline/predict.py for example.
As an immediate measure, you can reduce the number of documents DrQA reads per question (the n_docs parameter in process, default is 5). This will hurt your accuracy, however.

Deepakchawla · 2017-08-22T04:35:04Z

Okay so I will try with GPU and try to reduce its execution time... and thanks a lot once again... you help a lot and also contribute to accomplishment my passionate project... 😄

ajfisch · 2017-08-22T04:36:28Z

You are very welcome!

Deepakchawla · 2017-08-22T04:40:09Z

😊

augmen · 2018-04-17T10:43:02Z

Hi i am having the same issue with 8GB RAM and 4CPU cores. Can you help us .
(pt) root@ml:~/DrQA# python3 scripts/pipeline/interactive.py --tokenizer regexp Traceback (most recent call last): File "scripts/pipeline/interactive.py", line 16, in <module> from drqa import pipeline ImportError: No module named 'drqa'

Deepakchawla changed the title ~~Numpu memory error~~ Numpy memory error Aug 22, 2017

ajfisch closed this as completed Aug 28, 2017

j6mes mentioned this issue Apr 26, 2018

Some training data files missing sheffieldnlp/naacl2018-fever#36

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Numpy memory error #30

Numpy memory error #30

Deepakchawla commented Aug 21, 2017

ajfisch commented Aug 21, 2017

Deepakchawla commented Aug 21, 2017

Deepakchawla commented Aug 21, 2017

ajfisch commented Aug 21, 2017

Deepakchawla commented Aug 22, 2017

ajfisch commented Aug 22, 2017

Deepakchawla commented Aug 22, 2017 •

edited

ajfisch commented Aug 22, 2017 •

edited

Deepakchawla commented Aug 22, 2017

ajfisch commented Aug 22, 2017

Deepakchawla commented Aug 22, 2017

Deepakchawla commented Aug 22, 2017

ajfisch commented Aug 22, 2017

Deepakchawla commented Aug 22, 2017

ajfisch commented Aug 22, 2017 •

edited

Deepakchawla commented Aug 22, 2017

ajfisch commented Aug 22, 2017

Deepakchawla commented Aug 22, 2017 •

edited

augmen commented Apr 17, 2018

Numpy memory error #30

Numpy memory error #30

Comments

Deepakchawla commented Aug 21, 2017

ajfisch commented Aug 21, 2017

Deepakchawla commented Aug 21, 2017

Deepakchawla commented Aug 21, 2017

ajfisch commented Aug 21, 2017

Deepakchawla commented Aug 22, 2017

ajfisch commented Aug 22, 2017

Deepakchawla commented Aug 22, 2017 • edited

ajfisch commented Aug 22, 2017 • edited

Deepakchawla commented Aug 22, 2017

ajfisch commented Aug 22, 2017

Deepakchawla commented Aug 22, 2017

Deepakchawla commented Aug 22, 2017

ajfisch commented Aug 22, 2017

Deepakchawla commented Aug 22, 2017

ajfisch commented Aug 22, 2017 • edited

Deepakchawla commented Aug 22, 2017

ajfisch commented Aug 22, 2017

Deepakchawla commented Aug 22, 2017 • edited

augmen commented Apr 17, 2018

Deepakchawla commented Aug 22, 2017 •

edited

ajfisch commented Aug 22, 2017 •

edited

ajfisch commented Aug 22, 2017 •

edited

Deepakchawla commented Aug 22, 2017 •

edited