"RuntimeError: release unlocked lock" in numpy mtrand.RandomState.randint while training doc2vec #1311

yangkky · 2017-05-09T21:21:00Z

Description

I'm trying to train a doc2vec model on a set of protein sequences divided into k-mers. With some datasets/divisions into k-mers, the training completes successfully. However, sometimes, I get a RuntimeError:

2017-05-09 13:37:56,183 : INFO : PROGRESS: at 96.88% examples, 409235 words/s, in_qsize 15, out_qsize 0
Exception in thread Thread-34:
Traceback (most recent call last):
  File "/Users/kevinyang/anaconda/lib/python3.5/threading.py", line 914, in _bootstrap_inner
    self.run()
  File "/Users/kevinyang/anaconda/lib/python3.5/threading.py", line 862, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/kevinyang/anaconda/lib/python3.5/site-packages/gensim/models/word2vec.py", line 822, in worker_loop
    tally, raw_tally = self._do_train_job(sentences, alpha, (work, neu1))
  File "/Users/kevinyang/anaconda/lib/python3.5/site-packages/gensim/models/doc2vec.py", line 717, in _do_train_job
    doctag_vectors=doctag_vectors, doctag_locks=doctag_locks)
  File "gensim/models/doc2vec_inner.pyx", line 428, in gensim.models.doc2vec_inner.train_document_dm (./gensim/models/doc2vec_inner.c:5444)
  File "mtrand.pyx", line 1266, in mtrand.RandomState.randint (numpy/random/mtrand/mtrand.c:15836)
RuntimeError: release unlocked lock

Steps/Code/Corpus to Reproduce

Unfortunately, I don't know how to reproduce this on a smaller corpus. I've attached the training code (in a Jupyter notebook).
bug_report.zip

What would be the best way to attach the corpus? It is around 170 MB.

Versions

Darwin-16.5.0-x86_64-i386-64bit
Python 3.5.3 |Anaconda custom (x86_64)| (default, Mar  6 2017, 12:15:08) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)]
NumPy 1.11.3
SciPy 0.18.1
gensim 1.0.1
FAST_VERSION 1

The text was updated successfully, but these errors were encountered:

menshikh-iv · 2017-05-12T09:18:30Z

Thank you @yangkky, you can use external file store service like dropbox, google disk or something else for share your corpus

gojomo · 2017-05-14T23:08:13Z

Note the actual error is occurring inside numpy code – and gensim's use of numpy on that doc2vec_inner.pyx line (https://github.com/RaRe-Technologies/gensim/blob/fdc01ab1ee350ce223ab9209e911352fba5d4290/gensim/models/doc2vec_inner.pyx#L428) is pretty straightforward. Even though it's in cython, we haven't yet entered a nogil section. That makes me suspect a numpy/anaconda bug rather than gensim. (Gensim isn't managing any locking/unlocking of the random object, and is calling it in an acceptable way, it seems whatever mess-up is happening with locking must be caused by non-gensim code.)

Does the problem recur with a later Numpy (0.12+), or with non-Anaconda Python 3.5.x?

yangkky · 2017-05-15T06:16:04Z

@gojomo Do you mean Numpy 1.12+?

I'll try updating jupyter and see if it still fails. I'll also upload the corpus soon.

gojomo · 2017-05-15T18:16:03Z

Yes, sorry, I meant Numpy 1.12 or later. (Also beyond non-Anaconda Python, a later Anaconda could also be worth trying while the problem is still recurring.)

yangkky · 2017-05-15T18:22:30Z

Here is a link to the corpus:
https://drive.google.com/file/d/0BzIF1ox2Vmq5RkUyU2dNWkswS2s/view?usp=sharing

yangkky · 2017-05-15T21:13:48Z

Upgrading to numpy 1.12.1 did not resolve the issue.

gojomo · 2017-05-15T22:14:10Z

I recommend separately reporting to Numpy's issues; I don't see any error in how gensim is calling this method, and the error arises in that code, with regard to locks it manages.

yangkky · 2017-05-15T22:25:53Z

@gojomo ok I will do that.

piskvorky · 2017-09-02T20:00:38Z

From the discussion in that numpy ticket, the error seems to be somehow related to multiprocessing and numpy.

Closing here as unrelated -- please let us know if there's anything we could do on our side @yangkky .

gojomo changed the title ~~RuntimeError while training doc2vec~~ "RuntimeError: release unlocked lock" in numpy mtrand.RandomState.randint while training doc2vec May 15, 2017

yangkky mentioned this issue May 31, 2017

RuntimeError: "release unlocked lock" in numpy mtrand.RandomState.randint numpy/numpy#9192

Closed

piskvorky closed this as completed Sep 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"RuntimeError: release unlocked lock" in numpy mtrand.RandomState.randint while training doc2vec #1311

"RuntimeError: release unlocked lock" in numpy mtrand.RandomState.randint while training doc2vec #1311

yangkky commented May 9, 2017

menshikh-iv commented May 12, 2017

gojomo commented May 14, 2017

yangkky commented May 15, 2017

gojomo commented May 15, 2017

yangkky commented May 15, 2017

yangkky commented May 15, 2017

gojomo commented May 15, 2017

yangkky commented May 15, 2017

piskvorky commented Sep 2, 2017

"RuntimeError: release unlocked lock" in numpy mtrand.RandomState.randint while training doc2vec #1311

"RuntimeError: release unlocked lock" in numpy mtrand.RandomState.randint while training doc2vec #1311

Comments

yangkky commented May 9, 2017

Description

Steps/Code/Corpus to Reproduce

Versions

menshikh-iv commented May 12, 2017

gojomo commented May 14, 2017

yangkky commented May 15, 2017

gojomo commented May 15, 2017

yangkky commented May 15, 2017

yangkky commented May 15, 2017

gojomo commented May 15, 2017

yangkky commented May 15, 2017

piskvorky commented Sep 2, 2017