New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multicore NULL bug #376

Closed
ghost opened this Issue Jul 1, 2015 · 14 comments

Comments

Projects
None yet
4 participants
@ghost
Copy link

ghost commented Jul 1, 2015

ubuntu@ip-172-31-33-28:~$ python lda_en.py
2015-07-01 00:13:06,021 : INFO : initializing corpus reader from <bz2.BZ2File object at 0x7fd29bf1eb90>
2015-07-01 00:13:06,052 : INFO : accepted corpus with 3831719 documents, 100000 features, 595701551 non-zero entries
MmCorpus(3831719 documents, 100000 features, 595701551 non-zero entries)
2015-07-01 00:13:06,059 : INFO : using symmetric alpha at 0.0001
2015-07-01 00:13:06,060 : INFO : using serial LDA version on this node
2015-07-01 00:16:27,970 : INFO : running online LDA training, 10000 topics, 1 passes over the supplied corpus of 3831719 documents, updating every 3
2000 documents, evaluating every ~320000 documents, iterating 50x with a convergence threshold of 0.001000
2015-07-01 00:16:27,976 : INFO : training LDA model using 32 processes
2015-07-01 00:16:33,442 : INFO : PROGRESS: pass 0, dispatched chunk #0 = documents up to #1000/3831719, outstanding queue size 1
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/queues.py", line 264, in _feed
send(obj)
SystemError: NULL result without error in PyObject_Call
2015-07-01 00:16:40,488 : INFO : PROGRESS: pass 0, dispatched chunk #1 = documents up to #2000/3831719, outstanding queue size 2
2015-07-01 00:16:45,457 : INFO : PROGRESS: pass 0, dispatched chunk #2 = documents up to #3000/3831719, outstanding queue size 3
2015-07-01 00:16:50,220 : INFO : PROGRESS: pass 0, dispatched chunk #3 = documents up to #4000/3831719, outstanding queue size 4

@ghost

This comment has been minimized.

Copy link

ghost commented Jul 1, 2015

This is related to this bug: https://bugs.python.org/issue17560

@ghost

This comment has been minimized.

Copy link

ghost commented Jul 1, 2015

Another version of this error:
mingus@lsa:~/Projects/MBTI> python lda_en.py
2015-06-30 21:13:34,692 : INFO : loaded corpus index from wiki_en_tfidf.mm.index.bz2
2015-06-30 21:13:34,692 : INFO : initializing corpus reader from wiki_en_tfidf.mm.bz2
2015-06-30 21:13:34,719 : INFO : accepted corpus with 3831719 documents, 100000 features, 595701551 non-zero entries
2015-06-30 21:13:34,726 : INFO : using symmetric alpha at 0.0004
2015-06-30 21:13:34,726 : INFO : using serial LDA version on this node
2015-06-30 21:14:20,875 : INFO : running online LDA training, 2500 topics, 1 passes over the supplied corpus of 3831719 documents, updating every 2000 documents, evaluating every ~20000 documents, iterating 50x with a convergence threshold of 0.001000
2015-06-30 21:14:20,906 : INFO : training LDA model using 2 processes
2015-06-30 21:14:29,270 : INFO : PROGRESS: pass 0, dispatched chunk #0 = documents up to #1000/3831719, outstanding queue size 1
Traceback (most recent call last):
File "/usr/lib64/python2.7/multiprocessing/queues.py", line 266, in _feed
send(obj)
IOError: bad message length
2015-06-30 21:14:38,933 : INFO : PROGRESS: pass 0, dispatched chunk #1 = documents up to #2000/3831719, outstanding queue size 2
2015-06-30 21:14:47,038 : INFO : PROGRESS: pass 0, dispatched chunk #2 = documents up to #3000/3831719, outstanding queue size 3
2015-06-30 21:14:54,713 : INFO : PROGRESS: pass 0, dispatched chunk #3 = documents up to #4000/3831719, outstanding queue size 4

@piskvorky

This comment has been minimized.

Copy link
Member

piskvorky commented Jul 1, 2015

Yes, I remember this error. It's a bug/limitation in CPython.

It's been reported here for gensim, but there's no (easy) workaround for now :(

@ghost

This comment has been minimized.

Copy link

ghost commented Jul 2, 2015

I tried the SO monkeypatch. It results in this error:

https://gist.github.com/brianmingus/c58533bc690516a600f6

This error is not easily fixable, because it's a bug in C code:

https://github.com/python/cpython/blob/3.4/Modules/posixmodule.c#L8048-L8065

@ghost

This comment has been minimized.

Copy link

ghost commented Jul 2, 2015

I filed a bug for this issue: http://bugs.python.org/issue24550

@ghost

This comment has been minimized.

Copy link

ghost commented Jul 2, 2015

If you hack multiprocessing in Python 3.6 then ldamulticore works!

@ghost

This comment has been minimized.

Copy link

ghost commented Jul 2, 2015

It's working for 5k topics but at 10k a new bug emerges:

2015-07-02 07:36:40,642 : INFO : initializing corpus reader from wiki_en_tfidf.mm.bz2
2015-07-02 07:36:40,681 : INFO : accepted corpus with 3831719 documents, 100000 features, 595701551 non-zero entries
2015-07-02 07:36:40,702 : INFO : using symmetric alpha at 0.0001
2015-07-02 07:36:40,709 : INFO : using serial LDA version on this node
2015-07-02 07:40:24,386 : INFO : running online LDA training, 10000 topics, 1 passes over the supplied corpus of 3831719 documents, updating every 310000 documents, evaluating every ~3100000 documents, iterating 50x with a convergence threshold of 0.001000
2015-07-02 07:40:24,412 : INFO : training LDA model using 31 processes
2015-07-02 07:43:12,940 : INFO : PROGRESS: pass 0, dispatched chunk #0 = documents up to #10000/3831719, outstanding queue size 1
Traceback (most recent call last):
File "/usr/local/lib/python3.6/multiprocessing/queues.py", line 241, in _feed
obj = ForkingPickler.dumps(obj)
File "/usr/local/lib/python3.6/multiprocessing/reduction.py", line 50, in dumps
cls(buf, protocol).dump(obj)
OverflowError: cannot serialize a bytes object larger than 4 GiB

@piskvorky

This comment has been minimized.

Copy link
Member

piskvorky commented Jul 2, 2015

Looks like you're on a quest Brian :)

That's exciting on its own, and will push the boundaries of what people have tried with LDA in the past. Especially if coupled with a thorough analysis of the results (human eval, not perplexity). How useful are the 10k (or 5k) topics? What is the practical applicability of such models? There is an article or two waiting in there somewhere.

@tmylk

This comment has been minimized.

Copy link
Contributor

tmylk commented Jan 23, 2016

@brianmingus Do you have any interesting results to share here about breaking the topic barrier?

@koustuvsinha

This comment has been minimized.

Copy link

koustuvsinha commented Jul 17, 2017

any updates for this? it's still an issue

@ghost

This comment has been minimized.

Copy link

ghost commented Jul 17, 2017

If you want many topics, use an autoencoder to implement LSA. Set the size of your hidden layer to the number of topics desired.

@koustuvsinha

This comment has been minimized.

Copy link

koustuvsinha commented Jul 17, 2017

I am just querying for 25 topics and it's still failing. the size of my individual documents is large which is the reason I suppose.

@menshikh-iv

This comment has been minimized.

Copy link
Member

menshikh-iv commented Aug 14, 2017

@koustuvsinha you can reduce a size of your vocab and batch_size to avoid this issue.

@menshikh-iv

This comment has been minimized.

Copy link
Member

menshikh-iv commented Oct 3, 2017

Unfortunately, it's a limitation from python, for this reason, we can't fix it.

@menshikh-iv menshikh-iv closed this Oct 3, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment