Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LDAModel DepracationWarning under Python 3 #494

Closed
mattilyra opened this issue Oct 20, 2015 · 3 comments
Closed

LDAModel DepracationWarning under Python 3 #494

mattilyra opened this issue Oct 20, 2015 · 3 comments

Comments

@mattilyra
Copy link
Contributor

I noticed that when training an LDA model under Python 3 with NumPy 1.10 or 1.9 I get a long long list of DepracationWarnings from NumPy, so much so that it kills the browser running a notebook. This doesn't happen under Py2 with NumPy 1.10.

I think it's because the ids is a list not a ndarray. The warning refers to two lines in ldamodel.py

Both use the ids list to index a ndarray. The ids are just feature ids from the current document (?) so I tried a quick fix of just changing the ids to ndarray which makes the warnings go away, but I haven't tested if it has some other consquences, I don't see why it would.

The DepracationWarnings (just two of them, there are many more actually produced)

/Volumes/LocalDataHD/conda/envs/py34/lib/python3.4/site-packages/gensim/models/ldamodel.py:375: DeprecationWarning: non integer (and non boolean) array-likes will not be accepted as indices in the future
  expElogbetad = self.expElogbeta[:, ids]
/Volumes/LocalDataHD/conda/envs/py34/lib/python3.4/site-packages/gensim/models/ldamodel.py:401: DeprecationWarning: non integer (and non boolean) array-likes will not be accepted as indices in the future
  sstats[:, ids] += numpy.outer(expElogthetad.T, cts / phinorm)
@piskvorky
Copy link
Owner

I think this may be related to this discussion: #448 (comment)

Yes, we definitely want to get rid of these warnings!

@mattilyra
Copy link
Contributor Author

So the offender is actually 561 which passes the numpyified doc(s) into .inference()

for chunk_no, chunk in enumerate(utils.grouper(corpus, chunksize, as_numpy=True)):                                                                                                                                                           
    reallen += len(chunk)  # keep track of how many documents we've processed so far
    if eval_every and ((reallen == lencorpus) or ((chunk_no + 1) % (eval_every * self.numworkers) == 0)):
        self.log_perplexity(chunk, total_docs=lencorpus)

Changing as_numpy=False seems to work, my suggested change above actually created some other weirdness.

@tmylk
Copy link
Contributor

tmylk commented Jan 9, 2016

These DeprecationWarnings are not present in the latest build. Closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants