Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeroDivisionError in Doc2Vec.build_vocab #518

Closed
jgc128 opened this issue Nov 10, 2015 · 3 comments
Closed

ZeroDivisionError in Doc2Vec.build_vocab #518

jgc128 opened this issue Nov 10, 2015 · 3 comments
Assignees

Comments

@jgc128
Copy link

jgc128 commented Nov 10, 2015

When I call model.build_vocab(docs_iterator) sometimes I got ZeroDivisionError:

Traceback (most recent call last):
  File "patient_notes_doc2vec.py", line 45, in <module>
    model.build_vocab(docs_iterator)
  File "/data2/aromanov/virt_env/theano/lib/python3.4/site-packages/gensim/models/word2vec.py", line 495, in build_vocab
    self.scan_vocab(sentences, trim_rule=trim_rule)  # initial survey
  File "/data2/aromanov/virt_env/theano/lib/python3.4/site-packages/gensim/models/doc2vec.py", line 634, in scan_vocab
    interval_rate = (total_words - interval_count) / (default_timer() - interval_start)
ZeroDivisionError: float division by zero

A few lines above in doc2vec.py interval_start is defined as interval_start = default_timer() and in the for loop we have

interval_rate = (total_words - interval_count) / (default_timer() - interval_start)

Perhaps my system is too fast 😃

@gojomo gojomo self-assigned this Nov 10, 2015
@gojomo
Copy link
Collaborator

gojomo commented Nov 10, 2015

Too fast... or the clock too coarse!

This definitely shouldn't happen; I'll likely add a +0.001 (or similar). Then the rate-estimate may occasionally be off, but the code won't error.

Thanks for the report!

@gojomo
Copy link
Collaborator

gojomo commented Nov 13, 2015

Also reported by @kikohs in #529.

@tmylk
Copy link
Contributor

tmylk commented Jan 9, 2016

@gojomo Pinging this issue. Is it an easy fix that can be include in this month's release?

tmylk added a commit that referenced this issue Jan 23, 2016
fixes #518: tiny 10µsec fudge against 0 elapsed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants