Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DocvecsArray getitem int64 bug #1231

Closed
knownact opened this issue Mar 22, 2017 · 3 comments · Fixed by #1254
Closed

DocvecsArray getitem int64 bug #1231

knownact opened this issue Mar 22, 2017 · 3 comments · Fixed by #1254

Comments

@knownact
Copy link

def __getitem__(self, index):
    """
    Accept a single key (int or string tag) or list of keys as input.

    If a single string or int, return designated tag's vector
    representation, as a 1D numpy array.

    If a list, return designated tags' vector representations as a
    2D numpy array: #tags x #vector_size.
    """
    if isinstance(index, string_types + (int,)):
        return self.doctag_syn0[self._int_index(index)]


    return vstack([self[i] for i in index])

should be

    if isinstance(index, integer):
        return self.doctag_syn0[self._int_index(index)]
@gojomo
Copy link
Collaborator

gojomo commented Mar 22, 2017

Why? Do you have an example where the chabge works better? Does the change still support indexed-lookup of string doctags?

@knownact
Copy link
Author

knownact commented Mar 23, 2017

it is because index here may be the numpy.int64 type, and the code here will hurt the indexed-lookup of string doctag, so maybe add a addition compare is ok, like:

from numpy import integer

if isinstance(index, string_types + (int,)) or isinstance(index, string_types + (integer,)) :
    return self.doctag_syn0[self._int_index(index)]

@gojomo
Copy link
Collaborator

gojomo commented Mar 23, 2017

I see. A PR with a test for the new case this enables, & the fix, would be welcome.

bogdanteleaga added a commit to bogdanteleaga/gensim that referenced this issue Mar 31, 2017
bogdanteleaga added a commit to bogdanteleaga/gensim that referenced this issue Apr 29, 2017
tmylk pushed a commit that referenced this issue May 2, 2017
* doc2vec: allow indexing with np.int64 -- fixes #1231

* doc2vec: use assertEqual instead of assertEquals

* Do integer checks using both `six.integer_types` and `np.integer`

* Add more tests for np.int64 indexing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants