Skip to content

WordEmbeddingsKeyedVectors.add() doesn't clear vectors_norm, causing IndexError on later most_similar() #2532

Open
@gojomo

Description

@gojomo

As reported in a StackOverflow question/answer: https://stackoverflow.com/a/56641265/130288

An adapted version of the asker's minimal test case (which could become a unit test):

import numpy as np
from gensim.models.keyedvectors import WordEmbeddingsKeyedVectors

kv = WordEmbeddingsKeyedVectors(vector_size=3)
kv.add(entities=['a', 'b'],
       weights=[np.random.rand(3), np.random.rand(3)])
kv.most_similar('a')  # works

kv.add(entities=['c'], weights=[np.random.rand(3)])
kv.most_similar('c')  # fails with `IndexError`

Clearing the vectors_norm property (with either del or assignment-to-None) should be sufficient to trigger re-calculation upon the next most_similar().

Metadata

Metadata

Assignees

No one assigned

    Labels

    HacktoberfestIssues marked for hacktoberfestbugIssue described a bugdifficulty easyEasy issue: required small fixgood first issueIssue for new contributors (not required gensim understanding + very simple)impact MEDIUMBig annoyance for affected usersreach LOWAffects only niche use-case users

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions