You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Per user report on SO, neither assignment to a bracketed-access (as would be implemented by __setitem__()) nor use of the add() method will successfully mutate a Doc2VecKeyedVectors object.
Looking closer, it seems the superclass __setItem__() passes through to superclass add(), which was only ever implemented for word-centric sets of vectors – consulting/updating properties like .vocab that only exist as empty values in Doc2VecKeyedVectors because of the currently confused inheritance created by #1777.
The text was updated successfully, but these errors were encountered:
As an addition to the SO post, I want to add new documents to the model.
It seems this should be done with the add() method, but since this is not working I figured the following work-around out:
model = Doc2Vec.load(PATH_to_model)
# Add vector and identifier to original values
model.docvecs.vectors_docs = np.vstack([model.docvecs.vectors_docs, new_vec])
model.docvecs.index2entity.append(new_identifier)
# Test if new document is included
model.docvecs.most_similar(positive = [new_vec])
Calling the most_similar() method returns results including this new document, also after saving and loading the model. So it seems to work.
My question is whether this is a 'correct' way of working around this bug, or if I am missing something.
@ThijsKranenburg - If it works for your purposes, it's good enough! Note though you've not yet done enough to look-up the new vectors by identifier – that's also require adding entries to the model.docvecs.doctags dict. And the possible effects of such a workaround on any further training are unclear.
Per user report on SO, neither assignment to a bracketed-access (as would be implemented by
__setitem__()
) nor use of theadd()
method will successfully mutate aDoc2VecKeyedVectors
object.Looking closer, it seems the superclass
__setItem__()
passes through to superclassadd()
, which was only ever implemented for word-centric sets of vectors – consulting/updating properties like.vocab
that only exist as empty values inDoc2VecKeyedVectors
because of the currently confused inheritance created by #1777.The text was updated successfully, but these errors were encountered: