Topic worth investigating over: 'vector rejection' #595

shirish93 · 2016-01-28T23:59:14Z

@benSchmidt has written an interesting blog post on the use of a method he calls 'vector rejection' to separate words with ambiguous meanings.

During experimentation with a Nepali news corpus dataset, I found his method to be more useful to discard unwanted vectors than the existing method with most_similar.

I have recreated his method (which he has in R) in this gist and have been working with it for the last few days. In my (admittedly limited) series of experiments it seems to have quite a lot of value. Yoav Goldberg has a twitter thread about the operation/post here.

I bring this up because someone might want to look it over/possibly see if this aligns with the project? Please close the issue if you believe otherwise.

edit: correct link.

piskvorky · 2016-01-29T00:50:42Z

This is very interesting, thanks for the tip @shirish93 !

menshikh-iv added feature Issue described a new feature difficulty medium Medium issue: required good gensim understanding & python skills wishlist Feature request labels Oct 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Topic worth investigating over: 'vector rejection' #595

Topic worth investigating over: 'vector rejection' #595

shirish93 commented Jan 28, 2016

piskvorky commented Jan 29, 2016

Topic worth investigating over: 'vector rejection' #595

Topic worth investigating over: 'vector rejection' #595

Comments

shirish93 commented Jan 28, 2016

piskvorky commented Jan 29, 2016