Added Function relative_cosine_similarity in keyedvectors.py #2307

rsdel2007 · 2018-12-23T13:20:59Z

Fixes #2175.
I have implemented relative_cosine_similarity as function according to the paper and as @gojomo suggested in #2175 discussion.
According to paper :
rcs(top-n) = cosine_similarity(wordA,wordB)/(sum of cosine_similarities of top-n similar words to wordA)
For finding the top-n similar words I have used method similar_by_word(word,topn).

gojomo · 2018-12-23T17:21:30Z

gensim/models/keyedvectors.py

@@ -1384,7 +1384,42 @@ def init_sims(self, replace=False):
            else:
                self.vectors_norm = (self.vectors / sqrt((self.vectors ** 2).sum(-1))[..., newaxis]).astype(REAL)

+    def relative_cosine_similarity(self, wa, wb, topn=10):
+        """Compute the relative cosine similarity between two words given top-n similar words,
+        proposed by Artuur Leeuwenberg,Mihaela Vela,Jon Dehdari,Josef van Genabith


Spaces after commas.

gojomo · 2018-12-23T17:23:17Z

gensim/models/keyedvectors.py

+        "A Minimally Supervised Approach for Synonym Extraction with Word Embeddings"
+        <https://ufal.mff.cuni.cz/pbml/105/art-leeuwenberg-et-al.pdf>.
+        To calculate relative cosine similarity between two words, equation (1) of the paper is used.
+        For WordNet synonyms, if rcs(topn=10) is greater than 0.10 than wa and wb are more similar than


Second of three 'than's on this line should actually be 'then' (consequently) not 'than' (comparative).

Yeah...Thanks:).

gojomo · 2018-12-23T17:26:02Z

gensim/models/keyedvectors.py


+        return rcs


Need blank line before next method.

gojomo · 2018-12-23T17:27:48Z

gensim/models/keyedvectors.py

+        """
+
+        result = self.similar_by_word(wa, topn)
+        topn_words = []


topn_words is never needed to calculate results - so no good reason to create.

Okay...Done.

gojomo · 2018-12-23T17:29:13Z

gensim/models/keyedvectors.py

+            relative cosine similarity between wa and wb.
+        """
+
+        result = self.similar_by_word(wa, topn)


As this is a list of results, using a plural variable name would be slightly better. Also, it's common in the existing gensim code to call the list-of-most-similar-items sims (short for 'similars'), so I'd recommend that variable name here.

Done. sims is used as variable name.

gojomo · 2018-12-23T17:34:27Z

gensim/models/keyedvectors.py

+
+        topn_cosine = np.array(topn_cosine)
+
+        norm = np.sum(topn_cosine)


norm isn't a good name here, as it usually means something other than a sum.

But, there's not really a need to loop-append, convert-to-np-array, or put the sum calculation in a local variable. The sum can be a short, idiomatic calculation at the place where it's needed as the denominator of the final return-value calculation, for example just: sum(result[1] for result in sims).

gojomo · 2018-12-23T17:38:13Z

Thanks for your contribution! I've added some line-by-line comments; there should also be a unit test method that confirms expected results in some minimal way.

Automatic tests are failing, but that appears unrelated to your changes - @menshikh-iv, it again looks related to some doc-building command.

rsdel2007 · 2018-12-24T04:07:18Z

Yeah, sure I will make the changes.

rsdel2007 · 2018-12-24T15:48:52Z

@gojomo can you help me, how to write unit test?

gojomo · 2018-12-24T18:29:16Z

@rsdel2007 Take a look at the existing tests in https://github.com/RaRe-Technologies/gensim/blob/develop/gensim/test/test_keyedvectors.py as models – and make something that at least minimally uses the new function, and verifies its results according to some idea of where it's beneficial. (There might be ideas in the original paper proposing relative-cosine-similarity about the kinds of word-comparisons where it'd be meaningful. Matching any results claimed there, even in a tiny way, would both help confirm proper operation and highlight proper use.)

While you are developing your test locally, you can run all the tests in test_keyedvectors.py by executing the file itself – its main() runs its own tests. After you commit code, the multiple automatic testing services used by the gensim project will run it along with all others.

rsdel2007 · 2018-12-26T10:25:50Z

I have written the unit test in the test_keyedvectors.py, but when I am running the code this error is coming.
Can you help me here @gojomo ?

gojomo · 2018-12-26T18:41:43Z

It is almost always better to cut & paste actual error text than to use a screenshot - it ensures the text will appear in search results, etc.

Are you sure you're executing the tests inside the same active environment (finding the same working copy of gensim) as where you added relative_cosine_similarity()? I presume you've run the method separately in your own ad-hoc tests... how did you run it then? Can you still run it there?

rsdel2007 · 2018-12-27T08:29:19Z

Yes I am sure that these both are in the same environment. I have checked it twice, but still I got stuck at this.

horpto · 2018-12-27T10:21:27Z

@rsdel2007
Add your tests into PR even they are not completed successfully.

rsdel2007 · 2018-12-27T12:21:26Z

@horpto ,@gojomo I have added the test. Please me tell what changes should I do?

gojomo · 2018-12-27T18:16:37Z

The test is not succeeding - see for example https://travis-ci.org/RaRe-Technologies/gensim/jobs/472632483

But, you absolutely need to figure out locally how to run the tests without the has no attribute 'relative_cosine_similarity'error appearing. Without being at your system, I don't have further ideas, except to say that the error is suggestive that the python/library-search-directories that you're running test_keyedvectors.py with is not your development python-env, but some other one, such as the system default. (You should be using separate envs, as are typically created by virtualenv or venv, to keep projects' libraries separate during development.) You could try looking up other unit-test tutorials or virtual-env tutorials that may be custom to the "visual studio" tool you appear to be using.

The unit-test needs to pass and make sense according to the supposed benefits of this new calculation, ideally by matching the claims/examples of the paper in which it originated.

rsdel2007 · 2018-12-28T17:08:48Z

Thanks @gojomo I figure out the problem.
One problem which I am getting while preparing for the unit test is that results for topn similar words to a given word are not same for most_similar and wordnet. That's why I am getting some differences from the paper.
So according to you what should I do?

gojomo · 2018-12-29T00:17:12Z

Can you fashion a test which probes/demonstrates the same advantages the paper claims for this measure, even if the unit-test environment only has access to much smaller corpuses/vector-sets?

rsdel2007 · 2018-12-29T12:43:07Z

@gojomo I have added the test.

rsdel2007 · 2018-12-30T09:09:49Z

@gojomo @menshikh-iv, Please take a look.

piskvorky

Code style.

gensim/models/keyedvectors.py

rsdel2007 · 2018-12-31T12:25:14Z

@piskvorky I have made the changes. Please take a look.

gojomo · 2018-12-31T15:48:08Z

gensim/test/test_keyedvectors.py

+                cos_sim.append(self.vectors.similarity("good", wordnet_syn[i]))
+        cos_sim = sorted(cos_sim, reverse=True)  # cosine_similarity of "good" with wordnet_syn in decreasing order
+        # computing relative_cosine_similarity of two similar words
+        rcs_wordnet = self.vectors.similarity("good", "nice") / sum(cos_sim[i] for i in range(10))


I'm not sure what this is calculating. It's kind of like the relactive_cosine_similarity() formula, but now with only WordNet synonyms as contributors to the denominator. And, only those synonyms which happen to be in this vector-set. Are all those words in the euclidean_vectors.bin test vectors set? As a result, I'm not sure what the following asserts really test. Is this matching something in the paper?

Actually, this is the problem which I found while making test.There is not any claim or any perfect result in the paper and I can't find any way to confirm on corpus other than wordnet, so I think best way will be to compare the relative_cosine_similarity of wordnet synonyms and most_similar ones under a threshold of 0.125.

Let me give explain the insights of the section relative cosine similarity of the paper_:

Construct a set of the top 10 most (cosine) similar words for w1 (called topn in the paper).

Calculate a normalized score for each of the words in the topn, by dividing by the sum of the topn cosine similarity scores.

They mostly wanted to know if the most similar word of w1 was a synonym or not, and not a synonym/hypernym etc. They expected that if the most (cosine) similar word is a lot more (cosine) similar than the other words in the topn it is more likely to be a synonym, than if it is only slightly more similar. So this is what the rcs takes into account.
So, they come to conclusion which is the only claim in the paper is that if a word pair have a rcs greater 0.10 than it is more likely to be an arbitrary pair.
0.10 can be used threshold but this result is based on wordnet corpus. On a short corpus this result may be more lower. The threshold is nothing but the mean of the cosine_similarities of topn words. So on a short corpus it may be anything less than 0.10.

@gojomo ,Can you suggest some better way to test why looking at above description?

I am looking forward for helping to contribute for the tests

I don't know offhand what's in euclidean_vectors.bin - if there's any overlap with the wordnet words you've chosen. But if there's one or more word-and-nearest-neighbor pairs in that set-of-word-vectors (or some other available-at-unit-testing set-of-word-vectors) that the RCS measure successfully identifies as synonyms, and one or more other word-and-nearest-neighbor pairs that the RCS measure also successfully rejects as synonyms, then having the test method show that functionality would be useful as a demonstration/confirmation of the RCS functionality. (And, at least a little, a guard against any future regressions where that breaks due to other changes... which seems unlikely here, but is one of the reasons for ensuring this kind of test coverage.)

Maybe @viplexke, who originally suggested this in #2175, has some other application/test ideas?

gojomo · 2018-12-31T15:49:44Z

gensim/test/test_keyedvectors.py

+        self.assertTrue(np.allclose(rcs_wordnet, rcs, 0, 0.125))
+        # computing relative_cosine_similarity for two non-similar words
+        rcs = self.vectors.relative_cosine_similarity("good", "worst", 10)
+        self.assertTrue(rcs < 0.10)


Is 0.10 an important threshold from the paper, or just chosen because it works? Is this sort of contrast – between a word good and a near-antonym worst – the sort of thing RCS is supposed to be good for?

gensim/models/keyedvectors.py

viplexke · 2019-01-04T17:42:57Z

Hi, according to the paper, precision of rcs should go up when increasing the similarity threshold, in contrast with plain cosine similarity, which was the main motive behind rcs. So I thought maybe calculating such a precision curve for one or two top10 cosine similarity sets could prove the usefulness. Testing on a few individual pairs may work as well, using the fact that rcs gives lower values for co-hyponyms and related words (false positives) than plain cs. I hope that helps. I'll try to give test cases.. 2019. jan. 2., Sze 0:28 dátummal Gordon Mohr <notifications@github.com> ezt írta:

…

***@***.**** commented on this pull request. ------------------------------ In gensim/test/test_keyedvectors.py <#2307 (comment)> : > @@ -104,6 +104,27 @@ def test_most_similar_topn(self): predicted = self.vectors.most_similar('war', topn=None) self.assertEqual(len(predicted), len(self.vectors.vocab)) + def test_relative_cosine_similarity(self): + """Test relative_cosine_similarity returns expected results with an input of a word pair and topn""" + wordnet_syn = ['good', 'goodness', 'commodity', 'trade_good', 'full', 'estimable', 'honorable', + 'respectable', 'beneficial', 'just', 'upright', 'adept', 'expert', 'practiced', 'proficient', + 'skillful', 'skilful', 'dear', 'near', 'dependable', 'safe', 'secure', 'right', 'ripe', 'well', + 'effective', 'in_effect', 'in_force', 'serious', 'sound', 'salutary', 'honest', 'undecomposed', + 'unspoiled', 'unspoilt', 'thoroughly', 'soundly'] # synonyms for "good" as per wordnet + cos_sim = [] + for i in range(len(wordnet_syn)): + if wordnet_syn[i] in self.vectors.vocab: + cos_sim.append(self.vectors.similarity("good", wordnet_syn[i])) + cos_sim = sorted(cos_sim, reverse=True) # cosine_similarity of "good" with wordnet_syn in decreasing order + # computing relative_cosine_similarity of two similar words + rcs_wordnet = self.vectors.similarity("good", "nice") / sum(cos_sim[i] for i in range(10)) I don't know offhand what's in euclidean_vectors.bin - if there's any overlap with the wordnet words you've chosen. But if there's one or more word-and-nearest-neighbor pairs in that set-of-word-vectors (or some other available-at-unit-testing set-of-word-vectors) that the RCS measure successfully identifies as synonyms, and one or more other word-and-nearest-neighbor pairs that the RCS measure also successfully rejects as synonyms, then having the test method show that functionality would be useful as a demonstration/confirmation of the RCS functionality. (And, at least a little, a guard against any future regressions where that breaks due to other changes... which seems unlikely here, but is one of the reasons for ensuring this kind of test coverage.) Maybe @viplexke <https://github.com/viplexke>, who originally suggested this in #2175 <#2175>, has some other application/test ideas? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2307 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ApC21kDRsoXm4phLDDdBrnl-VBEvwl_6ks5u--80gaJpZM4Zf1m4> .

rsdel2007 · 2019-01-06T14:23:27Z

@viplexke can you provide test cases?

viplexke · 2019-01-08T16:14:40Z

"""test cases in [word, synonym, antonym, co-hyponym, related word] form""" Here are 5 dictionaries with 5 hierarchical category key each. dic_mother={"word":"mother", "synonym":"mom", "antonym":"father", "co-hyponym":"father", "related_word":"birth"} dic_cautious={"word":"cautious", "synonym":"careful", "antonym":"careless", "co-hyponym":"shrewd", "related_word":"danger"} dic_guy={"word":"guy", "synonym":"dude", "antonym":"girl", "co-hyponym":"bachelor", "related_word":"son"} dic_white={"word":"white", "synonym":"milky", "antonym":"black", "co-hyponym":"blue", "related_word":"blank"} dic_bend={"word":"bend", "synonym":"curl", "antonym":"straighten", "co-hyponym":"stretch", "related_word":"river"} For each case, we can calculate 6 values: (R)CS(x.word, x.synonym) / (R)CS(x.word, x.antonym) (R)CS(x.word, x.synonym) / (R)CS(x.word, x.co-hyponym) (R)CS(x.word, x.synonym) / (R)CS(x.word, x.related_word) where (R)CS is the (relative) cosine similarity and each ratio with RCS should be higher thatn the respective ratio with CS: CS(x.word, x.synonym) / CS(x.word, x.antonym) < RCS(x.word, x.synonym) / RCS(x.word, x.antonym) I think this shows that it does what it design for, and that it's not broken. If this fails significantly, then I was wrong :)

…

On Sun, 6 Jan 2019 at 15:23, Rupal Sharma ***@***.***> wrote: @viplexke <https://github.com/viplexke> can you provide test cases? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2307 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ApC21i22Kjz06pbZm39iZBZw6NbS8b6Oks5vAgbqgaJpZM4Zf1m4> .

viplexke · 2019-01-08T16:17:39Z

" each ratio with RCS should be higher thatn the respective ratio with CS" - of course I don't mean the universality. The more the better.

…

On Tue, 8 Jan 2019 at 17:14, Viktor Pless ***@***.***> wrote: """test cases in [word, synonym, antonym, co-hyponym, related word] form""" Here are 5 dictionaries with 5 hierarchical category key each. dic_mother={"word":"mother", "synonym":"mom", "antonym":"father", "co-hyponym":"father", "related_word":"birth"} dic_cautious={"word":"cautious", "synonym":"careful", "antonym":"careless", "co-hyponym":"shrewd", "related_word":"danger"} dic_guy={"word":"guy", "synonym":"dude", "antonym":"girl", "co-hyponym":"bachelor", "related_word":"son"} dic_white={"word":"white", "synonym":"milky", "antonym":"black", "co-hyponym":"blue", "related_word":"blank"} dic_bend={"word":"bend", "synonym":"curl", "antonym":"straighten", "co-hyponym":"stretch", "related_word":"river"} For each case, we can calculate 6 values: (R)CS(x.word, x.synonym) / (R)CS(x.word, x.antonym) (R)CS(x.word, x.synonym) / (R)CS(x.word, x.co-hyponym) (R)CS(x.word, x.synonym) / (R)CS(x.word, x.related_word) where (R)CS is the (relative) cosine similarity and each ratio with RCS should be higher thatn the respective ratio with CS: CS(x.word, x.synonym) / CS(x.word, x.antonym) < RCS(x.word, x.synonym) / RCS(x.word, x.antonym) I think this shows that it does what it design for, and that it's not broken. If this fails significantly, then I was wrong :) On Sun, 6 Jan 2019 at 15:23, Rupal Sharma ***@***.***> wrote: > @viplexke <https://github.com/viplexke> can you provide test cases? > > — > You are receiving this because you were mentioned. > Reply to this email directly, view it on GitHub > <#2307 (comment)>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ApC21i22Kjz06pbZm39iZBZw6NbS8b6Oks5vAgbqgaJpZM4Zf1m4> > . >

rsdel2007 · 2019-01-09T17:17:19Z

@viplexke according to paper,
rcs(wa,wb,topn)= cs(wa,wb)/(sum of cosine_similarities of top-n similar words to wa).
So, rcs(x.word, x.synonym) /rcs(x.word, x.antonym) will be equal to cs(x.word, x.synonym) / cs(x.word, x.antonym).

CS(x.word, x.synonym) / CS(x.word, x.antonym) < RCS(x.word, x.synonym) / RCS(x.word, x.antonym)

So how can this prove the functionality?

viplexke · 2019-01-10T13:03:58Z

You right, I skipped the sum. The formula is CS(x.word, x.synonym) / CS(x.word, x.antonym) < RCS(x.word, x.synonym, dictionary) / RCS(x.word, x.antonym, dictionary) where *dictionary *corresponds to topn

…

On Wed, 9 Jan 2019 at 18:30, Rupal Sharma ***@***.***> wrote: @viplexke <https://github.com/viplexke> according to paper, rcs(wa,wb,topn)= cs(wa,wb)/(sum of cosine_similarities of top-n similar words to wa). So, rcs(x.word, x.synonym) /rcs(x.word, x.antonym) will be equal to cs(x.word, x.synonym) / cs(x.word, x.antonym). CS(x.word, x.synonym) / CS(x.word, x.antonym) < RCS(x.word, x.synonym) / RCS(x.word, x.antonym) So how can this prove the functionality? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#2307 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ApC21p7AQaWBkucz-lU9o2NQ4e0UWYWAks5vBicfgaJpZM4Zf1m4> .

rsdel2007 · 2019-01-10T13:09:44Z

@viplexke Can you please check again. It seems same to me again.

rsdel2007 · 2019-01-10T15:18:45Z

@viplexke Since this feature is not merged, you post it here.
wdyt @gojomo ?

viplexke · 2019-01-10T15:48:09Z

@viplexke Can you please check again. It seems same to me again.

You're right, sorry about that.
I'll find another test and change the test function accordingly.

menshikh-iv

nice, thanks @rsdel2007, I like current PR when @gojomo will be satisfied - I'll merge that (ping me, Gordon)

menshikh-iv · 2019-01-11T09:50:46Z

gensim/models/keyedvectors.py

@@ -195,6 +195,7 @@ class Vocab(object):
    and for constructing binary trees (incl. both word leaves and inner nodes).

    """
+


unrelated changes, please revert all of it (stay PR compact)

menshikh-iv · 2019-01-11T09:51:51Z

gensim/models/keyedvectors.py

@@ -1384,12 +1387,42 @@ def init_sims(self, replace=False):
            else:
                self.vectors_norm = (self.vectors / sqrt((self.vectors ** 2).sum(-1))[..., newaxis]).astype(REAL)

+    def relative_cosine_similarity(self, wa, wb, topn=10):
+        """Compute the relative cosine similarity between two words given top-n similar words,
+        proposed by Artuur Leeuwenberg, Mihaela Vela, Jon Dehdari, Josef van Genabith


to make a proper link in the doc, please use

by `Artuur Leeuwenberg, ... <https://ufal.mff.cuni.cz/pbml/105/art-leeuwenberg-et-al.pdf>`_

Okay..Done.

menshikh-iv · 2019-01-11T09:52:38Z

gensim/models/keyedvectors.py

+        Parameters
+        ----------
+        wa: str
+            word for which we have to look top-n similar word.


Sentence should start from uppercased letter

gensim/models/keyedvectors.py

menshikh-iv · 2019-01-11T11:01:55Z

gensim/models/keyedvectors.py

+        """
+        sims = self.similar_by_word(wa, topn)
+        assert sims, "Failed code invariant: list of similar words must never be empty."
+        rcs = (self.similarity(wa, wb)) / (sum(result[1] for result in sims))


no need wrap left part with ()

Actually, prepend float if this is meant to be a float division. Both to avoid potential errors due to integer operands in python2, and to make the intent clear.

Also, can you please unpack result into appropriately named variables, instead of writing result[1]?

Suggested change

rcs = (self.similarity(wa, wb)) / (sum(result[1] for result in sims))

rcs = float(self.similarity(wa, wb)) / sum(sim for _, sim in sims)

@menshikh-iv cool! You have to teach me how to do that :)

menshikh-iv · 2019-01-11T11:02:27Z

gensim/test/test_keyedvectors.py

+    def test_relative_cosine_similarity(self):
+        """Test relative_cosine_similarity returns expected results with an input of a word pair and topn"""
+        wordnet_syn = ['good', 'goodness', 'commodity', 'trade_good', 'full', 'estimable', 'honorable',
+        'respectable', 'beneficial', 'just', 'upright', 'adept', 'expert', 'practiced', 'proficient',


format it properly, please, like

wordnet_syn = [ 'good', 'goodness', 'commodity', 'trade_good', 'full', 'estimable', 'honorable', 'respectable', 'beneficial', 'just', 'upright', 'adept', 'expert', 'practiced', 'proficient', 'skillful', 'skilful', 'dear', 'near', 'dependable', 'safe', 'secure', 'right', 'ripe', 'well', 'effective', 'in_effect', 'in_force', 'serious', 'sound', 'salutary', 'honest', 'undecomposed', 'unspoiled', 'unspoilt', 'thoroughly', 'soundly', ]

menshikh-iv · 2019-01-14T07:50:06Z

gensim/models/keyedvectors.py

@@ -1385,6 +1385,36 @@ def init_sims(self, replace=False):
            logger.info("precomputing L2-norms of word weight vectors")
            self.vectors_norm = _l2_norm(self.vectors, replace=replace)

+    def relative_cosine_similarity(self, wa, wb, topn=10):
+        """Compute the relative cosine similarity between two words given top-n similar words,
+        by Artuur Leeuwenberg, ... "A Minimally Supervised Approach for Synonym Extraction with Word Embeddings"


should be proper link rendered by sphinx, in this case

by `Artuur ..... <https://ufal.mff.cuni.cz/pbml/105/art-leeuwenberg-et-al.pdf>`_

no, this was an example, still incorrect (+ still missing ` and _)

by `Artur <another authors, paper name> <URL_LINK>`_

Do I have to include another authors and paper name in < > and add link in a separate < >?

just replace with this

`Artuur Leeuwenberga, Mihaela Velab , Jon Dehdaribc, Josef van Genabithbc "A Minimally Supervised Approach for Synonym Extraction with Word Embeddings" <https://ufal.mff.cuni.cz/pbml/105/art-leeuwenberg-et-al.pdf>`_.

Thanks 😄

menshikh-iv · 2019-01-14T07:51:10Z

@gojomo how do you feel about this PR? Looks good for me, waiting approve from you (or notes about what's still should be fixed)

gensim/models/keyedvectors.py

gojomo · 2019-01-14T21:21:19Z

Though the difficulty in demonstrating/testing the value of this calculation has me more doubtful than initially of its value, I suspect the simple implementation is correct, and the test is OK/passing, and there's little risk of harm to users who don't use it, so merging is OK with me!

menshikh-iv · 2019-01-15T02:53:57Z

congratz @rsdel2007 👍

rsdel2007 added 2 commits December 23, 2018 17:52

Added Function relative_cosine_similarity

a0ed2a7

Updated function relative_cosine_similarity

9095f3b

gojomo requested changes Dec 23, 2018

View reviewed changes

Updated Function relative_cosine_similarity

62015f6

Added test for relative_cosine_similarity

e637468

Added unit test for relative_cosine_similarity

96d2114

piskvorky requested changes Dec 31, 2018

View reviewed changes

rsdel2007 added 2 commits December 31, 2018 17:33

Update keyedvectors.py

ac7580d

Updated test_relative_cosine_similarity

1633f3f

gojomo reviewed Dec 31, 2018

View reviewed changes

rsdel2007 added 3 commits January 3, 2019 18:59

updated function relative_cosine_similarity

92d4444

Updated relative_cosine_similarity

80322a7

Update keyedvectors.py

ad3394c

piskvorky requested changes Jan 4, 2019

View reviewed changes

gensim/models/keyedvectors.py Outdated Show resolved Hide resolved

Update keyedvectors.py

cf6ee20

rsdel2007 added 2 commits January 9, 2019 18:11

Merge remote-tracking branch 'origin/develop' into rcs

613d20c

Merge remote-tracking branch 'origin/develop' into rcs

84ab7b7

menshikh-iv suggested changes Jan 11, 2019

View reviewed changes

rsdel2007 and others added 3 commits January 11, 2019 23:28

Update keyedvectors.py

778d01d

Update test_keyedvectors.py

04dec59

Merge remote-tracking branch 'origin/develop' into rcs

a89ee1e

menshikh-iv approved these changes Jan 14, 2019

View reviewed changes

menshikh-iv reviewed Jan 14, 2019

View reviewed changes

Added link properly

36903a3

piskvorky requested changes Jan 14, 2019

View reviewed changes

gensim/models/keyedvectors.py Outdated Show resolved Hide resolved

Update keyedvectors.py

10a28be

menshikh-iv merged commit a864e02 into piskvorky:develop Jan 15, 2019

rsdel2007 deleted the rcs branch January 15, 2019 12:49


		topn_cosine = np.array(topn_cosine)

		norm = np.sum(topn_cosine)

		@@ -195,6 +195,7 @@ class Vocab(object):
		and for constructing binary trees (incl. both word leaves and inner nodes).

		"""

	rcs = (self.similarity(wa, wb)) / (sum(result[1] for result in sims))
	rcs = float(self.similarity(wa, wb)) / sum(sim for _, sim in sims)

Added Function relative_cosine_similarity in keyedvectors.py #2307

Added Function relative_cosine_similarity in keyedvectors.py #2307

Conversation

rsdel2007 commented Dec 23, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gojomo commented Dec 23, 2018 • edited Loading

rsdel2007 commented Dec 24, 2018

rsdel2007 commented Dec 24, 2018

gojomo commented Dec 24, 2018

rsdel2007 commented Dec 26, 2018

gojomo commented Dec 26, 2018

rsdel2007 commented Dec 27, 2018

horpto commented Dec 27, 2018

rsdel2007 commented Dec 27, 2018

gojomo commented Dec 27, 2018

rsdel2007 commented Dec 28, 2018

gojomo commented Dec 29, 2018

rsdel2007 commented Dec 29, 2018

rsdel2007 commented Dec 30, 2018

piskvorky left a comment

Choose a reason for hiding this comment

rsdel2007 commented Dec 31, 2018

Choose a reason for hiding this comment

rsdel2007 Dec 31, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

viplexke commented Jan 4, 2019 via email

rsdel2007 commented Jan 6, 2019

viplexke commented Jan 8, 2019 via email

viplexke commented Jan 8, 2019 via email

rsdel2007 commented Jan 9, 2019 • edited Loading

viplexke commented Jan 10, 2019 via email

rsdel2007 commented Jan 10, 2019

rsdel2007 commented Jan 10, 2019

viplexke commented Jan 10, 2019 • edited Loading

menshikh-iv left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

piskvorky Jan 11, 2019 • edited Loading

Choose a reason for hiding this comment

menshikh-iv Jan 11, 2019 • edited Loading

Choose a reason for hiding this comment

piskvorky Jan 11, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

menshikh-iv Jan 11, 2019 • edited by piskvorky Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

menshikh-iv Jan 14, 2019 • edited Loading

Choose a reason for hiding this comment

rsdel2007 Jan 14, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

menshikh-iv commented Jan 14, 2019

gojomo commented Jan 14, 2019

menshikh-iv commented Jan 15, 2019

gojomo commented Dec 23, 2018 •

edited

Loading

rsdel2007 Dec 31, 2018 •

edited

Loading

rsdel2007 commented Jan 9, 2019 •

edited

Loading

viplexke commented Jan 10, 2019 •

edited

Loading

piskvorky Jan 11, 2019 •

edited

Loading

menshikh-iv Jan 11, 2019 •

edited

Loading

piskvorky Jan 11, 2019 •

edited

Loading

menshikh-iv Jan 11, 2019 •

edited by piskvorky

Loading

menshikh-iv Jan 14, 2019 •

edited

Loading

rsdel2007 Jan 14, 2019 •

edited

Loading