Form of the h function for infinite dataset #13

brotherofken · 2019-12-27T10:55:11Z

Thanks for the great code and paper!

I have a question regarding the form of h function. I have a huge dataset, thus it's impossible to store all embeddings in memory, so I decided to increase the batch size and mine negatives from it.
So far so good, but from my understanding due to big dataset size nominator almost equals denominator and h approaches 1.

Do you think that it's a good idea to replace h with an angular similarity between embeddings instead of the ratio proposed in the paper? Or maybe you could kindly propose some other appropriate choice for h?

The text was updated successfully, but these errors were encountered:

HobbitLong · 2019-12-27T11:15:37Z

Hi, @brotherofken ,

Thanks for your interest. For the eq. 19 in the paper, h will automatically work if you set N and M as the number of negatives to pair each positive and the number of the dataset size, respectively. h approaches to 1 at the beginning, bu will be adjusted very quickly as the training proceeds. This is how NCE works.

Angular similarity might also work, however loses the spirit of posterior probability in NCE.

brotherofken · 2020-01-17T09:05:37Z

Thanks for the quick response!

I read the paper carefully and found that I missed that the temperature in (19) has to be quite low (0.02-0.3 in your experiments) to compensate for the small value of the N/M ratio.
That became clear now. Thanks!

brotherofken changed the title ~~h-function for infinite dataset~~ Form of the h function for infinite dataset Dec 27, 2019

brotherofken changed the title ~~Form of the h function for infinite dataset~~ Form of the h function for infinite dataset Dec 27, 2019

HobbitLong closed this as completed Jan 7, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Form of the h function for infinite dataset #13

Form of the h function for infinite dataset #13

brotherofken commented Dec 27, 2019

HobbitLong commented Dec 27, 2019

brotherofken commented Jan 17, 2020 •

edited

Loading

Form of the h function for infinite dataset #13

Form of the h function for infinite dataset #13

Comments

brotherofken commented Dec 27, 2019

HobbitLong commented Dec 27, 2019

brotherofken commented Jan 17, 2020 • edited Loading

brotherofken commented Jan 17, 2020 •

edited

Loading