applying regularisation #7

ruoyzhang · 2018-10-13T21:28:42Z

Hi theeluwin!

First of all thanks for the code, it was well written and helped me a ton in building my own word2vec model.

This is not an issue per se, but something I'm potentially adding to the word2vec model using your code, the main idea is to use regularisation on embeddings in a temporal setting. I've run into trouble with the code and I'm wondering if you'd be so kind as to help out!

the main idea is that I'm training 2 sets of models (model 0 & 1) consecutively based on 2 sets of corpora, the 2 sets are temporally adjacent (say news articles of 01/jan and 02/jan), during the training of model 1, I'd like to add a penalty term to the loss/cost function:
for all the words in set(vocab_0)&set(vocab_1), I'd like to minimise the distance of the same word's embeddings from period 0 & 1.

I'm not sure if it makes sense!

So far I'm testing on embeddings of rather small dimensions ~ 20, therefore I'm using the Euclidean distance as a measure.

based on your code, I added a fordward_r function in the Word2Vec class:
`
def forward_r(self, data):

if data is not None:
    v = LT(data)
    v = v.cuda() if self.ivectors.weight.is_cuda else v
    return(self.ivectors(v))
else:
    return(None)

`

This function simply extracts the relevant embeddings (words from the intersection of the 2 vocabs)

and then in the SGNS, I'm now only testing on 1 particular embedding, I added the following loss calculation that look like this:

rvectors = self.embedding.forward_r(rwords)
rloss = 3*((rvectors.squeeze() - self.vector3)**2).sum()

and finally it woud return the following total loss:
return -(oloss + nloss).mean() + rloss

However the problem is, the loss gets stuck, it never updates, and it appears that the back propagation is not working properly.

As you can probably tell, I'm rather new to pytorch and I'm not sure if you could lend me a hand on what's happening!

Thank you so much in advance!

The text was updated successfully, but these errors were encountered:

ruoyzhang · 2018-10-14T19:34:23Z

Hi!
I've managed to resolve this problem! sorry for creating this superficial issue, I'll go ahead and close it now :)

Thanks again for your code! much appreciated :D

ruoyzhang closed this as completed Oct 14, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

applying regularisation #7

applying regularisation #7

ruoyzhang commented Oct 13, 2018

ruoyzhang commented Oct 14, 2018

applying regularisation #7

applying regularisation #7

Comments

ruoyzhang commented Oct 13, 2018

ruoyzhang commented Oct 14, 2018