Skip to content
This repository has been archived by the owner on Mar 19, 2024. It is now read-only.

Understanding and deactivating lrUpdateRate during word representation training #1217

Open
orestxherija opened this issue Aug 25, 2021 · 1 comment

Comments

@orestxherija
Copy link

I would like to learn word vectors in an unsupervised fashion using a large corpus. I am trying to understand a bit how the lrUpdateRate parameter works. It appears to play the role of a learning rate scheduler but most likely not in the traditional sense, where the scheduler essentially is a function from epochs or batch indices to a real number (the new learning rate). Here the default value for lrUpdateRate is 100 and I am not sure what that 100 refers to: is it number of tokens, number of batches, epochs? I inspected carefully the source code but since I am only superficially familiar with C++, I was not able to decipher the exact use of that parameter. I did, however, get the impression that it's being defined in a non-standard way. I was wondering if anybody can provide some insight about:

  1. What exactly does that default value for lrUpdateRate refer to?
  2. How can I deactivate lrUpdateRate during unsupervised training? Do I need to supply a really large number as its value or a really small one (e.g. 0)?

I have used fastText extensively so far and this is a good opportunity to learn even more about its inner workings. Unfortunately, there is next to nothing on the web about this particular parameter.

@orestxherija
Copy link
Author

orestxherija commented Aug 26, 2021

I believe I figured out what lrUpdateRate controls. Essentially, it is the factor by which one divides the learning rate. It's not clear to me how frequently this update occurs, most likely at the end of each epoch. If my understanding is correct, then one deactivates this parameter simply by setting it equal to 1.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant