Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

run_ot and run_ot_write #17

Closed
kirefu opened this issue Aug 24, 2021 · 4 comments
Closed

run_ot and run_ot_write #17

kirefu opened this issue Aug 24, 2021 · 4 comments

Comments

@kirefu
Copy link

kirefu commented Aug 24, 2021

Hi there,

I am trying to reconcile your code with the description in Algorithm 1 of your paper.

In the paper:

entropy, vocab = get_vocab(optimal matrix)
vocabularies.append(entropy,vocab)
Output v∗ from vocabularies satisfying Eq. 3

VOLT/ot_run.py

Line 141 in c9f2e69

scores[iter_number] = Gs-previous_entropy

However, in the code for run_ot, the transport matrix or a vocabulary set for each timestep t is not stored, only the (vocab_size, entropy) pairs are.

Then run_ot_write() takes this optimal vocab size, and recalculates the transport matrix again, and I don't see how this is different from when it was calculated in the for loop with run_ot, surely the same matrix is outputted? I also don't understand how run_ot_write() is doing the same thing as "Output v∗ from vocabularies satisfying Eq. 3" from Algorithm 1, as there are no vocabs being taken into consideration.

Would be very grateful if you could help clarify the above, as I am keen to implement your work :)

@Jingjing-NLP
Copy link
Owner

Hi. Thanks for you attentions!

Yes! Theses are two equivalent operations. We store all transport matrices at the original version. Due to the large size of transport matrix, it takes a lot of memory usage. Therefore, we slightly change the implementation details. First, at each step, we get the transport matrix and calculate its Eq.3 score. We choose to save the Eq.3 score for each step, rather than transport matrix. Second, after we get the best score, we keep the related step and re-run the ot commands to recover its transport matrix.

Hope this can address your questions. If you have any other questions, please feel free to contact us.

@kirefu
Copy link
Author

kirefu commented Aug 26, 2021

Thanks for your response. For clarification purposes:

"The inner arg max represents that the target is to find the vocabulary from V_S[t] with the maximum MUV scores. The outer arg max means that the target is to enumerate all timesteps and find the vocabulary with the maximum MUV scores."

Is the inner argmax the sinkhorn algorithm, and the outer argmax the for loop over the timesteps?

Best,
Faheem

@Jingjing-NLP
Copy link
Owner

Yes, your understanding is right! There are two argmax operations here. The outer argmax means the maximum value over all timesteps.

@kirefu
Copy link
Author

kirefu commented Sep 10, 2021

Thanks!

@kirefu kirefu closed this as completed Sep 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants