Skip to content
This repository has been archived by the owner on Jul 7, 2023. It is now read-only.

Doubts on the paper "universal transformers". #1215

Closed
futaoo opened this issue Nov 12, 2018 · 9 comments
Closed

Doubts on the paper "universal transformers". #1215

futaoo opened this issue Nov 12, 2018 · 9 comments

Comments

@futaoo
Copy link

futaoo commented Nov 12, 2018

Description

The detailed figure 4 in appendix seems to do not follow the iterative equations (4)(5) in the paper. If I follow the figure, it should be H^t = LayerNorm(A^t+Transition(A^t)), and A^t = LayerNorm(H^(t-1)+P^t+MultiHeadSelfAttention(H^(t-1)+P^t)). It is very confusing. Could anyone help me to figure this doubt out? Thank you!

@senarvi
Copy link
Contributor

senarvi commented Nov 12, 2018

I'm pretty sure there's a typo in equation 4.

@futaoo
Copy link
Author

futaoo commented Nov 13, 2018

@senarvi thx, I consider the same as you.

@lkluo
Copy link

lkluo commented Nov 13, 2018

I believe Eq 4 is typo. Eq 5 may be typo as well, but could also be misinterpretation of Figure 4. I think you can have a check on the code to figure it out.

@MostafaDehghani
Copy link
Contributor

Yes! there are small typos as well as a problem in fig4 in the current arXiv version of the paper. We'll update it soon. In the meantime, you can check the slides here and, as always, a better way to understand what's going on exactly is digging into the code :)

@futaoo
Copy link
Author

futaoo commented Nov 16, 2018

@MostafaDehghani Very lucky to have the slides, thanks!

@colmantse
Copy link

Hi @MostafaDehghani , thank you for the slides! they are really helpful. On a side note, may i inquire if UT and transformer both use the EN-DE default generator provided in the tensor2tensor library? i noticed the version is the same, but i want to be certain.

@MostafaDehghani
Copy link
Contributor

Yes, we used problem=translate_ende_wmt32k for all the MT experiments, both with Transformer and Univeral Transformer.

@colmantse
Copy link

thank you

@afrozenator
Copy link
Contributor

Thanks @MostafaDehghani and others.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants