Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared Embeddings #2

Closed
adrian-spataru opened this issue Dec 6, 2020 · 4 comments
Closed

Shared Embeddings #2

adrian-spataru opened this issue Dec 6, 2020 · 4 comments

Comments

@adrian-spataru
Copy link
Contributor

Sharing the token_emb between Encoder & Decoder is not by default. Lot of transformers like BART/T5 use a shared encoder/decoder embedding.

model = XTransformer(
    dim = 512,
    enc_num_tokens = 256,
    enc_depth = 6,
    enc_heads = 8,
    enc_max_seq_len = 1024,
    dec_num_tokens = 256,
    dec_depth = 6,
    dec_heads = 8,
    dec_max_seq_len = 1024,
    enc_num_memory_tokens = 0,
    
)

model.decoder.token_emb = model.encoder.token_emb

Would be this enough?

Furthermore the example for Encoder/Decoder in ReadMe doesn't work out of the box, it needs also a value for enc_num_memory_tokens

@lucidrains
Copy link
Owner

@adrian-spataru yes! Would you be willing to submit a PR?

My desktop machine broke, and we gave stay at home orders because of this pandemic 👍

@adrian-spataru
Copy link
Contributor Author

Sorry to hear that @lucidrains! Hopefully, no data loss.
I will run different experiments with these transformers in the following weeks, if I find any issues, I'll submit a PR in the future.

@lucidrains
Copy link
Owner

thanks Adrian :)

@lucidrains
Copy link
Owner

@adrian-spataru added it as a keyword tie_token_emb for XTransformer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants