Shared Embeddings #2

adrian-spataru · 2020-12-06T16:07:37Z

Sharing the token_emb between Encoder & Decoder is not by default. Lot of transformers like BART/T5 use a shared encoder/decoder embedding.

model = XTransformer(
    dim = 512,
    enc_num_tokens = 256,
    enc_depth = 6,
    enc_heads = 8,
    enc_max_seq_len = 1024,
    dec_num_tokens = 256,
    dec_depth = 6,
    dec_heads = 8,
    dec_max_seq_len = 1024,
    enc_num_memory_tokens = 0,
    
)

model.decoder.token_emb = model.encoder.token_emb

Would be this enough?

Furthermore the example for Encoder/Decoder in ReadMe doesn't work out of the box, it needs also a value for enc_num_memory_tokens

The text was updated successfully, but these errors were encountered:

lucidrains · 2020-12-06T16:59:38Z

@adrian-spataru yes! Would you be willing to submit a PR?

My desktop machine broke, and we gave stay at home orders because of this pandemic 👍

adrian-spataru · 2020-12-06T17:58:06Z

Sorry to hear that @lucidrains! Hopefully, no data loss.
I will run different experiments with these transformers in the following weeks, if I find any issues, I'll submit a PR in the future.

lucidrains · 2020-12-06T19:01:24Z

thanks Adrian :)

lucidrains · 2020-12-07T04:03:43Z

@adrian-spataru added it as a keyword tie_token_emb for XTransformer

adrian-spataru closed this as completed Dec 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shared Embeddings #2

Shared Embeddings #2

adrian-spataru commented Dec 6, 2020

lucidrains commented Dec 6, 2020

adrian-spataru commented Dec 6, 2020

lucidrains commented Dec 6, 2020

lucidrains commented Dec 7, 2020

Shared Embeddings #2

Shared Embeddings #2

Comments

adrian-spataru commented Dec 6, 2020

lucidrains commented Dec 6, 2020

adrian-spataru commented Dec 6, 2020

lucidrains commented Dec 6, 2020

lucidrains commented Dec 7, 2020