Bert token type embedding #213

eyalmazuz · 2023-12-02T01:52:30Z

I was looking at the example of the Bert you give in the code
but unlike the original paper
I didn't see in the source code a way to add token type embedding

a solution is to manage myself by doing something like

import torch
from torch import nn
from x_transformers import ContinuousTransformerWrapper, Encoder

class Bert(nn.Module):
    def __init__(self, num_tokens, num_types, dim): 
        model = ContinuousTransformerWrapper(
            dim_in = dim,
            dim_out = dim,
            max_seq_len = 1024
            attn_layers = Encoder(
                dim = dim,
                depth = 12,
                heads = 8
            )
        )
        self.type_emb = nn.Embedding(num_types, dim)
        self.token_emb = nn.Embedding(num_tokens, dim)
        
    def forward(self, tokens, types, mask):
        emb = self.token_emb(tokens)
        type_emb = self.type_emb(types)
        
        out = self.model(emb + type_emb, mask)
        
        return out

but it seems weird to handle the word embedding matrix myself when I can just used the regular TransformerWrapper
is there a way to add token type embedding into the model so I could just say something like:

import torch
from x_transformers import TransformerWrapper, Encoder

model = TransformerWrapper(
    num_tokens = 20000,
    max_seq_len = 1024,
    max_token_type = 3,
    attn_layers = Encoder(
        dim = 512,
        depth = 12,
        heads = 8
    )
).cuda()

x = torch.randint(0, 256, (1, 1024)).cuda()
mask = torch.ones_like(x).bool()

model(x, token_type = token_type, mask = mask) # (1, 1024, 20000)

The text was updated successfully, but these errors were encountered:

lucidrains · 2023-12-02T04:20:07Z

@eyalmazuz hey Eyal! thanks for bringing this up

do you want to see if the following works for you in the latest version?

import torch
from x_transformers import TransformerWrapper, Decoder

model = TransformerWrapper(
    num_tokens = 20000,
    max_seq_len = 1024,
    embed_num_tokens = dict(type = 5),
    attn_layers = Decoder(
        dim = 512,
        depth = 12,
        heads = 8
    )
)

x = torch.randint(0, 256, (1, 1024))
types = torch.randint(0, 5, (1, 1024))

logits = model(x, embed_ids = dict(type = types))
logits.shape # (1, 1024, 20000)

eyalmazuz · 2023-12-02T05:08:10Z

@lucidrains Hey Phil!
Thanks for the quick response and fix!
I think that solution is great and could work for me

Thanks again for the feature
I'll close the issue now

lucidrains added a commit that referenced this issue Dec 2, 2023

support any number of additional embeddings, addressing #213

58db2d4

lucidrains added a commit that referenced this issue Dec 2, 2023

support any number of additional embeddings, addressing #213

3ebaa3f

eyalmazuz closed this as completed Dec 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bert token type embedding #213

Bert token type embedding #213

eyalmazuz commented Dec 2, 2023

lucidrains commented Dec 2, 2023 •

edited

eyalmazuz commented Dec 2, 2023

Bert token type embedding #213

Bert token type embedding #213

Comments

eyalmazuz commented Dec 2, 2023

lucidrains commented Dec 2, 2023 • edited

eyalmazuz commented Dec 2, 2023

lucidrains commented Dec 2, 2023 •

edited