padding_idx=masking_idx in ByteNetLMTime instantiation arguments #32

cmarkak · 2024-02-21T03:18:12Z

The following code in train.py assigns padding_idx=masking_idx in the model initiation. This is conflicting with the definition above for padding_idx which is different from masking_idx. Is this an oversight or there is a particular reason for this assignment?

padding_idx = tokenizer.pad_id # PROTEIN_ALPHABET.index(PAD)
masking_idx = tokenizer.mask_id
print('Using {} as padding index'.format(padding_idx))
print('Using {} as masking index'.format(masking_idx))
#if args.model_type == 'ByteNet':
model = ByteNetLMTime(n_tokens, d_embed, d_model, n_layers, kernel_size, r,
causal=causal, padding_idx=masking_idx, rank=weight_rank, dropout=args.dropout,
tie_weights=args.tie_weights, final_ln=args.final_norm, slim=slim, activation=activation,
timesteps=diffusion_timesteps)

Thank you in advance

The text was updated successfully, but these errors were encountered:

sarahalamdari · 2024-04-23T14:00:09Z

This is done on purpose. We follow how ESM handles mask tokens. Padding is handled with input_mask

sarahalamdari closed this as completed Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

padding_idx=masking_idx in ByteNetLMTime instantiation arguments #32

padding_idx=masking_idx in ByteNetLMTime instantiation arguments #32

cmarkak commented Feb 21, 2024

sarahalamdari commented Apr 23, 2024

padding_idx=masking_idx in ByteNetLMTime instantiation arguments #32

padding_idx=masking_idx in ByteNetLMTime instantiation arguments #32

Comments

cmarkak commented Feb 21, 2024

sarahalamdari commented Apr 23, 2024