Adding custom tokens #29

ajaybabu20 · 2022-10-28T14:09:43Z

Hey guys !
I had fun reading the paper and thanks for open-sourcing the model.

In the paper, you guys mentioned where [COL] and [VAL] are special tokens for indicating the start of attribute names and values respectively. Meaning that [COL] and [VAL] are special tokens that are to be added to the tokenizer. In the repo https://github.com/megagonlabs/ditto/blob/master/ditto_light/dataset.py#L12, you guys are not adding this as special tokens to the vocabulary of the pre-trained tokenizer.

Any reason why?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding custom tokens #29

Adding custom tokens #29

ajaybabu20 commented Oct 28, 2022

Adding custom tokens #29

Adding custom tokens #29

Comments

ajaybabu20 commented Oct 28, 2022