You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey guys !
I had fun reading the paper and thanks for open-sourcing the model.
In the paper, you guys mentioned where [COL] and [VAL] are special tokens for indicating the start of attribute names and values respectively. Meaning that [COL] and [VAL] are special tokens that are to be added to the tokenizer. In the repo https://github.com/megagonlabs/ditto/blob/master/ditto_light/dataset.py#L12, you guys are not adding this as special tokens to the vocabulary of the pre-trained tokenizer.
Any reason why?
The text was updated successfully, but these errors were encountered:
Hey guys !
I had fun reading the paper and thanks for open-sourcing the model.
In the paper, you guys mentioned
where [COL] and [VAL] are special tokens for indicating the start of attribute names and values respectively.
Meaning that[COL]
and[VAL]
are special tokens that are to be added to the tokenizer. In the repo https://github.com/megagonlabs/ditto/blob/master/ditto_light/dataset.py#L12, you guys are not adding this as special tokens to the vocabulary of the pre-trained tokenizer.Any reason why?
The text was updated successfully, but these errors were encountered: