[Feature] How to modify the special token? I would like to keep <ref> and </ref> in the output, but skip other special tokens.

### Motivation

How to modify the special token set? I would like to keep \<ref> and \</ref> in the output, but skip other special tokens.

I modified the added_tokens.json, special_tokens_map.json by deleting the \<ref> and \</ref>. I also set the "special" attribute in \<ref> and \</ref> from tokenizer_config.json to be false. These approaches did not work.  It worked when I modified the "responses = tokenizer.batch_decode(generation_output, skip_special_tokens=false)" from modeling_internvl_chat.py, but I want to skip other special tokens.


UPDADE: It works after I remove all \<ref> and \</ref> in tokenizer_config.json. However, model outputs \<ref> and \</ref> with white space around them.

Model Output: 1   \<ref> car \</ref> <rbox>({<30.47><63.77><6.42><2.90>|<68>})</rbox>

Ground Truth:  1 \<ref>car\</ref><rbox>({<30.37><64.16><6.53><3.16>|<68>})</rbox>

What should I do? 

Thanks.

### Related resources

_No response_

### Additional context

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] How to modify the special token? I would like to keep <ref> and </ref> in the output, but skip other special tokens. #803

Motivation

Related resources

Additional context

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Feature] How to modify the special token? I would like to keep <ref> and </ref> in the output, but skip other special tokens. #803

Description

Motivation

Related resources

Additional context

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions