You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the file conversation.py, the Llama-3 chat is given by the line 107
self.tokenizer.apply_chat_template(chat_template_messages, tokenize=False, add_generation_prompt=False)
which means the token <|start_header_id|> and <|end_header_id|> will be inserted automatically by the chat template of the tokenizer. However the token <|start_header_id|> is also in the roles as well (line 353)
roles=("<|start_header_id|>user", "<|start_header_id|>assistant"),
So the token <|start_header_id|> will be duplicated in the output like this:
<|start_header_id|><|start_header_id|>user<|end_header_id|>\n\n....<|eot_id|><|start_header_id|><|start_header_id|>assistant<|end_header_id|>\n\n...
Is this the correct behavior?
The text was updated successfully, but these errors were encountered:
In the file conversation.py, the Llama-3 chat is given by the line 107
self.tokenizer.apply_chat_template(chat_template_messages, tokenize=False, add_generation_prompt=False)
which means the token <|start_header_id|> and <|end_header_id|> will be inserted automatically by the chat template of the tokenizer. However the token <|start_header_id|> is also in the roles as well (line 353)
roles=("<|start_header_id|>user", "<|start_header_id|>assistant"),
So the token <|start_header_id|> will be duplicated in the output like this:
<|start_header_id|><|start_header_id|>user<|end_header_id|>\n\n....<|eot_id|><|start_header_id|><|start_header_id|>assistant<|end_header_id|>\n\n...
Is this the correct behavior?
The text was updated successfully, but these errors were encountered: