-
Notifications
You must be signed in to change notification settings - Fork 459
Description
Description
In the apply_chat_template
function used for DPO training, there appears to be an issue where generation_prompt
is added even when add_generation_prompt
is not set to True
. This results in repeated assistant turns in the Llama template, potentially affecting the training outcomes.
Steps to Reproduce
- Apply the
apply_chat_template
function as follows:example["text_chosen"] = tokenizer.apply_chat_template(chosen_messages, tokenize=False) example["text_rejected"] = tokenizer.apply_chat_template(rejected_messages, tokenize=False) example["text_prompt"] = tokenizer.apply_chat_template(prompt_messages, tokenize=False)
- Review the outputs in different parts of the dataset.
Expected Behavior
The function should not add generation_prompt
to the outputs unless explicitly set by add_generation_prompt=True
.
Observed Behavior
The outputs include repeated assistant turns in the Llama template, as shown in the examples below:
Prompt sample 14592 of the raw training set:
<|begin_of_text|><|start_header_id|>user<|end_header_id|>
xxxxxx<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Chosen sample 14592 of the raw training set:
<|begin_of_text|><|start_header_id|>assistant<|end_header_id|>
xxxxxx<|eot_id|><|start_header_id|>assistant<|end_header_id|>
Rejected sample 14592 of the raw training set:
<|begin_of_text|><|start_header_id|>assistant<|end_header_id|>
xxxx<|eot_id|><|start_header_id|>assistant<|end_header_id|>
This repetition of the assistant's turn <|start_header_id|>assistant<|end_header_id|>
appears irrespective of the setting of add_generation_prompt
.
Additional Information
- It is unclear whether this behavior affects the training outcomes negatively, but it certainly alters the intended structure of the training data.
- No custom modifications were made to the function; the issue persists with the default implementation.
Please investigate this issue as it might be influencing the training process negatively. Any guidance on the expected outputs and how to correctly use the apply_chat_template
would also be appreciated.