Skip to content

Allow dpo responses to have tool calls instead of content#280

Merged
connermanuel merged 2 commits intonextfrom
conner/tool-calls-for-dpo
Feb 27, 2026
Merged

Allow dpo responses to have tool calls instead of content#280
connermanuel merged 2 commits intonextfrom
conner/tool-calls-for-dpo

Conversation

@connermanuel
Copy link
Contributor

This PR allows DPO outputs to have tool calls instead of content.
Since the input uses existing messages validation, it already supports tool calls.

Side note: no changes are needed to together-py to support reasoning in assistant messages.

if "content" not in example[key][0] and "tool_calls" not in example[key][0]:
raise InvalidFileFormatError(
message=f"The dataset is malformed, the first element of `{key}` must have a 'content' field on line {idx + 1}.",
message=f"The dataset is malformed, the first element of `{key}` must have a 'content' or 'tool_calls' field on line {idx + 1}.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logic above feels confusing when comparing to this message. I had to think it out to realize it is correct. But I wonder if the above could be changed to be:

if ("content" in example[key][0] or "tool_calls" in example[key][0]) == False:

Is that cleaner? open to either decision.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think for the best readability here, we can do something like

contains_response_field = "content" in example[key][0] or "tool_calls" in example[key][0]
if not contains_response_field:
...

what do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I like that

@connermanuel connermanuel merged commit d05090c into next Feb 27, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants