Skip to content

Conversation

Jintao-Huang
Copy link
Collaborator

No description provided.

Copy link
Contributor

Summary of Changes

Hello @Jintao-Huang, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the documentation related to custom dataset formats. It specifically addresses the "rejected_messages" field, providing crucial clarification for developers working with Agent scenarios by explicitly stating the requirement to include "rejected_tools" when using this flexible message structure. This update ensures that users have a clearer understanding of data formatting for advanced use cases.

Highlights

  • Documentation Update: Clarified the use of "rejected_tools" within "rejected_messages" for Agent scenarios in the custom dataset documentation.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the documentation for custom datasets in both Chinese and English to include information about using rejected_tools in Agent scenarios for RLHF. The changes are accurate, but to improve clarity, I've suggested adding an example for the Agent scenario, as currently only a multimodal example is provided. This will help users better understand how to format their data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The documentation now mentions rejected_tools for Agent scenarios, which is great. However, it only provides a multimodal example. To improve clarity, it would be beneficial to add an example for an Agent scenario using rejected_tools as well.

Also, you could change "多模态数据格式例子如下" (An example of the multimodal data format is as follows) to "数据格式例子如下" (An example of the data format is as follows).

Here is a sample DPO data format for an Agent scenario that you can use for reference:

{"tools": "[{\"type\": \"function\", \"function\": {\"name\": \"get_weather\", \"description\": \"获取天气信息\"}}]", "messages": [{"role": "user", "content": "北京的天气怎么样?"}, {"role": "tool_call", "content": "{\"name\": \"get_weather\", \"arguments\": {\"city\": \"北京\"}}"}], "rejected_messages": [{"role": "user", "content": "北京的天气怎么样?"}, {"role": "tool_call", "content": "{\"name\": \"get_stock_price\", \"arguments\": {\"company\": \"Alibaba\"}}"}], "rejected_tools": "[{\"type\": \"function\", \"function\": {\"name\": \"get_stock_price\", \"description\": \"获取股票价格\"}}]"}
Suggested change
当然,你也可以直接使用`rejected_messages`,而不是只提供`rejected_response`/`rejected_images`(需ms-swift>=3.8),这将提供更大的灵活度(例如多模态/agent场景)。若使用rejected_messages,在多模态场景下,你需要额外传入"rejected_images","rejected_audios","rejected_videos"等内容;在Agent场景下,你需要额外传入"rejected_tools"等内容。多模态数据格式例子如下
当然,你也可以直接使用`rejected_messages`,而不是只提供`rejected_response`/`rejected_images`(需ms-swift>=3.8),这将提供更大的灵活度(例如多模态/agent场景)。若使用rejected_messages,在多模态场景下,你需要额外传入"rejected_images","rejected_audios","rejected_videos"等内容;在Agent场景下,你需要额外传入"rejected_tools"等内容。数据格式例子如下

> Note: RM additionally supports the margin column. For details, refer to the [RM documentation](../Instruction/RLHF.md#rm).
Sure, you can also directly use `rejected_messages` instead of only providing `rejected_response` / `rejected_images` (requires ms-swift>=3.8), which offers greater flexibility (e.g., for multimodal or agent scenarios). In multimodal cases, if you use `rejected_messages`, you need to additionally provide fields such as `"rejected_images"`, `"rejected_audios"`, `"rejected_videos"`, etc. An example of the data format is as follows:
Sure, you can also directly use `rejected_messages` instead of only providing `rejected_response` / `rejected_images` (requires ms-swift>=3.8), which offers greater flexibility (e.g., for multimodal or agent scenarios). If you use "rejected_messages", then in multimodal scenarios you must also provide "rejected_images", "rejected_audios", "rejected_videos", etc.; in Agent scenarios you must also provide "rejected_tools", etc. An example of the multimodal data format is as follows:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The documentation now mentions rejected_tools for Agent scenarios, which is great. However, it only provides a multimodal example. To improve clarity, it would be beneficial to add an example for an Agent scenario using rejected_tools as well.

Also, you could change "An example of the multimodal data format is as follows:" to "An example of the data format is as follows:".

Here is a sample DPO data format for an Agent scenario that you can use for reference:

{"tools": "[{\"type\": \"function\", \"function\": {\"name\": \"get_weather\", \"description\": \"Get weather information\"}}]", "messages": [{"role": "user", "content": "What's the weather in Beijing?"}, {"role": "tool_call", "content": "{\"name\": \"get_weather\", \"arguments\": {\"city\": \"Beijing\"}}"}], "rejected_messages": [{"role": "user", "content": "What's the weather in Beijing?"}, {"role": "tool_call", "content": "{\"name\": \"get_stock_price\", \"arguments\": {\"company\": \"Google\"}}"}], "rejected_tools": "[{\"type\": \"function\", \"function\": {\"name\": \"get_stock_price\", \"description\": \"Get stock price\"}}]"}
Suggested change
Sure, you can also directly use `rejected_messages` instead of only providing `rejected_response` / `rejected_images` (requires ms-swift>=3.8), which offers greater flexibility (e.g., for multimodal or agent scenarios). If you use "rejected_messages", then in multimodal scenarios you must also provide "rejected_images", "rejected_audios", "rejected_videos", etc.; in Agent scenarios you must also provide "rejected_tools", etc. An example of the multimodal data format is as follows:
Sure, you can also directly use `rejected_messages` instead of only providing `rejected_response` / `rejected_images` (requires ms-swift>=3.8), which offers greater flexibility (e.g., for multimodal or agent scenarios). If you use "rejected_messages", then in multimodal scenarios you must also provide "rejected_images", "rejected_audios", "rejected_videos", etc.; in Agent scenarios you must also provide "rejected_tools", etc. An example of the data format is as follows:

@Jintao-Huang Jintao-Huang merged commit 6588fa4 into modelscope:main Sep 21, 2025
1 of 2 checks passed
Jintao-Huang added a commit that referenced this pull request Sep 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants