-
Notifications
You must be signed in to change notification settings - Fork 104
Description
您好,我尝试基于agent-cpm进行sft,完全按照readme的构造格式,我的数据如下:
{'id': '0', 'image': 'GUI_Agent/data/data_source/ac/screenshot_10115_0.png', 'conversations': [{'role': 'system', 'content': '# Role\n你是一名熟悉安卓系统触屏GUI操作的智能体,将根据用户的问题,分析当前界面的GUI元素和布局,生成相应的操作。\n\n# Task\n针对用户问题,根据输入的当前屏幕截图,输出下一步的操作。\n\n# Rule\n- 以紧凑JSON格式输出\n- 输出操作必须遵循Schema约束\n\n# Schema\n{"type":"object","description":"执行操作并决定当前任务状态","additionalProperties":false,"required":["thought"],"properties":{"thought":{"type":"string","description":"智能体的思维过程"},"POINT":{"$ref":"#/$defs/Location","description":"点击屏幕上的指定位置"},"to":{"description":"移动,组合手势参数","oneOf":[{"enum":["up","down","left","right"],"description":"从当前点(POINT)出发,执行滑动手势操作,方向包括向上、向下、向左、向右"},{"$ref":"#/$defs/Location","description":"移动到某个位置"}]},"duration":{"type":"integer","description":"动作执行的时间或等待时间,毫秒","minimum":0,"default":200},"PRESS":{"type":"string","description":"触发特殊按键,HOME为回到主页按钮,BACK为返回按钮,ENTER为回车按钮","enum":["HOME","BACK","ENTER"]},"TYPE":{"type":"string","description":"输入文本"},"INTERACT":{"type":"string","description":"与用户发起交互的内容"},"STATUS":{"type":"string","description":"当前任务的状态。特殊情况:satisfied,无需操作;impossible,任务无法完成;interrupt,任务中断;need_feedback,需要用户反馈;","enum":["continue","finish","satisfied","impossible","interrupt","need_feedback"],"default":"continue"}},"$defs":{"Location":{"type":"array","description":"坐标为相对于屏幕左上角位原点的相对位置,并且按照宽高比例缩放到0~1000,数组第一个元素为横坐标x,第二个元素为纵坐标y","items":{"type":"integer","minimum":0,"maximum":1000},"minItems":2,"maxItems":2}}}'}, {'role': 'user', 'content': '<Question>Open the California Pizza app, then add a Miami Beast pizza in large size with a thin crust and make sure it is gluten-free, and add to the cart.</Question>\n当前屏幕截图:<image>'}, {'role': 'assistant', 'content': '{\'thought\': "I am given the task to add a Miami Beast pizza with large size, thin crust, and ensure it is gluten-free. However, in many pizza ordering contexts, thin crust and gluten-free crust are often mutually exclusive options—thin crust typically contains gluten, while gluten-free crust is a separate, specific type. The wording \'thin crust\' and \'gluten-free\' in the same request creates a direct conflict because they may not be simultaneously selectable. This requires confirmation to prioritize one constraint over the other, as proceeding without clarification risks violating the user\'s intent.", \'STATUS\': \'continue\', \'INTERACT\': \'Should I prioritize the thin crust or ensure the pizza is gluten-free?\'}'}], 'episode_id': 87, 'step_id': 0}
报错来自conversation_to_ids的
if len(image_start_tokens) != len(image_end_tokens):
logger.error("image start token != image end tokens")
raise Exception(f"image start token != image end tokens, ({len(image_start_tokens)},{len(image_end_tokens)})")
报错Exception: image start token != image end tokens, (4,3),使用的LLM_TYPE="qwen"