Skip to content

Conversation

@coolmian
Copy link
Contributor

When defining data structures with Pydantic, prevent Unicode-encoded strings from appearing in prompts during the process of converting metadata to JSON schema.

My case:

class Narrative(BaseModel):
    type: Literal["dialogue", "narration", "voiceover"] = Field(description="表示内容类型(narration:叙述, dialogue:对话, voiceover:内心独白)")
    content: str = Field(description="具体内容(注意第一人称转换)")
    reaction: str | None = Field(default=None, description="情绪/反应(仅dialogue和voiceover需要)")
    name: str | None = Field(default=None, description="说话人名字(仅dialogue和voiceover需要)")


class StoryToJSON(dspy.Signature):
    """
    Convert story text into structured JSON format with specific fields for narration, dialogue, and voiceover.
    Make the performance more like a script or animation script style, help the performer better understand the character's emotions and reactions, and make the content more expressive and situational.
    NOTE: Convert each paragraph based on the story_text without skipping or omitting any content.
    """
    story_text = dspy.InputField(desc="小说内容")
    json_output: list[Narrative] = dspy.OutputField(desc="输出json list")

# Define the predictor.
predictor = dspy.Predict(StoryToJSON)

System prompt generated by dspy

before (Please scroll to the right and focus on the JSON schema section):

Your input fields are:
1. `story_text` (str): 小说内容

Your output fields are:
1. `json_output` (list[Narrative]): 输出json list

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## story_text ## ]]
{story_text}

[[ ## json_output ## ]]
{json_output}        # note: the value you produce must be pareseable according to the following JSON schema: {"type": "array", "$defs": {"Narrative": {"type": "object", "properties": {"type": {"type": "string", "description": "\u8868\u793a\u5185\u5bb9\u7c7b\u578b(narration:\u53d9\u8ff0, dialogue:\u5bf9\u8bdd, voiceover:\u5185\u5fc3\u72ec\u767d)", "enum": ["dialogue", "narration", "voiceover"], "title": "Type"}, "content": {"type": "string", "description": "\u5177\u4f53\u5185\u5bb9(\u6ce8\u610f\u7b2c\u4e00\u4eba\u79f0\u8f6c\u6362)", "title": "Content"}, "name": {"anyOf": [{"type": "string"}, {"type": "null"}], "default": null, "description": "\u8bf4\u8bdd\u4eba\u540d\u5b57(\u4ec5dialogue\u548cvoiceover\u9700\u8981)", "title": "Name"}, "reaction": {"anyOf": [{"type": "string"}, {"type": "null"}], "default": null, "description": "\u60c5\u7eea/\u53cd\u5e94(\u4ec5dialogue\u548cvoiceover\u9700\u8981)", "title": "Reaction"}}, "required": ["type", "content"], "title": "Narrative"}}, "items": {"$ref": "#/$defs/Narrative"}}

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Convert story text into structured JSON format with specific fields for narration, dialogue, and voiceover.
        Make the performance more like a script or animation script style, help the performer better understand the character's emotions and reactions, and make the content more expressive and situational.
        NOTE: Convert each paragraph based on the story_text without skipping or omitting any content.

after:

Your input fields are:
1. `story_text` (str): 小说内容

Your output fields are:
1. `json_output` (list[Narrative]): 输出json list

All interactions will be structured in the following way, with the appropriate values filled in.

[[ ## story_text ## ]]
{story_text}

[[ ## json_output ## ]]
{json_output}        # note: the value you produce must be pareseable according to the following JSON schema: {"type": "array", "$defs": {"Narrative": {"type": "object", "properties": {"type": {"type": "string", "description": "表示内容类型(narration:叙述, dialogue:对话, voiceover:内心独白)", "enum": ["dialogue", "narration", "voiceover"], "title": "Type"}, "content": {"type": "string", "description": "具体内容(注意第一人称转换)", "title": "Content"}, "name": {"anyOf": [{"type": "string"}, {"type": "null"}], "default": null, "description": "说话人名字(仅dialogue和voiceover需要)", "title": "Name"}, "reaction": {"anyOf": [{"type": "string"}, {"type": "null"}], "default": null, "description": "情绪/反应(仅dialogue和voiceover需要)", "title": "Reaction"}}, "required": ["type", "content"], "title": "Narrative"}}, "items": {"$ref": "#/$defs/Narrative"}}

[[ ## completed ## ]]

In adhering to this structure, your objective is: 
        Convert story text into structured JSON format with specific fields for narration, dialogue, and voiceover.
        Make the performance more like a script or animation script style, help the performer better understand the character's emotions and reactions, and make the content more expressive and situational.
        NOTE: Convert each paragraph based on the story_text without skipping or omitting any content.

@okhat okhat merged commit 793530c into stanfordnlp:main Nov 11, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants