Skip to content

fix(llm): fix CN prompt output example format to match EN version#340

Closed
linmengmeng-1314 wants to merge 3 commits into
apache:mainfrom
linmengmeng-1314:fix/cn-prompt-output-example
Closed

fix(llm): fix CN prompt output example format to match EN version#340
linmengmeng-1314 wants to merge 3 commits into
apache:mainfrom
linmengmeng-1314:fix/cn-prompt-output-example

Conversation

@linmengmeng-1314
Copy link
Copy Markdown
Contributor

@linmengmeng-1314 linmengmeng-1314 commented May 20, 2026

Summary

  • Fix CN property graph extraction prompt output example: change from flat array [{...}, {...}] to structured format {"vertices":[...], "edges":[...]}
  • Add missing "id" field in edge example to keep CN/EN consistent

Problem

The CN prompt (graph_extract_prompt_CN) used a flat array format for the output example, while:

  • The EN prompt (graph_extract_prompt_EN) uses {"vertices":[...], "edges":[...]}
  • The actual parsing logic in property_graph_extract.py expects the structured format

This mismatch could cause LLM to output incorrect JSON structure when processing Chinese text.

Test plan

  • Verify ruff format and check pass
  • Confirm CN output example now matches EN structure

🤖 Generated with Claude Code

huaun-develop and others added 3 commits May 18, 2026 19:57
…tputs

Different LLMs return graph extraction results in varying formats:
- Some wrap JSON in markdown code blocks (```json ... ```)
- Some return a flat array of vertices/edges instead of a structured object

This causes json.JSONDecodeError when the greedy regex ({.*}) captures
invalid content from markdown-wrapped or array-formatted responses.

Changes:
- Strip markdown code blocks before JSON extraction
- Support both object ({...}) and array ([...]) JSON formats
- Auto-convert flat arrays to {"vertices": [...], "edges": [...]} format

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The CN property graph extraction prompt used a flat array format
[{...}, {...}] for the output example, while the EN version and the
actual parsing logic expect {"vertices":[...], "edges":[...]}.
Also add missing "id" field in edge example to keep CN/EN consistent.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@dosubot dosubot Bot added size:S This PR changes 10-29 lines, ignoring generated files. bug Something isn't working labels May 20, 2026
@linmengmeng-1314
Copy link
Copy Markdown
Contributor Author

Closing this PR due to branch conflict. Replaced by #341 with a clean branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working llm size:S This PR changes 10-29 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants