-
Notifications
You must be signed in to change notification settings - Fork 45
feat: add cot data generation pipeline #42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds a Chain-of-Thought (CoT) data generation pipeline for community-based reasoning. The implementation provides templates and operators for generating structured reasoning paths from knowledge graph communities, enabling the creation of training data for chain-of-thought reasoning.
Key changes include:
- Addition of CoT template design and generation prompts for both Chinese and English
- Implementation of community detection and CoT generation operators
- Code cleanup and standardization across existing template files
Reviewed Changes
Copilot reviewed 13 out of 17 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| requirements-dev.txt | Adds pytest dependency for development testing |
| graphgen/templates/community/cot_template_design.py | Defines prompts for designing CoT reasoning path templates |
| graphgen/templates/community/cot_generation.py | Defines prompts for generating CoT data from templates |
| graphgen/templates/community/init.py | Exports CoT-related prompt constants |
| graphgen/operators/community/generate_cot.py | Implements the main CoT generation pipeline using community detection |
| graphgen/models/community/community_detector.py | Implements Leiden algorithm for community detection |
| graphgen/models/vis/community_visualizer.py | Provides visualization capabilities for community graphs |
| graphgen/templates/coreference_resolution.py | Renames template constant and removes pylint disable |
| graphgen/templates/answer_rephrasing.py | Removes extra blank lines and fixes trailing commas |
| graphgen/templates/init.py | Updates imports to reflect renamed constants |
| graphgen/operators/resolute_coreference.py | Updates to use renamed constant and improves formatting |
| graphgen/models/init.py | Exports new CommunityDetector class |
| README.md | Updates acknowledgements section |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| ) | ||
| reasoning_path = cot_template.split("Reasoning-Path Design:")[1].strip() | ||
| else: | ||
| raise ValueError("COT template format is incorrect.") |
Copilot
AI
Aug 13, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message is not helpful for debugging. It should include information about what format was expected and what was actually received.
| raise ValueError("COT template format is incorrect.") | |
| raise ValueError( | |
| f"COT template format is incorrect. Expected to find either '问题:' and '推理路径设计:' or 'Question:' and 'Reasoning-Path Design:' in the template. Received: {repr(cot_template)}" | |
| ) |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
No description provided.