English | νκ΅μ΄
Transform research papers into executable code with a structured, multi-phase pipeline.
There are great paper-to-code tools like DeepCode, but they require separate API costs for each run. If you're already using Claude Code with a subscription, why pay extra?
This skill brings the same structured, multi-phase approach directly into Claude Code β no additional API costs, just your existing subscription.
Paper2Code is a Claude Code Skill that systematically converts research papers (PDF/arXiv) into fully functional, reproducible code. Unlike naive approaches that simply feed papers to an LLM, this skill uses a structured intermediate representation (YAML) to ensure accuracy and completeness.
- 4+2 Phase Pipeline: Algorithm Extraction β Concept Analysis β Code Planning β Implementation (+ Reference Search & Memory Management)
- Structured YAML Intermediate Representation: Not just code generation, but systematic knowledge extraction
- Self-Check Mechanisms: Built-in verification at each phase to ensure completeness
- Behavioral Controls: DO/DON'T guidelines to prevent common implementation mistakes
- Reference-Augmented Generation: Optional reference code search for implementation quality
Option 1: Personal Installation (Recommended)
# Clone the repository
git clone https://github.com/issol14/paper2code-skill.git
# Copy to your Claude skills directory
cp -r paper2code-skill/skills/paper2code ~/.claude/skills/Option 2: Project Installation
# Add to your project's .claude/skills directory
mkdir -p .claude/skills
cp -r paper2code-skill/skills/paper2code .claude/skills/Option 3: Let Claude do it for you
Just paste this into Claude Code:
Install the paper2code skill from https://github.com/issol14/paper2code-skill
π For LLM Agents (click to expand)
If you're an LLM Agent helping a user install paper2code-skill, follow these steps:
# Check if pdftotext is available (optional but recommended)
command -v pdftotext &> /dev/null && echo "pdftotext: OK" || echo "pdftotext: Not installed (optional)"If pdftotext is not installed, inform the user:
pdftotext is recommended for PDF processing. Install with:
- Ubuntu/Debian: sudo apt install poppler-utils
- macOS: brew install poppler
- Windows: Install poppler and add to PATH
# Create the Claude skills directory if it doesn't exist
mkdir -p ~/.claude/skills# Clone the repository
git clone https://github.com/issol14/paper2code-skill.git /tmp/paper2code-skill
# Copy skill files to Claude skills directory
cp -r /tmp/paper2code-skill/skills/paper2code ~/.claude/skills/
# Clean up
rm -rf /tmp/paper2code-skill# Verify the skill files exist
ls ~/.claude/skills/paper2code/Expected output should show:
01_algorithm_extraction.md
02_concept_analysis.md
03_code_planning.md
04_implementation_guide.md
05_reference_search.md
06_memory_management.md
README.md
SKILL.md
Tell the user:
paper2code-skill has been installed successfully!
You can now use it by providing a paper URL or PDF path:
- "https://arxiv.org/abs/2301.12345 implement this paper"
- "/path/to/paper.pdf implement this paper"
The skill will automatically activate when you request paper implementation.
Once installed, Claude Code will automatically activate the skill when you request paper implementation:
# From arXiv URL
"https://arxiv.org/abs/2301.12345 implement this paper"
# From PDF file
"/path/to/paper.pdf implement the algorithm from this paper"
# Specific section
"Implement only Algorithm 2 from Section 3 of this paper"
User: https://arxiv.org/abs/2312.00752 implement this paper
Claude: I'll analyze the paper and convert it to code.
[Phase 1: Extracting algorithms...]
β Saved 01_algorithm_extraction.yaml
[Phase 2: Analyzing concepts...]
β Saved 02_concept_analysis.yaml
[Phase 3: Creating implementation plan...]
β Saved 03_implementation_plan.yaml
[Phase 4: Implementing code...]
β Created config.py
β Created models/network.py
β ...
β Created main.py
β Created README.md
Implementation complete. Run with `python main.py`.
User: Implement this paper. First, search for similar implementations.
Claude: I'll search for reference code before implementing.
[Phase 0: Searching reference code...]
β Found 5 related implementations
β Saved reference_search.yaml
[Proceeding with Phase 1-4...]
User: Implement only the Self-Attention part from Algorithm 2
Claude: I'll focus on implementing Self-Attention from Algorithm 2.
[Extracting and implementing the specific algorithm...]
After implementation, you'll get:
paper_workspace/
βββ 01_algorithm_extraction.yaml # Extracted algorithms & equations
βββ 02_concept_analysis.yaml # Paper structure analysis
βββ 03_implementation_plan.yaml # Detailed implementation plan
βββ src/
βββ config.py # Hyperparameters & settings
βββ models/
β βββ __init__.py
β βββ network.py # Neural network architecture
βββ algorithms/
β βββ core.py # Main algorithm implementation
βββ training/
β βββ losses.py # Loss functions
β βββ trainer.py # Training loop
βββ evaluation/
β βββ metrics.py # Evaluation metrics
βββ main.py # Entry point
βββ requirements.txt # Dependencies
βββ README.md # Usage documentation
[Paper Input: PDF/arXiv URL]
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 0: Reference Search (Optional)β
β β Find similar implementations β
βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 1: Algorithm Extraction β
β β Extract all algorithms, equations β
β β Output: YAML specification β
βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 2: Concept Analysis β
β β Map paper structure β
β β Identify components & experiments β
βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 3: Implementation Plan β
β β 5-section detailed plan β
β β File structure & dependencies β
βββββββββββββββββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββββββ
β Phase 4: Code Implementation β
β β File-by-file implementation β
β β Complete, runnable codebase β
βββββββββββββββββββββββββββββββββββββββ
paper2code/
βββ SKILL.md # Main skill entry point
βββ 01_algorithm_extraction.md # Phase 1: Algorithm extraction protocol
βββ 02_concept_analysis.md # Phase 2: Paper structure analysis
βββ 03_code_planning.md # Phase 3: Implementation planning
βββ 04_implementation_guide.md # Phase 4: Code generation guide
βββ 05_reference_search.md # Phase 0: Reference code search (optional)
βββ 06_memory_management.md # Context/memory management guide
| Aspect | Naive Approach | Paper2Code Skill |
|---|---|---|
| Process | Direct paper β code | Structured multi-phase pipeline |
| Intermediate | None | YAML knowledge representation |
| Verification | Manual | Built-in self-check at each phase |
| Completeness | Often partial | Systematic with checklists |
| Reproducibility | Inconsistent | Explicit success criteria |
DO:
β Implement exactly what the paper specifies
β Write simple, direct code
β Test each component immediately
β Move to next file without asking permission
DON'T:
β Ask "Should I implement the next file?"
β Over-engineer or add unnecessary abstractions
β Skip unclear parts (document in missing_but_critical)
β Guess parameter values not in the paper
- Completeness: No placeholders or TODOs
- Accuracy: Exact equations, parameters from paper
- Executability: Code runs without errors
- Reproducibility: Can reproduce paper results
- Claude Code with Claude subscription
- pdftotext (for PDF processing):
sudo apt install poppler-utils
Q: What types of papers work best?
Primarily optimized for ML/DL research papers, but works with any paper that has clearly described algorithms:
- Deep learning models (Transformer, CNN, GNN, etc.)
- Reinforcement learning algorithms
- Optimization algorithms
- Data processing pipelines
Q: What if the implementation differs from the paper?
- Check the generated YAML files to verify algorithm extraction accuracy
- Look for missing information in the
missing_but_criticalsection - Provide the paper's Appendix or Supplementary Material
- Request re-implementation of specific parts: "Re-implement the loss calculation in Algorithm 2"
Q: Can it handle long papers?
Yes, following the guidelines in 06_memory_management.md:
- Section-by-section analysis
- Context management through intermediate YAML saves
- Recoverable checkpoints when needed
Q: When should I use reference code search?
Useful when:
- The paper lacks implementation details
- You need specific framework patterns
- You want to reference implementation tricks for complex algorithms
Request it by saying "Also search for similar implementations" or "Find reference code first".
Q: How is code quality ensured?
Each Phase has built-in Self-Check mechanisms:
- Phase 1: Verify all algorithms/equations extracted
- Phase 2: Confirm component relationships and experiment requirements
- Phase 3: Check 5 required sections and content balance
- Phase 4: Final completion checklist (executability, reproducibility, etc.)
This skill was inspired by DeepCode from HKU Data Intelligence Lab, which pioneered the structured approach to paper-to-code conversion with multi-agent orchestration.
MIT License - See LICENSE for details.
Contributions are welcome! Please feel free to submit issues or pull requests.
Note: This skill is designed for use with Claude Code. For information about the Agent Skills standard, see agentskills.io.