Turn hard, dense papers into high-quality Obsidian notes you can actually understand, reuse, and keep.
When a paper is full of dense formulas, crowded architecture diagrams, tangled experimental design, and pages of ablations, the problem is often not that the paper is unimportant. It is that the paper is simply hard to digest.
DeepPaperNote is a Codex skill built for that problem. It is not trying to paraphrase the abstract one more time. It is trying to reorganize the parts of a paper that are actually worth understanding and keeping.
Instead of pretending to read the paper for you, it takes over the most time-consuming and error-prone layers of the workflow:
- 🤖 Model-led understanding: let the language model unpack mechanisms, method structure, key comparisons, and limitations.
- 🗂️ Automatic evidence collection: gather evidence from PDFs, metadata sources, and optional Zotero workflows.
- 💎 Reusable note output: generate structured Obsidian-native or plain Markdown notes that are worth revisiting later.
Let scripts handle the repetitive work. Save your attention for actual thinking.
Tip
If you already have an Obsidian or Zotero workflow, DeepPaperNote is not trying to replace it. It is trying to automate the most tedious parts of evidence gathering, structuring, and note drafting.
Most paper-summary tools stop after producing a neat-looking abstract rewrite. DeepPaperNote cares about two harder questions: did you actually understand the paper? and is the resulting note worth keeping?
| Capability | What pain it solves |
|---|---|
| 💡 Make complex mechanisms legible | Instead of paraphrasing the abstract, it breaks down the method backbone, key design choices, real contribution, and the most likely points of confusion. |
| 🧪 Go beyond surface-level summaries | It forces attention onto the research question, task definition, core results, and honest limitations. |
| 🖼️ Keep figure context intact | When figure extraction is unstable, it still preserves accurate figure placeholders and explanations so the reading flow does not collapse. |
| 🔗 Fit into your personal knowledge base | Each paper gets its own folder, local images/ directory, and Markdown note that works naturally inside Obsidian. |
| 📚 Local-library-first, more reliable and often faster | If the paper already exists in Zotero, DeepPaperNote can often reuse local records and attachments instead of rediscovering everything from the web. |
- People who regularly wrestle with hard, technical, cross-domain, or high-density papers
- People who often feel, "I know every word here, but I still do not understand the paragraph"
- People who do not want vague AI summaries and instead want to understand mechanisms, results, and boundaries
- People who want to build a reusable local paper-knowledge base in Obsidian
Clone this repository into your Codex skills directory:
git clone https://github.com/917Dhj/DeepPaperNote.git ~/.codex/skills/DeepPaperNoteThen restart Codex.
After that, just hand a paper to Codex. A title, DOI, URL, arXiv ID, or local PDF all work.
Typical prompts:
给这篇论文生成深度笔记把这篇文章整理成 obsidian 笔记读这篇论文并生成 md 笔记Turn this paper into a note I will actually come back to
By default, DeepPaperNote will:
- resolve the paper identity
- gather metadata and PDF evidence
- plan figure placeholders and attempt high-confidence figure replacement
- generate the final Markdown note
- save it into Obsidian when configured, or automatically fall back to the current directory
If you want the Python dependencies for local development:
python3 -m pip install -e .After installation, you can also ask Codex with short prompts such as:
/deeppapernote doctor/deeppapernote start查看 deeppapernote 的可用情况deeppapernote 有什么功能
In that mode, DeepPaperNote should explain its capabilities, inspect the current setup, and tell you what is already configured or still missing.
If you want a more explicit onboarding prompt, see ONBOARDING_PROMPT.md.
DeepPaperNote can be tried with zero configuration.
- if no Obsidian vault is configured, it can still save notes into the current working directory
- if you want an Obsidian-native long-term workflow, you should configure your vault path
- everything else in this section is optional and improves specific workflows
The cleanest setup is:
export DEEPPAPERNOTE_OBSIDIAN_VAULT="/absolute/path/to/your/Obsidian_Documents"🛠️ Show advanced configuration (directories / Zotero MCP / Semantic Scholar / OCR)
If you want to customize paper output paths or intermediate artifact paths:
export DEEPPAPERNOTE_PAPERS_DIR="20_Research/Papers"
export DEEPPAPERNOTE_OUTPUT_DIR="tmp/DeepPaperNote"| Variable | Required | Purpose |
|---|---|---|
DEEPPAPERNOTE_OBSIDIAN_VAULT |
Recommended | Root path of your Obsidian vault |
DEEPPAPERNOTE_PAPERS_DIR |
Optional | Vault-relative paper output folder, default: 20_Research/Papers |
DEEPPAPERNOTE_OUTPUT_DIR |
Optional | Local temporary artifact directory, default: tmp/DeepPaperNote |
DEEPPAPERNOTE_WORKSPACE_OUTPUT_DIR |
Optional | Fallback output folder under the current working directory when no Obsidian vault is configured, default: DeepPaperNote_output |
Why the optional path settings can help:
DEEPPAPERNOTE_PAPERS_DIRUseful if your vault does not store papers under20_Research/Papers, or if you want DeepPaperNote to fit an existing folder convention without extra manual moves.DEEPPAPERNOTE_OUTPUT_DIRUseful if you want all intermediate artifacts in a predictable location for debugging, cleanup, or experimentation.
DeepPaperNote can work without Zotero. But if you want Codex to search your local Zotero library first, you should configure a Zotero MCP option that Codex can actually use.
This is most worth setting up if you already use Zotero as your main paper-management or reading workflow.
Recommended ways to think about it:
| Option | Best for | Notes |
|---|---|---|
| kujenga/zotero-mcp | Lightweight read access | Closer to a minimal Zotero MCP server for search, metadata, and text access, but not natively designed for Codex, so it usually still needs some adaptation |
| 54yyyu/zotero-mcp | Richer research workflow features | More feature-rich, but also not natively built for Codex, so using it well in Codex usually requires additional adaptation |
Why it matters:
- local Zotero hits are usually the best identity anchor
- if the paper is already in your local Zotero library, DeepPaperNote can often reuse local records and attachments instead of searching and downloading again, which also tends to make note generation faster
- Codex can prefer your local paper library before internet search
- local attachments can reduce wrong-title matches
- it is especially helpful when you already curate papers in Zotero and do not want DeepPaperNote to rediscover the same paper from weaker web matches
- it also improves reliability for published papers whose title may collide with preprints, workshop versions, or mirrored pages
Important note:
- DeepPaperNote does not require one specific Zotero MCP implementation
- for DeepPaperNote, the key capability is that Codex can search Zotero items, inspect metadata, and ideally read local full text
- the two routes above are not plug-and-play Codex-native options today, so stable use in Codex usually requires some adaptation on your side
This is not required, but if you have a Semantic Scholar API key you can expose it as:
export DEEPPAPERNOTE_SEMANTIC_SCHOLAR_API_KEY="your_api_key"Why it can help:
- metadata lookup is usually more stable when Semantic Scholar is available
- title-based paper resolution can be more reliable for hard-to-match papers
- author, venue, and abstract backfill may be more complete in some cases
- it gives DeepPaperNote one more strong source before falling back to weaker guesses
OCR is not required for many modern PDFs. But it becomes useful when a paper is:
- a scanned PDF
- an image-based PDF with poor embedded text
- an older paper where direct text extraction is incomplete
Why DeepPaperNote uses OCR:
- to recover page text when direct PDF extraction is too weak
- to preserve method and results evidence that would otherwise be lost
- to improve page-level context around figures and captions
Current OCR logic in DeepPaperNote:
- DeepPaperNote first tries normal PDF text extraction with
PyMuPDF - for each page, it counts how much searchable text was extracted
- if a page has too little extracted text, it becomes an OCR fallback candidate
- OCR is then applied to that page only
- the recovered OCR text is mainly used as page context for later evidence handling and figure/page semantic matching
Important scope note:
- OCR is currently a page-text fallback
- it is not the primary extraction path for all PDFs
- it is not used as a replacement for model-side understanding
- it is not used to "understand images" directly
Without OCR, DeepPaperNote still works well on normal digital PDFs, but scanned or low-quality PDFs may produce weaker evidence.
Required software and packages for OCR:
| Layer | Requirement | Purpose |
|---|---|---|
| System tool | tesseract |
The actual OCR engine |
| Python package | pytesseract |
Python bridge to tesseract |
| Python package | Pillow |
Opens rendered page images before OCR |
| Existing PDF layer | PyMuPDF |
Renders pages and extracts normal PDF text |
Install on macOS:
brew install tesseract
python3 -m pip install --user pytesseract PillowInstall on Windows:
winget install UB-Mannheim.TesseractOCR
py -m pip install --user pytesseract PillowIf winget is unavailable, install Tesseract OCR manually and then run:
py -m pip install --user pytesseract PillowQuick verification:
tesseract --version
python3 -c "import pytesseract, PIL; print('python_ok')"
python3 -c "import pytesseract; print(pytesseract.get_tesseract_version())"For release-level updates, see CHANGELOG.md.
| Version | Status | Highlights |
|---|---|---|
| v0.1.0-alpha | ✅ Released | First public alpha: Codex workflow, synthesis bundle pipeline, Zotero-first helpers, placeholder-first figure handling, workspace fallback, OCR fallback, tests, and CI |
| Unreleased | 🕒 No user-facing changes yet | No unreleased release-level changes at the moment |
Most paper-summary workflows stop too early:
- they overfit to the abstract
- they flatten technical details into generic bullets
- they silently skip figures when extraction is messy
- they produce notes that look neat but are not useful a week later
DeepPaperNote takes a different stance:
scriptsgather, normalize, and verify evidence- Codex/GPT does the actual understanding and writing
- figure handling is
placeholder-first - text correctness matters more than image completeness
The goal is not "summarize a paper". The goal is "produce a note you would actually keep in a serious research vault".
| Feature | What it means in practice |
|---|---|
| Model-first understanding | Scripts do deterministic work and do not pretend to understand the paper better than the model. |
| Deep-reading notes | The note should reconstruct the paper's argument, not paraphrase the abstract. |
| Figure placeholder-first | Major figures and tables should stay in the note structure even when extraction is partial. |
| Obsidian-native output | Each paper gets its own folder with a note file and local images/ directory. |
| Zotero-first | If the paper exists in local Zotero, use that as the most reliable identity anchor first. |
The default workflow is:
- resolve the paper identity
- collect metadata
- acquire the PDF or strong full-text evidence
- extract evidence
- extract PDF image assets
- plan figure placement
- build a synthesis bundle
- let Codex/GPT write the note
- lint the final note
- write it into Obsidian
Core principle:
- scripts gather evidence
- the model writes
- lint guards quality before save
See also:
DeepPaperNote uses a placeholder-first strategy.
If a major figure matters, the note should preserve it even when extraction is imperfect.
Preferred placeholder format:
> [!figure] Fig. 3 数据分布与质量评估
> 建议位置:数据与任务定义
> 放置原因:这张图同时展示样本构成、对话长度统计和专家质检结果,是理解 `PsyInterview` 数据边界最重要的图之一。
> 当前状态:保留占位;当前提取结果只拿到局部子图,无法稳定恢复成可独立解释的完整原图。Rule of thumb:
- figures may be partial
- figures may be missing
- text must stay accurate
See Figure Placement.
DeepPaperNote is strict about what counts as a successful note.
The note should:
- distinguish research problem from task definition
- explain the real method or analysis flow
- include key numbers that actually matter
- point out what is easy to misread
- state at least one honest limitation
- use real heading levels:
#,##,### - avoid half-Chinese half-English prose lines
If evidence quality is too weak, the skill should fail closed or clearly degrade the output, not pretend it performed a true deep read.
See:
DeepPaperNote/
├── SKILL.md
├── README.md
├── README.zh-CN.md
├── agents/
│ └── openai.yaml
├── assets/
│ ├── hero.png
│ ├── hero.svg
│ └── note_template.md
├── references/
│ ├── architecture.md
│ ├── deep-analysis.md
│ ├── evidence-first.md
│ ├── figure-placement.md
│ ├── final-writing.md
│ ├── metadata-sources.md
│ ├── model-synthesis.md
│ ├── note-quality.md
│ ├── obsidian-format.md
│ ├── paper-types.md
│ └── workflow.md
└── scripts/
├── build_synthesis_bundle.py
├── collect_metadata.py
├── common.py
├── contracts.py
├── create_input_record.py
├── extract_evidence.py
├── extract_pdf_assets.py
├── fetch_pdf.py
├── lint_note.py
├── locate_zotero_attachment.py
├── materialize_figure_asset.py
├── plan_figures.py
├── resolve_paper.py
├── run_pipeline.py
└── write_obsidian_note.py
| Component | Status | Notes |
|---|---|---|
| Codex desktop / CLI | Recommended | Primary target environment |
| Python 3.10+ | Required | Runs the helper scripts |
| Obsidian vault | Required for note writing | Configure DEEPPAPERNOTE_OBSIDIAN_VAULT |
| Zotero + MCP | Optional | Best for local-library-first workflows |
| OCR tooling | Optional | Helpful for scanned PDFs |
This repository is in active early-stage development.
| Area | Current state |
|---|---|
| Single-paper preprocessing pipeline | ✅ Working |
| Synthesis bundle generation | ✅ Working |
| Zotero-first helper workflow | ✅ Working |
| Obsidian writing flow | ✅ Working |
| Placeholder-first figure planning | ✅ Working |
| Style and structure linting | ✅ Working |
| Public examples | Not added yet |
| Test suite | ✅ Minimal suite added |
| CI | ✅ GitHub Actions configured |
| Packaging metadata | Not added yet |
| Figure matching / OCR robustness | Needs improvement |
Model-first: understanding belongs to the language modelEvidence-first: writing should be grounded in extracted evidencePlaceholder-first: missing figures must not erase note structureTruth over neatness: uncertain extraction should be stated honestlyResearch usefulness over summary polish: the note should remain valuable later
DeepPaperNote is currently a Codex skill.
The long-term direction is:
- keep the core workflow portable
- keep the Codex integration strong and clear
- later add adapters for other agent environments if the core remains stable
DeepPaperNote is informed by paper-reading and note-generation workflows that influenced the design of this skill:
What DeepPaperNote tries to do differently is stay strongly model-first:
- scripts gather evidence and assets
- the language model does the real paper understanding
- figure handling remains placeholder-first when extraction is uncertain
Contributions are welcome, especially around:
- README and examples
- tests and CI
- PDF/OCR robustness
- figure matching quality
- note quality evaluation
- multi-agent adapter design
This project is licensed under the MIT License.