Zotero Taxonomy Curator is a domain-configurable workflow for turning Zotero metadata, PDFs, annotations, highlights, and manual tags into structured Obsidian literature notes with a controlled taxonomy.
It is not an automatic paper summarizer. Its goal is to help researchers build a maintainable literature system:
Zotero evidence -> taxonomy mapping -> structured Obsidian notes -> literature index
- Read Zotero items and collections from a local
zotero.sqlitedatabase. - Preserve Zotero manual tags as first-class evidence.
- Map original tags to a user-defined taxonomy.
- Separate stable
canonical_tagsfrom exploratorycandidate_tags. - Create structured Markdown notes for Obsidian.
- Support any research domain through YAML taxonomy configuration.
- Validate note tags against the taxonomy.
- Avoid hard-coded local paths by using
config/local.yaml.
zotero-taxonomy-curator/
├─ README.md
├─ LICENSE
├─ .gitignore
├─ config/
│ ├─ local.example.yaml
│ ├─ empty-taxonomy.yaml
│ └─ taxonomy.example.yaml
├─ skills/
│ └─ zotero-taxonomy-curator/
│ ├─ SKILL.md
│ ├─ templates/
│ │ ├─ empty-taxonomy.yaml
│ │ └─ literature-note-template.md
│ └─ scripts/
│ ├─ curator_common.py
│ ├─ init_curator_project.py
│ ├─ extract_zotero_evidence.py
│ ├─ list_zotero_collections.py
│ └─ validate_taxonomy_notes.py
├─ templates/
│ └─ literature-note-template.md
├─ examples/
│ └─ vrp-taxonomy.example.yaml
└─ docs/
├─ taxonomy-design.md
├─ zotero-setup.md
└─ obsidian-workflow.md
After installing skills/zotero-taxonomy-curator/ into your Codex skills
directory, initialize a project from the project root:
python ~/.codex/skills/zotero-taxonomy-curator/scripts/init_curator_project.pyOn Windows, use the corresponding path, for example:
python "$env:USERPROFILE\.codex\skills\zotero-taxonomy-curator\scripts\init_curator_project.py"The initializer will:
- try to detect
zotero.sqlitefrom environment variables, Zotero profile preferences, and common default locations; - create
config/local.yaml; - create an empty
config/taxonomy.yaml; - copy the bundled close-reading note template to
templates/; - create
notes/literature/and a starterLiterature Index.md.
If Zotero is installed in a custom location, pass paths explicitly:
python ~/.codex/skills/zotero-taxonomy-curator/scripts/init_curator_project.py \
--zotero-sqlite "/path/to/zotero.sqlite" \
--zotero-storage "/path/to/Zotero/storage"-
Copy the example local config:
cp config/local.example.yaml config/local.yaml
-
Edit
config/local.yamland set your Zotero database paths. -
Copy or create a taxonomy:
cp config/empty-taxonomy.yaml config/taxonomy.yaml
-
Customize
config/taxonomy.yamlfor your own field. -
Install
skills/zotero-taxonomy-curator/into your Codex skills directory. -
Use the skill on one Zotero item or a collection.
config/local.yaml is intentionally ignored by Git because it contains local
paths. Use config/local.example.yaml as the public template.
Minimal config:
zotero:
sqlite_path: "/path/to/zotero.sqlite"
storage_dir: "/path/to/Zotero/storage"
output:
note_root: "notes/literature"
index_file: "notes/literature/Literature Index.md"
taxonomy:
path: "config/taxonomy.yaml"
template:
path: "templates/literature-note-template.md"The core idea is to build a controlled vocabulary for your field.
canonical_tagsare stable tags used for retrieval and comparison.candidate_tagsare new or uncertain terms that should not be promoted too early.aliasesmap spelling variants and user tags to canonical forms.candidate_termsrecord emerging concepts for later review.
The default note template is a Chinese close-reading note scaffold inspired by
research-paper notes: source positioning, tag marking, core summary,
problem positioning, model or assumptions, methods, core formulas or mechanisms,
evidence/data/experiments, main findings, and personal reuse assessment. It keeps
only the necessary taxonomy traces: original Zotero tags, canonical_tags,
candidate_tags, and a compact tag-decision table. Full category definitions
belong in config/taxonomy.yaml, not in every note.
The template also includes optional review-paper fields (review_type,
review_scope_tags, review method and coverage) and writing discipline for
models and formulas: use LaTeX for mathematical objects, keep formulas tied to
source evidence, and mark unclear formulas for PDF verification rather than
guessing.
The default taxonomy is generic. The VRP taxonomy in examples/ shows how a
domain-specific taxonomy can be built without changing the skill itself.
The scripts are named as functional modules rather than generic helpers:
| Module | Script | Purpose |
|---|---|---|
| Curator Core | curator_common.py |
Shared configuration, Zotero database, and attachment utilities. |
| Project Initializer | init_curator_project.py |
Create first-run local config, empty taxonomy, note template, and note folder. |
| Evidence Collector | extract_zotero_evidence.py |
Build a raw evidence packet from one Zotero item without summarizing it. |
| Collection Navigator | list_zotero_collections.py |
Inspect Zotero collection paths before selecting items to process. |
| Taxonomy Gatekeeper | validate_taxonomy_notes.py |
Validate Obsidian note tags against the configured taxonomy. |
Initialize a project:
python skills/zotero-taxonomy-curator/scripts/init_curator_project.pyList Zotero collections:
python skills/zotero-taxonomy-curator/scripts/list_zotero_collections.py \
--config config/local.yaml \
--out output/collections.jsonExtract one item:
python skills/zotero-taxonomy-curator/scripts/extract_zotero_evidence.py \
--config config/local.yaml \
--item-key ABCD1234 \
--out output/raw_zotero_item.jsonValidate notes:
python skills/zotero-taxonomy-curator/scripts/validate_taxonomy_notes.py \
--taxonomy config/taxonomy.yaml \
notes/literatureDo not commit:
config/local.yaml- Zotero databases
- Zotero
storage/ - private PDFs
- generated notes from private projects
- temporary extraction files
Use .gitignore as the default safety net.