Skip to content

ganzoth/zotero-taxonomy-curator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Zotero Taxonomy Curator

Zotero Taxonomy Curator is a domain-configurable workflow for turning Zotero metadata, PDFs, annotations, highlights, and manual tags into structured Obsidian literature notes with a controlled taxonomy.

It is not an automatic paper summarizer. Its goal is to help researchers build a maintainable literature system:

Zotero evidence -> taxonomy mapping -> structured Obsidian notes -> literature index

Features

  • Read Zotero items and collections from a local zotero.sqlite database.
  • Preserve Zotero manual tags as first-class evidence.
  • Map original tags to a user-defined taxonomy.
  • Separate stable canonical_tags from exploratory candidate_tags.
  • Create structured Markdown notes for Obsidian.
  • Support any research domain through YAML taxonomy configuration.
  • Validate note tags against the taxonomy.
  • Avoid hard-coded local paths by using config/local.yaml.

Project structure

zotero-taxonomy-curator/
├─ README.md
├─ LICENSE
├─ .gitignore
├─ config/
│  ├─ local.example.yaml
│  ├─ empty-taxonomy.yaml
│  └─ taxonomy.example.yaml
├─ skills/
│  └─ zotero-taxonomy-curator/
│     ├─ SKILL.md
│     ├─ templates/
│     │  ├─ empty-taxonomy.yaml
│     │  └─ literature-note-template.md
│     └─ scripts/
│        ├─ curator_common.py
│        ├─ init_curator_project.py
│        ├─ extract_zotero_evidence.py
│        ├─ list_zotero_collections.py
│        └─ validate_taxonomy_notes.py
├─ templates/
│  └─ literature-note-template.md
├─ examples/
│  └─ vrp-taxonomy.example.yaml
└─ docs/
   ├─ taxonomy-design.md
   ├─ zotero-setup.md
   └─ obsidian-workflow.md

Quick start

Option A: first-run initializer

After installing skills/zotero-taxonomy-curator/ into your Codex skills directory, initialize a project from the project root:

python ~/.codex/skills/zotero-taxonomy-curator/scripts/init_curator_project.py

On Windows, use the corresponding path, for example:

python "$env:USERPROFILE\.codex\skills\zotero-taxonomy-curator\scripts\init_curator_project.py"

The initializer will:

  • try to detect zotero.sqlite from environment variables, Zotero profile preferences, and common default locations;
  • create config/local.yaml;
  • create an empty config/taxonomy.yaml;
  • copy the bundled close-reading note template to templates/;
  • create notes/literature/ and a starter Literature Index.md.

If Zotero is installed in a custom location, pass paths explicitly:

python ~/.codex/skills/zotero-taxonomy-curator/scripts/init_curator_project.py \
  --zotero-sqlite "/path/to/zotero.sqlite" \
  --zotero-storage "/path/to/Zotero/storage"

Option B: manual setup

  1. Copy the example local config:

    cp config/local.example.yaml config/local.yaml
  2. Edit config/local.yaml and set your Zotero database paths.

  3. Copy or create a taxonomy:

    cp config/empty-taxonomy.yaml config/taxonomy.yaml
  4. Customize config/taxonomy.yaml for your own field.

  5. Install skills/zotero-taxonomy-curator/ into your Codex skills directory.

  6. Use the skill on one Zotero item or a collection.

Configuration

config/local.yaml is intentionally ignored by Git because it contains local paths. Use config/local.example.yaml as the public template.

Minimal config:

zotero:
  sqlite_path: "/path/to/zotero.sqlite"
  storage_dir: "/path/to/Zotero/storage"

output:
  note_root: "notes/literature"
  index_file: "notes/literature/Literature Index.md"

taxonomy:
  path: "config/taxonomy.yaml"

template:
  path: "templates/literature-note-template.md"

Taxonomy-first workflow

The core idea is to build a controlled vocabulary for your field.

  • canonical_tags are stable tags used for retrieval and comparison.
  • candidate_tags are new or uncertain terms that should not be promoted too early.
  • aliases map spelling variants and user tags to canonical forms.
  • candidate_terms record emerging concepts for later review.

The default note template is a Chinese close-reading note scaffold inspired by research-paper notes: source positioning, tag marking, core summary, problem positioning, model or assumptions, methods, core formulas or mechanisms, evidence/data/experiments, main findings, and personal reuse assessment. It keeps only the necessary taxonomy traces: original Zotero tags, canonical_tags, candidate_tags, and a compact tag-decision table. Full category definitions belong in config/taxonomy.yaml, not in every note.

The template also includes optional review-paper fields (review_type, review_scope_tags, review method and coverage) and writing discipline for models and formulas: use LaTeX for mathematical objects, keep formulas tied to source evidence, and mark unclear formulas for PDF verification rather than guessing.

The default taxonomy is generic. The VRP taxonomy in examples/ shows how a domain-specific taxonomy can be built without changing the skill itself.

Scripts

The scripts are named as functional modules rather than generic helpers:

Module Script Purpose
Curator Core curator_common.py Shared configuration, Zotero database, and attachment utilities.
Project Initializer init_curator_project.py Create first-run local config, empty taxonomy, note template, and note folder.
Evidence Collector extract_zotero_evidence.py Build a raw evidence packet from one Zotero item without summarizing it.
Collection Navigator list_zotero_collections.py Inspect Zotero collection paths before selecting items to process.
Taxonomy Gatekeeper validate_taxonomy_notes.py Validate Obsidian note tags against the configured taxonomy.

Initialize a project:

python skills/zotero-taxonomy-curator/scripts/init_curator_project.py

List Zotero collections:

python skills/zotero-taxonomy-curator/scripts/list_zotero_collections.py \
  --config config/local.yaml \
  --out output/collections.json

Extract one item:

python skills/zotero-taxonomy-curator/scripts/extract_zotero_evidence.py \
  --config config/local.yaml \
  --item-key ABCD1234 \
  --out output/raw_zotero_item.json

Validate notes:

python skills/zotero-taxonomy-curator/scripts/validate_taxonomy_notes.py \
  --taxonomy config/taxonomy.yaml \
  notes/literature

What should not be committed

Do not commit:

  • config/local.yaml
  • Zotero databases
  • Zotero storage/
  • private PDFs
  • generated notes from private projects
  • temporary extraction files

Use .gitignore as the default safety net.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages