Skip to content

Releases: daniel-pittman/librarian

v1.8.1 — Polish: consistent MCP errors + stats/rollup convergence

Choose a tag to compare

@daniel-pittman daniel-pittman released this 12 Jun 18:13

Bug-fix / polish release following v1.8.0 (grant block + rollup). From #30 / #34.

Fixes

  • Consistent block_field errors across the MCP read surface. librarian_filter and librarian_rollup now both return ERROR: block_field must be in BLOCK.FIELD=VALUE form on a malformed block_field (previously filter silently dropped it and rollup raised) — matching how the rest of the read tools surface errors.
  • rollup --sum warns on an unknown field of a known block instead of silently returning 0.
  • stats and rollup agree on integer coercion. stats now uses the shared _intish helper, so a bool in an int field contributes 0 (not 1), matching rollup.

v1.8.0 — Grant block + rollup aggregation

Choose a tag to compare

@daniel-pittman daniel-pittman released this 12 Jun 05:56

New

  • grant schema block (schemas/grant.yaml) — structured funding data on funding entries: amount (int, whole units), role, required status (awarded / pending / not-funded), sponsor, award_number, notes.
  • rollup command (CLI rollup + MCP librarian_rollup) — aggregate any block into a count and an optional integer-field sum, with --group-by and the same set-scoping filters as filter. Block-field totals (funding dollars, CPE credits by group, any int field) become derivable instead of hand-tallied. --json for downstream tooling.
librarian rollup grant --sum amount --group-by status
librarian rollup grant --sum amount --group-by role --block-field grant.status awarded

Also included since v1.7.2

  • v1.7.3 — CITATION.cff for Zenodo archive integration.
  • CI — Semgrep migration, review-test gating, and the two-pass Claude-review merge gate.

Follow-up polish tracked in #30.

v1.7.3 — Add CITATION.cff for Zenodo archive integration

Choose a tag to compare

@daniel-pittman daniel-pittman released this 12 Jun 05:38
537a1ec

What's new

Adds a CITATION.cff describing librarian as research software, ahead of enabling Zenodo's GitHub archive integration on the repo. This is the first release that mints a Zenodo DOI for librarian; future releases will continue to archive automatically via the Zenodo webhook.

Other changes

  • tests/test_cli_read.py::test_version now reads the expected version from the repo-root VERSION file rather than a hard-coded literal, so future version bumps don't break the test.
  • VERSION and pyproject.toml both bumped to 1.7.3 (the single-source-of-truth invariant documented in librarian/__init__.py is preserved).

No application logic changes

This is a metadata-only release. The CLI, MCP server, and schema engine are unchanged from v1.7.2.

🤖 Generated with Claude Code

v1.7.2

Choose a tag to compare

@daniel-pittman daniel-pittman released this 30 May 23:43
157dec3

Final v1.7.x cleanup

Closes every v1.7.1-deferred item and every round-1 review finding from PR #22. End of the v1.7.x development arc — no further point releases planned.

Items closed

v1.7.1-deferred HIGHs:

  • --label=--dry-run (equals-form) bypass — flag-shape guard now applied to both equals- and space-forms.
  • description : (space before colon) → IndexError in merge shape gates — both preview and commit paths split on the first : instead of the literal description: substring.
  • 0-byte / no-key pre-existing activities.yaml — line-anchored has_root_key scan. Empty / whitespace / comments-only files bootstrap safely. Files with real non-activities top-level content (e.g. a meta: block) are REFUSED with a clear error, preserving user bytes intact.

v1.7.1-round-6 MEDIUMs:

  • cmd_file_rehash ledger asymmetry: detection-only runs (path-drift / added-during / removed-during / vanished / missing / malformed) now write rows tagged detect-only=1 so --changed-since consumers can filter. save_files still gated on actual digest changes.
  • removed_during_hash computed inside the lock and reflected in the ledger detail — symmetric with added_during_hash.
  • atomic_replace durability: fh.flush() + fsync(fileno) before rename, fsync(dirfd) after. Post-rename dir-fsync failures are caught so append_ledger is never skipped after a successful write (closes the disk-has-write-but-no-ledger-row race).
  • _print_help_for adds a global-flags footer (--label / LIBRARIAN_SESSION_LABEL). cmd_create and cmd_merge docstrings extended to list every flag.

Test plan

  • Full pytest suite green on 3.10 / 3.11 / 3.12
  • 5 new tests pin the v1.7.2 fixes
  • Both claude-review and security-review pass on PR #22

Upgrade notes

No breaking changes. Existing entries, schemas, and ledger files are forward-compatible.

If you've hand-edited activities.yaml to contain a meta: block (or other top-level content) WITHOUT an activities: mapping key, librarian create will now refuse rather than silently produce malformed YAML. Add activities: as the top-level key (your other content is preserved) and creates will resume normally.

Known LOWs (will not be patched in v1.7.x)

  • Equals-form --label=-foo rejects values that start with - with a misleading "looks like another flag" message. Affects no in-tree caller; users adopting a leading-hyphen label convention should use the space-form --label -foo.

v1.7.1

Choose a tag to compare

@daniel-pittman daniel-pittman released this 30 May 22:41
9aa0faa

Highlights

v1.7.1 closes every follow-up item the v1.7.0 release notes flagged for a patch release, plus six rounds of automated-review hardening. No new commands, no breaking changes — quality + concurrency-safety + scanner-precision pass.

Lock-scope reduction

  • Writer decorators short-circuit <cmd> -h / <cmd> --help (anywhere in argv) to print usage and return 0 without acquiring the activities / files lock. No more lockfile materialization on --help against a fresh data home.
  • cmd_create inline-refactored: stdin read, schema-block validation, fuzzy-similarity scan, indent detection, and YAML rendering all happen OUTSIDE the lock. The lock now wraps only the re-load + id-uniqueness recheck + write.
  • cmd_file_rehash inline-refactored: SHA-256 hashing runs OUTSIDE the lock. The lock wraps only the apply-deltas + save phase. A multi-gigabyte --all rehash no longer blocks concurrent file-* writers.

Atomic writes

  • New shared atomic_replace helper: temp + os.replace with mode-bit preservation (chmod 600 survives round-trips), orphan-tmp cleanup on failure (including KeyboardInterrupt), symlink target honoring (the link is preserved instead of destroyed), and symlink-target-parent mkdir for not-yet-mounted destinations.
  • write_lines, save_files, and cmd_create's bootstrap all route through it.
  • write_lock sidecar anchored to the resolved symlink target so two aliases to the same file share one mutex.

Merge polish

  • Dry-run preview total now equals the post-confirm ledger refs= count: preview counts steps 1, 1b, 2, and 5 instead of just step 1.
  • Description-shape gates (missing / empty / inline / folded >) hoisted to run BEFORE the dry-run return so the preview accurately reflects whether --confirm will commit.
  • Folded-scalar (description: >) targets are rejected when --append-sources is on (YAML would fold the appended bodies into one paragraph). Provenance-only is still permitted.
  • opt_out_advice joiner is and (not or) when both flags trigger the shape check.

cmd_file_rehash correctness

  • Phase 2 captures (id, path_snapshot, sha256) so phase 3 can verify the record's path didn't change underneath us via a concurrent file-move.
  • Empty / whitespace ids and empty / non-string paths are filtered with clear notices.
  • TOCTOU mid-hash (FileNotFoundError / OSError) is trapped; the rid lands in missing and the rehash continues for everything else.
  • --help (anywhere in argv) short-circuits to usage instead of falling through to id-lookup.
  • Concurrent file-add / file-delete between phases produce added_during_hash / removed_during_hash / scope_id_vanished notices.

Dangling-ref scanner

  • _HISTORICAL_PHRASES_RE regex tightened: consolidat(?:e|es|ed|ing)\s+from and consolidat...\s+(?:into|under) with list-continuation lookahead; merged\s+from and merged\s+into with list-continuation lookahead; originally\s+(?:tracked|...stored as|...) with completers; named\s+as (not bare named); was named / old id / former id dropped.
  • Backtick-pair scoping (vs raw rfind) so an unmatched stray backtick doesn't misroute the clause boundary.
  • Multi-line continuation lists handled (\r\n in the strip set).
  • Sentence-break lookback requires 3+ alphabetic chars before the period AND an uppercase letter starting the next word; decimals, abbreviations, and version numbers don't false-trigger.
  • Semicolon recognized as a list-continuation separator.

--label UX

  • Accepts the equals-form --label=foo.
  • Rejects duplicate --label flags.
  • Rejects flag-shaped values (--label --dry-run typo no longer silently swallows the next flag).
  • Rejects trailing --label with no value.

Test plan

  • 30+ new tests cover the new behaviors
  • Full pytest suite green on 3.10 / 3.11 / 3.12
  • Six rounds of automated review on PR #20

Upgrade notes

No breaking changes. Existing entries, schemas, and ledger files are forward-compatible.

Known issues (v1.7.2 candidates)

  • --label=--dry-run (equals-form) bypasses the flag-shape value guard; the space-form is still guarded. Rare typo.
  • description : (space before colon) raises IndexError in the merge dry-run shape gate. Rare hand-edit pattern.
  • A 0-byte pre-existing activities.yaml (e.g. touch'd but never written) produces an unparseable root-list file on first cmd_create. Rare bootstrap pattern; pre-existing pre-PR.

Review trajectory

HIGHs found per review round on PR #20 = 2, 1, 2, 1, 2, 4. The round-6 cluster was mostly trivial edge cases plus one data-loss bug (write_lock symlink alias mutex). The data-loss bug is fixed; the rare-edge bugs are documented above. Review loop paused per ~/.claude/methodology/handling_claude_pr_reviews.md's symmetric-gap-chasing guidance.

v1.7.0

Choose a tag to compare

@daniel-pittman daniel-pittman released this 30 May 16:35
7daa8ea

Highlights

Four feature releases bundled (v1.4 → v1.7) plus extensive review hardening.

New commands

  • set-block (v1.4) — Add a schema block to an existing entry. Validates the block atomically against the active schema before write; coerces stringy primitives (int / bool / date) to native types.
  • delete --repoint-to <target> (v1.5) — Rewrite every inbound reference to the deleted entry id (backticked and plain-text, across descriptions and block string values) to point at <target> before removing the source. A shared repoint helper backs this and the merge rewriter.
  • merge (v1.6) — Atomic consolidation of N source entries into one target. Carries optional schema blocks (with --on-block-conflict keep-target | keep-source | abort), unions tags and docs, rewrites cross-references, deletes sources, writes one ledger entry.

v1.7 — deferred cleanup

  • Cross-process write lock with thread-local reentrancy and an os.register_at_fork hook so a forked child re-acquires its own flock.
  • Merge target-skip: backticked source-id mentions inside the target's prose are rewritten by step 1b; plain-text mentions are deliberately preserved for the author to edit.
  • merge --append-sources: folds source descriptions into the target's body with plain-text ## From <sid> headers. Backticked cross-source references are rewritten to target_id so step 6's source-delete doesn't strand dangling backticks.
  • Dangling-ref scanner softening: historical phrases (Originally tracked under, Previously known as, Consolidated from, etc.) suppress what would otherwise be flagged as broken cross-references. Sentence-scoped via lookbehind on alphabetic characters plus uppercase lookahead, so decimals (4.2), version numbers (v1.0), and short abbreviations (Dr., e.g.) don't false-trigger.

Cosmetic + UX polish

  • bool coerce strips whitespace before lowercasing
  • _render_scalar bool raises on unrecognized strings (no silent flip to false)
  • Bare librarian delete (no args) exits 1 with a usage hint (vs argparse exit 2)

Test plan

  • pytest green on 3.10 / 3.11 / 3.12
  • Lint + format clean
  • MCP server smoke test against real corpus (364 entries)
  • Eight rounds of automated review on the v1.7 cleanup PR

Upgrade notes

No breaking changes. Existing entries, schemas, and ledger files are forward-compatible.

Known follow-ups (v1.7.1 candidates)

Review-batch deferrals: lock-scope expansion to argparse / dry-run paths, dry-run preview ↔ ledger drift in merge, opt-out advice joiner, folded-scalar (>) handling in --append-sources, further regex tightening for consolidat\w* / merged, cmd_file_rehash --all lock scope, scanner edge cases (CRLF strip set, window-edge backtick straddle).

v1.3.0 — Discoverability: env, schema enum values, docs_optional

Choose a tag to compare

@daniel-pittman daniel-pittman released this 28 May 21:05
1cdf706

v1.3.0

Three introspection features so an agent or operator can discover the active setup and valid inputs without reading config files.

Features

  • env command (and librarian_env MCP tool) — prints each resolved data-home path (home, activities, files, ledger, schema, root, artifacts, memory_dir), which LIBRARIAN_* variable set it (or home/derived/default), and whether it exists. --json supported. Read-only; paths are local to the machine.
  • schema enumerates enum values — every enum field's allowed values are listed: plain enums inline, dependent enums as a per-parent map (e.g. category → subcategory), in both text and --json. No more reading schema.yaml to learn valid subcategory values.
  • docs_optional field — an entry that legitimately has no artifact can set docs_optional: true to suppress its NO DOCS validation warning, so genuine gaps stand out. Settable on create and via update-field <id> docs_optional true, written as a real YAML boolean, honored by validate. A shared strict bool parser keeps create and update-field in agreement and rejects typos.

Quality

  • 212-test suite green on Python 3.10 / 3.11 / 3.12; ruff lint + format clean.
  • Reviewed across two Claude code-review rounds + security-review.

Full changelog: v1.2.0...v1.3.0

v1.2.0 — Query what changed since you last pulled

Choose a tag to compare

@daniel-pittman daniel-pittman released this 28 May 17:57
7537c90

v1.2.0

Headline feature

--changed-since / --changed-until on search, filter, and list (and the matching MCP tool params). Intersect any query with the change ledger to find what's changed since a point in time — e.g. librarian filter --tag grant --changed-since 2026-05-01 — handy for exporting only-the-updates into an external system. Date-only bounds are inclusive of the whole day; entries with no ledger history are excluded.

Fixes & internals since v1.1.0

  • Fixed a latent timezone-comparison crash in the changes --since ledger query (a bare date like 2026-05-01 previously crashed and surfaced as an empty result).
  • Same-indent YAML corruption fix in add-docs / add-tags (#8).
  • Blank-line / #-comment scan truncation fix across add-docs / add-tags / remove-tags (#9).
  • Extracted a shared _scan_list_items helper, collapsing three duplicate scan loops (#10).
  • Hardened MCP error surfacing — read tools now report CLI errors instead of returning empty output.
  • Entry-id validation: create and rename-id enforce ledger-safe slugs (lowercase, digits, hyphens; ≥2 chars).

Quality

  • 192-test suite green on Python 3.10 / 3.11 / 3.12; ruff lint + format clean.
  • Every contributing PR passed Claude code-review + security-review gates.

Full changelog: v1.1.0...v1.2.0

v1.1.0 — Enhanced rolodex

Choose a tag to compare

@daniel-pittman daniel-pittman released this 26 May 07:29
d82a30a

Minor release. Enhanced rolodex (librarian contact) correctness on activity descriptions that use ;-separated affiliation lists. Drop-in for 1.0.x users.

What's new

Six-case ;-boundary handling. The contact extractor's name-walk-back is now driven by one explicit property — a real-name token contains at least one lowercase letter; an institutional acronym does not — and handles six arrangement cases uniformly:

Bob Smith (email)                  → "Bob Smith"
Bob Smith; (email)                 → "Bob Smith"
ZX; Bob Smith (email)              → "Bob Smith"
Alice Garcia; ACM (email)          → "Alice Garcia"
Bob Smith ZX; (email)              → "Bob Smith"
Bob Smith ZX YZ; (email)           → "Bob Smith"

Name-suffix preservation. Generational and degree suffixes survive the post-loop acronym strip via the new _NAME_SUFFIXES allowlist:

  • Roman numerals: II, III, IV, V, VI, VII, VIII, IX, X, XI, XII
  • Generational: Jr, Jr., Sr, Sr., JR, JR., SR, SR.
  • All-caps degrees: MD, MA, BA, MS, BS, JD, MBA, DDS, DVM (mixed-case forms like PhD, MSc survive natively)

Two-pass post-loop strip. Collect trailing suffixes → strip acronyms → re-attach. Prevents a suffix at position 0 from blocking the acronym strip behind it.

18 regression tests anchor the property across all six arrangement cases plus the suffix-preservation and boundary edges. Full suite now at 155 tests.

Known limitations

Documented honestly in the source comment on _walk_back_for_name:

  • Comma-separated lists (Alice Smith, Bob Jones) still leak earlier authors
  • Internal-; tokens (III;some) bypass the boundary
  • 80-char snippet window — verbose affiliations can push the upstream ; out of view
  • Capitalised noise without an acronym marker may sweep into a "name"
  • All-caps last names (SMITH, CHEN) misclassified as acronyms by the lowercase-letter check

Upgrade

pip install --upgrade librarian-tracker

No API changes. No schema changes. No data migration needed. Only the rolodex contact output differs from 1.0.1, and the differences are bug fixes for real-world cases (DU , KIHA , and similar institutional-prefix leaks).

Verified

End-to-end against a live deployment's activities.yaml: six previously-buggy contacts (resolving as DU Lombe Chileshe, DU Hojjat Abdollahi, etc.) now resolve cleanly. The two-pass design was caught and fixed by a regression test that exercises Name ZX III; (email)Name III.

v1.0.1 — Auto-trigger reviews + brand polish

Choose a tag to compare

@daniel-pittman daniel-pittman released this 26 May 03:00

What's new

Auto-trigger Claude reviews

The Claude code review and security review workflows now fire automatically on pull requests rather than waiting for a label. The outside-collaborator approval gate (Actions settings) is what bounds drive-by credit burn from random fork PRs.

  • claude-code-review.yml runs on every PR (opened / synchronize / ready-for-review / reopened).
  • claude-security-review.yml runs on every PR whose base is main or develop, plus workflow_dispatch for manual re-runs.

Brand polish

  • README H1 is now The Librarian to match the character framing of the project.
  • A 1280×640 social-preview banner (docs/img/social-preview.png) is included so the repo can show a proper OpenGraph card when shared on Slack / X / Discord. Upload via Settings → General → Social preview (UI-only).
  • CONTRIBUTING.md trimmed to contributor-facing essentials (the maintainer-setup and threat-model rationale stay in SECURITY.md).

Upgrade

Drop-in for 1.0.0 users. No code or schema changes.

pip install --upgrade librarian-tracker