Skip to content

v1.1.0 — Enhanced rolodex

Choose a tag to compare

@daniel-pittman daniel-pittman released this 26 May 07:29
d82a30a

Minor release. Enhanced rolodex (librarian contact) correctness on activity descriptions that use ;-separated affiliation lists. Drop-in for 1.0.x users.

What's new

Six-case ;-boundary handling. The contact extractor's name-walk-back is now driven by one explicit property — a real-name token contains at least one lowercase letter; an institutional acronym does not — and handles six arrangement cases uniformly:

Bob Smith (email)                  → "Bob Smith"
Bob Smith; (email)                 → "Bob Smith"
ZX; Bob Smith (email)              → "Bob Smith"
Alice Garcia; ACM (email)          → "Alice Garcia"
Bob Smith ZX; (email)              → "Bob Smith"
Bob Smith ZX YZ; (email)           → "Bob Smith"

Name-suffix preservation. Generational and degree suffixes survive the post-loop acronym strip via the new _NAME_SUFFIXES allowlist:

  • Roman numerals: II, III, IV, V, VI, VII, VIII, IX, X, XI, XII
  • Generational: Jr, Jr., Sr, Sr., JR, JR., SR, SR.
  • All-caps degrees: MD, MA, BA, MS, BS, JD, MBA, DDS, DVM (mixed-case forms like PhD, MSc survive natively)

Two-pass post-loop strip. Collect trailing suffixes → strip acronyms → re-attach. Prevents a suffix at position 0 from blocking the acronym strip behind it.

18 regression tests anchor the property across all six arrangement cases plus the suffix-preservation and boundary edges. Full suite now at 155 tests.

Known limitations

Documented honestly in the source comment on _walk_back_for_name:

  • Comma-separated lists (Alice Smith, Bob Jones) still leak earlier authors
  • Internal-; tokens (III;some) bypass the boundary
  • 80-char snippet window — verbose affiliations can push the upstream ; out of view
  • Capitalised noise without an acronym marker may sweep into a "name"
  • All-caps last names (SMITH, CHEN) misclassified as acronyms by the lowercase-letter check

Upgrade

pip install --upgrade librarian-tracker

No API changes. No schema changes. No data migration needed. Only the rolodex contact output differs from 1.0.1, and the differences are bug fixes for real-world cases (DU , KIHA , and similar institutional-prefix leaks).

Verified

End-to-end against a live deployment's activities.yaml: six previously-buggy contacts (resolving as DU Lombe Chileshe, DU Hojjat Abdollahi, etc.) now resolve cleanly. The two-pass design was caught and fixed by a regression test that exercises Name ZX III; (email)Name III.