-
Notifications
You must be signed in to change notification settings - Fork 0
Opportunities Annotation Ecosystem
14 tables and 1 SQL view added to the schema across three interconnected systems:
Fragment Joins (4 tables) — join_groups groups N-way reconstructions; fragment_joins stores pairwise links with a proposed → verified → accepted → rejected pipeline; fragment_join_evidence and fragment_join_decisions provide the same trust infrastructure that readings and lemmatizations already have.
Scholarly Annotations (2 tables + 1 view) — scholarly_annotations accepts commentary at any granularity (artifact, surface, line, token, sign, composite) via nullable FK columns with a CHECK constraint enforcing exactly one target. 12 annotation types from textual criticism to conservation notes. scholarly_annotations_w3c view exports to W3C Web Annotation format for future federation.
Discussion Threads (6 tables) — 5 per-entity thread tables with strict FKs to their targets (token_readings, lemmatizations, translations, fragment_joins, scholarly_annotations). Shared discussion_posts table with typed contributions (observation, counterargument, evidence, question, synthesis, endorsement).
Decision Trade Gain Strict FKs everywhere 5 thread tables instead of 1 polymorphic DB-enforced referential integrity at every entity boundary Nullable FK columns on scholarly_annotations 6 NULL columns per row, CHECK constraint Real foreign keys to all 6 target types without a join table Shared discussion_posts thread_type + thread_id polymorphic on post side (no DB-level FK to threads) Avoids 5 separate post tables; cross-thread queries stay simple join_groups for N-way Extra table + fragment_count bookkeeping No transitive closure queries; explicit grouping for 3+ fragment tablets W3C view over native W3C storage Export-only interoperability, not full bidirectional federation Internal schema stays optimized for SQLite; W3C is a projection
For the data model: Every layer now has a complete trust chain. Structured data (readings, lemmas, translations) already had competing interpretations + evidence + decisions. Now fragment joins get the same rigor, and unstructured commentary gets a formal home with provenance.
For the annotation silo problem: The three solutions form a graduated capture system. Structured scholarly claims (readings, joins) flow through high-rigor pipelines. Semi-structured commentary (paleographic notes, conservation observations) enters via scholarly_annotations. The discourse that produces consensus — currently lost in emails and conferences — gets captured in discussion threads with full attribution.
For federation: The W3C view is the interoperability surface. Internal storage stays SQLite-optimized, but any scholarly_annotation with visibility = 'public' can be exported as a standards-compliant Web Annotation. This is the seed for the distributed ecosystem described in the opportunities doc without requiring Glintstone to become an annotation server.
For academic credit: Every discussion post, every annotation, every join proposal traces back to a scholar via annotation_runs. The reasoning behind consensus decisions is no longer invisible. A graduate student who spots a fragment join gets a formal, citable record of that contribution.
Source: github.com/wittkensis/glintstone · Issues · Edit this wiki
Start here
Getting Started
Overview
Data Model
- Data Sources
- Data Quality
- Data Issues
- Import Pipeline Guide
- ML Integration
- Citation Pipeline Summary
Reference — Data Model
Reference — API
Reference — MCP
Opportunities
Personas
Project
Research