Skip to content

v0.2.0

Choose a tag to compare

@github-actions github-actions released this 19 May 18:30
· 53 commits to master since this release
6bbbd90

What's Changed

Added

  • synto add SOURCE — import PDF, Markdown, and text files as tracked source documents.
    The original is archived in .synto/sources/<id>/. For PDF files, segments are
    extracted immediately into heading-aware chunks and assembled into canonical raw/*.md
    notes for the ingest pipeline. Use --type to specify the document type; --force to
    re-import; --extend-pack remains reserved and is currently a safe no-op.
  • Source-type prompt system: built-in prompts for notes, textbook, paper, spec,
    api_docs, web_article, corp_docs, transcript, plus unknown_text fallback are
    loaded during ingest analysis based on the declared source type, steering the fast model
    toward type-appropriate structure and terminology.
  • Compile lineage: compile_runs table records every compile job (models, token counts,
    timestamps). Published articles carry a lineage: frontmatter field listing their
    contributing sources and run ID. synto trace article <name> prints the full compile
    history for any article.
  • LLM response cache: llm_cache table stores SHA-256-keyed responses. synto maintain --clear-cache flushes all entries; --older-than N prunes entries older than N days.
  • Term extraction: extract_terms() and VaultReader.list_terms() added;
    concept_occurrences table (schema v13) links concepts to the source segments they
    appear in.

Fixed

  • PDF import now preserves ToC preamble text, surfaces bibliographic metadata into raw-note
    frontmatter, closes extractor resources correctly, and detects duplicate imports by content
    hash instead of filename alone.
  • File-based imports are now atomic, and synto add --force correctly replaces prior raw,
    asset, and import state instead of leaving stale artifacts behind.
  • synto status now counts imported on-disk raw notes before ingest, so synto add output
    shows up immediately as Raw: new.
  • Structured-output recovery now repairs malformed JSON escapes more aggressively, fixing
    live compile failures caused by invalid backslash and malformed \u... sequences.
  • Compile cleanup now strips stray [[wikilinks]... placeholder artifacts from generated
    article bodies before draft write.
  • Ingest invalidation now respects source-type prompt changes, and imported ### Media
    blocks are stripped before article synthesis.
  • Smoke coverage now matches current runtime behavior, including LM Studio model-id
    resolution and the intentional --extend-pack no-op.