Skip to content

Marginalia v0.1.1

Choose a tag to compare

@shenmintao shenmintao released this 28 May 13:06
· 61 commits to main since this release

This release focuses on hardening the end-to-end private knowledge-base workflow after seed-user testing: better long-document ingest, more reliable PDF/OCR handling, stricter evidence/citation behavior, scoped retrieval, and provider-specific LLM thinking controls.

Highlights

  • Improved long-document ingestion with chunk-level indexing, section summaries, coverage metadata, and parallel LLM calls for large text/PDF inputs.
  • Added OCR text block storage for scanned PDFs, with uncapped OCR support when configured.
  • Strengthened retrieval tools: search_metadata and search_journal now handle multi-keyword OR queries more reliably, and metadata search supports catalog/folder scoped filtering.
  • Improved citation and export robustness, including tolerant parsing for Chinese commas, concatenated quoted fragments, and page=... variants.
  • Added provider-specific reasoning/thinking controls for Qwen, DeepSeek, Kimi, Gemini, OpenRouter, SiliconFlow, DashScope/Bailian, Together, NVIDIA, MiniMax, VolcEngine/BytePlus, Moonshot, and MiMo.
  • Disabled thinking output for ingest and Qwen VLM/image paths to avoid leaking <think> content into summaries, descriptions, or indexed metadata.
  • Made worker concurrency easier to reason about by aligning ingest task concurrency with WORKER_BATCH_SIZE.
  • Refined prompts across ingest, export, PDF reading, evidence discovery, and agent answer generation.

Retrieval And Evidence

  • search_metadata supports text arrays for OR-style recall.
  • search_metadata can restrict results by catalog_id, catalog_subtree, folder_id, and folder_subtree.
  • list_catalogs treats omitted/null parent as root listing.
  • Query logs and exported evidence are easier to inspect and reuse.
  • Footnote generation and citation locators are more robust across file types and quote/page variants.

PDF, OCR, And Long Documents

  • Long text/PDF indexing records coverage and partial-indexing metadata.
  • PDF OCR can preserve full OCR output instead of truncating useful source text.
  • OCR PDF ingest stores block/page-level text for later targeted reads.
  • read_files behavior for PDFs is improved for page-range reads and scanned documents.
  • Large ingest workloads can use parallel LLM calls to reduce wall-clock time.

Agent And Runtime

  • Improved final-answer handling for long answers and continuation flows.
  • Reduced accidental reasoning leakage from model responses.
  • Tightened tool prompts and parser contracts.
  • Added task scheduling hardening and active-task deduplication.
  • Improved session/title handling and runtime guard behavior.

GUI And Configuration

  • Added or refined settings for worker batch size and ingest concurrency.
  • Improved activity/status visibility during large ingest runs.
  • Added lifecycle configuration switch for users whose personal file workflow should not auto-demote/archive files.

Documentation

  • Revised README, USAGE, DESIGN, and architecture notes around Marginalia as an AI retrieval infrastructure for private heterogeneous knowledge bases.
  • Documented structured funnel retrieval, persistent investigation logs, recommender-style evidence discovery, and source verification.

Upgrade Notes

  • Existing ingested PDFs will not automatically gain the new OCR/block-level metadata. Reprocess important scanned PDFs if you want the new read behavior.
  • For heavy OCR or long-document imports, tune WORKER_BATCH_SIZE and ingest LLM concurrency according to provider rate limits.
  • This release does not require a manual database migration step beyond normal startup/bootstrap behavior.