Releases: shenmintao/marginalia
Marginalia v0.2.4
Highlights
-
Citation footnotes now show the cited quote excerpt while hiding internal
quote_status=...markers; source links and quote/page locators are still
preserved. -
Switched
py7zzback to the upstream PyPI package at>=1.3.1, replacing
the temporary forked wheel URLs now that upstream publishes ARM64 wheels. -
Stable release for the 0.2.4 line, including the 0.2.4-rc.1 feature set.
Artifacts
Desktop bundles built from v0.2.4.
Desktop targets: Windows x64/arm64, macOS arm64, Linux x64/arm64.
Each bundle ships a self-contained Python runtime; no system Python required.
Docker image: ghcr.io/shenmintao/marginalia:v0.2.4 (linux/amd64, linux/arm64)
First-Launch Notes For Unsigned Binaries
- Windows: SmartScreen may say "Windows protected your PC". Click "More info" and then "Run anyway".
- macOS: Gatekeeper may refuse to open the .dmg. Run
xattr -dr com.apple.quarantine /Applications/Marginalia.appafter dragging it across.
Marginalia v0.2.4-rc.1
Highlights
Added
- Optional API bearer authentication via
MARGINALIA_API_TOKEN, with CLI and
desktop client support for sending the token. - Auto chat mode now defaults new turns to planner-selected quick/standard/deep
execution budgets, with visible budget upgrade notices when fresh evidence
justifies continuing. marginalia eval ablation-runfor candidate-pool component attribution
across metadata, relation expansion, semantic recall, rerank, and full
recall configurations.marginalia mcp/marginalia-mcpstdio server exposing the read-only
retrieval tool set to MCP-capable clients.- Python linting baseline with
ruff check src testsin CI. - Postgres metadata search now uses native text-search expressions with GIN
indexes, and eval coverage now includes a tiny CJK short-term dataset path. - Journal recall now annotates stale entry references caused by deletion or
reprocessing, downgrades stale notes behind current notes, and hides rows
invalidated by later contradictory reflections. MAINTENANCE_DAILY_TOKEN_BUDGETcan cap rolling 24-hour background
maintenance LLM usage and defer low-priority speculative tasks when spent.- Relation discovery now vets directly hit unjudged edges lazily during
/discover; periodic batchvet_relationsis opt-in via
RELATION_BACKGROUND_VETTING_ENABLED. - Citation display now marks quote-bearing footnotes as
quote_status=verifiedorquote_status=unverifiedafter checking the
cited entry's original readable text with whitespace/punctuation
normalization.
Changed
- Split the eval implementation into dataset, retrieval, metrics, reporting,
prompt, and probe modules while keepingmarginalia.eval.coreas the
compatibility import path.
Fixed
query_sqlnow disables DuckDB external access before executing
model-authored SQL, blocking path-literal, scan-function, and glob-style
local file reads outside the loaded entries.- E2E test temp directories are cleaned with a retrying Windows-aware helper.
- Docker compose now binds API and MinIO ports to localhost by default.
- OCR PDF VLM readback no longer counts PDF pages synchronously on the async
read path. - Mixed metadata queries keep short CJK terms via LIKE fallback instead of
silently dropping them from trigram FTS.
Documentation
- Documented API token use, compose localhost binding, and the known risk of
syncing a liveMARGINALIA_HOMEwith file replication tools.
Artifacts
Desktop bundles built from v0.2.4-rc.1.
Desktop targets: Windows x64/arm64, macOS arm64, Linux x64/arm64.
Each bundle ships a self-contained Python runtime; no system Python required.
Docker image: ghcr.io/shenmintao/marginalia:v0.2.4-rc.1 (linux/amd64, linux/arm64)
First-Launch Notes For Unsigned Binaries
- Windows: SmartScreen may say "Windows protected your PC". Click "More info" and then "Run anyway".
- macOS: Gatekeeper may refuse to open the .dmg. Run
xattr -dr com.apple.quarantine /Applications/Marginalia.appafter dragging it across.
Marginalia v0.2.3
Highlights
-
CLI chat mode control:
/mode [quick|deep]now shows or switches the
investigation mode, and CLI chat requests send the selected mode to
/v1/chat. -
marginalia initnow includes optional embedding, semantic recall, rerank,
and evidence-selection settings in the generated starter.env. -
Desktop chat restores the latest quick/deep mode when returning to an active
stream or reopening a historical session. -
Session list and transcript APIs now expose the latest recorded chat mode so
the UI can replay sessions without silently falling back to deep mode. -
Final-answer continuation and Quick-mode forced-answer guardrails now ask
the model to keep the same language as the user's latest message. -
recall_knowledgenow prioritizes selected evidence entries before journal
note-linked entries when buildingcandidate_entry_ids, so rerank/quota
evidence selection is preserved for follow-up verification and reads. -
Clarified internal
search_metadatanaming so local metadata signal ranking
is not confused with the optional external reranker. -
GitHub release notes now pull the matching version section from
CHANGELOG.md, keeping generated release notes aligned with prior releases. -
Added coverage for CLI quick/deep mode requests, starter
.envretrieval
settings, session mode restore, and selected-evidence candidate ordering. -
Main CI passed for the post-0.2.2 fixes before preparing this release.
Artifacts
Desktop bundles built from v0.2.3.
Desktop targets: Windows x64/arm64, macOS arm64, Linux x64/arm64.
Each bundle ships a self-contained Python runtime; no system Python required.
Docker image: ghcr.io/shenmintao/marginalia:v0.2.3 (linux/amd64, linux/arm64)
First-Launch Notes For Unsigned Binaries
- Windows: SmartScreen may say "Windows protected your PC". Click "More info" and then "Run anyway".
- macOS: Gatekeeper may refuse to open the .dmg. Run
xattr -dr com.apple.quarantine /Applications/Marginalia.appafter dragging it across.
Marginalia v0.2.2
Highlights
- Added Settings UI and API controls for embedding, semantic recall, rerank, and evidence-selection configuration.
- Hardened citation footnote rendering so raw
entry_id,quote, andreasonmetadata is hidden across more model output variants. - Improved OpenAI-compatible model handling for DeepSeek-style DSML text tool calls.
- Added a quick-mode forced final-answer retry when the capped final turn still attempts another tool call.
Retrieval Settings
Embedding, semantic recall, rerank, and evidence-selection knobs are now available from the desktop Settings page and the settings API. This makes the retrieval stack testable without editing .env by hand for every toggle.
Citation Handling
Citation footnotes now tolerate quoted entry_id values and fields emitted in a different order. This prevents model-written metadata such as entry_id=..., quote=..., and reason=... from leaking into the rendered answer when the model uses a slightly different footnote shape.
Quick Mode Reliability
OpenAI-compatible adapters now convert DSML text tool-call blocks into real tool calls instead of letting pseudo-XML reach the user-facing answer. Quick mode also performs one forced final-answer retry if the final capped execute turn still asks for another tool call.
Validation
- Targeted backend tests passed:
41 passed. - Frontend TypeScript check passed:
npm run lint. uv sync --locked --extra devpassed after syncinguv.lockto 0.2.2.- Local Windows x64 NSIS build passed.
- Local Windows x64 portable zip build passed and contains sidecar manifest
package: 0.2.2. - Main CI completed successfully after the lockfile sync fix.
- Release workflow completed successfully.
Artifacts
Desktop bundles built from v0.2.2.
Desktop targets: Windows x64/arm64, macOS arm64, Linux x64/arm64.
Each bundle ships a self-contained Python runtime; no system Python required.
Docker image: ghcr.io/shenmintao/marginalia:v0.2.2 (linux/amd64, linux/arm64)
First-Launch Notes For Unsigned Binaries
- Windows: SmartScreen may say "Windows protected your PC". Click "More info" and then "Run anyway".
- macOS: Gatekeeper may refuse to open the .dmg. Run
xattr -dr com.apple.quarantine /Applications/Marginalia.appafter dragging it across.
Marginalia v0.2.1
Highlights
- Added the Chat UI Quick / Deep mode switch.
- Added request-level chat mode selection:
POST /v1/chat/{session_id}now acceptsmode: "quick" | "deep". - Added deterministic, non-LLM
read_filesresult compression for long Agent reads. - Added exact reopen support with
read_filescompress: falsefor omitted compressed ranges. - Added runtime settings for read result compression, including a Settings-page toggle and
.envdefaults throughREAD_COMPRESSION_*. - Broadened text routing for common code/config/data files, including JSON/YAML/TOML/XML/HTML/CSV, Python, JavaScript/TypeScript, Go, Rust, Java, SQL, and shell scripts.
Agent Reading
Long read_files results can now be trimmed before they enter the chat model while preserving page, line, and offset anchors. The compression is extractive and lossy, not an LLM summarizer. Visible text remains quoteable; omitted markers must be reopened before quoting or relying on that evidence.
Supported compressed shapes include large plain text, PDF text, JSON, logs, and code-like files. Precision reads are left uncompressed, including pattern hits, explicit line/paragraph ranges, and VLM reads.
Compression is enabled by default. It can be disabled from Settings or with READ_COMPRESSION_ENABLED=false; thresholds can be tuned with READ_COMPRESSION_MIN_CHARS, READ_COMPRESSION_TARGET_CHARS, and READ_COMPRESSION_CONTEXT_CHARS.
Quick Mode
Quick mode keeps the plan phase but caps execute to three LLM calls: the first two may gather evidence with tools, while the third disables tools and must answer from collected evidence. Deep mode keeps the existing full ReAct investigation budget.
Validation
- Backend test suite:
189 passed. - Added unit coverage for PDF page, text, JSON, log, and code read compression.
- Added
read_filese2e coverage for compressed reads andcompress: falsereopen behavior. - Frontend production build passed.
cargo check --locked --all-targetspassed.- CI and Release workflows completed successfully.
Artifacts
Desktop bundles built from v0.2.1.
Desktop targets: Windows x64/arm64, macOS arm64, Linux x64/arm64.
Each bundle ships a self-contained Python runtime; no system Python required.
Docker image: ghcr.io/shenmintao/marginalia:v0.2.1 (linux/amd64, linux/arm64)
First-Launch Notes For Unsigned Binaries
- Windows: SmartScreen may say "Windows protected your PC". Click "More info" and then "Run anyway".
- macOS: Gatekeeper may refuse to open the .dmg. Run
xattr -dr com.apple.quarantine /Applications/Marginalia.appafter dragging it across.
Marginalia v0.2.0
Marginalia 0.2.0 moves the project toward a personal-library research agent:
retrieval remains local-first and source-grounded, while optional semantic
recall, reranking, and evaluation commands make report-generation quality
measurable.
Added
- Optional semantic recall using OpenAI-compatible embeddings, with
DashScope/Bailiantext-embedding-v4as the documented default. - Optional
sqlite-vecsemantic-index backend, with file-index fallback. - Optional second-stage reranking with separate
RERANK_*credentials. - Hybrid
recall_knowledgeevaluation support with batched recall, answer
probes, answer-run aggregates, and report comparison. marginalia eval compare-report, which compares one-shot RAG reports with
the full ReAct investigation workflow using blind pairwise judging.- BEIR-style dataset import that runs ingest synchronously and supports
resumed/concurrent imports. - Entry metadata FTS expansion for richer lexical recall.
Changed
- Semantic recall and rerank are opt-in; no chat, vision, or ingest API key is
reused implicitly for embedding or reranking. recall_knowledgecan merge lexical and semantic candidates, apply RRF-style
scoring, optionally rerank, and select evidence with source quotas.- Evaluation reports distinguish candidate-pool retrieval metrics from
final-answer/report metrics.
Validation
- SciFact 300 retrieval with rerank top-80 reached MRR 0.7226, hit@10 0.8800,
and hit@100 0.9133 in local validation. - SciFact 300 bounded answer-run with rerank top-80 and quota reached evidence
hit 0.8667, citation hit 0.7133, and label accuracy 0.8085. - A 30-query end-to-end report comparison favored the ReAct workflow over
one-shot RAG in 26/30 cases, with 2 one-shot RAG wins, 2 ties, and 1 timeout.
Downloads
- Desktop targets: Windows x64/arm64, macOS arm64, Linux x64/arm64.
- Docker image:
ghcr.io/shenmintao/marginalia:v0.2.0for linux/amd64 and
linux/arm64. - Desktop bundles ship a self-contained Python runtime; no system Python is
required.
First-Launch Notes
- Windows: SmartScreen may say "Windows protected your PC". Click "More info"
and then "Run anyway". - macOS: Gatekeeper may refuse to open the
.dmg. Run
xattr -dr com.apple.quarantine /Applications/Marginalia.appafter dragging
it across.
Notes
- ReAct report generation improves report quality at substantially higher
latency and token cost. It is best treated as a deep investigation mode, not
as the default path for every quick lookup. - Some OpenAI-compatible models may occasionally emit invalid JSON tool
arguments; the runtime tolerates these failures, but they can waste tool
turns and should be improved in later releases.
Marginalia v0.1.4
Marginalia v0.1.4 is a hotfix release for ingest recovery and ARM64 packaging.
Highlights
- Stops startup self-heal from reprocessing normal short summaries; automatic reprocess now targets only files whose ingest produced an empty/blank summary.
- Handles empty text and Markdown files deterministically without an LLM call, so empty files are marked done instead of looping through retry/self-heal.
- Removes the internal
Session name:control line from streamed and replayed plan text while still using it for chat sidebar titles. - Uses fork-built
py7zzv1.1.5 wheels for macOS, Windows x64/ARM64, and Linux x86_64/ARM64. - Publishes the Docker image as a multi-arch image for
linux/amd64andlinux/arm64.
Docker image: ghcr.io/shenmintao/marginalia:v0.1.4
First-launch notes for unsigned binaries:
- Windows: SmartScreen may say "Windows protected your PC". Click "More info" -> "Run anyway".
- macOS: Gatekeeper may refuse to open the app. Run
xattr -dr com.apple.quarantine /Applications/Marginalia.appafter dragging it into Applications.
Marginalia v0.1.3
Highlights
- Improved ingest failure recovery in the desktop library: failed files now surface the error state more clearly and can be reprocessed without stale spinner state blocking the action.
- Added folder-level reprocess affordances and better parent-folder status propagation, so unfinished or failed work is visible higher in the file tree.
- Refined relation mining so stronger evidence is preserved and weak relationships can be upgraded when better evidence appears.
- Made long text ingest more resilient by retrying empty LLM summaries with a larger output budget and by tolerating providers that reject reasoning-control parameters.
- Fixed Markdown deep-link navigation from chat/search results back into the library viewer.
- Polished background task metrics and task-window layout for dense running histories.
- Continued CI and test cleanup, including pytest conversions, vision-test isolation, and SQLite metadata search optimization.
Downloads
Desktop bundles include a self-contained Python runtime; no system Python install is required.
- Windows: installer, MSI, and portable zip are attached below.
- macOS: Apple Silicon DMG and app archive are attached below.
- Linux: amd64 deb and rpm packages are attached below.
- Docker:
ghcr.io/shenmintao/marginalia:v0.1.3
First-Launch Notes
- Windows: SmartScreen may show "Windows protected your PC". Click "More info", then "Run anyway".
- macOS: Gatekeeper may block the unsigned app. After dragging it to Applications, run:
xattr -dr com.apple.quarantine /Applications/Marginalia.appFull changelog: v0.1.2...v0.1.3
Marginalia v0.1.2
Highlights
- Added desktop UI internationalization with Auto, English, and Chinese language modes.
- Localized the main desktop surfaces: Chat, Library, Search, Settings, status bar, metadata panels, dialogs, markdown controls, and task activity.
- Reworked background task labels so internal kinds such as
ingest_file,reflect_turn, andreflection_turnshow human-readable names while keeping the raw kind in tooltips for debugging. - Improved chat session behavior: sessions appear immediately on first message, titles refresh when the planner returns them, and active streams survive switching away and back.
- Optimized LLM prompts for prefix caching, including ingest and background task prompts, to better align with DeepSeek-style prompt cache behavior.
Fixes
- Fixed the Chat stop/send state after switching between running sessions.
- Fixed Library folder actions so upload/new-folder targets carry the real folder name instead of falling back to an id or placeholder.
- Added configurable agent execution budgets and surfaced the settings in the desktop UI.
- Fixed GUI version display and updated release metadata to 0.1.2.
- Added README community links and a desktop GUI screenshot.
Artifacts
- Desktop bundles include Windows installer and portable zip, macOS Apple Silicon DMG/app archive, and Linux deb/rpm packages.
- Docker image:
ghcr.io/shenmintao/marginalia:v0.1.2andghcr.io/shenmintao/marginalia:latest.
First-launch Notes
- Windows: SmartScreen may warn. Click "More info" -> "Run anyway".
- macOS: Gatekeeper may block the app. After dragging it to Applications, run
xattr -dr com.apple.quarantine /Applications/Marginalia.app.
Marginalia v0.1.1
This release focuses on hardening the end-to-end private knowledge-base workflow after seed-user testing: better long-document ingest, more reliable PDF/OCR handling, stricter evidence/citation behavior, scoped retrieval, and provider-specific LLM thinking controls.
Highlights
- Improved long-document ingestion with chunk-level indexing, section summaries, coverage metadata, and parallel LLM calls for large text/PDF inputs.
- Added OCR text block storage for scanned PDFs, with uncapped OCR support when configured.
- Strengthened retrieval tools:
search_metadataandsearch_journalnow handle multi-keyword OR queries more reliably, and metadata search supports catalog/folder scoped filtering. - Improved citation and export robustness, including tolerant parsing for Chinese commas, concatenated quoted fragments, and
page=...variants. - Added provider-specific reasoning/thinking controls for Qwen, DeepSeek, Kimi, Gemini, OpenRouter, SiliconFlow, DashScope/Bailian, Together, NVIDIA, MiniMax, VolcEngine/BytePlus, Moonshot, and MiMo.
- Disabled thinking output for ingest and Qwen VLM/image paths to avoid leaking
<think>content into summaries, descriptions, or indexed metadata. - Made worker concurrency easier to reason about by aligning ingest task concurrency with
WORKER_BATCH_SIZE. - Refined prompts across ingest, export, PDF reading, evidence discovery, and agent answer generation.
Retrieval And Evidence
search_metadatasupports text arrays for OR-style recall.search_metadatacan restrict results bycatalog_id,catalog_subtree,folder_id, andfolder_subtree.list_catalogstreats omitted/null parent as root listing.- Query logs and exported evidence are easier to inspect and reuse.
- Footnote generation and citation locators are more robust across file types and quote/page variants.
PDF, OCR, And Long Documents
- Long text/PDF indexing records coverage and partial-indexing metadata.
- PDF OCR can preserve full OCR output instead of truncating useful source text.
- OCR PDF ingest stores block/page-level text for later targeted reads.
read_filesbehavior for PDFs is improved for page-range reads and scanned documents.- Large ingest workloads can use parallel LLM calls to reduce wall-clock time.
Agent And Runtime
- Improved final-answer handling for long answers and continuation flows.
- Reduced accidental reasoning leakage from model responses.
- Tightened tool prompts and parser contracts.
- Added task scheduling hardening and active-task deduplication.
- Improved session/title handling and runtime guard behavior.
GUI And Configuration
- Added or refined settings for worker batch size and ingest concurrency.
- Improved activity/status visibility during large ingest runs.
- Added lifecycle configuration switch for users whose personal file workflow should not auto-demote/archive files.
Documentation
- Revised
README,USAGE,DESIGN, and architecture notes around Marginalia as an AI retrieval infrastructure for private heterogeneous knowledge bases. - Documented structured funnel retrieval, persistent investigation logs, recommender-style evidence discovery, and source verification.
Upgrade Notes
- Existing ingested PDFs will not automatically gain the new OCR/block-level metadata. Reprocess important scanned PDFs if you want the new read behavior.
- For heavy OCR or long-document imports, tune
WORKER_BATCH_SIZEand ingest LLM concurrency according to provider rate limits. - This release does not require a manual database migration step beyond normal startup/bootstrap behavior.