Skip to content

Release: merge dev into main#12

Merged
rejojer merged 22 commits intomainfrom
dev
Apr 10, 2026
Merged

Release: merge dev into main#12
rejojer merged 22 commits intomainfrom
dev

Conversation

@rejojer
Copy link
Copy Markdown
Member

@rejojer rejojer commented Apr 10, 2026

Summary

  • feat: concept dedup, compile pipeline refactor, and bidirectional backlinks
  • feat: brief system, per-page JSON sources, and unified query agent
  • feat: improve query agent with multimodal get_image tool
  • refactor: unify image paths and add pymupdf per-page extraction
  • fix: multiple bugfixes for compiler, indexer, concept names, and CLI
  • fix: config, CI improvements, and default model handling

Commits

  • Merge pull request fix: compiler concept update bugs #11 (bugfix/compiler-update-fixes)
  • Revert + re-fix exact concept index row matching
  • Fix concept index updates by section
  • fix: always replace concept body on update, not only when source is new
  • fix: preserve non-ASCII characters in concept name slugs
  • fix: update existing concept briefs in index.md instead of skipping
  • Merge pull request feat: compile pipeline, query agent, and multimodal improvements #10 (bugfix/compile-clean)
  • fix: sanitize concept names, pass doc_type/doc_brief, use json_repair
  • refactor: unify image paths and add pymupdf per-page extraction
  • feat: improve query agent with multimodal get_image tool
  • fix: improve init prompts, warning suppression, and CLI polish
  • fix: default model, API key warning, config and CI improvements
  • feat: brief system, per-page JSON sources, and unified query agent
  • feat: concept dedup, compile pipeline refactor, and bidirectional backlinks
  • bump version to 0.1.0.dev0

Test plan

  • Verify all existing tests pass
  • Test concept compilation and dedup workflow
  • Test query agent with image retrieval
  • Verify CLI commands and config handling

rejojer and others added 22 commits April 8, 2026 21:16
…klinks

- Add concept dedup with briefs and _read_concept_briefs context
- Add concepts plan and update prompt templates with create/update/related paths
- Extract shared _compile_concepts from compile_short_doc and compile_long_doc
- Add bidirectional backlinks between summaries and concepts
- Code review fixes: security, robustness, tests, and CI hardening

Co-authored-by: Ray <mailtangyu@gmail.com>
- Add get_page_content tool and parse_pages helper for page-level access
- Store long doc sources as per-page JSON extracted by pymupdf
- Unify summary frontmatter to doc_type + full_text fields
- Update schema and tree renderer for new frontmatter format
- All image paths use sources/images/ prefix relative to wiki root

Co-authored-by: Ray <mailtangyu@gmail.com>
- Change default model to gpt-5.4-mini
- Warn when no LLM API key found instead of failing silently
- Fix CI publish workflow and test isolation

Co-authored-by: Ray <mailtangyu@gmail.com>
- Move warning suppression after imports to avoid markitdown override
- Improve init prompts with explicit defaults
- Use American English throughout (initialized, normalized, Synthesize)
- Replace unicode ellipsis with ASCII
- Remove empty explorations/reports dirs from init
- Fix test isolation for _find_kb_dir
- Add get_image tool for viewing images referenced in source documents
- Use ToolOutputImage for proper image content in LLM context
- Update prompt: use full_text field, restrict get_page_content to pageindex
- Add self-talk before tool calls, enforce concise answers
- Prevent duplicate frontmatter in LLM-generated content via schema update
- Add convert_pdf_to_pages for per-page content+image extraction
- All image paths use sources/images/ prefix relative to wiki root
- Remove page marker comments from short doc source markdown
The _CONCEPT_UPDATE_USER prompt asks the LLM for a full rewrite, but
_write_concept was appending the rewrite to the existing body, causing
content duplication on every concept update.
Replace hand-rolled fence stripping with json_repair to handle
malformed JSON, missing fences, and prose-wrapped responses from LLMs.
Also fixes str.index() ValueError on fenced blocks without newlines.
feat: compile pipeline, query agent, and multimodal improvements
@rejojer rejojer merged commit 0291ec9 into main Apr 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants