Skip to content

feat: add MinerU document parsing SOP and helper#578

Open
GGzili wants to merge 1 commit into
lsdefine:mainfrom
GGzili:feat/mineru-skill
Open

feat: add MinerU document parsing SOP and helper#578
GGzili wants to merge 1 commit into
lsdefine:mainfrom
GGzili:feat/mineru-skill

Conversation

@GGzili
Copy link
Copy Markdown

@GGzili GGzili commented Jun 7, 2026

What

Adds a MinerU document-parsing capability to memory/ as a SOP plus helper, matching the repo's tool-paired SOP pattern (e.g. procmem_scanner_sop.md + procmem_scanner.py):

  • memory/mineru_sop.md: standard operating procedure (quick start, interface notes, limits)
  • memory/mineru.py: zero-extra-dependency helper (uses requests). Submits a URL or local file to the MinerU API v4, polls, then downloads and extracts the resulting Markdown/JSON.
  • .gitignore: whitelist entries for the two files.

Why

PDF / Office / image to Markdown+JSON parsing is a common, fundamental agent capability. This exposes MinerU (https://mineru.net) as a reusable core tool.

Testing

  • python -m py_compile memory/mineru.py passes.
  • Manual run: python memory/mineru.py <url|file> -o ./out (requires a MinerU token).

@GGzili GGzili force-pushed the feat/mineru-skill branch from b402b58 to fbbd684 Compare June 7, 2026 05:03
@GGzili GGzili changed the title feat: add MinerU document-extraction skill feat: add MinerU document parsing SOP and helper Jun 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant