somark-document-parser

Parse PDFs, images, Word, and PowerPoint files into clean Markdown or JSON using SoMark — the document intelligence API built for AI workflows.

Install

npx skills add https://github.com/SoMarkAI/somark-document-parser

Works with Claude Code, Cursor, Cline, OpenCode, and 40+ other agents.

What it does

When you share a document with your AI agent, SoMark parses it into structured Markdown or JSON that the agent can actually reason over — not just OCR'd text, but proper headings, tables, formulas, and layout.

Supported formats:

Type	Formats
Documents	PDF, DOC, DOCX, PPT, PPTX
Images	PNG, JPG, JPEG, BMP, TIFF, WEBP, HEIC, HEIF, GIF

Example triggers:

"Parse this PDF for me"
"Extract the key clauses from this contract"
"Summarize the paper I just uploaded"
"Convert this document to Markdown"
"What does this image say?"

Setup

Get an API key at somark.tech, then set it as an environment variable:

export SOMARK_API_KEY=sk-your-api-key

Or add it to your agent's settings. The skill will guide you through setup on first use.

Free quota: SoMark offers a free tier. Visit the purchase page and follow the instructions there to claim it.

Why SoMark

Most agents struggle with documents because raw PDF/image data loses structure. SoMark preserves:

Heading hierarchy — agents can understand document sections correctly
Tables — fully reconstructed instead of flattened into plain text
Formulas and diagrams — converted to LaTeX or described accurately
Multi-column layouts — reading order is preserved

The result: your agent gives accurate, context-aware answers instead of hallucinating from garbled text.

Limits

Constraint	Limit
Max file size	200 MB
Max pages	300 pages
QPS per account	1

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
skills/somark-document-parser		skills/somark-document-parser
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
README.md		README.md
README.zh-CN.md		README.zh-CN.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

somark-document-parser

Install

What it does

Setup

Why SoMark

Limits

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

somark-document-parser

Install

What it does

Setup

Why SoMark

Limits

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages