夕凪 (yunagi) — Japanese for "evening calm." The moment when the sea wind stills and the ocean surface becomes perfectly calm. Yunagi takes the turbulent sea of HTML and delivers serene, readable Markdown.
A zero-dependency HTML-to-Markdown converter that extracts main content from web pages. Available as a library, CLI tool, and MCP server.
- Content Extraction - Identifies and extracts main article content using readability heuristics
- HTML-to-Markdown - Converts extracted content to clean, formatted Markdown
- Metadata Extraction - Extracts page title, author, site name, and excerpt
- Site-Specific Filtering - Custom CSS selectors to remove/include/select content per domain
- robots.txt Support - Optional robots.txt compliance checking
- MCP Server - Exposes functionality via Model Context Protocol for LLM clients
| Package | Description |
|---|---|
| yunagi | Core library and CLI |
| @yunagi/mcp | MCP server |
bun install
bun run build# Basic usage
yunagi https://example.com/article
# Output to file
yunagi https://example.com -o article.md
# With robots.txt compliance
yunagi https://example.com --respect-robots-txt
# Custom filtering
yunagi https://example.com --remove ".sidebar" --remove ".ad"
# Select specific content
yunagi https://example.com --select "article.main"
# From stdin
cat page.html | yunagi --stdinimport { toMarkdown, htmlToMarkdown } from 'yunagi'
// Fetch and convert
const result = await toMarkdown('https://example.com', {
respectRobotsTxt: true,
remove: ['.sidebar', '.ad'],
converter: { headingStyle: 'setext' },
})
// From HTML string
const result = htmlToMarkdown(htmlString, options)Create a yunagi.config.json in your project root:
{
"keepImages": true,
"respectRobotsTxt": false,
"converter": {
"headingStyle": "atx",
"bulletListMarker": "-",
"codeBlockStyle": "fenced",
"linkStyle": "inlined"
},
"remove": [".sidebar", ".ad"],
"include": [".main-content"],
"select": "article.main",
"siteRules": [
{
"url": "zenn.dev",
"remove": [".topic-badge"],
"include": [".article-content"]
}
]
}The MCP server exposes 3 tools:
yunagi_convert- Fetch a URL, extract main content, and convert to Markdownyunagi_select- Extract elements matching a CSS selector as HTMLyunagi_select_markdown- Extract elements matching a CSS selector as Markdown
{
"mcpServers": {
"yunagi": {
"command": "npx",
"args": ["@yunagi/mcp"]
}
}
}bun run dev # Watch mode
bun run test # Run tests
bun run typecheck # Type checking
bun run lint # Lint
bun run format # FormatMIT