Yunagi (夕凪)

夕凪 (yunagi) — Japanese for "evening calm." The moment when the sea wind stills and the ocean surface becomes perfectly calm. Yunagi takes the turbulent sea of HTML and delivers serene, readable Markdown.

A zero-dependency HTML-to-Markdown converter that extracts main content from web pages. Available as a library, CLI tool, and MCP server.

Features

Content Extraction - Identifies and extracts main article content using readability heuristics
HTML-to-Markdown - Converts extracted content to clean, formatted Markdown
Metadata Extraction - Extracts page title, author, site name, and excerpt
Site-Specific Filtering - Custom CSS selectors to remove/include/select content per domain
robots.txt Support - Optional robots.txt compliance checking
MCP Server - Exposes functionality via Model Context Protocol for LLM clients

Packages

Package	Description
yunagi	Core library and CLI
@yunagi/mcp	MCP server

Getting Started

bun install
bun run build

CLI Usage

# Basic usage
yunagi https://example.com/article

# Output to file
yunagi https://example.com -o article.md

# With robots.txt compliance
yunagi https://example.com --respect-robots-txt

# Custom filtering
yunagi https://example.com --remove ".sidebar" --remove ".ad"

# Select specific content
yunagi https://example.com --select "article.main"

# From stdin
cat page.html | yunagi --stdin

Programmatic API

import { toMarkdown, htmlToMarkdown } from 'yunagi'

// Fetch and convert
const result = await toMarkdown('https://example.com', {
  respectRobotsTxt: true,
  remove: ['.sidebar', '.ad'],
  converter: { headingStyle: 'setext' },
})

// From HTML string
const result = htmlToMarkdown(htmlString, options)

Configuration

Create a yunagi.config.json in your project root:

{
  "keepImages": true,
  "respectRobotsTxt": false,
  "converter": {
    "headingStyle": "atx",
    "bulletListMarker": "-",
    "codeBlockStyle": "fenced",
    "linkStyle": "inlined"
  },
  "remove": [".sidebar", ".ad"],
  "include": [".main-content"],
  "select": "article.main",
  "siteRules": [
    {
      "url": "zenn.dev",
      "remove": [".topic-badge"],
      "include": [".article-content"]
    }
  ]
}

MCP Server

The MCP server exposes 3 tools:

yunagi_convert - Fetch a URL, extract main content, and convert to Markdown
yunagi_select - Extract elements matching a CSS selector as HTML
yunagi_select_markdown - Extract elements matching a CSS selector as Markdown

Setup for Claude Desktop

{
  "mcpServers": {
    "yunagi": {
      "command": "npx",
      "args": ["@yunagi/mcp"]
    }
  }
}

Development

bun run dev            # Watch mode
bun run test           # Run tests
bun run typecheck      # Type checking
bun run lint           # Lint
bun run format         # Format

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.changeset		.changeset
.claude/agents		.claude/agents
.github/workflows		.github/workflows
packages		packages
.gitignore		.gitignore
.npmrc		.npmrc
.prettierignore		.prettierignore
.prettierrc		.prettierrc
README.md		README.md
bun.lock		bun.lock
eslint.config.js		eslint.config.js
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yunagi (夕凪)

Features

Packages

Getting Started

CLI Usage

Programmatic API

Configuration

MCP Server

Setup for Claude Desktop

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Yunagi (夕凪)

Features

Packages

Getting Started

CLI Usage

Programmatic API

Configuration

MCP Server

Setup for Claude Desktop

Development

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages