A Hexo plugin that reads your posts aloud. It builds an MP3 for each post at generate-time using Microsoft Edge online TTS, caches the result, and injects a floating audio player into the rendered page.
English | 汉语 | Español | Français | Deutsch | 日本語 | 한국어 | Português
The browser only ever loads a static <audio> file — no runtime TTS server,
no API key, no client-side JavaScript synthesis. Ideal for blogs that want
read-aloud without operating a backend or paying for a speech API at runtime.
- Preview
- Features
- Requirements
- Install
- Quick Start
- Configuration
- How it works
- Voices
- Cache
- Theming
- Troubleshooting
- Notes & limitations
- Development
- Related links
- License
Open preview.html in your browser to try the floating player
UI in light and dark mode — no Hexo site or TTS network call required.
- Build-time synthesis — no runtime TTS service, no API key required
- Content-hash cache — unchanged posts are not re-synthesized between builds
- Auto-injection into post pages, or manual
{% reader %}tag for precise placement - Floating player with play / pause / seek / playback speed (0.75× – 2×)
- Light & dark themes, keyboard accessible
- Smart text extraction — skips code blocks, scripts, styles, and embedded media
- Long-post safe — chunks input on sentence boundaries and concatenates MP3 frames
- Per-post opt-out via front-matter
- Node.js >= 18
- Hexo >= 5
- Network access to Microsoft Edge online TTS at build time
npm install hexo-tts-reader --saveor with yarn / pnpm:
yarn add hexo-tts-reader
pnpm add hexo-tts-reader-
Install the plugin.
-
Add the following to your site's
_config.yml:reader: enable: true
-
Run
hexo clean && hexo generate(orhexo server). Each post page will gain a floating player button (default label:朗读本文) in the bottom-right corner. ChangebuttonLabelin config for your locale.
That's it. On subsequent builds, the content-hash cache means only changed posts hit the TTS service.
Full config with defaults:
reader:
enable: true
autoInject: true
voice: zh-CN-XiaoxiaoNeural
rate: 0 # -100..100, relative percent
pitch: 0 # -100..100, relative percent
outputFormat: audio-24khz-48kbitrate-mono-mp3
audioDir: audio # public audio output dir
cacheDir: .hexo-reader-cache # local cache dir (gitignore it)
position: bottom-right # bottom-right | bottom-left | top-right | top-left
buttonLabel: 朗读本文
chunkSize: 4000 # split long posts into <= N chars per request
maxTextLength: 100000 # hard upper bound per post
failOnError: false # if true, build fails when TTS fails
timeoutMs: 60000 # per-chunk TTS timeout (ms)
skip: [] # list of substrings to match against post.source| Option | Type | Default | Notes |
|---|---|---|---|
enable |
boolean | true |
Master switch. |
autoInject |
boolean | true |
Append the player to every post automatically. Disable if you only want to use {% reader %}. |
voice |
string | zh-CN-XiaoxiaoNeural |
Any Microsoft Edge online TTS voice id. |
rate |
number | 0 |
Relative speech rate, -100..100. Out-of-range values are clamped. |
pitch |
number | 0 |
Relative pitch, -100..100. Out-of-range values are clamped. |
outputFormat |
string | audio-24khz-48kbitrate-mono-mp3 |
Any format supported by msedge-tts. |
audioDir |
string | audio |
Public path under the site root where MP3s are emitted. Path traversal is rejected. |
cacheDir |
string | .hexo-reader-cache |
Local cache dir (resolved from your site's base dir). Survives across builds. |
position |
enum | bottom-right |
One of bottom-right, bottom-left, top-right, top-left. |
buttonLabel |
string | 朗读本文 |
Aria-label / tooltip of the toggle button. |
chunkSize |
number | 4000 |
Max characters per TTS request, 200..8000. |
maxTextLength |
number | 100000 |
Hard cap per post, 100..1000000. Longer text is truncated. |
failOnError |
boolean | false |
When true, a TTS failure aborts the whole hexo generate. |
timeoutMs |
number | 60000 |
Per-chunk WebSocket timeout, 5000..600000. |
skip |
string[] | [] |
Substrings matched against post.source to skip selected posts. |
In a post's front-matter:
---
title: My private post
reader: false
---Insert the player anywhere in a post with the {% reader %} tag. When the tag
is present, auto-injection is suppressed for that post so you only get one
player:
Some intro text.
{% reader %}
The rest of the article.reader:
skip:
- "draft/"
- "_posts/private/"Any post whose source contains one of the substrings above is skipped.
flowchart LR
A[Hexo renders post] --> B[Extract plain text]
B --> C{Cache hit?}
C -->|yes| E[Copy MP3 to public dir]
C -->|no| D[Edge TTS via WebSocket]
D --> E
E --> F[Inject player markup]
- After Hexo renders a post (
after_post_renderfilter), the plugin extracts a TTS-friendly plain-text representation from the HTML. Code blocks, scripts, styles, and embedded media are removed. - A SHA-1 of
{ text, voice, rate, pitch, format }becomes the cache key. - If
<cacheDir>/<key>.mp3already exists, it is reused. Otherwise the plugin opens a WebSocket to Microsoft Edge TTS (viamsedge-tts) and writes the result atomically (*.tmp→ rename). - Long inputs are chunked on sentence boundaries (Chinese & English punctuation), synthesized chunk-by-chunk, and concatenated as raw MP3 frames — which is safe for playback.
- The generator emits the MP3 under
<audioDir>/<key>.mp3, plus a sharedreader.js/reader.cssunderassets/hexo-reader/. - The player markup and a
<link>/<script>snippet are appended to the post's content so it ships with the static site.
Any voice supported by Microsoft Edge online TTS works. A few examples:
| Voice id | Locale | Notes |
|---|---|---|
zh-CN-XiaoxiaoNeural |
zh-CN | Default, female |
zh-CN-YunxiNeural |
zh-CN | Male |
zh-CN-YunyangNeural |
zh-CN | Male, news-style |
en-US-AriaNeural |
en-US | Female |
en-US-GuyNeural |
en-US | Male |
ja-JP-NanamiNeural |
ja-JP | Female |
ko-KR-SunHiNeural |
ko-KR | Female |
For a full list, see the upstream voice catalogue or run a voices query via
msedge-tts.
-
The cache lives at
<site>/<cacheDir>and survives across builds. -
It is keyed on the content, not the file name, so renames don't trigger re-synthesis and minor edits regenerate only the affected posts.
-
To force a full re-synthesis, delete the cache directory.
-
Recommended
.gitignoreentry:.hexo-reader-cache/
The injected player uses CSS classes prefixed with hexo-reader__. To
customize colors, override them in your theme's stylesheet, for example:
.hexo-reader__toggle {
background: #1f6feb;
color: #fff;
}
.hexo-reader__panel {
border-radius: 12px;
}The player respects prefers-color-scheme: dark out of the box.
The build hangs or times out on hexo generate.
Your build needs network access to Edge TTS. If you're behind a firewall or
running offline, set failOnError: false (default) so failures degrade
gracefully, or skip the affected posts via skip.
A post has no player after build.
Check the Hexo log for hexo-reader: TTS failed for "<title>". If TTS failed
and failOnError is false, the player is silently skipped for that post.
The player appears twice in a post.
Don't combine autoInject: true with the {% reader %} tag in the same post.
The plugin already suppresses auto-injection when the tag is present — make
sure the tag is rendered (not commented out) and that no theme template adds
its own copy.
Player UI looks broken (overlapping play/pause icons) or won't play under PJAX themes.
Themes such as AnZhiYu and Butterfly swap #body-wrap on in-site navigation and
ignore <script defer> / <link> tags embedded in post content. Since 0.1.3,
this plugin injects reader.css / reader.js site-wide via Hexo injector and
re-initializes on pjax:complete through hexoReaderBoot(). Upgrade and run
hexo clean && hexo generate.
Audio doesn't play / 404.
Make sure your deployment includes the audio/ and assets/hexo-reader/
directories. If you set a non-default audioDir, ensure it's not blocked by
your CDN rules.
I want to ship the site without ever calling TTS.
Pre-populate <cacheDir> with previously-generated MP3 files. The plugin
re-uses them by content hash without hitting the network.
- Synthesis happens at build time and needs network access to Edge TTS.
- Requires
msedge-tts>= 2.0.5 (bundled with this plugin). Microsoft changed the Edge Read Aloud API in late 2025; older clients receive an HTML error page instead of audio. - Each post becomes one MP3; very long posts are chunked and concatenated.
- If TTS fails for a post and
failOnErrorisfalse(default), the player is simply not injected for that post and the build continues. - The cache key is a SHA-1 of plain text + synthesis params. It is used as a content identifier only, not for security.
git clone https://github.com/xichenx/hexo-tts-reader.git
cd hexo-tts-reader
npm install
npm testTests use the built-in Node test runner (node --test).
- npm package
- GitHub repository
- Report an issue
- msedge-tts — underlying TTS client
MIT © 刘明智(xichen)