Convert .docx files to Markdown from the command line. Optional image OCR via PaddleX.
Web version: word2md.net — drag & drop in browser, no install needed.
pnpm add -g word2md-cli
# or run without install
npx word2md-cli input.docxword2md input.docx # → input.md next to source
word2md input.docx -o out.md # custom output
word2md input.docx --stdout # to stdout
word2md a.docx b.docx c.docx -d out/ # batch mode
word2md input.docx --format text # plain text (strip markdown)Pass --ocr with PaddleX credentials to extract text from images inside the docx:
export PADDLEX_OCR_URL="https://..."
export PADDLEX_OCR_TOKEN="..."
word2md input.docx --ocrOr pass flags directly:
word2md input.docx --ocr \
--paddlex-url https://... \
--paddlex-token xxx \
--ocr-concurrency 4Without --ocr, images are stripped.
pnpm install
pnpm dev -- sample.docx --stdout
pnpm build