Format-preserving DOCX translation pipeline. Extracts text, translates via your LLM provider, rebuilds the original formatting.
npm install @docshift/coreimport { translateDocxBuffer } from '@docshift/core';
import type { CoreProvider } from '@docshift/core';
const provider: CoreProvider = {
async complete(prompt) { /* call your LLM, return response string */ },
async translateWithBrief(segments, targetLang, readingNotes, onProgress, opts) {
/* translate segments in chunks, inject opts.glossary + opts.rules */
return segments.map(() => '');
},
};
const fs = await import('fs/promises');
const input = (await fs.readFile('document.docx')).buffer;
const { buffer } = await translateDocxBuffer(input, provider, 'English', {
glossary: 'tổng hợp=combined, hợp nhất=consolidated',
rules: 'Keep all numbers in original format.',
onStage: stage => console.log('Stage:', stage),
onProgress: (done, total) => console.log(`${done}/${total} segments`),
});
await fs.writeFile('document_en.docx', Buffer.from(buffer));| Param | Type | Description |
|---|---|---|
input |
ArrayBuffer |
Source .docx file bytes |
provider |
CoreProvider |
LLM adapter — must implement complete + translateWithBrief |
targetLang |
string |
Target language name, e.g. "English" |
options.glossary |
string? |
Comma-separated source=target pairs |
options.rules |
string? |
Free-text translation rules |
options.onStage |
(stage: string) => void? |
Stage callback: extracting → priming → translating → reviewing → rebuilding → done |
options.onProgress |
(done, total) => void? |
Per-segment progress |
Returns Promise<{ buffer: ArrayBuffer; filename: string }>.
On rebuild failure, returns original buffer unchanged (never throws).
If this saved you time → buy me a coffee ☕
MIT © Tuan Khuc