Skip to content

curvenote/prosemirror-docx

Repository files navigation

prosemirror-docx

prosemirror-docx on npm prosemirror-docx on GitHub [ MIT License CI

Export a prosemirror document to a Microsoft Word file, using docx.

image

Overview

prosemirror-docx has a similar structure to prosemirror-markdown, with a DocxSerializerState object that you write to as you walk the document. It is a light wrapper around https://docx.js.org/, which actually does the export. Currently prosemirror-docx is write only (i.e. can export to, but can’t read from *.docx), and has most of the basic nodes covered (see below).

Curvenote uses this to export from @curvenote/editor to word docs, but this library currently only has dependence on docx, prosemirror-model and image-dimensions - and similar to prosemirror-markdown, the serialization schema can be edited externally (see Extended usage below).

Basic usage

import { defaultDocxSerializer, writeDocx } from 'prosemirror-docx';
import { EditorState } from 'prosemirror-state';
import { writeFileSync } from 'fs'; // Or some other way to write a file

// Set up your prosemirror state/document as you normally do
const state = EditorState.create({ schema: mySchema });

// If there are images, we will need to preload the buffers
const opts = {
  getImageBuffer(src: string) {
    return anImageBuffer;
  },
};

// Create a doc in memory, and then write it to disk
const wordDocument = defaultDocxSerializer.serialize(state.doc, opts);

await writeDocx(wordDocument).then((buffer) => {
  writeFileSync('HelloWorld.docx', buffer);
});

Advanced usage

If you need to access the underlying state and modify the final docx Document you can use the last argument of serialize to pass in a callback function that receives the DocxSerializerState.

This function needs to return an IPropertiesOptions type, ie. the config that should be passed to a Document. Your options will be spread with the default options, so you can override any of the defaults.

const wordDocument = defaultDocxSerializer.serialize(state.doc, opts, (state) => {
  return {
    numbering: {
      config: state.numbering,
    },
    fonts: [], // embed fonts,
    styles: {
      paragraphStyles,
      default: {
        heading1: paragraphStyles[1],
      },
    },
  };
});

See the docx documentation for more details on the options you can pass in.

Extended usage

Instead of using the defaultDocxSerializer you can override or provide custom serializers.

import { DocxSerializer, defaultNodes, defaultMarks } from 'prosemirror-docx';

const nodeSerializer = {
  ...defaultNodes,
  my_paragraph(state, node) {
    state.renderInline(node);
    state.closeBlock(node);
  },
};

export const myDocxSerializer = new DocxSerializer(nodeSerializer, defaultMarks);

The state is the DocxSerializerState and has helper methods to interact with docx.

If the exported content includes image links that require fetching the image data, you can use asynchronous APIs. Here's a demo example:

import { DocxSerializerAsync, defaultAsyncNodes, defaultMarks } from 'prosemirror-docx';
import { EditorState } from 'prosemirror-state';
import { writeFileSync } from 'fs';

const state = EditorState.create({ schema: mySchema });

export const docxSerializer = new DocxSerializerAsync(
  {
    ...defaultAsyncNodes,
    async image(state, node) {
      const { src } = node.attrs;
      await state.image(src, 70, 'center', undefined, 'png');
      state.closeBlock(node);
    },
  },
  defaultMarks,
);

// If there are images, we will need to preload the buffers
const opts = {
  async getImageBuffer(src: string) {
    const arrayBuffer = await fetch(src).then((res) => res.arrayBuffer());
    return new Uint8Array(arrayBuffer);
  },
};

// Create a doc in memory, and then write it to disk
const wordDocument = docxSerializer.serializeAsync(state.doc, opts);

await writeDocx(wordDocument).then((buffer) => {
  writeFileSync('HelloWorld.docx', buffer);
});

Supported Nodes

  • text
  • paragraph
  • heading (levels)
    • TODO: Support numbering of headings
  • blockquote
  • code_block
    • TODO: No styles supported
  • horizontal_rule
  • hard_break
  • ordered_list
  • unordered_list
  • list_item
  • image
  • math
  • equations (numbered & unnumbered)
  • tables

Planned:

  • Internal References (e.g. see Table 1)

Supported Marks

  • em
  • strong
  • link
    • Note: this is actually treated as a node in docx, so ignored as a prosemirror mark, but supported.
  • code
  • subscript
  • superscript
  • strikethrough
  • underline
  • smallcaps
  • allcaps

Resources

About

Export a prosemirror document to a Microsoft Word file, using docx.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 8