Skip to content

[BUG] Internal page links with custom display text are dropped during markdown conversion #104

@tomerha

Description

@tomerha

Describe the bug
When converting a page to markdown with confluence read <pageId> -f markdown, internal page links that have a custom display text (via <ac:link-body>) are silently removed instead of being preserved as text. This causes table cells (and other elements) that contain only such links to appear empty in the output.

The storageToMarkdown method handles two <ac:link> patterns:

  1. External URL links with <ac:plain-text-link-body> → converted to [text](url)
  2. Internal page links without a link body → converted to [Page Title]

But internal page links with <ac:link-body> (the format Confluence uses when the author gives a link custom display text) aren't matched by either pattern, so they fall through to the catch-all removal on line 1428:

markdown = markdown.replace(/<ac:link>[\s\S]*?<\/ac:link>/g, '');

Additionally, that catch-all only matches <ac:link> with no attributes. Links that carry attributes like ac:anchor, ac:local-id, or ac:card-appearance (e.g. <ac:link ac:anchor="...">) are not caught by this regex either, so they survive as raw HTML in the output.

To Reproduce
Any Confluence page with a table where cells contain internal links with custom display text. For example, storage format like:

<ac:link>
  <ri:page ri:content-title="Some Long Page Title" ri:version-at-save="28" />
  <ac:link-body>Short Name</ac:link-body>
</ac:link>
confluence read <pageId> -f markdown

The cell containing that link will be empty in the markdown table.

Expected behavior
The link's display text should be preserved. The above example should produce Short Name in the markdown output.

Environment (please complete the following information):

  • confluence-cli version: 1.30.0
  • Node.js version: v22.20.0
  • OS: macOS

Suggested fix

Add a regex before the catch-all that extracts the display text from <ac:link-body>, and widen the catch-all to match <ac:link> tags with attributes:

// Convert internal page links with custom link body text
markdown = markdown.replace(
  /<ac:link[^>]*>[\s\S]*?<ac:link-body>([\s\S]*?)<\/ac:link-body>[\s\S]*?<\/ac:link>/g,
  '$1'
);

// Remove any remaining ac:link tags that weren't matched
markdown = markdown.replace(/<ac:link[^>]*>[\s\S]*?<\/ac:link>/g, '');

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingreleased

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions