Skip to content

[EuiMarkdownFormat] Fix intraword underscore emphasis rendering#9408

Open
acstll wants to merge 5 commits intoelastic:mainfrom
acstll:fix/markdown-intraword-underscore-emphasis
Open

[EuiMarkdownFormat] Fix intraword underscore emphasis rendering#9408
acstll wants to merge 5 commits intoelastic:mainfrom
acstll:fix/markdown-intraword-underscore-emphasis

Conversation

@acstll
Copy link
Contributor

@acstll acstll commented Feb 24, 2026

Summary

Adds a remark AST transformer plugin (remark_intraword_underscore) that reverses incorrectly applied emphasis/strong nodes when underscore delimiters are flanked by alphanumeric characters on both sides (intraword context).

remark-parse v8 does not implement the CommonMark specification rule that underscore delimiters which are both left-flanking and right-flanking (i.e. surrounded by word characters) cannot open or close emphasis. This plugin walks the parsed AST post-parse, detects these cases by checking sibling text nodes and verifying the delimiter is _ (not *) via source position, and flattens the nodes back to plain text with underscores restored.

This is a temporary workaround until remark-parse is upgraded to a version that natively handles this rule.

Why are we making this change?

Closes #9404

Salesforce API identifiers use underscores and double underscores extensively (e.g. SBQQ__OrderProductBookings__c, Custom_Field__c). EuiMarkdownFormat renders these with bold/italic formatting instead of preserving them verbatim, making the text unreadable and unusable for copy-paste.

Impact to users

🟢 Intraword underscore sequences that were previously (incorrectly) rendered as bold or italic will now display as plain text, matching CommonMark spec behavior. Legitimate emphasis using __bold__ or _italic_ with whitespace boundaries continues to work as expected. Asterisk-based emphasis (**bold**, *italic*) is completely unaffected.

QA

Remove or strikethrough items that do not apply to your PR.

General checklist

  • Browser QA
    • Checked in both light and dark modes
    • Checked in both MacOS and Windows high contrast modes
    • Checked in mobile
    • Checked in Chrome, Safari, Edge, and Firefox
    • Checked for accessibility including keyboard-only and screenreader modes
  • Docs site QA
  • Code quality checklist
  • Release checklist
    • A changelog entry exists and is marked appropriately
    • If applicable, added the breaking change issue label
    • If the changes unblock an issue in a different repo, smoke tested carefully
  • Designer checklist

Made with Cursor

…tic#9404)

Add a remark AST transformer plugin that reverses incorrectly applied
emphasis/strong nodes when underscore delimiters are flanked by
alphanumeric characters on both sides (intraword context), matching the
CommonMark specification behavior that remark-parse v8 does not implement.

This prevents Salesforce-style identifiers like SBQQ__OrderProductBookings__c
from being rendered with bold formatting.

Co-authored-by: Cursor <cursoragent@cursor.com>
@acstll acstll requested a review from a team as a code owner February 24, 2026 13:06
@acstll acstll self-assigned this Feb 24, 2026
@acstll acstll added the skip-changelog Use on PRs to skip changelog requirement (Don't delete - used for automation) label Feb 24, 2026
@acstll acstll marked this pull request as draft February 24, 2026 13:07
@acstll acstll added support-duty-flywheel Label for PRs, see eui-private #310 and removed skip-changelog Use on PRs to skip changelog requirement (Don't delete - used for automation) labels Feb 24, 2026
Copy link
Member

@tkajtoch tkajtoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was initially thinking we'd handle that outside remark but having it as a plugin is also a valid solution :D So far the changes look good. Let me know when the PR is ready for final review!

@acstll
Copy link
Contributor Author

acstll commented Feb 25, 2026

I was initially thinking we'd handle that outside remark but having it as a plugin is also a valid solution :D So far the changes look good. Let me know when the PR is ready for final review!

nice! i'll go through it again and add a small unit test, and then i'll open 🙏

@acstll acstll marked this pull request as ready for review February 27, 2026 08:44
@acstll
Copy link
Contributor Author

acstll commented Feb 27, 2026

@tkajtoch this one is ready for review 🫡

@weronikaolejniczak
Copy link
Contributor

I would propose to close this one in favor of dropping CJS support (https://github.com/elastic/eui-private/issues/553). It's not the first issue we've had for remark that a simple package update would fix.

@tkajtoch
Copy link
Member

This issue has spun up a whole different discussion, and I believe at this point we just need to get things going. I'm going to run your code against remark test files to ensure there are no regressions.

Copy link
Member

@tkajtoch tkajtoch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested your PR against a CommonMark sample file and confirmed there are no regressions.

However, I've found one more thing related to underscores that this doesn't fix: Lorem__ipsum__ dolor sit amet should output a plaintext but instead it makes the ipsum bolded.

Here's the expected mdast:

{
  "type": "root",
  "children": [
    {
      "type": "paragraph",
      "children": [
        {
          "type": "text",
          "value": "Lorem__ipsum__ dolor sit amet"
        }
      ]
    }
  ]
}

@acstll
Copy link
Contributor Author

acstll commented Mar 20, 2026

The Lorem__ipsum__ dolor sit amet case is now handled, updated in 366c5f4

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a temporary remark AST transformer to undo incorrectly-parsed underscore emphasis/strong in “intraword” contexts (e.g., Salesforce API identifiers), aligning EuiMarkdownFormat output closer to the CommonMark rule for _ delimiters until remark-parse is upgraded.

Changes:

  • Added remark_intraword_underscore plugin to flatten misapplied emphasis/strong nodes back into plain text underscores in intraword scenarios.
  • Added RTL/Jest coverage validating Salesforce-style identifiers render without <em>/<strong> while standalone _italic_/__bold__ still work.
  • Registered the plugin in default parsing plugins, updated plugin count assertions, and added a changelog entry.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
packages/eui/src/components/markdown_editor/plugins/remark/remark_intraword_underscore.ts New remark plugin that rewrites intraword underscore emphasis/strong nodes into text.
packages/eui/src/components/markdown_editor/plugins/remark/remark_intraword_underscore.test.tsx Tests verifying identifiers with underscores are preserved as plain text.
packages/eui/src/components/markdown_editor/plugins/markdown_default_plugins/parsing_plugins.ts Registers the new parsing plugin in the default list.
packages/eui/src/components/markdown_editor/plugins/markdown_default_plugins/plugins.test.ts Updates default plugin count expectations.
packages/eui/changelogs/upcoming/9408.md Changelog entry for the bug fix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +47 to +52
} else if (isEmphasisOrStrong(child)) {
inner += flattenToText(child);
} else {
// Contains non-text children (links, images, etc.) — leave emphasis intact
return delimiter + collectText(node) + delimiter;
}
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flattenToText() claims to "leave emphasis intact" when encountering non-text children, but it still returns a string and processParent() always replaces the node with a text node. This will strip nested nodes like inlineCode, link, etc. and can even drop their content (e.g. inlineCode has value but no children, so collectText() returns an empty string). Consider bailing out of the replacement when non-text descendants are present (e.g. have flattenToText return null and skip splice), or otherwise ensure all literal node types are preserved and non-text nodes are not lost.

Copilot uses AI. Check for mistakes.
Comment on lines +72 to +80
if (isEmphasisOrStrong(child) && isIntraword(parent, i, source)) {
const textValue = flattenToText(child);
const replacement: TextNode = {
type: 'text',
value: textValue,
} as TextNode;

parent.children.splice(i, 1, replacement);
modified = true;
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replacement text node created here drops the original node’s position. EuiMarkdownEditor relies on AST position.start/end.offset for cursor tracking and node replacement, so losing positions can break selection behavior. Consider copying child.position onto replacement (and ensuring it still reflects the same source span).

Copilot uses AI. Check for mistakes.
Comment on lines +177 to +179
(parent.children[i] as TextNode).value += (
parent.children[i + 1] as TextNode
).value;
Copy link

Copilot AI Mar 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mergeAdjacentText() concatenates adjacent text node values but does not update their position ranges. If the first node keeps its original position.end, the merged node’s offsets will no longer cover the full text span, which can break cursor-to-AST mapping in the markdown editor. When merging, update the kept node’s position.end (and related fields) based on the removed node’s end position.

Suggested change
(parent.children[i] as TextNode).value += (
parent.children[i + 1] as TextNode
).value;
const current = parent.children[i] as TextNode & { position?: any };
const next = parent.children[i + 1] as TextNode & { position?: any };
current.value += next.value;
// Preserve and extend positional information so the merged node
// covers the full text span of both original nodes.
if (current.position && next.position && next.position.end) {
current.position.end = next.position.end;
}

Copilot uses AI. Check for mistakes.
@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @acstll

@elasticmachine
Copy link
Collaborator

💚 Build Succeeded

History

cc @acstll

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

support-duty-flywheel Label for PRs, see eui-private #310

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[EuiMarkdownFormat] Renderer misinterprets Salesforce API object

5 participants