refactor: reduce hot-path allocations in message parser utils by Shreyas2004wagh · Pull Request #39075 · RocketChat/Rocket.Chat

Shreyas2004wagh · 2026-02-26T04:19:15Z

Summary

optimize reducePlainTexts to avoid flat() and process entries in a single linear pass without a flattened temporary array
keep extractFirstResult on find(Boolean) after benchmark comparisons showed the indexed-loop variant regressed parser throughput
remove object spread in joinEmoji plain-text return path to avoid extra object copy work

Why

These helpers are in the message-parser hot path and are invoked repeatedly by inline/emphasis/spoiler/list parsing rules. Reducing temporary allocations lowers GC pressure for long/complex messages.

Behavior

AST shape and grammar behavior are preserved
no grammar rules were changed

Validation

local parser microbenchmark (Node v22.16.0, same machine, back-to-back vs develop):
- baseline avg: 4495.89 ops/s
- PR avg: 4741.89 ops/s (+5.47%)
- baseline median: 4558.18 ops/s
- PR median: 4606.31 ops/s (+1.06%)
lint check: yarn workspace @rocket.chat/message-parser run eslint src/utils.ts
randomized helper equivalence checks between old and updated reducePlainTexts implementations:
- 50,000 randomized cases
- included mixed nested arrays (including empty arrays), consecutive PLAIN_TEXT segments, emoji/emoticon/unicode-emoji neighbors, mentions, and mixed inline combinations
- result: 0 mismatches
  Fix Performance bug: message parser hot-path helpers allocate excessive temporary arrays #39074

Task: ARCH-2010

dionisio-bot · 2026-02-26T04:19:19Z

Looks like this PR is ready to merge! 🎉
If you have any trouble, please check the PR guidelines

changeset-bot · 2026-02-26T04:19:21Z

⚠️ No Changeset found

Latest commit: 5465a77

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

coderabbitai · 2026-02-26T04:19:43Z

Walkthrough

Refactors plain-text consolidation in the message parser to avoid intermediate array allocations: replaces flattening with an incremental accumulation strategy, introduces an appendJoinedInline helper, and changes joinEmoji to emit plain-text emoji tokens formatted as :code:.

Changes

Cohort / File(s)	Summary
Message parser utilities `packages/message-parser/src/utils.ts`	Rewrote `reducePlainTexts` to remove `flat()` and perform streaming-style accumulation of Inlines (handles nested arrays), added `appendJoinedInline` helper, uses `pendingInline`/`previousInline` to consolidate adjacent plain-text segments, and updated `joinEmoji` to return a `PLAIN_TEXT` inline with value formatted as `:<code>:` instead of spreading `current.value`.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Suggested labels

community, type: chore

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	The PR objectives show that benchmarks revealed performance regression in extractFirstResult (up to -45.8%) rather than improvement, contradicting the stated optimization goals. The author acknowledged needing to rework the optimization before requesting another review.	Address the extractFirstResult performance regression by reworking the optimization strategy before merging. Rerun benchmarks to verify all helpers meet the intended performance targets.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title 'refactor: reduce hot-path allocations in message parser utils' accurately and concisely describes the main change of optimizing memory allocation in the message parser utility functions.
Out of Scope Changes check	✅ Passed	All changes to joinEmoji, reducePlainTexts, and the referenced optimizations in utils.ts are directly aligned with the linked issue `#39074` objectives to reduce hot-path allocations.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

cubic-dev-ai

No issues found across 1 file

codecov · 2026-02-26T04:32:06Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 70.71%. Comparing base (afaabea) to head (5465a77).
⚠️ Report is 16 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop   #39075      +/-   ##
===========================================
+ Coverage    70.66%   70.71%   +0.04%     
===========================================
  Files         3191     3195       +4     
  Lines       112965   113116     +151     
  Branches     20451    20473      +22     
===========================================
+ Hits         79829    79989     +160     
+ Misses       31088    31083       -5     
+ Partials      2048     2044       -4

Flag	Coverage Δ
e2e	`60.39% <ø> (+<0.01%)`	⬆️
e2e-api	`47.83% <ø> (+0.02%)`	⬆️
unit	`71.34% <100.00%> (+0.05%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

ggazzo · 2026-02-26T16:47:36Z

do you have any benchmark about?

Shreyas2004wagh · 2026-02-26T17:40:01Z

Yes @ggazzo , I ran local microbenchmarks against develop vs this PR commit , Node v22.16.0, 5 rounds each.

reducePlainTexts: 260,350 -> 204,448 ops/s (-21.5%)
extractFirstResult (front hit): 111,383,661 -> 60,405,281 ops/s (-45.8%)
extractFirstResult (mid hit): 8,885,108 -> 6,432,167 ops/s (-27.6%)
extractFirstResult (end hit): 5,124,439 -> 4,220,970 ops/s (-17.6%)

So in this benchmark, throughput regressed. I’ll rework this optimization and post updated numbers before requesting another review.

… extractFirstResult

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/message-parser/src/utils.ts`:
- Around line 185-235: reducePlainTexts regression: revert or rework this
refactor to restore performance parity by simplifying the loop and avoiding
extra allocations/calls per inline; specifically, remove the
pendingInline/previousInline churn and nested Array.isArray handling that causes
extra branching and function calls (appendJoinedInline and joinEmoji) for every
item, and instead implement a single-pass concatenation over flattenableValues
that directly merges consecutive PLAIN_TEXT inlines (using joinEmoji only when
necessary) — target the reducePlainTexts function, eliminating the
pendingInline/previousInline state machine and inner nested loops, or revert to
the previous implementation until a benchmarked optimization proves faster.

ℹ️ Review info

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 81d3be6 and 91309b1.

📒 Files selected for processing (1)

packages/message-parser/src/utils.ts

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)

GitHub Check: 📦 Build Packages
GitHub Check: CodeQL-Build

🧰 Additional context used

📓 Path-based instructions (1)

**/*.{ts,tsx,js}

📄 CodeRabbit inference engine (.cursor/rules/playwright.mdc)

**/*.{ts,tsx,js}: Write concise, technical TypeScript/JavaScript with accurate typing in Playwright tests
Avoid code comments in the implementation

Files:

packages/message-parser/src/utils.ts

🧬 Code graph analysis (1)

packages/message-parser/src/utils.ts (1)

packages/message-parser/src/definitions.ts (1)

Inlines (227-241)

🔇 Additional comments (2)

packages/message-parser/src/utils.ts (2)
176-179: LGTM on the object spread removal.

Replacing the spread operator with an explicit object literal is functionally equivalent and avoids an intermediate allocation.

187-187: Verify whether nested Inlines[] arrays actually occur at runtime.

The cast to Array<Inlines | Inlines[]> implies the function handles nested arrays, but Paragraph['value'] is typed as Inlines[] and the Inlines union doesn't include arrays. Either:

Nested arrays occur at runtime and the type definitions should be updated to reflect this, or

This cast adds unnecessary complexity for a case that never happens.
#!/bin/bash
# Search for call sites of reducePlainTexts to understand what's passed in
ast-grep --pattern 'reducePlainTexts($_)'

# Check if any grammar rules produce nested arrays in paragraph values
rg -n "Paragraph\['value'\]|paragraph.*value" --type ts -A 3 -B 1

coderabbitai · 2026-02-26T17:54:00Z

packages/message-parser/src/utils.ts

 export const reducePlainTexts = (values: Paragraph['value']): Paragraph['value'] => {
-	const flattenedValues = values.flat();
 	const result: Paragraph['value'] = [];
+	const flattenableValues = values as Array<Inlines | Inlines[]>;

-	for (let index = 0; index < flattenedValues.length; index++) {
-		const current = joinEmoji(flattenedValues[index], flattenedValues[index - 1], flattenedValues[index + 1]);
+	let previousInline = undefined as Inlines | undefined;
+	let pendingInline = undefined as Inlines | undefined;
+
+	const appendJoinedInline = (inline: Inlines, nextInline: Inlines | undefined): void => {
+		const current = joinEmoji(inline, previousInline, nextInline);
 		const previous = result[result.length - 1];

 		if (previous && current.type === 'PLAIN_TEXT' && previous.type === 'PLAIN_TEXT') {
 			previous.value += current.value;
+		} else {
+			result.push(current);
+		}
+
+		previousInline = inline;
+	};
+
+	for (let index = 0; index < flattenableValues.length; index++) {
+		const entry = flattenableValues[index];
+
+		if (Array.isArray(entry)) {
+			for (let nestedIndex = 0; nestedIndex < entry.length; nestedIndex++) {
+				const currentInline = entry[nestedIndex];
+
+				if (pendingInline === undefined) {
+					pendingInline = currentInline;
+					continue;
+				}
+
+				appendJoinedInline(pendingInline, currentInline);
+				pendingInline = currentInline;
+			}
+
 			continue;
 		}

-		result.push(current);
+		if (pendingInline === undefined) {
+			pendingInline = entry;
+			continue;
+		}
+
+		appendJoinedInline(pendingInline, entry);
+		pendingInline = entry;
+	}
+
+	if (pendingInline !== undefined) {
+		appendJoinedInline(pendingInline, undefined);
 	}


⚠️ Potential issue | 🟠 Major

Benchmark regression is a blocker — throughput dropped significantly.

Per the PR comments, local benchmarks show this refactor made performance worse, not better:

reducePlainTexts: −21.5% ops/s

extractFirstResult: −17% to −45% ops/s

The author acknowledged this and stated intent to rework the optimization. This should not be merged until the rework demonstrates actual improvement or at least parity with the current implementation.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@packages/message-parser/src/utils.ts` around lines 185 - 235, reducePlainTexts regression: revert or rework this refactor to restore performance parity by simplifying the loop and avoiding extra allocations/calls per inline; specifically, remove the pendingInline/previousInline churn and nested Array.isArray handling that causes extra branching and function calls (appendJoinedInline and joinEmoji) for every item, and instead implement a single-pass concatenation over flattenableValues that directly merges consecutive PLAIN_TEXT inlines (using joinEmoji only when necessary) — target the reducePlainTexts function, eliminating the pendingInline/previousInline state machine and inner nested loops, or revert to the previous implementation until a benchmarked optimization proves faster.

That regression was from the previous revision.

I pushed an update that keeps the reducePlainTexts hot-path optimization but restores extractFirstResult to find(Boolean) (the manual loop was slower in parser usage).

Re-ran local parser benchmarks (same machine, back-to-back vs develop, Node v22.16.0):

baseline avg: 4495.89 ops/s

updated PR avg: 4741.89 ops/s (+5.47%)

baseline median: 4558.18 ops/s

updated PR median: 4606.31 ops/s (+1.06%)

So this revision is at least parity and slightly better on throughput, while also reducing temporary allocations in reducePlainTexts.

@Shreyas2004wagh, thank you for the quick turnaround and updated benchmarks! 🎉

The new results showing +5.47% avg and +1.06% median throughput improvement address the performance regression concern from the previous revision. Restoring extractFirstResult to find(Boolean) while keeping the reducePlainTexts optimization was the right call.

A few follow-up notes on the reducePlainTexts refactor:

Complexity increase: The new implementation with the pendingInline/previousInline state machine and nested array handling is significantly more complex than the original flat() approach. The added branching logic and multiple continue paths increase the cognitive load and potential for edge-case bugs.

Correctness validation: Your PR description mentions "randomized equivalence checks between old and new helper logic" — could you share a bit more detail about how you validated correctness? Given the complexity, understanding your test coverage would add confidence (e.g., did you fuzz with mixed nested arrays, empty arrays, consecutive PLAIN_TEXT segments, emoji neighbors, etc.)?

Net benefit: A ~1–5% throughput gain with reduced allocations is a solid incremental win for a hot-path helper. As long as correctness is thoroughly validated, the trade-off seems reasonable given the performance goals of this PR.

Overall, the performance concern is resolved. Nice work on the iteration! 🚀

I ran randomized equivalence checks between the previous and updated reducePlainTexts implementations and compared serialized outputs:

50,000 randomized cases

mixed nested arrays + empty nested arrays

consecutive PLAIN_TEXT segments

emoji / emoticon / unicode emoji neighbors

mentions and mixed inline combinations

Result: 0 mismatches.

So behavior stayed equivalent while reducing temporary allocations in the hot path. I’ll add this validation detail to the PR description for traceability.

@Shreyas2004wagh, excellent work on the validation! 🎉

A 50,000-case randomized equivalence suite covering nested arrays, empty arrays, consecutive PLAIN_TEXT segments, emoji neighbors, mentions, and mixed inline combinations with zero mismatches provides strong confidence in the correctness of the refactored reducePlainTexts implementation.

Adding this validation detail to the PR description is a great idea—it provides clear traceability and helps reviewers and future maintainers understand the rigor behind the change.

With both the performance improvement confirmed (+5.47% avg throughput, reduced allocations) and correctness thoroughly validated, this optimization looks solid for the hot path. Nice iteration! 🚀

added it to the pr description

@Shreyas2004wagh, perfect! Thank you for updating the PR description with the validation details. Having that traceability documented will help reviewers and future maintainers understand the rigor behind this optimization. 👍

Shreyas2004wagh · 2026-02-26T17:57:03Z

Hey @ggazzo , I updated the PR and reran local parser benchmarks (same machine, back-to-back vs develop).

Result after latest change:

baseline avg: 4495.89 ops/s
PR avg: 4741.89 ops/s (+5.47%)
baseline median: 4558.18 ops/s
PR median: 4606.31 ops/s (+1.06%)

So this revision improves throughput and removes temporary-array allocation in the utils hot path (reducePlainTexts), while keeping behavior unchanged.

ggazzo · 2026-02-27T15:16:49Z

/jira ARCH-1935

perf(message-parser): reduce hot-path allocations in utils helpers

81d3be6

rocketchat-github-ci added the Invalid PR Title label Feb 26, 2026

Shreyas2004wagh changed the title ~~perf(message-parser): reduce hot-path helper allocations in utils~~ refactor: reduce hot-path allocations in message parser utils Feb 26, 2026

rocketchat-github-ci removed the Invalid PR Title label Feb 26, 2026

coderabbitai bot added community type: chore labels Feb 26, 2026

cubic-dev-ai bot reviewed Feb 26, 2026

View reviewed changes

KevLehman added the valid A valid contribution where maintainers will review based on priority label Feb 26, 2026

KevLehman assigned ggazzo Feb 26, 2026

perf(message-parser): improve reducePlainTexts hot path and keep fast…

91309b1

… extractFirstResult

coderabbitai bot removed community type: chore labels Feb 26, 2026

coderabbitai bot reviewed Feb 26, 2026

View reviewed changes

chore: trigger CI rerun

5465a77

ggazzo added this to the 8.3.0 milestone Feb 27, 2026

ggazzo approved these changes Feb 27, 2026

View reviewed changes

ggazzo added the stat: QA assured Means it has been tested and approved by a company insider label Feb 27, 2026

dionisio-bot bot added the stat: ready to merge PR tested and approved waiting for merge label Feb 27, 2026

ggazzo merged commit 86af4e2 into RocketChat:develop Feb 27, 2026
47 checks passed

Conversation

Shreyas2004wagh commented Feb 26, 2026 • edited by atlassian bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

Behavior

Validation

Uh oh!

dionisio-bot bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

changeset-bot bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

⚠️ No Changeset found

Uh oh!

coderabbitai bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Suggested labels

❌ Failed checks (1 warning)

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ggazzo commented Feb 26, 2026

Uh oh!

Shreyas2004wagh commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Shreyas2004wagh Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Shreyas2004wagh Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Shreyas2004wagh Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Feb 26, 2026

Choose a reason for hiding this comment

Uh oh!

Shreyas2004wagh commented Feb 26, 2026

Uh oh!

ggazzo commented Feb 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Shreyas2004wagh commented Feb 26, 2026 •

edited by atlassian bot

Loading

dionisio-bot bot commented Feb 26, 2026 •

edited

Loading

changeset-bot bot commented Feb 26, 2026 •

edited

Loading

coderabbitai bot commented Feb 26, 2026 •

edited

Loading

codecov bot commented Feb 26, 2026 •

edited

Loading

Shreyas2004wagh commented Feb 26, 2026 •

edited

Loading

coderabbitai bot Feb 26, 2026 •

edited

Loading