Skip to content

🐛 [PANA-7217] Limit the size of the string table in session replay recordings#4544

Merged
sethfowler-datadog merged 2 commits intov7from
seth.fowler/PANA-7217-limit-the-size-of-the-string-table
Apr 29, 2026
Merged

🐛 [PANA-7217] Limit the size of the string table in session replay recordings#4544
sethfowler-datadog merged 2 commits intov7from
seth.fowler/PANA-7217-limit-the-size-of-the-string-table

Conversation

@sethfowler-datadog
Copy link
Copy Markdown
Contributor

Motivation

The new DOM serialization algorithm in v7 of the browser SDK constructs a string table, allowing strings to be deduplicated across records. This greatly improves the space efficiency of session replay recordings.

There's a known issue with this string table: if a view lasts for a very long time, the string table can grow to be quite large. We should apply a limit to the size of the string table to ensure that we limit the amount of memory the recording process uses.

In the short term, the easiest approach is to just stop adding entries to the string table once it grows above an arbitrary maximum size, and that's what this PR implements. However, this is a bit of a bummer, because it means that views that hit this threshold will not benefit from the string table as much as they otherwise could. It would likely be better to reset the string table when it gets too large, repopulating it from that point. That change is a bit more complicated, since it requires a tweak to the data format (we should be explicit about when the reset happens to ensure that the recorder and the player agree), so I don't want to start with it, but I plan to revisit this issue in the near future.

Changes

This PR adds a StringIdConstants.SOFT_MAX_SIZE constant defining the size limit for the string table. This limit is set arbitrarily for now; I plan to add telemetry to gather data about how large these string tables are in practice, and I'll adjust the limit in the future based on that data.

The size limit is enforced by ChangeEncoder, which is the infrastructure that actually replaces strings in the recorded event stream with references to the string table. The data format was designed to allow either a literal string or a string table reference in any position where a string can be used, so it's fine if ChangeEncoder simply declines to replace strings once the size limit is hit. Even if the string table is maxed out, a string can still be replaced with a string table reference if the string is already in the table, so we still get value from the string table even in this situation.

Checklist

  • Tested locally
  • Tested on staging
  • Added unit tests for this change.
  • Added e2e/integration tests for this change.
  • Updated documentation and/or relevant AGENTS.md file

@sethfowler-datadog sethfowler-datadog requested review from a team as code owners April 29, 2026 11:25
Comment thread packages/rum/src/domain/record/serialization/changeEncoder.spec.ts Outdated
@cit-pr-commenter-54b7da
Copy link
Copy Markdown

cit-pr-commenter-54b7da Bot commented Apr 29, 2026

Bundles Sizes Evolution

📦 Bundle Name Base Size Local Size 𝚫 𝚫% Status
Rum 168.98 KiB 168.98 KiB 0 B 0.00%
Rum Profiler 6.09 KiB 6.09 KiB 0 B 0.00%
Rum Recorder 21.25 KiB 21.28 KiB +25 B +0.11%
Logs 54.54 KiB 54.54 KiB 0 B 0.00%
Rum Slim 127.34 KiB 127.34 KiB 0 B 0.00%
Worker 22.99 KiB 22.99 KiB 0 B 0.00%
🚀 CPU Performance
Action Name Base CPU Time (ms) Local CPU Time (ms) 𝚫%
RUM - add global context 0.0022 0.0025 +13.64%
RUM - add action 0.0128 0.0154 +20.31%
RUM - add error 0.0117 0.0113 -3.42%
RUM - add timing 0.0005 0.0005 0.00%
RUM - start view 0.0136 0.0116 -14.71%
RUM - start/stop session replay recording 0.0011 0.0007 -36.36%
Logs - log message 0.0264 0.0158 -40.15%
🧠 Memory Performance
Action Name Base Memory Consumption Local Memory Consumption 𝚫
RUM - add global context 38.14 KiB 38.71 KiB +579 B
RUM - add action 64.23 KiB 64.78 KiB +564 B
RUM - add timing 37.53 KiB 38.42 KiB +916 B
RUM - add error 70.36 KiB 68.56 KiB -1.80 KiB
RUM - start/stop session replay recording 41.54 KiB 41.79 KiB +254 B
RUM - start view 477.13 KiB 480.02 KiB +2.89 KiB
Logs - log message 54.31 KiB 54.58 KiB +281 B

🔗 RealWorld

@datadog-datadog-prod-us1
Copy link
Copy Markdown

Tests

🎉 All green!

❄️ No new flaky tests detected
🧪 All tests passed

🎯 Code Coverage (details)
Patch Coverage: 57.14%
Overall Coverage: 77.55% (-0.02%)

This comment will be updated automatically if new data arrives.
🔗 Commit SHA: 6a3474c | Docs | Datadog PR Page | Give us feedback!

@sethfowler-datadog sethfowler-datadog merged commit ef59231 into v7 Apr 29, 2026
19 checks passed
@sethfowler-datadog sethfowler-datadog deleted the seth.fowler/PANA-7217-limit-the-size-of-the-string-table branch April 29, 2026 13:54
@github-actions github-actions Bot locked and limited conversation to collaborators Apr 29, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants