🐛 [PANA-7217] Limit the size of the string table in session replay recordings#4544
Merged
sethfowler-datadog merged 2 commits intov7from Apr 29, 2026
Conversation
BenoitZugmeyer
approved these changes
Apr 29, 2026
Bundles Sizes Evolution
🚀 CPU Performance
🧠 Memory Performance
|
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: 6a3474c | Docs | Datadog PR Page | Give us feedback! |
mormubis
approved these changes
Apr 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
The new DOM serialization algorithm in v7 of the browser SDK constructs a string table, allowing strings to be deduplicated across records. This greatly improves the space efficiency of session replay recordings.
There's a known issue with this string table: if a view lasts for a very long time, the string table can grow to be quite large. We should apply a limit to the size of the string table to ensure that we limit the amount of memory the recording process uses.
In the short term, the easiest approach is to just stop adding entries to the string table once it grows above an arbitrary maximum size, and that's what this PR implements. However, this is a bit of a bummer, because it means that views that hit this threshold will not benefit from the string table as much as they otherwise could. It would likely be better to reset the string table when it gets too large, repopulating it from that point. That change is a bit more complicated, since it requires a tweak to the data format (we should be explicit about when the reset happens to ensure that the recorder and the player agree), so I don't want to start with it, but I plan to revisit this issue in the near future.
Changes
This PR adds a
StringIdConstants.SOFT_MAX_SIZEconstant defining the size limit for the string table. This limit is set arbitrarily for now; I plan to add telemetry to gather data about how large these string tables are in practice, and I'll adjust the limit in the future based on that data.The size limit is enforced by
ChangeEncoder, which is the infrastructure that actually replaces strings in the recorded event stream with references to the string table. The data format was designed to allow either a literal string or a string table reference in any position where a string can be used, so it's fine ifChangeEncodersimply declines to replace strings once the size limit is hit. Even if the string table is maxed out, a string can still be replaced with a string table reference if the string is already in the table, so we still get value from the string table even in this situation.Checklist