perf: cache backtrace line parsing and Line object creation#2905
perf: cache backtrace line parsing and Line object creation#2905
Conversation
⚠️ Needs closer review — introduces class-level mutable caches. Add two layers of caching to Backtrace::Line.parse to avoid redundant work when the same backtrace lines appear across multiple exceptions (which is the common case in production): 1. Parse data cache: Caches the extracted (file, number, method, module_name) tuple by the raw unparsed line string. Avoids re-running the regex match and string extraction on cache hit. 2. Line object cache: Caches complete Line objects by (unparsed_line, in_app_pattern) pair. Avoids creating new Line objects entirely when the same line has been seen with the same pattern. Both caches are bounded to 2048 entries and clear entirely when the limit is reached (simple, no LRU overhead). Also cache the compiled in_app_pattern Regexp in Backtrace.parse to avoid Regexp.new on every exception capture. Safety: Line objects are effectively immutable after creation (all attributes are set in initialize and only read afterwards). The parse inputs are deterministic — same unparsed_line always produces the same parsed data.
Reduce total allocated memory from 442k to 206k bytes (-53.5%) and
objects from 3305 to 1538 (-53.5%) per Rails exception capture.
All changes are internal optimizations with zero behavior changes.
Key optimizations:
- Cache longest_load_path and compute_filename results (class-level,
invalidated on $LOAD_PATH changes)
- Cache backtrace line parsing and Line/Frame object creation (bounded
at 2048 entries)
- Optimize LineCache with Hash#fetch, direct context setting, and
per-(filename, lineno) caching
- Avoid unnecessary allocations: indexed regex captures, match? instead
of =~, byteslice, single-pass iteration in StacktraceBuilder
- RequestInterface: avoid env.dup, cache header name transforms, ASCII
fast-path for encoding
- Scope/BreadcrumbBuffer: shallow dup instead of deep_dup where inner
values are not mutated after duplication
- Hub#add_breadcrumb: hint default nil instead of {} to avoid empty
hash allocation
See sub-PRs for detailed review by risk level:
- #2902 (low risk) — hot path allocation avoidance
- #2903 (low risk) — LineCache optimization
- #2904 (medium risk) — load path and filename caching
- #2905 (needs review) — backtrace parse caching
- #2906 (needs review) — Frame object caching
- #2907 (needs review) — Scope/BreadcrumbBuffer shallow dup
- #2908 (medium risk) — RequestInterface optimizations
| @parse_cache = {} | ||
|
|
||
| # Cache complete Line objects by (unparsed_line, in_app_pattern) to avoid | ||
| # re-creating identical Line objects across exceptions. | ||
| @line_object_cache = {} |
There was a problem hiding this comment.
Bug: The new caching mechanisms in Backtrace::Line use non-thread-safe Hash objects, creating race conditions in multi-threaded environments during cache read, write, and clear operations.
Severity: MEDIUM
Suggested Fix
Replace the plain Hash objects used for @parse_cache and @line_object_cache with a thread-safe alternative. Since the project already depends on concurrent-ruby, using Concurrent::Map is the recommended approach, as it is already used for other caches in the codebase.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.
Location: sentry-ruby/lib/sentry/backtrace/line.rb#L38-L42
Potential issue: The new caching mechanisms (`@parse_cache`, `@line_object_cache`) in
`Backtrace::Line` use plain `Hash` objects, which are not thread-safe. In a
multi-threaded environment, concurrent read/write operations can lead to race
conditions. Specifically, the non-atomic sequence of checking the cache size, clearing
it, and then writing a new entry can result in lost entries and an inconsistent cache
state. This can degrade cache effectiveness and potentially cause errors on non-GIL Ruby
implementations like JRuby.
Did we get this right? 👍 / 👎 to inform future reviews.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 2 potential issues.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
| Regexp.new("^(#{project_root}/)?#{app_dirs_pattern}") | ||
| cache_key = app_dirs_pattern | ||
| in_app_pattern = @in_app_pattern_cache.fetch(cache_key) do | ||
| @in_app_pattern_cache[cache_key] = Regexp.new("^(#{project_root}/)?#{app_dirs_pattern}") |
There was a problem hiding this comment.
Cache key omits project_root, returning stale Regexp
High Severity
The @in_app_pattern_cache key is only app_dirs_pattern, but the cached Regexp incorporates both project_root and app_dirs_pattern. If project_root changes while app_dirs_pattern stays the same (e.g., across test runs, SDK reconfiguration, or different configurations), the cache returns a stale Regexp built with the old project_root. This causes incorrect in_app classification of backtrace frames, affecting error grouping and display in Sentry.
|
|
||
| # Cache complete Line objects by (unparsed_line, in_app_pattern) to avoid | ||
| # re-creating identical Line objects across exceptions. | ||
| @line_object_cache = {} |
There was a problem hiding this comment.
Thread-unsafe plain Hash caches risk corruption on JRuby
Medium Severity
The three new class-level caches (@parse_cache, @line_object_cache, @in_app_pattern_cache) use plain {} hashes, but they're accessed concurrently from multiple threads during exception capture. On JRuby (which the SDK explicitly supports via JAVA_INPUT_FORMAT), there is no GIL, and concurrent reads/writes to a plain Hash can corrupt its internal structure. The existing line_cache in the same file already uses Concurrent::Map for this exact reason.


Part of #2901 (reduce memory allocations by ~53%)
Changes
Add two layers of caching to
Backtrace::Line.parseto avoid redundant work when the same backtrace lines appear across multiple exceptions (which is the common case in production):Parse data cache: Caches the extracted
(file, number, method, module_name)tuple by the raw unparsed line string. Avoids re-running the regex match and string extraction on cache hit.Line object cache: Caches complete
Lineobjects by(unparsed_line, in_app_pattern)pair. Avoids creating new Line objects entirely when the same line has been seen with the same pattern.Both caches are bounded to 2048 entries and clear entirely when the limit is reached (simple, no LRU overhead).
Also caches the compiled
in_app_patternRegexp inBacktrace.parseto avoidRegexp.newon every exception capture.Review focus