Skip to content

perf: cache backtrace line parsing and Line object creation#2905

Open
HazAT wants to merge 1 commit intomasterfrom
perf/backtrace-parse-caching
Open

perf: cache backtrace line parsing and Line object creation#2905
HazAT wants to merge 1 commit intomasterfrom
perf/backtrace-parse-caching

Conversation

@HazAT
Copy link
Member

@HazAT HazAT commented Mar 18, 2026

⚠️ Needs closer review — class-level mutable caches with bounded size

Part of #2901 (reduce memory allocations by ~53%)

Changes

Add two layers of caching to Backtrace::Line.parse to avoid redundant work when the same backtrace lines appear across multiple exceptions (which is the common case in production):

  1. Parse data cache: Caches the extracted (file, number, method, module_name) tuple by the raw unparsed line string. Avoids re-running the regex match and string extraction on cache hit.

  2. Line object cache: Caches complete Line objects by (unparsed_line, in_app_pattern) pair. Avoids creating new Line objects entirely when the same line has been seen with the same pattern.

Both caches are bounded to 2048 entries and clear entirely when the limit is reached (simple, no LRU overhead).

Also caches the compiled in_app_pattern Regexp in Backtrace.parse to avoid Regexp.new on every exception capture.

Review focus

  • Are 2048 entries a reasonable bound? In most apps the unique backtrace lines will be far fewer.
  • The full-clear strategy on limit hit — acceptable simplicity tradeoff?
  • Line objects are effectively immutable after creation — is this assumption safe across all usage patterns?

⚠️ Needs closer review — introduces class-level mutable caches.

Add two layers of caching to Backtrace::Line.parse to avoid redundant
work when the same backtrace lines appear across multiple exceptions
(which is the common case in production):

1. Parse data cache: Caches the extracted (file, number, method,
   module_name) tuple by the raw unparsed line string. Avoids re-running
   the regex match and string extraction on cache hit.

2. Line object cache: Caches complete Line objects by (unparsed_line,
   in_app_pattern) pair. Avoids creating new Line objects entirely when
   the same line has been seen with the same pattern.

Both caches are bounded to 2048 entries and clear entirely when the
limit is reached (simple, no LRU overhead).

Also cache the compiled in_app_pattern Regexp in Backtrace.parse to
avoid Regexp.new on every exception capture.

Safety: Line objects are effectively immutable after creation (all
attributes are set in initialize and only read afterwards). The parse
inputs are deterministic — same unparsed_line always produces the same
parsed data.
HazAT added a commit that referenced this pull request Mar 18, 2026
Reduce total allocated memory from 442k to 206k bytes (-53.5%) and
objects from 3305 to 1538 (-53.5%) per Rails exception capture.

All changes are internal optimizations with zero behavior changes.

Key optimizations:
- Cache longest_load_path and compute_filename results (class-level,
  invalidated on $LOAD_PATH changes)
- Cache backtrace line parsing and Line/Frame object creation (bounded
  at 2048 entries)
- Optimize LineCache with Hash#fetch, direct context setting, and
  per-(filename, lineno) caching
- Avoid unnecessary allocations: indexed regex captures, match? instead
  of =~, byteslice, single-pass iteration in StacktraceBuilder
- RequestInterface: avoid env.dup, cache header name transforms, ASCII
  fast-path for encoding
- Scope/BreadcrumbBuffer: shallow dup instead of deep_dup where inner
  values are not mutated after duplication
- Hub#add_breadcrumb: hint default nil instead of {} to avoid empty
  hash allocation

See sub-PRs for detailed review by risk level:
- #2902 (low risk) — hot path allocation avoidance
- #2903 (low risk) — LineCache optimization
- #2904 (medium risk) — load path and filename caching
- #2905 (needs review) — backtrace parse caching
- #2906 (needs review) — Frame object caching
- #2907 (needs review) — Scope/BreadcrumbBuffer shallow dup
- #2908 (medium risk) — RequestInterface optimizations
Comment on lines +38 to +42
@parse_cache = {}

# Cache complete Line objects by (unparsed_line, in_app_pattern) to avoid
# re-creating identical Line objects across exceptions.
@line_object_cache = {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The new caching mechanisms in Backtrace::Line use non-thread-safe Hash objects, creating race conditions in multi-threaded environments during cache read, write, and clear operations.
Severity: MEDIUM

Suggested Fix

Replace the plain Hash objects used for @parse_cache and @line_object_cache with a thread-safe alternative. Since the project already depends on concurrent-ruby, using Concurrent::Map is the recommended approach, as it is already used for other caches in the codebase.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: sentry-ruby/lib/sentry/backtrace/line.rb#L38-L42

Potential issue: The new caching mechanisms (`@parse_cache`, `@line_object_cache`) in
`Backtrace::Line` use plain `Hash` objects, which are not thread-safe. In a
multi-threaded environment, concurrent read/write operations can lead to race
conditions. Specifically, the non-atomic sequence of checking the cache size, clearing
it, and then writing a new entry can result in lost entries and an inconsistent cache
state. This can degrade cache effectiveness and potentially cause errors on non-GIL Ruby
implementations like JRuby.

Did we get this right? 👍 / 👎 to inform future reviews.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Regexp.new("^(#{project_root}/)?#{app_dirs_pattern}")
cache_key = app_dirs_pattern
in_app_pattern = @in_app_pattern_cache.fetch(cache_key) do
@in_app_pattern_cache[cache_key] = Regexp.new("^(#{project_root}/)?#{app_dirs_pattern}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cache key omits project_root, returning stale Regexp

High Severity

The @in_app_pattern_cache key is only app_dirs_pattern, but the cached Regexp incorporates both project_root and app_dirs_pattern. If project_root changes while app_dirs_pattern stays the same (e.g., across test runs, SDK reconfiguration, or different configurations), the cache returns a stale Regexp built with the old project_root. This causes incorrect in_app classification of backtrace frames, affecting error grouping and display in Sentry.

Fix in Cursor Fix in Web


# Cache complete Line objects by (unparsed_line, in_app_pattern) to avoid
# re-creating identical Line objects across exceptions.
@line_object_cache = {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thread-unsafe plain Hash caches risk corruption on JRuby

Medium Severity

The three new class-level caches (@parse_cache, @line_object_cache, @in_app_pattern_cache) use plain {} hashes, but they're accessed concurrently from multiple threads during exception capture. On JRuby (which the SDK explicitly supports via JAVA_INPUT_FORMAT), there is no GIL, and concurrent reads/writes to a plain Hash can corrupt its internal structure. The existing line_cache in the same file already uses Concurrent::Map for this exact reason.

Additional Locations (1)
Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant