chore: remove all heap allocation from the hot renderer loop#38
Conversation
📝 WalkthroughWalkthroughAdds per-frame scratch containers (RendererScratch, TraversalScratch) and in-place traversal; converts tessellated vertex buffers to Arc with TessellatedGeometry; reworks effect/texture recycling and effect output handling; introduces reusable readback and lyon buffer pool reuse APIs; updates renderer call sites to use and trim scratch state. Changes
Sequence Diagram(s)sequenceDiagram
rect rgba(120,180,120,0.5)
participant Client
end
rect rgba(80,160,240,0.5)
participant Renderer
participant TraversalScratch
participant EffectPipeline
participant OffscreenPool
participant GPU
end
Client->>Renderer: render_to_texture_view()
Renderer->>TraversalScratch: begin() / provide traversal_scratch
Renderer->>TraversalScratch: plan_traversal_in_place(draw_tree, traversal_scratch)
TraversalScratch-->>Renderer: traversal data (in-place)
Renderer->>EffectPipeline: compile/apply effects
EffectPipeline->>OffscreenPool: acquire/recycle PooledTexture(s)
OffscreenPool->>GPU: create/reuse textures
EffectPipeline->>GPU: dispatch pipelines / composite
GPU-->>Renderer: output textures / readback bytes
Renderer->>TraversalScratch: store scratch back (end frame)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/renderer/readback.rs`:
- Around line 129-139: The code should guard against empty or insufficient
readback bytes before calling copy_padded_readback_rows: after calling
Self::map_readback_buffer_into(&self.device, output_buffer, &mut readback_bytes)
check whether readback_bytes is empty (or its length is less than height *
padded_bytes_per_row) and if so restore self.scratch.readback_bytes =
readback_bytes and return early (same early‑return pattern used by
render_to_argb32) to avoid passing a too‑small buffer into
copy_padded_readback_rows; reference map_readback_buffer_into,
copy_padded_readback_rows, render_to_argb32 and self.scratch.readback_bytes when
making the change.
In `@src/renderer/types.rs`:
- Around line 165-180: trim_to_policy currently calls trim_vector_if_needed(&mut
self.readback_bytes, MAX_READBACK_BYTES_CAPACITY) but shrink_to cannot reduce
capacity below the current len, so large readbacks won't free memory; update
trim_to_policy to first reduce the length of self.readback_bytes when it exceeds
MAX_READBACK_BYTES_CAPACITY (e.g.,
self.readback_bytes.truncate(MAX_READBACK_BYTES_CAPACITY) or
self.readback_bytes.clear() depending on semantics) and then call
trim_vector_if_needed on readback_bytes so shrink_to can actually reclaim
capacity.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/renderer/readback.rs (1)
129-135: Replacechecked_mul().unwrap_or(usize::MAX)withsaturating_mul().
The pattern flags the Clippymanual_saturating_arithmeticlint. Usingsaturating_mul()is more idiomatic and directly expresses the intent.🔧 Suggested change
- let required_readback_len = (height as usize) - .checked_mul(padded_bytes_per_row as usize) - .unwrap_or(usize::MAX); + let required_readback_len = + (height as usize).saturating_mul(padded_bytes_per_row as usize);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/renderer/readback.rs` around lines 129 - 135, The multiplication that computes required_readback_len uses checked_mul().unwrap_or(usize::MAX), which triggers Clippy's manual_saturating_arithmetic lint; replace that expression with a direct saturating multiplication on usize (e.g. use (height as usize).saturating_mul(padded_bytes_per_row as usize)) in the readback length calculation so required_readback_len is computed without unwraps and the intent is explicit; update the calculation near where required_readback_len is defined (alongside readback_bytes and Self::map_readback_buffer_into calls) and remove the checked_mul()/unwrap_or() pattern.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/renderer/readback.rs`:
- Around line 129-135: The multiplication that computes required_readback_len
uses checked_mul().unwrap_or(usize::MAX), which triggers Clippy's
manual_saturating_arithmetic lint; replace that expression with a direct
saturating multiplication on usize (e.g. use (height as
usize).saturating_mul(padded_bytes_per_row as usize)) in the readback length
calculation so required_readback_len is computed without unwraps and the intent
is explicit; update the calculation near where required_readback_len is defined
(alongside readback_bytes and Self::map_readback_buffer_into calls) and remove
the checked_mul()/unwrap_or() pattern.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/renderer/readback.rs (1)
13-15: Fast-pathcopy_from_slicemay panic ifdata.len() != output_size.When
padded_bytes_per_row == unpadded_bytes_per_row, the function assumesdata.len()exactly equalsoutput_size. If the caller passes a buffer with extra trailing bytes,copy_from_slicewill panic due to length mismatch.Current callers validate
readback_bytes.len() >= required_readback_len, so the buffer could be larger than needed. Consider using a range slice to be defensive:🛡️ Defensive fix
if padded_bytes_per_row == unpadded_bytes_per_row { - output.copy_from_slice(data); + output.copy_from_slice(&data[..output_size]); return; }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/renderer/readback.rs` around lines 13 - 15, The fast-path assumes data.len() == output.len() and calls output.copy_from_slice(data), which can panic if callers pass a larger readback_bytes buffer (they only guarantee >= required_readback_len); change the fast-path in the branch where padded_bytes_per_row == unpadded_bytes_per_row to copy only the needed slice (use data[..output.len()] or equivalent) into output instead of passing the whole data, referencing the variables padded_bytes_per_row, unpadded_bytes_per_row, output.copy_from_slice(data), readback_bytes and required_readback_len to locate the code to modify.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/renderer/readback.rs`:
- Around line 310-316: The current code only guards against empty readback_bytes
and then blindly casts and slices into src_words, which can panic if
map_readback_buffer_into returned partial data; before calling
bytemuck::cast_slice and slicing src_words[..needed_len], validate that
readback_bytes.len() is at least needed_len * 4 (or that
bytemuck::cast_slice(&readback_bytes).len() >= needed_len) and handle the
short-read case (e.g., store the partial data in self.scratch.readback_bytes and
return or surface an error) so
out_pixels.copy_from_slice(&src_words[..needed_len]) is only executed when there
are enough u32 words; reference symbols: readback_bytes,
self.scratch.readback_bytes, src_words, out_pixels, needed_len, and
render_to_buffer for consistency with other checks.
---
Nitpick comments:
In `@src/renderer/readback.rs`:
- Around line 13-15: The fast-path assumes data.len() == output.len() and calls
output.copy_from_slice(data), which can panic if callers pass a larger
readback_bytes buffer (they only guarantee >= required_readback_len); change the
fast-path in the branch where padded_bytes_per_row == unpadded_bytes_per_row to
copy only the needed slice (use data[..output.len()] or equivalent) into output
instead of passing the whole data, referencing the variables
padded_bytes_per_row, unpadded_bytes_per_row, output.copy_from_slice(data),
readback_bytes and required_readback_len to locate the code to modify.
| if readback_bytes.is_empty() { | ||
| self.scratch.readback_bytes = readback_bytes; | ||
| return; | ||
| } | ||
|
|
||
| let src_words: &[u32] = bytemuck::cast_slice(&data); | ||
| let src_words: &[u32] = bytemuck::cast_slice(&readback_bytes); | ||
| out_pixels[..needed_len].copy_from_slice(&src_words[..needed_len]); |
There was a problem hiding this comment.
Missing length validation before casting and slicing.
The check on line 310 only guards against empty data, but render_to_buffer (lines 131-134) additionally validates that the buffer has sufficient length. If map_readback_buffer_into returns partial data, line 316 will panic when slicing src_words[..needed_len].
🛡️ Proposed fix to add length validation
- if readback_bytes.is_empty() {
+ let required_len = needed_len * std::mem::size_of::<u32>();
+ if readback_bytes.is_empty() || readback_bytes.len() < required_len {
self.scratch.readback_bytes = readback_bytes;
return;
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if readback_bytes.is_empty() { | |
| self.scratch.readback_bytes = readback_bytes; | |
| return; | |
| } | |
| let src_words: &[u32] = bytemuck::cast_slice(&data); | |
| let src_words: &[u32] = bytemuck::cast_slice(&readback_bytes); | |
| out_pixels[..needed_len].copy_from_slice(&src_words[..needed_len]); | |
| let required_len = needed_len * std::mem::size_of::<u32>(); | |
| if readback_bytes.is_empty() || readback_bytes.len() < required_len { | |
| self.scratch.readback_bytes = readback_bytes; | |
| return; | |
| } | |
| let src_words: &[u32] = bytemuck::cast_slice(&readback_bytes); | |
| out_pixels[..needed_len].copy_from_slice(&src_words[..needed_len]); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/renderer/readback.rs` around lines 310 - 316, The current code only
guards against empty readback_bytes and then blindly casts and slices into
src_words, which can panic if map_readback_buffer_into returned partial data;
before calling bytemuck::cast_slice and slicing src_words[..needed_len],
validate that readback_bytes.len() is at least needed_len * 4 (or that
bytemuck::cast_slice(&readback_bytes).len() >= needed_len) and handle the
short-read case (e.g., store the partial data in self.scratch.readback_bytes and
return or surface an error) so
out_pixels.copy_from_slice(&src_words[..needed_len]) is only executed when there
are enough u32 words; reference symbols: readback_bytes,
self.scratch.readback_bytes, src_words, out_pixels, needed_len, and
render_to_buffer for consistency with other checks.
Summary by CodeRabbit