Skip to content

Editor windows playback performance#1613

Merged
richiemcilroy merged 8 commits intomainfrom
cursor/editor-windows-playback-performance-f2d4
Feb 17, 2026
Merged

Editor windows playback performance#1613
richiemcilroy merged 8 commits intomainfrom
cursor/editor-windows-playback-performance-f2d4

Conversation

@richiemcilroy
Copy link
Member

@richiemcilroy richiemcilroy commented Feb 17, 2026

Significantly improve editor playback performance on Windows by enabling native MediaFoundation hardware decoding with zero-copy D3D11 textures, optimizing data transfer with NV12 output, and ensuring high-performance GPU selection.


Open in Cursor Open in Web

Greptile Summary

This PR significantly improves editor playback performance on Windows through three coordinated optimizations: (1) enables DirectX 12 backend preference across all GPU contexts for better hardware acceleration, (2) switches Windows video decoding to prefer native MediaFoundation with hardware acceleration and automatic FFmpeg fallback, and (3) implements D3D11 zero-copy texture sharing for NV12 frames to eliminate expensive memory transfers.

Key changes:

  • All GPU initialization now prefers DX12 backend on Windows (wgpu::Backends::DX12 | wgpu::Backends::VULKAN)
  • Frontend WebGPU requests high-performance GPU adapter
  • Windows decoder switched from FFmpeg-first to MediaFoundation-first with fallback logic
  • Editor renderer switched to NV12 output pipeline with RGBA fallback
  • Display layer implements D3D11 zero-copy path using shared texture handles for NV12 frames
  • Pipeline always flushes to prevent frame loss in the editor

Confidence Score: 4/5

  • This PR is safe to merge with careful testing on Windows systems
  • The changes are well-structured with proper platform-specific gating and comprehensive fallback paths. The decoder fallback logic ensures robustness when MediaFoundation fails. However, the performance impact should be verified on different Windows GPU configurations, and the .clone() of segment_frames in the editor loop may have performance implications worth investigating.
  • Pay close attention to crates/editor/src/editor.rs for the frame cloning overhead and crates/rendering/src/decoder/mod.rs for the decoder fallback logic

Important Files Changed

Filename Overview
apps/desktop/src/utils/webgpu-renderer.ts Requests high-performance GPU adapter for WebGPU in editor frontend, improving rendering performance
crates/editor/src/editor.rs Switches to NV12 rendering pipeline with RGBA fallback; always flushes NV12 pipeline after render to avoid frame loss
crates/rendering/src/decoder/mod.rs Switches Windows to prefer MediaFoundation hardware decoder with automatic FFmpeg fallback, improving performance and reliability
crates/rendering/src/layers/display.rs Adds D3D11 zero-copy NV12 texture path for Windows, falling back to memory copy paths when unavailable

Flowchart

flowchart TD
    A[Editor Render Frame] --> B{Try NV12 Pipeline}
    B -->|Success| C{Has D3D11 Handles?}
    C -->|Yes| D[Zero-Copy D3D11 Texture]
    C -->|No| E[CPU Memory Copy]
    D --> F[Output NV12 Frame]
    E --> F
    B -->|Failure| G[Fallback to RGBA]
    G --> H[Output RGBA Frame]
    
    I[Windows Decoder Init] --> J{Try MediaFoundation}
    J -->|Success| K[HW Accelerated MF]
    J -->|Fail| L[Fallback to FFmpeg]
    L --> M{FFmpeg Init}
    M -->|Success| N[FFmpeg Decoder]
    M -->|Fail| O[Error: Both Failed]
    
    P[GPU Context Init] --> Q{Platform?}
    Q -->|Windows| R[DX12 + Vulkan]
    Q -->|Other| S[Default Backends]
    R --> T[High-Performance Adapter]
    S --> T
Loading

Last reviewed commit: 62381d0

cursoragent and others added 6 commits February 17, 2026 21:55
Explicitly request DX12 | Vulkan backends in wgpu InstanceDescriptor on
Windows instead of using the default (all backends). DX12 provides optimal
D3D11↔D3D12 interop for shared texture handles from the MediaFoundation
decoder and is generally the most performant backend on Windows.

Non-Windows platforms continue using the default InstanceDescriptor.

Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
On Windows, try the native MediaFoundation hardware decoder first before
falling back to FFmpeg. This mirrors the macOS pattern (AVAssetReader first,
FFmpeg fallback).

The MF decoder provides:
- Native D3D11VA hardware-accelerated video decoding
- D3D11 shared texture handles (Y/UV planes) for zero-copy GPU texture sharing
- More reliable hardware acceleration than FFmpeg's D3D11VA path

If MF initialization fails for any reason, the existing FFmpeg path is used
as a fallback, preserving backward compatibility. The force_ffmpeg flag also
bypasses MF entirely when set.

Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
The prepare_with_encoder() method in DisplayLayer now checks for D3D11
shared texture handles (from the MediaFoundation decoder) before falling
back to CPU data upload. This enables zero-copy GPU texture sharing when
MF decoder provides shared Y/UV plane handles.

The non-Windows code path is completely unchanged - it is now in its own
#[cfg(not(target_os = "windows"))] block for clarity.

Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
Change the editor frame renderer to output NV12 frames instead of RGBA.
NV12 is ~33% smaller (1.5 bytes/pixel vs 4 bytes/pixel), significantly
reducing WebSocket bandwidth during playback.

The frontend WebSocket handling and WebGPU renderer already fully support
NV12 frames, so no frontend changes are needed. If NV12 rendering fails
for any reason, the renderer falls back to the existing RGBA path.

Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
Add powerPreference: 'high-performance' to both isWebGPUSupported() and
initWebGPU() calls. This ensures the discrete GPU is selected on systems
with both integrated and discrete GPUs, providing better frame rendering
performance in the editor preview canvas.

Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
Apply the same DX12 | Vulkan backend preference to all gpu-converter
crates for consistency with the rendering and gpu_context changes.

Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
@cursor
Copy link

cursor bot commented Feb 17, 2026

Cursor Agent can help with this pull request. Just @cursor in comments and I'll start working on changes in this branch.
Learn more about Cursor Agents

- Editor renderer now always flushes the NV12 pipeline after render_nv12()
  to ensure the current frame is emitted immediately. Previously, when
  render_nv12 returned Ok(Some(prev_frame)), the current frame would stay
  stuck in the GPU pipeline until the next render request arrived. This
  caused the last frame during playback to never be displayed, and preview
  frames could show stale content.

- Fixed potential collapsible_if clippy violation in the Windows D3D11
  zero-copy path in display.rs by combining nested if conditions into a
  single chained condition using let-chains.

- Applied cargo fmt formatting fix.

Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
@richiemcilroy richiemcilroy marked this pull request as ready for review February 17, 2026 22:53
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

current.uniforms,
let nv12_result = frame_renderer
.render_nv12(
current.segment_frames.clone(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.clone() on segment_frames may be expensive if DecodedFrame contains large buffers. Verify whether this clone is necessary or if the data can be moved/borrowed instead.

Prompt To Fix With AI
This is a comment left during a code review.
Path: crates/editor/src/editor.rs
Line: 152:152

Comment:
`.clone()` on `segment_frames` may be expensive if `DecodedFrame` contains large buffers. Verify whether this clone is necessary or if the data can be moved/borrowed instead.

How can I resolve this? If you propose a fix, please make it concise.

if let Some(prev) = pipeline_frame {
(self.frame_cb)(EditorFrameOutput::Nv12(prev));
}
if let Some(Ok(current_frame)) = frame_renderer.flush_pipeline_nv12().await {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flush_pipeline_nv12() can also yield Some(Err(_)); logging that case helps avoid silently dropping frames.

Suggested change
if let Some(Ok(current_frame)) = frame_renderer.flush_pipeline_nv12().await {
match frame_renderer.flush_pipeline_nv12().await {
Some(Ok(current_frame)) => {
(self.frame_cb)(EditorFrameOutput::Nv12(current_frame));
}
Some(Err(e)) => {
tracing::warn!(error = %e, "Failed to flush NV12 pipeline frame");
}
None => {}
}

status,
})
}
Ok(Ok(Err(e))) => Err(format!("MediaFoundation initialization failed: {e}")),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These MediaFoundation error strings drop useful context (name/path) that the FFmpeg errors include; adding them makes debugging much easier.

Suggested change
Ok(Ok(Err(e))) => Err(format!("MediaFoundation initialization failed: {e}")),
Ok(Ok(Err(e))) => Err(format!(
"'{name}' MediaFoundation initialization failed: {e} ({path_display})"
)),
Ok(Err(e)) => Err(format!(
"'{name}' MediaFoundation channel closed: {e} ({path_display})"
)),

pub async fn new() -> Self {
todo!("implement UV downsampling for UYVYToNV12");

#[cfg(target_os = "windows")]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this function currently hard-todo!()s, the Windows-specific InstanceDescriptor block below is dead code and adds noise/duplication.

Suggested change
#[cfg(target_os = "windows")]
let instance = wgpu::Instance::new(&wgpu::InstanceDescriptor::default());

- Log flush_pipeline_nv12() errors instead of silently dropping them,
  making it easier to diagnose GPU pipeline issues during playback.

- Add name and path context to MediaFoundation decoder error messages,
  matching the pattern used by FFmpeg decoder errors for consistent
  debugging experience.

- Revert unnecessary Windows backend preference in uyvy_nv12 since the
  function body is unreachable (todo!() macro).

Co-authored-by: Richie McIlroy <richiemcilroy@users.noreply.github.com>
@richiemcilroy richiemcilroy merged commit f5e712d into main Feb 17, 2026
15 of 16 checks passed
Comment on lines +96 to 101
const adapter = await navigator.gpu.requestAdapter({
powerPreference: "high-performance",
});
if (!adapter) {
throw new Error("No WebGPU adapter available");
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same idea here: try high-performance, but fall back to default adapter request so initialization still works if the preference is unsupported.

Suggested change
const adapter = await navigator.gpu.requestAdapter({
powerPreference: "high-performance",
});
if (!adapter) {
throw new Error("No WebGPU adapter available");
}
let adapter = await navigator.gpu
.requestAdapter({
powerPreference: "high-performance",
})
.catch(() => null);
adapter ??= await navigator.gpu.requestAdapter();
if (!adapter) {
throw new Error("No WebGPU adapter available");
}

Comment on lines +528 to +542
match convert_result {
Ok(_) => {
if self.yuv_converter.output_texture().is_some() {
self.pending_copy = Some(PendingTextureCopy {
width: actual_width,
height: actual_height,
dst_texture_index: next_texture,
});
true
} else {
false
}
}
Err(_) => false,
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now conversion failures get silently collapsed to false, which makes it hard to debug why we’re falling back (especially on Windows where multiple paths exist). Consider logging the error at debug to keep it low-noise.

Suggested change
match convert_result {
Ok(_) => {
if self.yuv_converter.output_texture().is_some() {
self.pending_copy = Some(PendingTextureCopy {
width: actual_width,
height: actual_height,
dst_texture_index: next_texture,
});
true
} else {
false
}
}
Err(_) => false,
}
match convert_result {
Ok(_) => {
if self.yuv_converter.output_texture().is_some() {
self.pending_copy = Some(PendingTextureCopy {
width: actual_width,
height: actual_height,
dst_texture_index: next_texture,
});
true
} else {
false
}
}
Err(e) => {
tracing::debug!(error = %e, "NV12 GPU conversion failed");
false
}
}

Comment on lines +84 to +86
const adapter = await navigator.gpu.requestAdapter({
powerPreference: "high-performance",
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This option can throw or be ignored on some WebGPU implementations; falling back to a plain requestAdapter() keeps the capability check from returning false for supported devices.

Suggested change
const adapter = await navigator.gpu.requestAdapter({
powerPreference: "high-performance",
});
const adapter = await navigator.gpu
.requestAdapter({
powerPreference: "high-performance",
})
.catch(() => navigator.gpu.requestAdapter());

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants