Skip to content

feat(layer): pre-record swapchain image copy commands#14

Merged
K1ngst0m merged 2 commits intomainfrom
dev/kingstom/optimize-present
Dec 25, 2025
Merged

feat(layer): pre-record swapchain image copy commands#14
K1ngst0m merged 2 commits intomainfrom
dev/kingstom/optimize-present

Conversation

@K1ngst0m
Copy link
Copy Markdown
Collaborator

@K1ngst0m K1ngst0m commented Dec 25, 2025

Records copy commands once per swapchain image at init

Performance Impact

Operation Before After
ResetCommandPool() Every frame Never
Command recording Every frame (~80 lines of Vulkan calls) Once at init
QueueSubmit() Every frame Every frame

Summary by CodeRabbit

  • Refactor
    • Internal capture pipeline reworked to use pre-recorded copy command buffers, reducing per-frame overhead and improving synchronous and asynchronous capture reliability.
  • Bug Fixes
    • Improved tracking of in-flight copies to reduce race conditions and missed frames.
  • Chores
    • Clarified logging for capture initialization and lifecycle events.

✏️ Tip: You can customize this high-level summary in your review settings.

@K1ngst0m K1ngst0m marked this pull request as ready for review December 25, 2025 05:03
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Dec 25, 2025

📝 Walkthrough

Walkthrough

Replaces per-frame resources with pre-recorded per-swapchain copy command buffers (CopyCmd). Recording of image-copy commands and layout transitions moves to initialization (init_copy_cmds) and is reused for async and sync capture paths; synchronization now uses timeline semaphores and per-command busy tracking.

Changes

Cohort / File(s) Summary
Header types & APIs
src/capture/vk_layer/vk_capture.hpp
Removed FrameData; added CopyCmd { VkCommandPool pool, VkCommandBuffer cmd, uint64_t timeline_value, bool busy }. SwapData now holds std::vector<CopyCmd> copy_cmds (removed frames and frame_index). Renamed private methods: create_frame_resourcesinit_copy_cmds, destroy_frame_resourcesdestroy_copy_cmds. Removed record_copy_commands declaration.
Capture implementation & lifecycle
src/capture/vk_layer/vk_capture.cpp
Replaced per-frame resource flow with pre-recorded copy command buffers. on_present calls init_copy_cmds. capture_frame uses swap->copy_cmds, implements timeline semaphore sync, per-command busy tracking, and submits pre-recorded command buffers for both async and sync paths (sync waits on export fence and streams texture). destroy_copy_cmds used in cleanup; former record_copy_commands logic inlined into initialization. Logging updated to reflect copy cmd init/destroy and counts.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    participant App as Application
    participant Layer as CaptureLayer
    participant GPU as GPU/Vulkan
    participant Host as Host IO

    Note over Layer,GPU: Initialization
    Layer->>GPU: init_copy_cmds(swap)  -- create pools & pre-record copy cmd buffers
    GPU-->>Layer: pre-recorded CmdBuffers

    Note over App,Layer: Presentation / Capture
    App->>Layer: on_present()
    Layer->>Layer: select CopyCmd (not busy)
    alt Async path
      Layer->>GPU: submit pre-recorded CmdBuffer + timeline semaphore increment
      GPU-->>Layer: timeline value updated (in-flight)
      Layer->>Layer: mark CopyCmd.busy
      par After GPU completes
        GPU->>Layer: timeline signal (timeline semaphore)
        Layer->>Host: schedule readback (async)
        Layer->>Layer: mark CopyCmd.busy = false
      end
    else Sync path
      Layer->>GPU: submit pre-recorded CmdBuffer + wait export fence
      GPU-->>Layer: export fence signaled
      Layer->>Host: stream texture data synchronously
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐰 I pre-record my hops before the race,
Commands in rows, each in its place.
Timelines hum and fences sing,
Copies bound for home they bring.
A rabbit clap—capture's graceful pace. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main objective of the changeset: pre-recording swapchain image copy commands at initialization instead of per-frame, which is the core refactoring described in the raw summary.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch dev/kingstom/optimize-present

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/capture/vk_layer/vk_capture.cpp (1)

653-659: No fallback if init_copy_cmds fails silently.

After init_export_image succeeds, init_copy_cmds is called unconditionally. If command pool/buffer creation fails inside init_copy_cmds, copy_cmds may be empty or contain invalid handles, causing capture_frame to either early-return (line 683) or crash.

Consider returning a boolean from init_copy_cmds and handling failure:

Suggested approach
-        init_copy_cmds(swap, dev_data);
+        if (!init_copy_cmds(swap, dev_data)) {
+            LAYER_DEBUG("Copy command buffer init FAILED");
+            // Optionally clean up export resources or mark as failed
+        }
🧹 Nitpick comments (3)
src/capture/vk_layer/vk_capture.hpp (1)

85-86: Consider returning bool from init_copy_cmds for error propagation.

Unlike init_export_image and init_sync_primitives which return bool, init_copy_cmds returns void. If Vulkan calls fail during command pool/buffer creation, the caller has no way to detect partial initialization and may proceed with invalid handles.

Suggested signature change
-    void init_copy_cmds(SwapData* swap, VkDeviceData* dev_data);
-    void destroy_copy_cmds(SwapData* swap, VkDeviceData* dev_data);
+    bool init_copy_cmds(SwapData* swap, VkDeviceData* dev_data);
+    void destroy_copy_cmds(SwapData* swap, VkDeviceData* dev_data);
src/capture/vk_layer/vk_capture.cpp (2)

530-533: Add error checking for command buffer recording.

BeginCommandBuffer and EndCommandBuffer (line 603) can fail. Consider checking their return values for robustness.

Suggested fix
-        funcs.BeginCommandBuffer(cmd.cmd, &begin_info);
+        res = funcs.BeginCommandBuffer(cmd.cmd, &begin_info);
+        if (res != VK_SUCCESS) {
+            LAYER_DEBUG("BeginCommandBuffer failed for image %zu: %d", i, res);
+            // Handle cleanup...
+            return false;
+        }

598-601: VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT in dstStageMask is a no-op.

Using TOP_OF_PIPE_BIT as a destination stage means "no stage waits for this barrier" which is effectively ignored. For the post-copy barrier before presentation, BOTTOM_OF_PIPE_BIT alone is sufficient and clearer.

Suggested simplification
         funcs.CmdPipelineBarrier(cmd.cmd, VK_PIPELINE_STAGE_TRANSFER_BIT,
-                                 VK_PIPELINE_STAGE_TOP_OF_PIPE_BIT |
-                                     VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
+                                 VK_PIPELINE_STAGE_BOTTOM_OF_PIPE_BIT,
                                  0, 0, nullptr, 0, nullptr, 2, barriers);
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 82f59bd and cc10dad.

📒 Files selected for processing (2)
  • src/capture/vk_layer/vk_capture.cpp
  • src/capture/vk_layer/vk_capture.hpp
🧰 Additional context used
🧬 Code graph analysis (2)
src/capture/vk_layer/vk_capture.hpp (1)
src/render/backend/vulkan_backend.hpp (1)
  • struct FrameResources { (88-92)
src/capture/vk_layer/vk_capture.cpp (1)
src/capture/vk_layer/vk_capture.hpp (6)
  • swap (83-83)
  • swap (84-84)
  • swap (85-85)
  • swap (86-86)
  • swap (87-88)
  • swap (89-89)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Build and Test
  • GitHub Check: Static Analysis (clang-tidy)
🔇 Additional comments (5)
src/capture/vk_layer/vk_capture.hpp (1)

18-23: LGTM! Clean struct design for pre-recorded command buffers.

The CopyCmd struct appropriately encapsulates per-swapchain-image resources needed for the pre-recording optimization: dedicated pool (avoiding VK_COMMAND_POOL_CREATE_RESET_COMMAND_BUFFER_BIT), the pre-recorded command buffer, timeline tracking, and busy state.

src/capture/vk_layer/vk_capture.cpp (4)

609-628: LGTM! Proper cleanup with in-flight command handling.

The destruction logic correctly waits for busy command buffers via timeline semaphore before destroying pools. This prevents destroying resources that are still in use by the GPU.


683-698: LGTM! Correct bounds checking and synchronization for pre-recorded buffers.

The bounds check on line 683 guards against invalid indices, and the wait logic (lines 690-698) properly synchronizes with in-flight command buffers before reuse. The transition to using pre-recorded CopyCmd references is clean.


705-768: LGTM! Async capture path properly tracks command buffer state.

The async path correctly:

  1. Increments and assigns timeline_value for synchronization
  2. Submits the pre-recorded command buffer with timeline semaphore signaling
  3. Queues the capture item for the worker thread
  4. Marks cmd.busy = true to prevent reuse before completion

807-807: LGTM! Cleanup correctly calls destroy_copy_cmds.

The cleanup path properly invokes destroy_copy_cmds before releasing other swap resources, ensuring command buffers are waited on and pools are destroyed in the correct order.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/capture/vk_layer/vk_capture.cpp (1)

516-533: Good fix! Error handling added for command pool and buffer allocation.

The error handling for CreateCommandPool and AllocateCommandBuffers has been properly implemented, addressing the previous review concern. Both failures now trigger cleanup via destroy_copy_cmds and return false.

🧹 Nitpick comments (1)
src/capture/vk_layer/vk_capture.cpp (1)

541-541: Consider clarifying the "Reusable" comment.

The command buffer flags are set to 0, meaning the buffer can be reused but not while in the pending state (no VK_COMMAND_BUFFER_USAGE_SIMULTANEOUS_USE_BIT). Your busy flag synchronization correctly prevents concurrent submission. Consider clarifying the comment to reflect this, e.g., "Sequential reuse" or "Reusable after completion".

📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cc10dad and 00a6fa1.

📒 Files selected for processing (2)
  • src/capture/vk_layer/vk_capture.cpp
  • src/capture/vk_layer/vk_capture.hpp
🧰 Additional context used
🧬 Code graph analysis (1)
src/capture/vk_layer/vk_capture.cpp (2)
src/capture/vk_layer/vk_capture.hpp (6)
  • swap (83-83)
  • swap (84-84)
  • swap (85-85)
  • swap (86-86)
  • swap (87-88)
  • swap (89-89)
src/render/backend/vulkan_backend.hpp (3)
  • cmd (61-62)
  • cmd (63-64)
  • image_index (65-65)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Static Analysis (clang-tidy)
  • GitHub Check: Build and Test
🔇 Additional comments (7)
src/capture/vk_layer/vk_capture.hpp (3)

18-23: LGTM! Well-designed struct for pre-recorded commands.

The CopyCmd struct appropriately captures per-command-buffer state with safe default initialization. The busy flag enables proper synchronization to prevent resubmission of in-flight commands.


62-63: LGTM! Clear transition to per-swapchain-image command buffers.

The updated data structure aligns with the PR objective of pre-recording one command buffer per swapchain image.


85-86: LGTM! Clear and descriptive method names.

The renamed methods accurately reflect their purpose in the pre-recorded command buffer lifecycle.

src/capture/vk_layer/vk_capture.cpp (4)

619-638: LGTM! Proper cleanup with synchronization.

The function correctly waits for in-flight command buffers (using the busy flag and timeline values) before destroying resources. Destroying the command pool implicitly frees the command buffers, so the cleanup is complete.


696-714: LGTM! Clean pre-recorded command buffer usage.

The bounds check, busy-wait synchronization, and pre-recorded command submission flow are all correctly implemented. This achieves the PR objective of eliminating per-frame command recording overhead.


669-672: LGTM! Proper initialization flow.

The copy commands are initialized at the appropriate point—after the export image is ready—ensuring all resources are available for command recording.


820-820: LGTM! Cleanup properly integrated.

The destroy_copy_cmds call ensures pre-recorded command buffers are cleaned up with proper synchronization during swapchain destruction.

@K1ngst0m K1ngst0m merged commit 195fc65 into main Dec 25, 2025
3 checks passed
@K1ngst0m K1ngst0m deleted the dev/kingstom/optimize-present branch December 25, 2025 05:35
zhangzhousuper pushed a commit that referenced this pull request Dec 25, 2025
Records copy commands once per swapchain image at init
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant