Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upOptimize time spent in GPU driver for GPU cache updates. #1390
Conversation
|
r? @kvark This gives a big improvement in CPU time spent in the render thread, and removes all driver stall warnings when running with |
|
|
We keep a shadow copy of the GPU cache data in the render thread. After all updates (patches) have been applied for this frame, scan the rows of the texture for rows that have been modified. Upload each dirty row to the GPU via a PBO to ensure that there are no CPU-side stalls. In the future, there's several more optimizations that can be made: * Batch consecutive row updates into a single PBO / upload call. * Perhaps track start/end of each row, to avoid a full row update for small changes.
|
Rebased. |
| // Copy the blocks from the patch array in the shadow CPU copy. | ||
| let block_offset = row * MAX_VERTEX_TEXTURE_WIDTH + address.u as usize; | ||
| let data = &mut self.cpu_blocks[block_offset..(block_offset + block_count)]; | ||
| for i in 0..block_count { |
This comment has been minimized.
This comment has been minimized.
| // If we had to resize the texture, just mark all rows | ||
| // as dirty so they will be uploaded to the texture | ||
| // during the next flush. | ||
| for row in &mut self.rows { |
This comment has been minimized.
This comment has been minimized.
|
I'm not sure why we need to operate on rows as opposed to arbitrary slices of rows, but I can see this as a complication so the current approach is a nice approximation. |
|
|
|
We operate on rows for now since the driver time overhead in glTexSubImage2D was quite high, when calling it for individual blocks. There's still some improvements here we can make - for instance, we could track the first and last dirty block in a row. Then, for mostly dirty rows, we can update several rows with one driver call. And for rows that only have a small number of changes, we could update just those blocks to avoid blitting the entire row. |
Optimize time spent in GPU driver for GPU cache updates. We keep a shadow copy of the GPU cache data in the render thread. After all updates (patches) have been applied for this frame, scan the rows of the texture for rows that have been modified. Upload each dirty row to the GPU via a PBO to ensure that there are no CPU-side stalls. In the future, there's several more optimizations that can be made: * Batch consecutive row updates into a single PBO / upload call. * Perhaps track start/end of each row, to avoid a full row update for small changes. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/1390) <!-- Reviewable:end -->
|
|
|
@bors-servo retry |
Optimize time spent in GPU driver for GPU cache updates. We keep a shadow copy of the GPU cache data in the render thread. After all updates (patches) have been applied for this frame, scan the rows of the texture for rows that have been modified. Upload each dirty row to the GPU via a PBO to ensure that there are no CPU-side stalls. In the future, there's several more optimizations that can be made: * Batch consecutive row updates into a single PBO / upload call. * Perhaps track start/end of each row, to avoid a full row update for small changes. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/1390) <!-- Reviewable:end -->
|
|
glennw commentedJun 16, 2017
•
edited by larsbergstrom
We keep a shadow copy of the GPU cache data in the render thread.
After all updates (patches) have been applied for this frame, scan
the rows of the texture for rows that have been modified.
Upload each dirty row to the GPU via a PBO to ensure that there
are no CPU-side stalls.
In the future, there's several more optimizations that can be made:
This change is