Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize time spent in GPU driver for GPU cache updates. #1390

Merged
merged 1 commit into from Jun 20, 2017

Conversation

@glennw
Copy link
Member

glennw commented Jun 16, 2017

We keep a shadow copy of the GPU cache data in the render thread.
After all updates (patches) have been applied for this frame, scan
the rows of the texture for rows that have been modified.

Upload each dirty row to the GPU via a PBO to ensure that there
are no CPU-side stalls.

In the future, there's several more optimizations that can be made:

  • Batch consecutive row updates into a single PBO / upload call.
  • Perhaps track start/end of each row, to avoid a full row update for small changes.

This change is Reviewable

@glennw
Copy link
Member Author

glennw commented Jun 16, 2017

r? @kvark

This gives a big improvement in CPU time spent in the render thread, and removes all driver stall warnings when running with INTEL_DEBUG=perf.

@bors-servo
Copy link
Contributor

bors-servo commented Jun 16, 2017

The latest upstream changes (presumably #1392) made this pull request unmergeable. Please resolve the merge conflicts.

We keep a shadow copy of the GPU cache data in the render thread.
After all updates (patches) have been applied for this frame, scan
the rows of the texture for rows that have been modified.

Upload each dirty row to the GPU via a PBO to ensure that there
are no CPU-side stalls.

In the future, there's several more optimizations that can be made:
 * Batch consecutive row updates into a single PBO / upload call.
 * Perhaps track start/end of each row, to avoid a full row update for small changes.
@glennw glennw force-pushed the glennw:pbo branch from 748ad62 to c9c4ed6 Jun 16, 2017
@glennw
Copy link
Member Author

glennw commented Jun 16, 2017

Rebased.

@kvark
kvark approved these changes Jun 19, 2017
// Copy the blocks from the patch array in the shadow CPU copy.
let block_offset = row * MAX_VERTEX_TEXTURE_WIDTH + address.u as usize;
let data = &mut self.cpu_blocks[block_offset..(block_offset + block_count)];
for i in 0..block_count {
// If we had to resize the texture, just mark all rows
// as dirty so they will be uploaded to the texture
// during the next flush.
for row in &mut self.rows {

This comment has been minimized.

@kvark

kvark Jun 19, 2017

Member

nice :)

@kvark
Copy link
Member

kvark commented Jun 19, 2017

I'm not sure why we need to operate on rows as opposed to arbitrary slices of rows, but I can see this as a complication so the current approach is a nice approximation.
@bors-servo r+

@bors-servo
Copy link
Contributor

bors-servo commented Jun 19, 2017

📌 Commit c9c4ed6 has been approved by kvark

@glennw
Copy link
Member Author

glennw commented Jun 19, 2017

We operate on rows for now since the driver time overhead in glTexSubImage2D was quite high, when calling it for individual blocks.

There's still some improvements here we can make - for instance, we could track the first and last dirty block in a row. Then, for mostly dirty rows, we can update several rows with one driver call. And for rows that only have a small number of changes, we could update just those blocks to avoid blitting the entire row.

@bors-servo
Copy link
Contributor

bors-servo commented Jun 20, 2017

Testing commit c9c4ed6 with merge 760ac74...

bors-servo added a commit that referenced this pull request Jun 20, 2017
Optimize time spent in GPU driver for GPU cache updates.

We keep a shadow copy of the GPU cache data in the render thread.
After all updates (patches) have been applied for this frame, scan
the rows of the texture for rows that have been modified.

Upload each dirty row to the GPU via a PBO to ensure that there
are no CPU-side stalls.

In the future, there's several more optimizations that can be made:
 * Batch consecutive row updates into a single PBO / upload call.
 * Perhaps track start/end of each row, to avoid a full row update for small changes.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/1390)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Jun 20, 2017

💔 Test failed - status-travis

@glennw
Copy link
Member Author

glennw commented Jun 20, 2017

@bors-servo retry

@bors-servo
Copy link
Contributor

bors-servo commented Jun 20, 2017

Testing commit c9c4ed6 with merge 19ac69f...

bors-servo added a commit that referenced this pull request Jun 20, 2017
Optimize time spent in GPU driver for GPU cache updates.

We keep a shadow copy of the GPU cache data in the render thread.
After all updates (patches) have been applied for this frame, scan
the rows of the texture for rows that have been modified.

Upload each dirty row to the GPU via a PBO to ensure that there
are no CPU-side stalls.

In the future, there's several more optimizations that can be made:
 * Batch consecutive row updates into a single PBO / upload call.
 * Perhaps track start/end of each row, to avoid a full row update for small changes.

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/1390)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Jun 20, 2017

☀️ Test successful - status-travis
Approved by: kvark
Pushing 19ac69f to master...

@bors-servo bors-servo merged commit c9c4ed6 into servo:master Jun 20, 2017
3 checks passed
3 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.