Use CpuWriteGpuReadBuffer for all cpu->gpu transfers #1398

Wumpf · 2023-02-24T12:38:15Z

Followup on #1382

Turns out this does have some ripple effects on how we pass around RenderContext - generally we should try to pass immutable reference and takes locks internally where needed (as short lived as possible without too much hassle). This is a refactor we need to do anyways for better multitreadability

The text was updated successfully, but these errors were encountered:

jleibs · 2023-07-21T20:59:32Z

@Wumpf is this done?

Wumpf · 2023-07-22T07:59:52Z

No, afaik most RenderData is still using the queue and neither the LineBuilder nor the PointBuilder write to belt memory directly.

…ht perf improvements for large point clouds (#5229) ### What * Part of #1398 * Follow-up to #5207 Point cloud builder uses now `DataTextureSource` for all its data. `DataTextureSource` is now better at dealing with the max texture size: data writes are clamped ahead of time, meaning we no longer reserve memory for data that we can't write into the texture upon `DataTexturesSource::finish`. This bubbled up now since `PointCloudBuilder` so far used a vec for positions and checked that against the maximum. The changes here leave the check in place, but make `DataTextureSource` reserving & writing clamp to the maximum and deal with this robustly even if high level data structures (like `PointCloudBuilder`) fail to handle this. Artificially limiting the data texture size to 256 * 256 looks like this: <img width="1751" alt="image" src="https://github.com/rerun-io/rerun/assets/1220815/ff65944c-b499-4772-8993-9619daf8d6b0"> I tested performance before/after on these changes on two opf scenes on my mac to check for changes in perf and found the switch to using `DataTextureSource` for position/radius improved performance slightly: * olympic (1.5mio points): ~7.0ms -> ~6.8ms * rainwater (4.6mio points): ~17.1ms-> ~16.5ms (numbers via performance metrics, release build) ### Checklist * [x] I have read and agree to [Contributor Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and the [Code of Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md) * [x] I've included a screenshot or gif (if applicable) * [x] I have tested the web demo (if applicable): * Using newly built examples: [app.rerun.io](https://app.rerun.io/pr/5229/index.html) * Using examples from latest `main` build: [app.rerun.io](https://app.rerun.io/pr/5229/index.html?manifest_url=https://app.rerun.io/version/main/examples_manifest.json) * Using full set of examples from `nightly` build: [app.rerun.io](https://app.rerun.io/pr/5229/index.html?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json) * [x] The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG * [x] If applicable, add a new check to the [release checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)! - [PR Build Summary](https://build.rerun.io/pr/5229) - [Docs preview](https://rerun.io/preview/963956c93faa135571b48dc764f26ffea87e186b/docs)  - [Examples preview](https://rerun.io/preview/963956c93faa135571b48dc764f26ffea87e186b/examples)  - [Recent benchmark results](https://build.rerun.io/graphs/crates.html) - [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

### What * Fixes #1398 * Follow-up of #5229 * same changelog applies which is why I left it out: again this makes handling of I tried a scene with a very large number of line strips and the performance is about the same before after, hard to judge though since before perf was not very stable and both before/after we have a huge profiler overhead - there's some low hanging fruit in better batching how we add batches of line strips, but I left this intentionally out of this pr. Similar to the previous PR I also artificially lowered the data size texture limit to check that the error messages and truncating of lines works gracefully. ### Checklist * [x] I have read and agree to [Contributor Guide](https://github.com/rerun-io/rerun/blob/main/CONTRIBUTING.md) and the [Code of Conduct](https://github.com/rerun-io/rerun/blob/main/CODE_OF_CONDUCT.md) * [x] I've included a screenshot or gif (if applicable) * [x] I have tested the web demo (if applicable): * Using newly built examples: [app.rerun.io](https://app.rerun.io/pr/5240/index.html) * Using examples from latest `main` build: [app.rerun.io](https://app.rerun.io/pr/5240/index.html?manifest_url=https://app.rerun.io/version/main/examples_manifest.json) * Using full set of examples from `nightly` build: [app.rerun.io](https://app.rerun.io/pr/5240/index.html?manifest_url=https://app.rerun.io/version/nightly/examples_manifest.json) * [x] The PR title and labels are set such as to maximize their usefulness for the next release's CHANGELOG * [x] If applicable, add a new check to the [release checklist](https://github.com/rerun-io/rerun/blob/main/tests/python/release_checklist)! - [PR Build Summary](https://build.rerun.io/pr/5240) - [Docs preview](https://rerun.io/preview/6d57ed6f6cfce86d82218a33994437fd4fdb637e/docs)  - [Examples preview](https://rerun.io/preview/6d57ed6f6cfce86d82218a33994437fd4fdb637e/examples)  - [Recent benchmark results](https://build.rerun.io/graphs/crates.html) - [Wasm size tracking](https://build.rerun.io/graphs/sizes.html)

Wumpf added 🔺 re_renderer affects re_renderer itself 🧑‍💻 dev experience developer experience (excluding CI) 📉 performance Optimization, memory use, etc labels Feb 24, 2023

Wumpf self-assigned this Feb 24, 2023

This was referenced Feb 24, 2023

[re_renderer] Uniform buffer utility using CpuWriteGpuReadBelt #1400

Merged

Use write belt for mesh data gpu upload #1416

Merged

Wumpf removed their assignment Sep 26, 2023

Wumpf mentioned this issue Feb 19, 2024

More robust handling of maximum texture size for non-color data, slight perf improvements for large point clouds #5229

Merged

5 tasks

Wumpf mentioned this issue Feb 20, 2024

Use DataTextureSource for lines #5240

Merged

5 tasks

Wumpf closed this as completed in #5240 Feb 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use CpuWriteGpuReadBuffer for all cpu->gpu transfers #1398

Use CpuWriteGpuReadBuffer for all cpu->gpu transfers #1398

Wumpf commented Feb 24, 2023

jleibs commented Jul 21, 2023

Wumpf commented Jul 22, 2023

Use CpuWriteGpuReadBuffer for all cpu->gpu transfers #1398

Use CpuWriteGpuReadBuffer for all cpu->gpu transfers #1398

Comments

Wumpf commented Feb 24, 2023

jleibs commented Jul 21, 2023

Wumpf commented Jul 22, 2023