Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Merged by Bors] - Directly copy moved Table components to the target location #5056

Closed
wants to merge 5 commits into from

Conversation

james7132
Copy link
Member

@james7132 james7132 commented Jun 20, 2022

Objective

Speed up entity moves between tables by reducing the number of copies conducted. Currently three separate copies are conducted: src[index] -> swap scratch, src[last] -> src[index], and swap scratch -> dst[target]. The first and last copies can be merged by directly using the copy src[index] -> dst[target], which can save quite some time if the component(s) in question are large.

Solution

This PR does the following:

  • Adds BlobVec::swap_remove_unchecked(usize, PtrMut<'_>), which is identical to swap_remove_and_forget_unchecked, but skips the swap_scratch and directly copies the component into the provided PtrMut<'_>.
  • Build Column::initialize_from_unchecked(&mut Column, usize, usize) on top of it, which uses the above to directly initialize a row from another column.
  • Update most of the table move APIs to use initialize_from_unchecked instead of a combination of swap_remove_and_forget_unchecked and initialize.

This is an alternative, though orthogonal, approach to achieve the same performance gains as seen in #4853. This (hopefully) shouldn't run into the same Miri limitations that said PR currently does. After this PR, swap_remove_and_forget_unchecked is still in use for Resources and swap_scratch likely still should be removed, so #4853 still has use, even if this PR is merged.

Performance

TODO: Microbenchmark

This PR shows similar improvements to commands that add or remove table components that result in a table move. When tested on many_cubes sphere, some of the more command heavy systems saw notable improvements. In particular, prepare_uniform_components<T>, this saw a reduction in time from 1.35ms to 1.13ms (a 16.3% improvement) on my local machine, a similar if not slightly better gain than what #4853 showed here.

image

The command heavy Extract stage also saw a smaller overall improvement:

image

Changelog

Added: BlobVec::swap_remove_unchecked.
Added: Column::initialize_from_unchecked.

@james7132 james7132 added A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times labels Jun 20, 2022
@hymm
Copy link
Contributor

hymm commented Jun 20, 2022

if #4853 gets merged, should the new functions get removed?

@james7132
Copy link
Member Author

james7132 commented Jun 20, 2022

if #4853 gets merged, should the new functions get removed?

No, these PRs are orthogonal. Though this PR lessens the impact of that PR, as swap_remove_and_forget_unchecked isn't in the hotpath for component addition/removal anymore.

Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes look good, and those perf results are excellent. Some suggestions on comments to help readers understand the pointer logic.

@james7132 james7132 requested a review from mockersf June 21, 2022 01:11
@alice-i-cecile
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Jun 21, 2022
# Objective
Speed up entity moves between tables by reducing the number of copies conducted. Currently three separate copies are conducted: `src[index] -> swap scratch`, `src[last] -> src[index]`, and `swap scratch -> dst[target]`. The first and last copies can be merged by directly using the copy `src[index] -> dst[target]`, which can save quite some time if the component(s) in question are large.

## Solution
This PR does the  following:

 - Adds `BlobVec::swap_remove_unchecked(usize, PtrMut<'_>)`, which is identical to `swap_remove_and_forget_unchecked`, but skips the `swap_scratch` and directly copies the component into the provided `PtrMut<'_>`.
 - Build `Column::initialize_from_unchecked(&mut Column, usize, usize)` which uses the above to directly initialize a row from another column. 
 - Update most of the table move APIs to use `initialize_from_unchecked` instead of a combination of `swap_remove_and_forget_unchecked` and `initialize`.

This is an alternative, though orthogonal, approach to achieve the same performance gains as seen in #4853. This (hopefully) shouldn't run into the same Miri limitations that said PR currently does.  After this PR, `swap_remove_and_forget_unchecked` is still in use for Resources and swap_scratch likely still should be removed, so #4853 still has use, even if this PR is merged.

## Performance
TODO: Microbenchmark

This PR shows similar improvements to commands that add or remove table components that result in a table move. When tested on `many_cubes sphere`, some of the more command heavy systems saw notable improvements. In particular, `prepare_uniform_components<T>`, this saw a reduction in time from 1.35ms to 1.13ms (a 16.3% improvement) on my local machine, a similar if not slightly better gain than what #4853 showed [here](#4853 (comment)).

![image](https://user-images.githubusercontent.com/3137680/174570088-1c4c6fd7-3215-478c-9eb7-8bd9fe486b32.png)

The command heavy `Extract` stage also saw a smaller overall improvement:

![image](https://user-images.githubusercontent.com/3137680/174572261-8a48f004-ab9f-4cb2-b304-a882b6d78065.png)
---

## Changelog
Added: `BlobVec::swap_remove_unchecked`.
Added: `Column::initialize_from_unchecked`.
@alice-i-cecile
Copy link
Member

bors r-

@bors
Copy link
Contributor

bors bot commented Jun 21, 2022

Canceled.

@alice-i-cecile
Copy link
Member

@BoxyUwU my impression is that this is about as simple as pointer code gets, but feel free to review this in the next few days if you want.

@alice-i-cecile alice-i-cecile added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Jun 21, 2022
Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is ready to go :)

@alice-i-cecile
Copy link
Member

bors r+

bors bot pushed a commit that referenced this pull request Jun 27, 2022
# Objective
Speed up entity moves between tables by reducing the number of copies conducted. Currently three separate copies are conducted: `src[index] -> swap scratch`, `src[last] -> src[index]`, and `swap scratch -> dst[target]`. The first and last copies can be merged by directly using the copy `src[index] -> dst[target]`, which can save quite some time if the component(s) in question are large.

## Solution
This PR does the  following:

 - Adds `BlobVec::swap_remove_unchecked(usize, PtrMut<'_>)`, which is identical to `swap_remove_and_forget_unchecked`, but skips the `swap_scratch` and directly copies the component into the provided `PtrMut<'_>`.
 - Build `Column::initialize_from_unchecked(&mut Column, usize, usize)` on top of it, which uses the above to directly initialize a row from another column. 
 - Update most of the table move APIs to use `initialize_from_unchecked` instead of a combination of `swap_remove_and_forget_unchecked` and `initialize`.

This is an alternative, though orthogonal, approach to achieve the same performance gains as seen in #4853. This (hopefully) shouldn't run into the same Miri limitations that said PR currently does.  After this PR, `swap_remove_and_forget_unchecked` is still in use for Resources and swap_scratch likely still should be removed, so #4853 still has use, even if this PR is merged.

## Performance
TODO: Microbenchmark

This PR shows similar improvements to commands that add or remove table components that result in a table move. When tested on `many_cubes sphere`, some of the more command heavy systems saw notable improvements. In particular, `prepare_uniform_components<T>`, this saw a reduction in time from 1.35ms to 1.13ms (a 16.3% improvement) on my local machine, a similar if not slightly better gain than what #4853 showed [here](#4853 (comment)).

![image](https://user-images.githubusercontent.com/3137680/174570088-1c4c6fd7-3215-478c-9eb7-8bd9fe486b32.png)

The command heavy `Extract` stage also saw a smaller overall improvement:

![image](https://user-images.githubusercontent.com/3137680/174572261-8a48f004-ab9f-4cb2-b304-a882b6d78065.png)
---

## Changelog
Added: `BlobVec::swap_remove_unchecked`.
Added: `Column::initialize_from_unchecked`.
@bors bors bot changed the title Directly copy moved Table components to the target location [Merged by Bors] - Directly copy moved Table components to the target location Jun 27, 2022
@bors bors bot closed this Jun 27, 2022
inodentry pushed a commit to IyesGames/bevy that referenced this pull request Aug 8, 2022
…ne#5056)

# Objective
Speed up entity moves between tables by reducing the number of copies conducted. Currently three separate copies are conducted: `src[index] -> swap scratch`, `src[last] -> src[index]`, and `swap scratch -> dst[target]`. The first and last copies can be merged by directly using the copy `src[index] -> dst[target]`, which can save quite some time if the component(s) in question are large.

## Solution
This PR does the  following:

 - Adds `BlobVec::swap_remove_unchecked(usize, PtrMut<'_>)`, which is identical to `swap_remove_and_forget_unchecked`, but skips the `swap_scratch` and directly copies the component into the provided `PtrMut<'_>`.
 - Build `Column::initialize_from_unchecked(&mut Column, usize, usize)` on top of it, which uses the above to directly initialize a row from another column. 
 - Update most of the table move APIs to use `initialize_from_unchecked` instead of a combination of `swap_remove_and_forget_unchecked` and `initialize`.

This is an alternative, though orthogonal, approach to achieve the same performance gains as seen in bevyengine#4853. This (hopefully) shouldn't run into the same Miri limitations that said PR currently does.  After this PR, `swap_remove_and_forget_unchecked` is still in use for Resources and swap_scratch likely still should be removed, so bevyengine#4853 still has use, even if this PR is merged.

## Performance
TODO: Microbenchmark

This PR shows similar improvements to commands that add or remove table components that result in a table move. When tested on `many_cubes sphere`, some of the more command heavy systems saw notable improvements. In particular, `prepare_uniform_components<T>`, this saw a reduction in time from 1.35ms to 1.13ms (a 16.3% improvement) on my local machine, a similar if not slightly better gain than what bevyengine#4853 showed [here](bevyengine#4853 (comment)).

![image](https://user-images.githubusercontent.com/3137680/174570088-1c4c6fd7-3215-478c-9eb7-8bd9fe486b32.png)

The command heavy `Extract` stage also saw a smaller overall improvement:

![image](https://user-images.githubusercontent.com/3137680/174572261-8a48f004-ab9f-4cb2-b304-a882b6d78065.png)
---

## Changelog
Added: `BlobVec::swap_remove_unchecked`.
Added: `Column::initialize_from_unchecked`.
james7132 added a commit to james7132/bevy that referenced this pull request Oct 28, 2022
…ne#5056)

# Objective
Speed up entity moves between tables by reducing the number of copies conducted. Currently three separate copies are conducted: `src[index] -> swap scratch`, `src[last] -> src[index]`, and `swap scratch -> dst[target]`. The first and last copies can be merged by directly using the copy `src[index] -> dst[target]`, which can save quite some time if the component(s) in question are large.

## Solution
This PR does the  following:

 - Adds `BlobVec::swap_remove_unchecked(usize, PtrMut<'_>)`, which is identical to `swap_remove_and_forget_unchecked`, but skips the `swap_scratch` and directly copies the component into the provided `PtrMut<'_>`.
 - Build `Column::initialize_from_unchecked(&mut Column, usize, usize)` on top of it, which uses the above to directly initialize a row from another column. 
 - Update most of the table move APIs to use `initialize_from_unchecked` instead of a combination of `swap_remove_and_forget_unchecked` and `initialize`.

This is an alternative, though orthogonal, approach to achieve the same performance gains as seen in bevyengine#4853. This (hopefully) shouldn't run into the same Miri limitations that said PR currently does.  After this PR, `swap_remove_and_forget_unchecked` is still in use for Resources and swap_scratch likely still should be removed, so bevyengine#4853 still has use, even if this PR is merged.

## Performance
TODO: Microbenchmark

This PR shows similar improvements to commands that add or remove table components that result in a table move. When tested on `many_cubes sphere`, some of the more command heavy systems saw notable improvements. In particular, `prepare_uniform_components<T>`, this saw a reduction in time from 1.35ms to 1.13ms (a 16.3% improvement) on my local machine, a similar if not slightly better gain than what bevyengine#4853 showed [here](bevyengine#4853 (comment)).

![image](https://user-images.githubusercontent.com/3137680/174570088-1c4c6fd7-3215-478c-9eb7-8bd9fe486b32.png)

The command heavy `Extract` stage also saw a smaller overall improvement:

![image](https://user-images.githubusercontent.com/3137680/174572261-8a48f004-ab9f-4cb2-b304-a882b6d78065.png)
---

## Changelog
Added: `BlobVec::swap_remove_unchecked`.
Added: `Column::initialize_from_unchecked`.
@james7132 james7132 deleted the direct-copy-movement branch December 12, 2022 06:44
ItsDoot pushed a commit to ItsDoot/bevy that referenced this pull request Feb 1, 2023
…ne#5056)

# Objective
Speed up entity moves between tables by reducing the number of copies conducted. Currently three separate copies are conducted: `src[index] -> swap scratch`, `src[last] -> src[index]`, and `swap scratch -> dst[target]`. The first and last copies can be merged by directly using the copy `src[index] -> dst[target]`, which can save quite some time if the component(s) in question are large.

## Solution
This PR does the  following:

 - Adds `BlobVec::swap_remove_unchecked(usize, PtrMut<'_>)`, which is identical to `swap_remove_and_forget_unchecked`, but skips the `swap_scratch` and directly copies the component into the provided `PtrMut<'_>`.
 - Build `Column::initialize_from_unchecked(&mut Column, usize, usize)` on top of it, which uses the above to directly initialize a row from another column. 
 - Update most of the table move APIs to use `initialize_from_unchecked` instead of a combination of `swap_remove_and_forget_unchecked` and `initialize`.

This is an alternative, though orthogonal, approach to achieve the same performance gains as seen in bevyengine#4853. This (hopefully) shouldn't run into the same Miri limitations that said PR currently does.  After this PR, `swap_remove_and_forget_unchecked` is still in use for Resources and swap_scratch likely still should be removed, so bevyengine#4853 still has use, even if this PR is merged.

## Performance
TODO: Microbenchmark

This PR shows similar improvements to commands that add or remove table components that result in a table move. When tested on `many_cubes sphere`, some of the more command heavy systems saw notable improvements. In particular, `prepare_uniform_components<T>`, this saw a reduction in time from 1.35ms to 1.13ms (a 16.3% improvement) on my local machine, a similar if not slightly better gain than what bevyengine#4853 showed [here](bevyengine#4853 (comment)).

![image](https://user-images.githubusercontent.com/3137680/174570088-1c4c6fd7-3215-478c-9eb7-8bd9fe486b32.png)

The command heavy `Extract` stage also saw a smaller overall improvement:

![image](https://user-images.githubusercontent.com/3137680/174572261-8a48f004-ab9f-4cb2-b304-a882b6d78065.png)
---

## Changelog
Added: `BlobVec::swap_remove_unchecked`.
Added: `Column::initialize_from_unchecked`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants