Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve CowData::insert performance #94353

Merged
merged 1 commit into from
Aug 16, 2024

Conversation

aaronp64
Copy link
Contributor

Update CowData::insert to call ptrw() before loop, to avoid calling _copy_on_write for each item in the array, as well as repeated index checks in set and get. For larger Vectors/Arrays, this makes inserts around 10x faster for ints, 3x faster for Strings, and 2x faster for Variants. Less of an impact on smaller Vectors/Arrays, as a larger percentage of the time is spent allocating.

Compared old/new code times with below gdscript:

extends Node2D

func _ready():
	test_insert_int()
	test_insert_str()
	test_insert_var()

func test_insert_int():
	var a : PackedInt32Array = []
	var start = Time.get_ticks_msec()
	for i in 10000:
		a.insert(0, i)
	var end = Time.get_ticks_msec()
	print("test_insert_int: %dms" % [end - start])
	
func test_insert_str():
	var a : PackedStringArray = []
	var s := "test"
	var start = Time.get_ticks_msec()
	for i in 10000:
		a.insert(0, s)
	var end = Time.get_ticks_msec()
	print("test_insert_str: %dms" % [end - start])

func test_insert_var():
	var a : Array = []
	var start = Time.get_ticks_msec()
	for i in 10000:
		a.insert(0, i)
	var end = Time.get_ticks_msec()
	print("test_insert_var: %dms" % [end - start])

old:

test_insert_int: 134ms
test_insert_str: 165ms
test_insert_var: 200ms

new:

test_insert_int: 12ms
test_insert_str: 50ms
test_insert_var: 91ms

@aaronp64 aaronp64 requested a review from a team as a code owner July 14, 2024 15:42
@Calinou Calinou added this to the 4.x milestone Jul 14, 2024
@clayjohn clayjohn modified the milestones: 4.x, 4.4 Jul 14, 2024
Copy link
Member

@RandomShaper RandomShaper left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, but let me suggest that while you're at it, you also avoid calling size() twice.

@aaronp64
Copy link
Contributor Author

Replacing the first two size() calls by just calling it once up front looks pretty straightforward, I'm not sure what to do with the third size() call though (for the for loop). I think if we use a value from before resize is called, it might technically change the behavior in the case where resize fails - if a realloc was required and failed, _ptr might point to memory smaller than the requested size. Without checking size() again, we might go past the end of the array if we assume resize was successful. Not positive I'm following the allocation/error logic correctly though, might not be an issue.

Should I leave the size() call as is in the for loop, or replace all of them?

@RandomShaper
Copy link
Member

I'd go for all of them. And if you want to handle an error from resize(), feel free. The current implementation is not doing it (just assuming the resize will never fail) and, although it's not that important because you're actually doomed if that fails, there's at least a theoretical correctness point in using it.

I'd go for something like this:

	Error insert(Size p_pos, const T &p_val) {
		Size curr_size = size();
		ERR_FAIL_INDEX_V(p_pos, curr_size + 1, ERR_INVALID_PARAMETER);
		Error err = resize(curr_size + 1);
		ERR_FAIL_COND_V(err, err);
		T *p = ptrw();
		for (Size i = curr_size; i > p_pos; i--) {
			p[i] = p[i - 1];
		}
		p[p_pos] = p_val;

		return OK;
	}

Update CowData::insert to call ptrw() before loop, to avoid calling _copy_on_write for each item in the array, as well as repeated index checks in set and get.  For larger Vectors/Arrays, this makes inserts around 10x faster for ints, 3x faster for Strings, and 2x faster for Variants.  Less of an impact on smaller Vectors/Arrays, as a larger percentage of the time is spent allocating.
@aaronp64
Copy link
Contributor Author

Updated and added error check

@MewPurPur
Copy link
Contributor

MewPurPur commented Jul 20, 2024

For a user who doesn't understand engine core stuff much, what will be sped up by this optimization?

@AThousandShips
Copy link
Member

AThousandShips commented Jul 20, 2024

This will improve things like Vector<T> insert (including all packed arrays, and Array which uses Vector)

@MewPurPur
Copy link
Contributor

Presumably this will affect things all over the engine where Vector.insert() is used?

@AThousandShips
Copy link
Member

Might not be significant given other performance considerations, but it should affect them at least somewhat

@akien-mga akien-mga merged commit 91bf992 into godotengine:master Aug 16, 2024
18 checks passed
@akien-mga
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants