Variant memory pools #61315

lawnjelly · 2022-05-23T11:54:47Z

Memory pools via PagedAllocator for Transform2D, Transform3D, Basis and AABB.

Simplified version of #58942 in response to PR meeting feedback, using a couple of bucket sizes - one for Transform2D and AABB (6 reals), and one for Basis and Transform3D (12 reals).

Notes

In the PR meeting it was agreed to just pool the Variant sub types for now, using a couple of bucket sizes.
The only non-obvious part of this PR is we needed a way of turning off the PagedAllocator error messages during the fork in detect_prime(), otherwise we would get false error messages. See Use memory pools for Variant extended types #58633 for more details.
The _err_print_error change is to allow at least basic printing of errors outside the lifetime of OS (for instance if OS is destroyed before PagedAllocator pools). Order of construction / destruction requires some care with pools such as these as other modules depend on them being present during startup and shutdown.

core/templates/paged_allocator.h

akien-mga · 2022-06-28T13:18:27Z

@Calinou Would you be able to run quick benchmarks on this to see if it makes a significant change?

lawnjelly · 2022-06-28T13:19:33Z

PR Meeting:

print_string.h contains some similar globals, we can put these all together in a header in a namespace to keep them together.

Update:
I have now created a core/core_globals.h file, and moved leak_reporting_enabled into there as well as the globals from print_string.h. As before, let me know any better ideas for naming etc for this.

Calinou · 2022-06-28T17:54:38Z

@Calinou Would you be able to run quick benchmarks on this to see if it makes a significant change?

I'll try to do that this week.

Calinou · 2022-07-03T20:41:37Z

I ran a benchmark comparing this PR to the commit on master before this PR. Performance seems to be identical within margin of error:

SCons flags: tools=yes target=release_debug use_lto=yes

Opening and quitting project manager

Benchmark #1: bin/godot.linuxbsd.opt.tools.64 --quit
  Time (mean ± σ):      2.518 s ±  0.049 s    [User: 1.282 s, System: 0.192 s]
  Range (min … max):    2.451 s …  2.571 s    10 runs
 
Benchmark #2: bin/godot.linuxbsd.opt.tools.64.variant_bucket_pools --quit
  Time (mean ± σ):      2.500 s ±  0.055 s    [User: 1.282 s, System: 0.193 s]
  Range (min … max):    2.435 s …  2.584 s    10 runs
 
Summary
  'bin/godot.linuxbsd.opt.tools.64.variant_bucket_pools --quit' ran
    1.01 ± 0.03 times faster than 'bin/godot.linuxbsd.opt.tools.64 --quit'

Opening and quitting editor

Benchmark #1: bin/godot.linuxbsd.opt.tools.64 /tmp/4/project.godot --quit
  Time (mean ± σ):      4.224 s ±  0.444 s    [User: 2.990 s, System: 0.312 s]
  Range (min … max):    3.616 s …  4.625 s    10 runs
 
Benchmark #2: bin/godot.linuxbsd.opt.tools.64.variant_bucket_pools /tmp/4/project.godot --quit
  Time (mean ± σ):      4.246 s ±  0.433 s    [User: 2.982 s, System: 0.321 s]
  Range (min … max):    3.623 s …  4.633 s    10 runs
 
Summary
  'bin/godot.linuxbsd.opt.tools.64 /tmp/4/project.godot --quit' ran
    1.01 ± 0.15 times faster than 'bin/godot.linuxbsd.opt.tools.64.variant_bucket_pools /tmp/4/project.godot --quit'

lawnjelly · 2022-07-04T07:50:56Z

Yes I'll try and rebase this and also run some benchmarks.

Expected it should be pretty similar for a lot of cases, as malloc (on desktop linux at least) is pretty good in the general case. At the least, pooling should not be slower than malloc, provided the implementation is sound, and your tests confirm this.

To quote from wikipedia on benefits:

Memory pools allow memory allocation with constant execution time. The memory release for thousands of objects in a pool is just one operation, not one by one if malloc is used to allocate memory for each object (Note: We don't use bulk deletion in PagedAllocator).
Memory pools can be grouped in hierarchical tree structures, which is suitable for special programming structures like loops and recursions.
Fixed-size block memory pools do not need to store allocation metadata for each allocation, describing characteristics like the size of the allocated block. Particularly for small allocations, this provides substantial space savings.
Allows deterministic behavior on real-time systems avoiding the out of memory errors.

For myself the main attraction is the O(1) constant time operation, and the lack of fragmentation, housekeeping data and padding. Essentially you can really hammer a memory pool in recursive functions etc and not expect any glitches, whereas the same is not guaranteed with malloc.

Also although I'm leading with this on 4.x, I'm hoping to do the same on 3.x, where the fragmentation can lead to problems particularly on 32 bit OS : e.g. #61835

Juan has also pointed out that using PagedAllocator means allocations of same time will often be closer together in memory, which may have cache advantages. But this kind of thing is quite hard to quantify, because it depends on historical allocations.

Memory pools via PagedAllocator for Transform2D, Transform3D, Basis and AABB.

lawnjelly · 2022-07-04T12:04:29Z

Quick really naive benchmark from c++ suggests this PR is a little faster taking 3/4 of the time of old version:

void BenchmarkVariant::run()
{
	const int NUM_ITERATIONS = 10000000;
	
	AABB val;
	uint64_t before;
	
	
	before = OS::get_singleton()->get_ticks_msec();
	for (int n=0; n<NUM_ITERATIONS; n++)
	{
		Variant v2 = Variant(AABB());
		AABB aabb = v2;
		val.merge_with(aabb);
	}
	
	uint64_t takenA = OS::get_singleton()->get_ticks_msec() - before;
	print_line(val);
	print_line("took A " + itos(takenA) + " ms.");

	before = OS::get_singleton()->get_ticks_msec();
	for (int n=0; n<NUM_ITERATIONS; n++)
	{
		AABB * test = memnew(AABB());
		AABB aabb = *test;
		val.merge_with(aabb);
		
		memdelete(test);
	}
	
	
	uint64_t takenB = OS::get_singleton()->get_ticks_msec() - before;
	print_line(val);
	print_line("took B " + itos(takenB) + " ms.");


	before = OS::get_singleton()->get_ticks_msec();
	for (int n=0; n<NUM_ITERATIONS; n++)
	{
		Variant v2 = Variant(AABB());
		AABB aabb = v2;
		val.merge_with(aabb);
	}
	
	uint64_t takenC = OS::get_singleton()->get_ticks_msec() - before;
	print_line(val);
	print_line("took C " + itos(takenC) + " ms.");
	
}

Result:
[P: (0, 0, 0), S: (0, 0, 0)]
took A 303 ms.
[P: (0, 0, 0), S: (0, 0, 0)]
took B 408 ms.
[P: (0, 0, 0), S: (0, 0, 0)]
took C 303 ms.

Test B is actually just to save me compiling everything twice, as the old Variant news and deletes an AABB each time, so it is pretty much doing the same thing (maybe a little less in fact that the actual old Variant). Test C is just a confirmation repeat of A, to check for things like hot cache effects etc.

So for my system (Linux Mint) the PagedAllocator appears to be > about 4/3 faster than malloc in this situation of hammering the allocator. I'm assuming @Calinou 's test before was just timing the startup, in which case it is unlikely to make a lot of difference. I'm also assuming the optimizer isn't doing something to throw off the timings, which is always a possibility with these things (the print statements are to prevent optimizing out).

But as I say the primary reason (in my mind) is for the O(1) constant time allocation / deallocation, any increase in speed is a nice bonus.

akien-mga

Looks good to me. Also approved by @reduz in PR meeting.

akien-mga · 2022-08-02T13:54:21Z

Thanks!

lawnjelly requested review from a team as code owners May 23, 2022 11:54

lawnjelly force-pushed the variant_bucket_pools branch from e3f06ec to 930228a Compare May 23, 2022 12:09

lawnjelly added enhancement topic:core performance labels May 23, 2022

lawnjelly added this to the 4.0 milestone May 23, 2022

lawnjelly added the for pr meeting label May 24, 2022

lawnjelly mentioned this pull request Jun 9, 2022

Occasional 32bit editor crashes in Windows: malloc returns null. Memory leak? #61835

Open

akien-mga reviewed Jun 28, 2022

View reviewed changes

core/templates/paged_allocator.h Show resolved Hide resolved

lawnjelly removed for pr meeting labels Jun 28, 2022

lawnjelly force-pushed the variant_bucket_pools branch from 930228a to 11a5f4d Compare June 28, 2022 15:15

lawnjelly requested a review from a team as a code owner June 28, 2022 15:15

lawnjelly force-pushed the variant_bucket_pools branch 2 times, most recently from 7a312bb to ab8c72b Compare June 28, 2022 15:26

lawnjelly requested a review from a team as a code owner June 28, 2022 15:26

lawnjelly force-pushed the variant_bucket_pools branch from ab8c72b to f76c6e2 Compare July 4, 2022 10:30

Variant memory pools

b221eab

Memory pools via PagedAllocator for Transform2D, Transform3D, Basis and AABB.

lawnjelly force-pushed the variant_bucket_pools branch from f76c6e2 to b221eab Compare July 4, 2022 11:02

akien-mga approved these changes Aug 2, 2022

View reviewed changes

akien-mga merged commit 33258d8 into godotengine:master Aug 2, 2022

akien-mga mentioned this pull request Aug 2, 2022

Memory pools for variant, physics pairing etc #58942

Closed

akien-mga mentioned this pull request Aug 2, 2022

Use memory pools for Variant extended types #58633

Closed

lawnjelly deleted the variant_bucket_pools branch August 2, 2022 15:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Variant memory pools #61315

Variant memory pools #61315

lawnjelly commented May 23, 2022 •

edited

akien-mga commented Jun 28, 2022

lawnjelly commented Jun 28, 2022 •

edited

Calinou commented Jun 28, 2022

Calinou commented Jul 3, 2022 •

edited

lawnjelly commented Jul 4, 2022

lawnjelly commented Jul 4, 2022 •

edited

akien-mga left a comment •

edited

akien-mga commented Aug 2, 2022

Variant memory pools #61315

Variant memory pools #61315

Conversation

lawnjelly commented May 23, 2022 • edited

Notes

akien-mga commented Jun 28, 2022

lawnjelly commented Jun 28, 2022 • edited

Calinou commented Jun 28, 2022

Calinou commented Jul 3, 2022 • edited

Opening and quitting project manager

Opening and quitting editor

lawnjelly commented Jul 4, 2022

lawnjelly commented Jul 4, 2022 • edited

akien-mga left a comment • edited

Choose a reason for hiding this comment

akien-mga commented Aug 2, 2022

lawnjelly commented May 23, 2022 •

edited

lawnjelly commented Jun 28, 2022 •

edited

Calinou commented Jul 3, 2022 •

edited

lawnjelly commented Jul 4, 2022 •

edited

akien-mga left a comment •

edited