Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace the CafeOS default heap with a custom one #221

Merged
merged 3 commits into from Jun 30, 2022

Conversation

GaryOderNichts
Copy link
Contributor

Summary

Wut currently uses wutmalloc as a wrapper around the default CafeOS heap functions (MEMAllocFromDefaultHeap / MEMFreeToDefaultHeap).
This default heap is really slow for large amounts of allocations, which causes lots of slowdowns. A lot of retail games use a fast custom heap to prevent this issue.
This draft uses the malloc implementation in newlib instead and replaces the default heap functions with a wrapper around the newlib functions (see wutdefaultheap).
This is currently marked as a draft since it's a somewhat major change and there might be potential issues resulting from this which I haven't thought of.

RPX files

RPX files now implement and export a __preinit_user function, which will be called before any allocations are done to allow replacing the MEMAllocFromDefaultHeap / MEMFreeToDefaultHeap functions (see memdefaultheap.h).

In the preinit call wut allocates all of the available space in the MEM2 heap for sbrk.
It then initializes wutdefaultheap which will replace MEMAllocFromDefaultHeap / MEMAllocFromDefaultHeapEx / MEMFreeToDefaultHeap with wrappers around the newlib functions. This results in CafeOS functions allocating from the newlib heap instead.

Overriding this behavior

The user can override this behavior by implementing their own __preinit_user function.
This will skip the sbrk and wutdefaultheap initialization, and __init_wut_malloc can be called which results in linking in the old wrapper around the default heap.
See this code for an example.

RPL files

Since RPL files don't support __preinit_user (and shouldn't mess with the default heap), they will simply use wutmalloc which results in allocations from the heap, which RPX has set up.

Speed comparisons

For testing the speed I wrote a simple tool, which does a lot of heap allocations of various sizes, frees them, and displays the times they took.
This tool is probably not the best for accurate timing, but should be enough to show the performance increase.

Using the default CafeOS heap:

Allocations Free
malloc 51321 µs 25250 µs
memalign 44195 µs 29695 µs

Total time: 150461 µs

Using the custom newlib heap:

Allocations Free
malloc 6265 µs 3410 µs
memalign 15019 µs 3676 µs

Total time: 28370 µs

This is roughly a 5x total time improvement.

@GaryOderNichts GaryOderNichts marked this pull request as ready for review June 4, 2022 20:26
@GaryOderNichts
Copy link
Contributor Author

This has been tested in several applications and it seems to work fine so far.
Going to mark this as ready and waiting for a code review now.

@GaryOderNichts
Copy link
Contributor Author

Just pushed a commit which uses a spin lock instead of a mutex, which improves speeds even more.

Using the newlib heap (with an OSUninterruptibleSpinLock):

Allocations Free
malloc 4066 µs 1994 µs
memalign 8536 µs 2296 µs

Total time: 16892 µs

This is now almost a 9x total time improvement over the CafeOS heap.

@GaryOderNichts GaryOderNichts changed the title WIP: Replace the CafeOS default heap with a custom one Replace the CafeOS default heap with a custom one Jun 11, 2022
@fincs fincs merged commit f1b5da9 into devkitPro:master Jun 30, 2022
NessieHax pushed a commit to NessieHax/wut that referenced this pull request Sep 23, 2022
NessieHax pushed a commit to NessieHax/wut that referenced this pull request Feb 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants