Skip to content

Neat 0.5.2: Finally Fast-ish

Compare
Choose a tag to compare
@FeepingCreature FeepingCreature released this 05 Jan 01:03
· 96 commits to master since this release

I finally figured out why Neat was unexpectedly a lot slower than D. Turns out, passing 16 bytes (a D array) on AMD64 is very speedy as it's just passed in registers; however, passing 24 bytes (a Neat array) requires an alloca because the SysV ABI demands it be passed as a pointer. For string heavy code, this forces a lot of allocas and ends up with programs spending most of their time doing stack shuffling.

Luckily, while we need to pass structs conforming to the SysV ABI, arrays aren't actually structs and we can decide how we pass them. So structs, tuples and sumtypes are now passed as separate parameters. This alone basically brings the benchmark to a 2x speedup and brings Neat, hopefully, within striking distance of D.