Brilirs #189

Pat-Lafon · 2022-04-05T22:10:32Z

Most of this pr is cleanup via:

3539fd3: Mostly comments and stylistic changes.
51779f9: Reworks build.rs to have a better control flow and to not create a tab completions file if it already exists.
7c3822c: Addressing #![allow(clippy::too_many_arguments)] by refactoring the interpreter to use a State struct. This is similar to the State struct used in brili except there is only one instance which is passed by &mut.
e03f0be: Previously, indexing was being done by keeping track of u32 values and then casting to usize. This just now uses usize values directly.

I've been sporadically adding more comments to the code base as I've forgotten what my past self meant by some of it... see https://xkcd.com/1421/

2dfee99 tackles the second half of optimizing function calls in brilirs started in #188. Previously, functions were stored in a HashMap<String, BBFunction> in BBProgram. This meant that on each function call, the string name needs to be hashed and looked up in the map. Since all of the functions are know statically, we can instead create a vec<BBFunction> to be indexed into in a similar optimization for what is done with the variable names of instructions(originally introduced in the blog post https://www.cs.cornell.edu/courses/cs6120/2019fa/blog/faster-interpreter/).

The performance onfunction_call.bril.

brili	Old brilirs	Proposed brilirs
7.1s ± .1 s	328 ms ± 2ms	251 ms ± 2 ms

In comparison to #188, there is a much bigger reduction in run time which was surprising to me. It could be that hashing and indexing into a map are more expensive that I had thought. It could also be that this benchmark is too simple to for there to be a significant benefit from #188. Since it just calling the same function repeatedly, each allocation is the same size(same number of variables on the stack) which allows the allocator to be fast whereas the performance of #188 shouldn't be as affected by changes in stack size.

Either way, good benchmarking is non-trivial and all of these performance numbers should be taken with a grain of salt.

sampsyo · 2022-04-07T13:43:28Z

Super nice; this all looks great! Since the win is pretty big for the "function calls only" benchmark, I guess it might be educational to know what the performance impact is on "normal" benchmarks as well… it might be measurable, despite the relative rarity of function calls in other benchmarks.

In any case, this is a great way to extend the index-based interpretation philosophy beyond just variables! 👏

Pat-Lafon · 2022-04-07T16:52:35Z

So I'm only considering benchmarks with a dynamic instruction count greater than ~750,000 which currently is about what you need to get the lower range of brilirs above 5ms. Maybe one could use a different benchmarking tool and a less noisy setup to be more precise.

benchmark	brili	Old brilirs	Current brilirs
ackermann.bril	252ms ± 2ms	13ms ± 1ms	11 ms ± 1 ms
eight-queens.bril	226ms ± 2ms	8ms ± 1ms	7ms ± 1ms
mat-mul.bril	404ms ± 3ms	14ms ± 1ms	14ms ± 1ms
function-call.bril	7.1s ± .1s	330ms ± 2ms	252ms ± 2ms
alloc_large.bril	1.2s ± .1s	694ms ± 3ms	692ms ± 2ms

I've rounded everything up to the relevant significant figure. eight-queens and mat-mul both see no benefit which makes sense since they use looping instead of recursion.

I've thrown in alloc_large.bril which is actually a test case but has some very surprising performance behavior. It is unusual in that it is a small program that doesn't use recursion and creates a lot of memory allocations. brilirs's performance makes some sense in that there hasn't been any attempt at optimizing how it manages the heap(It's basically a copy paste of brili's implementation). However, I think it's important to note that function-call uses about 10x #dynamic instructions so it's both that brilirs does quite poorly and that brili does surprisingly well with this kind of program.

Basically, you are what you optimize for.

sampsyo · 2022-04-07T19:13:54Z

That all makes a lot of sense! Thanks for adding the extra color here, though; that's definitely interesting to see.

Pat-Lafon added 5 commits April 4, 2022 14:06

brilirs cleanup and comments

3539fd3

cleanup build.rs file

51779f9

Create an interpreter State struct

7c3822c

Numify functions

2dfee99

use usize instead of u32 for indexing

e03f0be

sampsyo merged commit fa46e23 into sampsyo:main Apr 7, 2022

Pat-Lafon deleted the brilirs branch April 7, 2022 16:52

Pat-Lafon mentioned this pull request Apr 7, 2022

The Brilirs memory extension implementation is very slow #190

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Brilirs #189

Brilirs #189

Pat-Lafon commented Apr 5, 2022

sampsyo commented Apr 7, 2022

Pat-Lafon commented Apr 7, 2022

sampsyo commented Apr 7, 2022

Brilirs #189

Brilirs #189

Conversation

Pat-Lafon commented Apr 5, 2022

sampsyo commented Apr 7, 2022

Pat-Lafon commented Apr 7, 2022

sampsyo commented Apr 7, 2022