Skip to content

Developer tasks (TODO)

mrbrdo edited this page Feb 15, 2013 · 10 revisions

You can ask me for more details.

“goto” method calling

Compile all ruby methods into one C function, which labels for the start of each ruby method. When calling a ruby method that is compiled in the same binary, we can “goto” the label, instead of having to call the C function first. The benefit comes from the fact that the C function needs to set up stack for local variables and parameters (mrb state and self, which are redundant information). Depending on the number of parameters and local variables, this goto approach is about 100% faster. OP_RETURN needs to be appropriately changed.

break from block

Breaking from block may be faster by using setjmp in mrbb_proc_call, but this depends on how fast/slow setjmp is.

symbols and strings (irep pool)

The best example is method calling. When calling a method (OP_SEND), the name of the method is provided in the opcode as a symbol. Right now every time the method is called mrb_intern is used to translate the name as a C string to a mruby symbol (basically a number). It would probably provide a slight performance increase to translate all these C strings into symbols in the entry point, and then use the symbols directly. One example setup would be similar to what the interpreter does - have a table symbols, then instead of doing mrb_intern(mrb, “str”) we would do symbols. The table can be populated either at entry point (shouldn’t take very long even with a lot of symbols), or initialized to 0 and then check every time when using it if it needs to be set/initialized. First solution seems better.

Alternatively, if possible to predict/assign what the symbol (number) will be for each string, we could do that. However this may be impossible when having multiple .so files loaded one after the other (the numbers would collide / be impossible to predict).

method cache

First maybe look at other ruby interpreters (like Rubinius), if they found any optimizations for method dispatching.

Method calling is quite slow in ruby & mruby. It might make sense to have a method cache for each class, instead of searching for the method through the class hierarchy each time. Preferably this should be toggleable on/off, because we are trading memory for performance.

Ideally, we would like to ahead-of-time determine the index into the cache for each method name (let’s ignore dynamically generated method names, majority of methods are created using OP_METHOD with static name). This may be hard or impossible because at compile-time we don’t know which class the method will be defined/called on. The other option is to use hash table (but use symbols!!! faster than C strings).

The cache should be either invalidated or corrected when OP_METHOD, calling define_method, changing class hierarchy (e.g. add metaclasses). Somewhat hard but not impossible. Easiest way to start would be to just invalidate.

The question is what happens with methods that are called only once. It would probably cause performance decrease for these, because we need to maintain the cache.

Some other options are to have a cache only for the last few called methods (and which objects they were called on), or to cache only methods that are frequently used etc.

We can take as inspiration existing things like PE IAT table (Import Address Table), how processors do branch prediction, etc. Modifications to these concepts are necessary because of how dynamic ruby is (number of methods a class has can change, methods can be removed! etc.). If IAT concept would be applicable, then we could store the address of method_missing at idx 0, and then methods after that. Maybe we can use symbols somehow here to do faster translation from [class, method_name_symbol] -> method_address. Basically method_name_symbol here can be known at compile time (more or less), but class cannot be known at compile time.

Also if this cache is organized by class (one cache “table”) for each class, then we can either cache only methods defined on this class directly, or also the methods of all superclasses of this class (faster, but need to invalidate when class hierarchy changes).

Brainstorm:

  • have method cache hash symbol => ptr

  • have method cache array [ [ symbol, ptr]…] use bisection. but for bisection every time we add a method we have to sort, which probably is no good

  • have base method cache of all methods

  • copy this for each class

  • call method: cache[symbol2idx]

  • length for each class can be different (only up to biggest symbol2idx of all the methods of this class)

  • when calling check if idx > mycachelen then method missing

  • fill unimplemented methods in cache to method missing

  • normal rails project has 2500 classes and 2500 unique method names, so 2500 * 2500 * 8B = 50 MB if each class would have full method table (so not good)

More concrete stuff

Range#each

First, why is it slower than mruby interpreted mode? Investigate Integer#succ (maybe it would be faster if we actually define it in ruby, due to opcodes generated? maybe inline it somehow). Also <=> is used where < could be used also, and < is much much faster for Integers, Strings and other base classes (because <, > and == are optimized heavily for them (each of them has its own opcode), but <=> is not optimized - it is defined as a C function). These improvements may be possible to merge into mruby interpreter too.

Calling mruby interpreted code from binary compiled code

Especially careful with blocks/lambdas and breaking from them. Maybe could dynamically generate MKOP(OP_JMP…). Also calling compiled code from interpreted code, especially lambdas again, but this should not be as problematic.

Technical

  • make some “stealstuff” script that will copy everything necessary from mruby (opcodes, TT_STRING etc. list, opcode array/enum). be very verbose if anything is weird raise error. this should be ran when upgrading mruby

  • check that in ensure block it is correct to assign target_class in EPUSH. will it always be the correct target class like this?