Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise builtin function dispatch #2529

Merged
merged 2 commits into from Oct 21, 2016

Conversation

dpgeorge
Copy link
Member

Builtin functions with a fixed number of arguments (0, 1, 2 or 3) are quite common. Before this patch the wrapper for such a function cost 3 machine words (uPy type, num args/kw, pointer to C function). After this patch it only takes 2 (uPy type, pointer to C function), which can reduce the code size by quite a bit (and pays off even more, the more functions are added). It also makes function dispatch slightly more efficient in CPU usage, and furthermore reduces stack usage for these cases. On x86 and Thumb archs the dispatch functions are now tail-call optimised by the compiler (but not on xtensa).

Stack usage by these builtin functions is decreased from 48 bytes to 0 on Thumb2 archs (due to tail-call opt), and is decreased from 48 to 16 bytes on xtensa.

Change in code size, in bytes, is:

   76    bare-arm/build/firmware.elf
  300    minimal/build/firmware.elf
-2384    unix/micropython (x86-64)
 -904    stmhal/build-PYBV10/firmware.elf
 -604    esp8266/build/firmware.elf
 -344    cc3200/build/WIPY/release/application.axf

As you can see the bare-arm and minimal builds increase, but all other "realistic" ports decrease by a lot. This is because they have a decent number of builtin functions which benefit from this optimisation. And the optimisation will pay off even more when more functions are added (as long as they have fixed number of args from 0-3).

To me this patch is definitely worth it, but I wanted to make a PR because it increases code size of the baseline ports.

@pfalcon
Copy link
Contributor

pfalcon commented Oct 18, 2016

This is of course great. Let's make sure we're aware of issues though. First, this makes bindings development harder. Then, I wondered how you solve problem of checking if an object is a function, and see you didn't much. That would lead to problems like type(max) is type(ord) is False. Given that even in CPython type(min) is type(lambda x:x) is False, it's not big deal either. So, I'm all for merging this ASAP.

@dpgeorge
Copy link
Member Author

That would lead to problems like type(max) is type(ord) is False.

Hmm, I didn't think of that, that all builtin functions should have the same type. I wonder if this point is specified in the CPython specs...?

First, this makes bindings development harder.

It would be easy to make it so there were only 2 distinct declaration macros, MP_DECLARE_CONST_FUN_OBJ_FIXED and MP_DECLARE_CONST_FUN_OBJ_VAR, corresponding to 2 separate data structures (the first is 2 words big, the second, 3 words).

But perhaps it's good to make the DECLARE macros mirror the DEFINE ones, as it allows to independently change the data structures in the future if needed.

@dpgeorge
Copy link
Member Author

That would lead to problems like type(max) is type(ord) is False.

Hmm, I didn't think of that, that all builtin functions should have the same type. I wonder if this point is specified in the CPython specs...?

There is types.BuiltinFunctionType which is supposed to be the type of all builtins, therefore all builtin functions should have the same type. With this PR this will no longer be the case. If it's really needed to have this behaviour then it's straightforward to modify type_make_new to check if the input argument to type(f) is a builtin, and if so return a unique type.

@stinos
Copy link
Contributor

stinos commented Oct 20, 2016

First, this makes bindings development harder

...

There is types.BuiltinFunctionType which is supposed to be the type of all builtins, therefore all builtin functions should have the same type. With this PR this will no longer be the case

I'm using MP_OBJ_IS_TYPE( arg, &mp_type_fun_builtin ) and functions taking mp_obj_fun_builtin_t*, so the latter would have to be changed to take all 5 overloads (it's C++) and the former to MP_OBJ_IS_TYPE( arg, &mp_type_fun_builtin_0 ) || MP_OBJ_IS_TYPE( arg, &mp_type_fun_builtin_1 ) ||... ?

@dpgeorge
Copy link
Member Author

I'm using MP_OBJ_IS_TYPE( arg, &mp_type_fun_builtin ) and functions taking mp_obj_fun_builtin_t*, so the latter would have to be changed to take all 5 overloads (it's C++)

I can easily change it so that there are only 2 variants of the structure (instead of 5), one for fixed args, the other for variable args.

and the former to MP_OBJ_IS_TYPE( arg, &mp_type_fun_builtin_0 ) || MP_OBJ_IS_TYPE( arg, &mp_type_fun_builtin_1 ) ||... ?

Yes, unfortunatelly. I was trying to think of a tricky way to do this and one option is to change the "name" of builtin function types to "builtin_function" (instead of "function").

@stinos
Copy link
Contributor

stinos commented Oct 20, 2016

I can easily change it so that there are only 2 variants of the structure (instead of 5), one for fixed args, the other for variable args.

The fixed args with the union then? That would simplify things yes. On the other hand it's no big deal, I only use it in 2 or 3 common pieces of code, I wouldn't go through hoops to make things tricky and possibly ofuscated if there's not much benefits.

@pfalcon
Copy link
Contributor

pfalcon commented Oct 20, 2016

With this PR this will no longer be the case. If it's really needed to have this behaviour then it's straightforward to modify

I'd say it's not important, because Python's doesn't have a single type for all functions. If there're separate functions for builtin vs Python functions, then a user can robustly check a type only of their own functions, because any other function can be either builtin or Python.

@pfalcon
Copy link
Contributor

pfalcon commented Oct 20, 2016

'm using MP_OBJ_IS_TYPE( arg, &mp_type_fun_builtin ) and functions taking mp_obj_fun_builtin_t*, so the latter would have to be changed

Yes, everyone who has C modules outside of core will need to do bunch of work. So, the sooner this merged, the sooner that work would complete.

In order to have more fine-grained control over how builtin functions are
constructed, the MP_DECLARE_CONST_FUN_OBJ macros are made more specific,
with suffix of _0, _1, _2, _3, _VAR, _VAR_BETEEN or _KW.  These names now
match the MP_DEFINE_CONST_FUN_OBJ macros.
Builtin functions with a fixed number of arguments (0, 1, 2 or 3) are
quite common.  Before this patch the wrapper for such a function cost
3 machine words.  After this patch it only takes 2, which can reduce the
code size by quite a bit (and pays off even more, the more functions are
added).  It also makes function dispatch slightly more efficient in CPU
usage, and furthermore reduces stack usage for these cases.  On x86 and
Thumb archs the dispatch functions are now tail-call optimised by the
compiler.

The bare-arm port has its code size increase by 76 bytes, but stmhal drops
by 904 bytes.  Stack usage by these builtin functions is decreased by 48
bytes on Thumb2 archs.
@dpgeorge dpgeorge merged commit 571e6f2 into micropython:master Oct 21, 2016
@dpgeorge dpgeorge deleted the opt-builtin-fun branch July 5, 2022 05:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants