New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: rewrite interface args by actual type args in functions called only with a single set of argument types #19165

Open
valyala opened this Issue Feb 17, 2017 · 8 comments

Comments

Projects
None yet
6 participants
@valyala
Contributor

valyala commented Feb 17, 2017

Sometimes programs use only a single set of interface implementations for every function call accepting interface args. In this case the compiler may safely rewrite such functions, so they'll accept actual type args instead of interface args.
This brings the following benefits:

  • removal of interface conversion overhead;
  • removal of interface method call overhead;
  • reducing memory allocations, since escape analysis may leave variables on stack that is passed to interface functions;
  • inlining short interface method calls and further optimizations based on the inlining.

The idea may be extended further. For instance, the compiler may generate specialized functions for each unique set of arg types passed to interface args (aka "generic" functions :)) . But naive implementation may lead to uncontrolled code bloat and performance degradation, so it must be carefully designed beforehand. One possible idea - specializing the function only if the following conditions are met:

  • all the interface methods for the arg implementation may be inlined (i.e. may lead to performance optimizations);
  • the function body is short (i.e. reducing generated code bloat).

This could inline and optimize sort.Interface implementations inside sort.Sort calls.

@valyala

This comment has been minimized.

Contributor

valyala commented Feb 18, 2017

@polarina

This comment has been minimized.

Contributor

polarina commented Feb 18, 2017

This optimization is also known as devirtualization in C++ compilers.

@as

This comment has been minimized.

Contributor

as commented Feb 19, 2017

How would this impact stack traces appearing after a panic that refer to rewritten funcs?

@josharian

This comment has been minimized.

Contributor

josharian commented Feb 19, 2017

It's not obvious to me how this works with package-at-a-time compilation. For example, it would seem to require re-compiling package sort when a new type satisfying sort.Interface shows up.

This can be done semi-manually with an source code rewriter/specializer--which the sort package in fact has internally. :)

@mvdan

This comment has been minimized.

Member

mvdan commented Feb 19, 2017

@josharian what about unexported funcs?

@josharian

This comment has been minimized.

Contributor

josharian commented Feb 19, 2017

How often does that happen? This optimization would (I think) take pretty significant effort to implement. And if all the code is in the same package, then it could be specialized manually or with a source code generator, like in package sort.

@mvdan

This comment has been minimized.

Member

mvdan commented Feb 19, 2017

How often does that happen?

One example that comes to mind is funcs that you want to test, so you make them take interfaces to help with the mocking. If you made it take an interface just for that, when compiling in non-test mode the interface type could be swapped with a single specific type.

I don't think you could specialize this manually, and I think this pattern is fairly well regarded.

I agree that this optimization would probably be far from trivial and most interface uses wouldn't benefit, though (unless inlining takes place?).

@valyala

This comment has been minimized.

Contributor

valyala commented Feb 22, 2017

Forgot mentioning that the devirtualization may help escape analysis leaving more variables on stack. For instance, currently all the variables passed to interface functions are escaped because the compiler doesn't know all the function implementations which may be used in the program. With the devirtualization step the compiler may leave part of such variables on stack.

How would this impact stack traces appearing after a panic that refer to rewritten funcs?

This shouldn't change stack traces at all

It's not obvious to me how this works with package-at-a-time compilation. For example, it would seem to require re-compiling package sort when a new type satisfying sort.Interface shows up.

I don't know how the current compiler works, but here is a naive sketch how the devirtualization may be implemented:

  • Step 1 - AST building. Each package is parsed into AST and the results are marshaled to cache files.
  • Step 2 - Analysis. Scan each AST and fill the calls map[funcName]set[argTypes] for all the functions accepting at least one interface argument.
    • funcName is fully qualified function name: path/to/package.funcName
    • argTypes - a tuple of real argument types used in each call to the function.
  • Step 3 - Rewriting. For each function that that has only a single element in the set[argTypes] rewrite its' AST by substituting interface args with real args from argTypes.
  • Step 4 - Compilation. Compile each packages' AST.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment