New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: coalesce consecutive calls to print/println #21417

Open
josharian opened this Issue Aug 12, 2017 · 7 comments

Comments

Projects
None yet
6 participants
@josharian
Contributor

josharian commented Aug 12, 2017

CLs 55095–55098 change how the compiler (and runtime) implement the built-ins print and println, in order to reduce the amount of code they generate. (The goal is to both reduce the runtime's binary footprint and to increase instruction density in important runtime routines.)

This issue spells out one further obvious optimization. Consider:

  println("a")
  println("b")

After CL 55098, this is gets compiled into: printlock, print "a\n", printunlock, printlock, print "b\n", printunlock. But it could get compiled into: printlock, print "a\nb\n", printunlock. In short, coalesce consecutive calls to print/println in a single printlock-protected sequence of calls, combining string constants as possible, taking care to correctly handle the spaces and newlines introduced by println.

Low priority, but might be an interesting learning exercise for someone interested in the mid-tier of the compiler. It's not super straightforward, since walkprint looks at one node at a time, but it also shouldn't too difficult.

@josharian josharian added the Suggested label Aug 12, 2017

@josharian josharian added this to the Unplanned milestone Aug 12, 2017

@josharian

This comment has been minimized.

Contributor

josharian commented Aug 12, 2017

Although special care is required if evaluating the arguments to the second println could panic. Maybe not a good starter issue after all.

@josharian josharian removed the Suggested label Aug 12, 2017

@robpike

This comment has been minimized.

Contributor

robpike commented Aug 12, 2017

I came here to say, "don't do that!" because of argument evaluation.

I use multiple printlns like this to diagnose certain kinds of failure. coalescing them into a single print would completely eliminate that possibility.

In any case, we shouldn't care of println is especially efficient. You shouldn't be using it in production code.

@josharian

This comment has been minimized.

Contributor

josharian commented Aug 13, 2017

I came here to say, "don't do that!" because of argument evaluation.

Of course; we can only do this if it is not detectable from user code. (This would be easier in SSA form, but it is still possible, using package gc's safeexpr.)

In any case, we shouldn't care of println is especially efficient. You shouldn't be using it in production code.

My goal here is shrinking runtime routines that use println, for smaller binaries and better instruction density. I would happily make println slower if it made the call sites smaller or use less stack.

@rasky

This comment has been minimized.

Member

rasky commented Aug 14, 2017

Tangential, but what about emitting blocks that contain print at the end of the function, as unlikely blocks? This would increase density of likely paths that never call print.

@dr2chase

This comment has been minimized.

Contributor

dr2chase commented Aug 14, 2017

This doesn't completely eliminate the ability to do that sort of debugging, though it does make it a lot more annoying:

//go:noinline
dbgPrintln(s string) {
   println(s)
}

But how much do we gain from doing this?

@thanm

This comment has been minimized.

Member

thanm commented Aug 14, 2017

Both clang and GCC do this optimization for C++ programs, for what that's worth. E.g.

  printf("a");
  printf("b");
  printf("c\n");

gets collapsed into a single 'puts("abc")'.

Q: why do this only with println? Why not apply the same coalescing to fmt.Printf calls?

@josharian

This comment has been minimized.

Contributor

josharian commented Aug 14, 2017

Tangential, but what about emitting blocks that contain print at the end of the function, as unlikely blocks? This would increase density of likely paths that never call print.

Good idea.

This doesn't completely eliminate the ability to do that sort of debugging

If done soundly (which was my intent), it doesn't impact it at all. It just limits the scope of the optimization.

But how much do we gain from doing this?

My simple optimizations cut 0.5% from hello world. Given that all of that is from the runtime, it seems low priority but still worthwhile (as I said initially).

Q: why do this only with println? Why not apply the same coalescing to fmt.Printf calls?

A few reasons. fmt is a bit high level for the compiler to be messing with. fmt's printing routines return values. In general, fmt's printing routines take care to make exactly one underlying write call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment