Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
cmd/compile: coalesce consecutive calls to print/println #21417
CLs 55095–55098 change how the compiler (and runtime) implement the built-ins print and println, in order to reduce the amount of code they generate. (The goal is to both reduce the runtime's binary footprint and to increase instruction density in important runtime routines.)
This issue spells out one further obvious optimization. Consider:
After CL 55098, this is gets compiled into: printlock, print "a\n", printunlock, printlock, print "b\n", printunlock. But it could get compiled into: printlock, print "a\nb\n", printunlock. In short, coalesce consecutive calls to print/println in a single printlock-protected sequence of calls, combining string constants as possible, taking care to correctly handle the spaces and newlines introduced by println.
Low priority, but might be an interesting learning exercise for someone interested in the mid-tier of the compiler. It's not super straightforward, since walkprint looks at one node at a time, but it also shouldn't too difficult.
I came here to say, "don't do that!" because of argument evaluation.
I use multiple printlns like this to diagnose certain kinds of failure. coalescing them into a single print would completely eliminate that possibility.
In any case, we shouldn't care of println is especially efficient. You shouldn't be using it in production code.
Of course; we can only do this if it is not detectable from user code. (This would be easier in SSA form, but it is still possible, using package gc's safeexpr.)
My goal here is shrinking runtime routines that use println, for smaller binaries and better instruction density. I would happily make println slower if it made the call sites smaller or use less stack.
If done soundly (which was my intent), it doesn't impact it at all. It just limits the scope of the optimization.
My simple optimizations cut 0.5% from hello world. Given that all of that is from the runtime, it seems low priority but still worthwhile (as I said initially).
A few reasons. fmt is a bit high level for the compiler to be messing with. fmt's printing routines return values. In general, fmt's printing routines take care to make exactly one underlying write call.