Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
cmd/compile: performance problems with many long, exported identifiers #18602
What version of Go are you using (
I was able to get ~130MB of the code to compress well down to 10MB so I could paste in this issue. See the link above. It's a transpiler I am working on here. Simply extract the ZIP, navigate into the folder extracted to, and run
I was afraid it was a generic, obvious issue due to the extreme size of the code and that it's all in one package. I don't expect the code to run (I figure it'd panic, I don't know, never got it to compile). But if it's a nuanced issue that would be fantastic.
Building Kubernetes produces several very large programs. There could be a reproducer there.
Just a suggestion....
Compile with memory profiling complete. Took almost 7 hrs, 13gb ram:
Longest phases were:
The resulting object file was 1.14gb!
Memory profile, using
The only unusual things I see in the profile are:
As for the extremely long dumpobj time and giant object file (which will lead to massive link times and/or giant binaries), I suspect that that is again the result of very many very long exported identifiers. Hopefully when my cpu profile run completes in five hours, it'll be much clearer whether there are obvious non-linearities in dumpobj.
@cretz though there may be some improvements available in the compiler, I don't anticipate anything breathtaking; you will almost certainly also need to change your generated code as well. As a start, I suggest altering your code generator to (a) avoid exporting any identifiers you reasonably can and (b) give your symbols shorter names, perhaps using truncated hashes or a global numbering scheme. (You'll probably also then need to generate a lookup table or some such to help with demangling, for debugging. Sorry about that.) If you do those things, please report back with the results, and maybe with an updated code dump. Thanks!
@josharian - Thanks for the update. I figured my use case was just really extreme. I can unexport some things. I can also reduce the identifier size (I wanted my identifiers to help w/ runtime reflection, but no big deal, I can maintain a separate blob w/ that info). I was hoping that even though the obj file is big, that DCE would remove a lot of bloat from the final binary due to having most functions as methods, but I am unfamiliar w/ the internals of the linker.
"whether there are obvious non-linearities in dumpobj"
I think this might be a key point in general. Ideally some of the work can be streamed and not held in memory for the life of the package compile.
At least now y'all have a good stress test case.
If it's really very long identifiers that cause problems with the compile time, we should try to get to the bottom of this, rather than find work-arounds. Exporting long identifiers shouldn't cause undue compilation times - there's a bug somewhere.
As one data point: The export data writer uses a map to canonicalize strings - that is, a string that's been seen before (this includes exported identifiers) will only appear once in the export data. But the same identifiers may appear elsewhere in the obj file.
Here's a CPU profile:
Hello, gc.testdclstack. This is not the first time we've had problems with testdclstack. See #14781. Robert suggested only enabling it in a special debug mode in the compiler. It is probably time to do that, perhaps even for Go 1.8. I'll see about sending a CL soon.
With gc.testdclstack eliminated, the parse phase drops from 11m to 13s. Still waiting to see how much it helps elsewhere.
Eliminating gc.testdclstack won't help with memory usage, though. My compile is still at 7gb and growing.
I don't think it's just very long exported identifiers. It is also the sheer number of them, and probably also some structural things (per the other comments I've made). Squinting at the profiles, the long identifiers is maybe 10% of the memory issue; I just suggested it to @cretz as a good, straightforward first step (and experiment to confirm what I'm seeing).
After the CLs above, time is reduced to a half hour, and max rss is down a bit:
For reference, here's an alloc_space profile output:
Aside from the things I've already mentioned, disabling inlining would probably help noticeably now with memory usage.
There might be further optimizations available to further speed up dumpobj or shrink the object file size by reusing more strings somewhere, but I'll leave that to @griesemer (export info) and @crawshaw (reflect info).
Thanks for the new test case; I'll take a look at that later or tomorrow.
pushed a commit
Jan 11, 2017
At least on my machine, the new code you posted compiles by ~10-15% faster, but memory usage doesn't shrink significantly; I guess I was wrong. The object file is still 856mb, though, so you're still probably going to have slow linking and a very large (and probably slow) binary.
I don't plan to investigate this further at the moment.