Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
runtime: revert unexpected random init order? #31636
This isn't randomizing the init order within a package; that is fully specified by the spec (modulo the uncertainty in #22326).
CL 170318 is about randomizing the initialization order of complete packages (while respecting dependencies), so there isn't a source order to differ from. Or am I missing something?
Right, this is only randomizing the order in which we initialize package dependencies, not global variables.
Perhaps you are suggesting that we must initialize subpackages in order of import declarations? I'm not sure how that would work. For example, the order may be different in two different source files in the same package.
Right, I understand that this is about package inits. The original implementation as I remember it followed the source order: if you had x.go importing A, B, C, then it ran A's init, B's init, C's init. I realize this is not in the spec, but it was consistently applied in imports-in-the-source order. If there are multiple files, they were still presented in the order of the source (one file at a time; the spec is already clear about file order mattering for func inits, it just also mattered for imported inits). I also understand that this is unspecified and it would be better if programs didn't accidentally depend on an unspecified order.
But when presented with "people are depending on an unspecified order that can at some point change", there are three possible things to do:
It's not a given that choice 1 is the right one.
Map order randomization and goroutine scheduler randomization with -race are examples where we did choice 1. There was obviously no way to determinize the order, but they looked a bit too deterministic in practice, which caused problems, so we randomized. Non-determinism is fundamental to those cases - obviously we need the flexibility to use a different hash function for different maps or on different architectures, and obviously scheduling is non-deterministic.
The rest of initialization is an example where we did choice 2. The spec goes into great detail about how to decide the relative order that variables are initialized and then the order in which init funcs are called, even to the point of noting that the order in which files are presented to the compiler matters. It also makes the guarantee that if A imports B then B is fully initialized before A starts initializing. We spent a lot of time on those rules, specifically adding determinism and user control, so that init-time work would behave predictably and consistently. This contrasts with C/C++ where init order is highly underspecified and difficult to work with. When problems with underspecified order mattering to real programs have come up, we've consistently updated the rules to make them more predictable and consistent.
Sorting is an example where we've done choice 3. sort.Sort has to make a choice about ordering "equal" (but distinguishable some other way) elements. When the sort algorithm changes, that decision can change. This has broken users in the past. We talked about adding randomness here (#13884) but ultimately decided that the benefits of a given sort implementation being self-consistent from run to run were larger than the benefits of forcing users to make their code "sort-independent".
Init order and sorting are (to date) both cases where we decided that determinism mattered. In init order we worked hard to add determinism, and in sorting we decided it was so important that we left a randomization opportunity on the table.
Personally, I think randomizing the imported-package init order is inconsistent with our previous efforts to make init order as deterministic as possible. Randomizing is effective for things that happen repeatedly in programs (like map iterations and scheduling decisions), where you can do things like go test -count=10 to shake it out. But init time happens once, and randomizing will have the effect of making programs that wander into undefined territory misbehave at startup 1/N of the time. Exacerbating the problem, the programs that do wander into undefined territory won't be simple tests. They will be complex binaries, making the inconsistent random misbehavior even more annoying. As a user, I don't want to deal with any of that. Startup-time initialization is not a good time for random behavior.
I would much rather see us continue our past approach of defining the init order more completely when these things come up. Then there are no random failures and we are back to predictable, consistent init behavior. For example we could define that, when not determined directly by dependency order, ties are broken by walking the imports in source order (like for variables), one file at a time, or we could define that ties are broken by walking the imports sorted by import path order.
The general argument that package import order should be consistent is persuasive.
In general initializing package P1 means that we have to initialize every package that P1 imports. We can do that in source order as suggested, such as by doing a depth first traversal of P1's imports in source order, where if some imported package has already been initialized we skip it (I think that is what the gc toolchain used to do before @randall77 's patch). The problem I see with that is that changes entirely outside of P1 and every package that it imports can change the initialization order, because some other package initialized before P1 can import some package P2 that P1 imports indirectly, and thus the order in which P2 is initialized can change even though P1 did not change. That is consistent within a program but it's not predictable by the programmer. The programmer can make it reliable by adding a (likely blank) import, but if the programmer actually notices the problem they will likely fix it in a clearer way. This is unlike variable initialization order, which cannot be affected by changes in some unrelated package.
So I would be in favor of the second approach suggested above, of breaking ties by import path order. To me this implies building a complete import graph of the program, then doing a breadth first walk of the graph starting at the runtime package, initializing each set of children (i.e., the packages that directly import the runtime package) in import path order. I think that will mean that any change in initialization order of P1 and the packages that it imports can only occur due to some change in those packages.
I'm ok with deciding that we don't want random at startup, and defining some sort of required initialization order (even if we don't actually add it to the spec). I'm not sure what that would be.
I agree with Ian that just using source order isn't satisfying. Just to make Ian's example more concrete, consider my example diamond with A importing B and C, both of which import D.
And then all of a sudden C gets initialized before B, breaking the invariant A wanted. There's no way for A to enforce its requirement or even notice that it has been violated (other than making C unimportable, for example by putting it in an internal directory, or saying so loudly in the package doc comment).
FYI CL 161337 changed the ordering from source order to sorted order. It wasn't really an intentional part of that change, it just made the implementation easier. It wouldn't be hard to change it back.
referenced this issue
May 15, 2019
Ian, I'm confused as to what initialization order you're proposing. You want to do breadth first pre-order from the bottom of the dependency graph instead of depth first post-order from the top (as we do now)?
To be concrete, I'm surmising you want to find the lexically earliest package that is not initialized yet, but has had all its dependencies initialized, Initialize that package, and repeat.
Am I close?
I'm also ok with Ian's suggestion, i.e., Keith's algorithmic interpretation. If I understand correctly, it will conserve the relative initialization order between packages that are not constrained by explicit dependencies (given their import paths being fixed).