Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: revert unexpected random init order? #31636

Open
rsc opened this issue Apr 23, 2019 · 12 comments

Comments

Projects
None yet
6 participants
@rsc
Copy link
Contributor

commented Apr 23, 2019

CL 170318 started randomizing the init order in -race mode.
I am not convinced the init order should ever differ from source code order.
We should discuss this before releasing it into the wild.

/cc @randall77 @josharian

@rsc rsc added this to the Go1.13 milestone Apr 23, 2019

@rsc rsc added the release-blocker label Apr 23, 2019

@josharian

This comment has been minimized.

Copy link
Contributor

commented Apr 23, 2019

This isn't randomizing the init order within a package; that is fully specified by the spec (modulo the uncertainty in #22326).

CL 170318 is about randomizing the initialization order of complete packages (while respecting dependencies), so there isn't a source order to differ from. Or am I missing something?

@randall77

This comment has been minimized.

Copy link
Contributor

commented Apr 23, 2019

Right, this is only randomizing the order in which we initialize package dependencies, not global variables.
It only matters in diamond cases, where package A depends on packages B and C, and each of B and C depend on (and, usually, register themselves with) package D.
There were a few cases inside google where people were relying on the initialization order (e.g. relying on the fact that B registered with D first), and they were broken by CL 161337. The intent of the randomization change is to keep that from happening again.

Perhaps you are suggesting that we must initialize subpackages in order of import declarations? I'm not sure how that would work. For example, the order may be different in two different source files in the same package.

@rsc

This comment has been minimized.

Copy link
Contributor Author

commented Apr 24, 2019

Right, I understand that this is about package inits. The original implementation as I remember it followed the source order: if you had x.go importing A, B, C, then it ran A's init, B's init, C's init. I realize this is not in the spec, but it was consistently applied in imports-in-the-source order. If there are multiple files, they were still presented in the order of the source (one file at a time; the spec is already clear about file order mattering for func inits, it just also mattered for imported inits). I also understand that this is unspecified and it would be better if programs didn't accidentally depend on an unspecified order.

But when presented with "people are depending on an unspecified order that can at some point change", there are three possible things to do:

  1. Make it change all the time.
  2. Specify the order so it stops changing.
  3. Do nothing.

It's not a given that choice 1 is the right one.

Map order randomization and goroutine scheduler randomization with -race are examples where we did choice 1. There was obviously no way to determinize the order, but they looked a bit too deterministic in practice, which caused problems, so we randomized. Non-determinism is fundamental to those cases - obviously we need the flexibility to use a different hash function for different maps or on different architectures, and obviously scheduling is non-deterministic.

The rest of initialization is an example where we did choice 2. The spec goes into great detail about how to decide the relative order that variables are initialized and then the order in which init funcs are called, even to the point of noting that the order in which files are presented to the compiler matters. It also makes the guarantee that if A imports B then B is fully initialized before A starts initializing. We spent a lot of time on those rules, specifically adding determinism and user control, so that init-time work would behave predictably and consistently. This contrasts with C/C++ where init order is highly underspecified and difficult to work with. When problems with underspecified order mattering to real programs have come up, we've consistently updated the rules to make them more predictable and consistent.

Sorting is an example where we've done choice 3. sort.Sort has to make a choice about ordering "equal" (but distinguishable some other way) elements. When the sort algorithm changes, that decision can change. This has broken users in the past. We talked about adding randomness here (#13884) but ultimately decided that the benefits of a given sort implementation being self-consistent from run to run were larger than the benefits of forcing users to make their code "sort-independent".

Init order and sorting are (to date) both cases where we decided that determinism mattered. In init order we worked hard to add determinism, and in sorting we decided it was so important that we left a randomization opportunity on the table.

Personally, I think randomizing the imported-package init order is inconsistent with our previous efforts to make init order as deterministic as possible. Randomizing is effective for things that happen repeatedly in programs (like map iterations and scheduling decisions), where you can do things like go test -count=10 to shake it out. But init time happens once, and randomizing will have the effect of making programs that wander into undefined territory misbehave at startup 1/N of the time. Exacerbating the problem, the programs that do wander into undefined territory won't be simple tests. They will be complex binaries, making the inconsistent random misbehavior even more annoying. As a user, I don't want to deal with any of that. Startup-time initialization is not a good time for random behavior.

I would much rather see us continue our past approach of defining the init order more completely when these things come up. Then there are no random failures and we are back to predictable, consistent init behavior. For example we could define that, when not determined directly by dependency order, ties are broken by walking the imports in source order (like for variables), one file at a time, or we could define that ties are broken by walking the imports sorted by import path order.

/cc @robpike @griesemer @ianlancetaylor although both Rob and Robert are away right now.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2019

The general argument that package import order should be consistent is persuasive.

In general initializing package P1 means that we have to initialize every package that P1 imports. We can do that in source order as suggested, such as by doing a depth first traversal of P1's imports in source order, where if some imported package has already been initialized we skip it (I think that is what the gc toolchain used to do before @randall77 's patch). The problem I see with that is that changes entirely outside of P1 and every package that it imports can change the initialization order, because some other package initialized before P1 can import some package P2 that P1 imports indirectly, and thus the order in which P2 is initialized can change even though P1 did not change. That is consistent within a program but it's not predictable by the programmer. The programmer can make it reliable by adding a (likely blank) import, but if the programmer actually notices the problem they will likely fix it in a clearer way. This is unlike variable initialization order, which cannot be affected by changes in some unrelated package.

So I would be in favor of the second approach suggested above, of breaking ties by import path order. To me this implies building a complete import graph of the program, then doing a breadth first walk of the graph starting at the runtime package, initializing each set of children (i.e., the packages that directly import the runtime package) in import path order. I think that will mean that any change in initialization order of P1 and the packages that it imports can only occur due to some change in those packages.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Apr 24, 2019

I'm ok with deciding that we don't want random at startup, and defining some sort of required initialization order (even if we don't actually add it to the spec). I'm not sure what that would be.

I agree with Ian that just using source order isn't satisfying. Just to make Ian's example more concrete, consider my example diamond with A importing B and C, both of which import D.
A requires that B's initializer runs before C's. Using source order A tries to enforce that. But then some other package does:

import (
    "C"
    "A"
)

And then all of a sudden C gets initialized before B, breaking the invariant A wanted. There's no way for A to enforce its requirement or even notice that it has been violated (other than making C unimportable, for example by putting it in an internal directory, or saying so loudly in the package doc comment).

FYI CL 161337 changed the ordering from source order to sorted order. It wasn't really an intentional part of that change, it just made the implementation easier. It wouldn't be hard to change it back.

@randall77

This comment has been minimized.

Copy link
Contributor

commented May 16, 2019

I think we need a decision here soon. Reverting the randomness is trivial. Going back to the previous ordering will be not quite as trivial but still not hard.

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented May 16, 2019

I still find my argument above to be persuasive. Anybody want to argue that we should go back to the earlier initialization order?

@randall77

This comment has been minimized.

Copy link
Contributor

commented May 17, 2019

Ian, I'm confused as to what initialization order you're proposing. You want to do breadth first pre-order from the bottom of the dependency graph instead of depth first post-order from the top (as we do now)?

To be concrete, I'm surmising you want to find the lexically earliest package that is not initialized yet, but has had all its dependencies initialized, Initialize that package, and repeat.

Am I close?

@ianlancetaylor

This comment has been minimized.

Copy link
Contributor

commented May 17, 2019

Yes, that sounds equivalent to what I am suggesting. Thanks.

@griesemer

This comment has been minimized.

Copy link
Contributor

commented May 17, 2019

I'm also ok with Ian's suggestion, i.e., Keith's algorithmic interpretation. If I understand correctly, it will conserve the relative initialization order between packages that are not constrained by explicit dependencies (given their import paths being fixed).

@randall77

This comment has been minimized.

Copy link
Contributor

commented May 17, 2019

Ok, I'll see if I can't hack up something that implements it efficiently.

@gopherbot

This comment has been minimized.

Copy link

commented May 21, 2019

Change https://golang.org/cl/178297 mentions this issue: runtime: revert init order changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.