New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
spec: define initialization order more precisely #57411
Comments
CC @griesemer |
Given the widespread use of More generally, what if a package does:
while another does
or even different files in the same package? |
Depending on the implementation of this,
That is a great question... couldn't this raise a syntax error ? Like a "pick a lane bud, you're drunk !" |
packaged does the first one, packagee does the second are you just never allowed to import these two packages together? |
Without having thought about it too much (therefore don't take my word for it), I think I would simply raise a syntax error on both import lines (for packaged and packagee...) There is hardly any reason why one would want to have 2 different import orders, and if there was a reason, that reason doesn't exist as of today, because at the moment, package initialization order isn't guaranteed (and so we can eliminate that problem by raising a syntax error.) |
Actually, I see a potential problem with applications with huge dependency trees with open source packages. One would want to take this into consideration if something is to be done... |
You should not connect to a server from Instead, packagea should have a proper constructor that gets invoked. Each unit test is then responsible for calling it. This establishes proper isolation between each test case, which is a very useful property. Meanwhile, unit tests for other packages that depend on packagea should likely be using some interface instead of direct function calls, which allows for mocking. |
See also #31636. You can force the init order by ordering the imports accordingly and into "blocks" so that they don't get re-ordered (example). Note that this is not prescribed by the language, it's an implementation behavior. That said, it is much more robust to be explicit and call initialization functions explicitly if they cause side effects. |
I guess you are right, there are probably other ways to get succint code without connecting to a server in the init() function. However the fundamental question regarding side effects is still of interest, for other use cases. |
it would be useful to have real usecases motivating these discussions. |
This proposal is only useful if package A's initialization is dependent on some other package B, despite the fact that A does not import B directly or indirectly. Writing code this way seems extremely confusing and error prone, especially because it will silently break if B imports A directly or indirectly, since then A would have to be initialized first no matter what. |
I'm glad that in the example you provided, the same assumptions are made; however while testing this I experienced some irregularities.
|
VSCode's goimport doesn't rearrange imports in different groups if they are separated by a blank line: import (
_ "b"
// blank line
_ "a"
) (at least for me.) That said, let me emphasize again: it's much better to explicitly call initialization functions that have important side effects if you want robust code. |
Thanks for chiming in; as I've mentioned, this doesn't seem to affect import order consistently for unit tests. It initializes packages in the proper order during normal execution, but when running unit tests, this order is not followed; Let me share an example: -- go.mod --
module somepackage
go 1.19
-- somepackage.go --
package somepackage
import (
"os"
)
var ENV_VAR = os.Getenv("ENV_VAR")
-- mock/mock.go --
package mock
import "os"
const ENV_VAR_VAL = "localhost"
var _ = os.Setenv("ENV_VAR", ENV_VAR_VAL)
-- somepackage_test.go --
package somepackage_test
import (
"testing"
"somepackage/mock"
_ "somepackage/mock"
"somepackage"
)
func TestSomePackage(t *testing.T) {
if somepackage.ENV_VAR != mock.ENV_VAR_VAL {
t.Fail()
}
} If I run this code on my machine (i.e. run the unit test |
I also just confirmed that in the example above, running the tests in a separate test package causes the behavior to change: // test/test_test.go
package test_test
import (
"testing"
"somepackage/mock"
_ "somepackage/mock"
"somepackage"
)
func TestTests(t *testing.T) {
if somepackage.ENV_VAR != mock.ENV_VAR_VAL {
t.Fail()
}
} In this case, the desired import order is the same, but the result is not the same, and the test passes. |
What matters here is also in which order the files are presented to the compiler by the But I cannot but emphasize one more time: depending on these orders leads to extremely fragile code. If there are important side-effects that your code depends on, they should be encoded explicitly by calling respective initialization functions in the desired order. |
That is true, and I agree; the practice should be avoided in most instances. However, the problem of reproducibility and consistency still remains. The commands I'm using to run unit tests is
I'm just trying to understand why it was, that with all my careful research and attentive reading of the Go spec, I ended up believing that I can have some side effects propagated during the import process; and then ended up in the end realizing that the import order is inconsistent, and that the import for side effects functionality doesn't guarantee any import order and so should be avoided altogether. If I am to import some package for its side effects, but can't have anything from the import/initialization process depend on it (implicitly), well then isn't the import for side effects functionality rendered useless, since we can't derive any assumption from it? I mean that, if the only code that can rely on side effects from imports is the one that sits in the package that is importing for side effects, couldn't the imported package (the one with the side effects) simply expose a method that is called in the init process of the importing package ? I'm just thinking that if calling the initialization manually in my package is a better approach, then importing a package for its side effects isn't really all that useful and should probably be discouraged; am I missing something ? Should a mention about this possible inconsistency be included in the spec, in the section where import for side effects is brought up? |
This proposal has been added to the active column of the proposals project |
If we do anything here I think we should do the algorithm discussed in #31636. That issue was closed with a note in the CL that we should consider implementing the more deterministic algorithm later. CC @randall77 |
Specifically, the algorithm proposed in #31636 is "find the lexically earliest package [as sorted by import path] that is not initialized yet, but has had all its dependencies initialized, Initialize that package, and repeat." |
@the-zucc The |
My sense of this so far has been that A significant (but acceptable) consequence of that concession is that because (I will concede that there are some packages in stdlib that seem to be intentionally designed around the existence of initialization side-effects, such as how It seems to me that the existence of However, I'm very skeptical about the import order of dependencies of a particular package being significant. The author of a package fully controls what it directly depends on, and so can avoid using combinations of packages that don't make sense together, can directly import the packages whose side-effects they depend on, and avoid directly importing packages that contain known bugs (init-order-related or otherwise). I share the same reaction to others in this discussion that having an However, if we put aside that particular part of the problem then we have an example where the It feels like a bug to me that I feel skeptical that the specification needs to guarantee anything stronger than that. |
@apparentlymart I am definitely sympathetic to that point of view. Still, for a counter-argument, see #31636 (comment) . |
@griesemer Excuse my ignorance, and thank you for the clarification! @apparentlymart I agree that relying on the import order for side effects being propagated is likely a very bad design choice if we're talking specifically about the example I mentioned earlier. However, wouldn't bringing consistency to import initialization order lessen that fact? Wouldn't it be "okay" (not saying it would be excellent, far from that) to be able to rely on a behavior that we can predict and that is consistent? Also, the whole idea of relying on import order for side effect propagation, again, starts with the mention in the spec:
This is inconsistent with the very valid principle that everyone has outlined here, that one shouldn't rely on a specific package initialization order for side effect propagation. In my point of view, it comes back to the same options that were outlined back in 2019 in #31636 (comment), but with a twist: in addition to any decision regarding the options previously outlined, why wouldn't importing for side effects just be discouraged in the spec ? Or does this fall out of the scope of such a specification? I'm thinking, something can be done that is similar to what is done with dot imports, where they are allowed but discouraged, and it clearly outlines it in the standard linting/syntax highlighting tools for Go in VSCode ( |
This is inaccurate. You should not rely on a specific initialization order when there is no explicit dependency between the two packages. That is:
The suggestion to have the language spec mandate that packages be initialized in lexicographic order when there is a "tie" is useful to eliminate currently undefined behavior that could cause grief as the toolchain evolves. However, it would still be ill-advised to rely on this behavior. For example, suppose I have two packages - "mypackage/one" and "mypackage/two". In this hypothetical future, assuming neither imports the other, "mypackage/one" will be initialized first, and "mypackage/two" could implicitly rely on this fact. But then suppose some developer makes changes that cause "mypackage/one" to import "mypackage/two". Now the import order will silently be reversed, causing the application to break in some way. And to the developer making the change, the reason for the failure may not be immediately obvious. If however the original dependency had been made explicit by having "mypackage/two" import "mypackage/one", then the aforementioned change would have resulted in a clear compile-time failure due to the import loop. |
@rittneje If there is an explicit dependency between the packages, the init order between them is predictable, and so your reasoning is not valid for the concerns I am expressing here. |
Note that in the general case, "renaming a package" and "deleting one package and adding another" are not distinguishable. Thus I don't think this goal is fully achievable without having a way for the developer to define the initialization order outside the package as in the original proposal, which I don't think Go should encourage or support. For me, both (3) and (4) in your list are essential in order to meet the general expectation that "if I don't change the source code at all, the program still works the same way". With regards to (1), I feel it is more that the import order is not a good source to base the initialization order on, since (a) it makes it too easy for developers to try to rely on and control it, and (b) it leads to confusion when two different packages (or even two different files in the same package) arrange their imports in contradictory orders. |
Right now we do a recursive traversal of the package import graph, with the imports of a given package considered in source code order. The suggestion here is to completely determine the order, by doing a topological sort where ties are broken by alphabetical order of import path. That makes gofmt's sorting of imports a no-op, which it has never been. A point was made that renaming a package will change when it gets initialized (in the tiebreaks) but renaming usually results in re-sorting with gofmt too, so it was already changing. This new order is a valid current order and is completely deterministic and fairly easy to explain. That seems like a win. |
This seems like a good idea but maybe we want to do a prototype implementation to convince ourselves there are no hidden gotchas. Does anyone want to try that? |
The essence of my earlier reactions was that I worry about a guaranteed initialization order encouraging increased reliance on the (IMHO regrettable) "registration by importing" pattern and possibly other order-sensitive designs like the one that motivated this proposal. I don't relish having more opportunities for unexplained behavior changes as a result of seemingly-unrelated changes in my own packages, particularly if those changes were made quietly and automatically by a tool like However, on further reflection I can see that existing code is already effectively relying on the current implementation detail and so it would be undesirable to make the behavior any less predictable than it already is; that would risk breaking systems that are currently working. Therefore there must be some specific order in each implementation, and if that is true then it's better for all implementations to agree rather than for codebases to behave subtly differently when compiled with a different implementation. Being prescriptive in the specification while making sure the prescription is backward compatible with the main implementation seems like the best way to get there. Given that, I retract my earlier objections. The specification is a description of how a Go implementation is expected to behave, and is not a "best practices" document describing what is and is not recommended in shared libraries. I don't have a strong opinion about what exactly the specified ordering is. The proposals above for the details seem fine to me. |
@rsc I will try to come up with an implementation, however I can't guarantee a short timeline for it at the moment, simply because of the busy schedule of simultaneous full-time work and college studies.
@apparentlymart very well said ! Last open question about this (hopefully) ; did anyone notice any change in behavior going from an application (with a main function) to just a unit tests file ? Am I wrong in the analysis I outlined in my previous comment here ? |
I have a prototype CL for this at https://go-review.googlesource.com/c/go/+/462035 |
My one minor concern with this is how it interacts with the "plugin" package (or any sort of dynamic loading). Right now, the init order between the main application and any additional packages initialized by loading a plugin is consistent with the underconstrained init order in the spec. I'm not sure any total init order can be consistent with plugin loading (unless the rule accounts for plugins). Again, this is a minor concern, and the init order of plugins is already inherently different with respect to non-init code, so perhaps we simply don't care. |
Are there any objections to this change? |
Making init order more predictable seems desirable, just to reduce unknowns when debugging / minimizing issues. And breaking ties by package path seems as reasonable to me as any other options. (Caveat: The Go spec today only talks about "import paths," not "package paths.") My only suggestion is since this is backwards compatible with the current specified semantics and there doesn't seem to be any compelling use case for end users to start depending on the new semantics, we could always change the toolchain(s) first and then update the spec (if at all) to match once we're confident the new behavior doesn't introduce any problems of its own. |
The important difference between import path and package path arises when using pre-module vendoring as described at https://go.dev/s/go15vendor. It is a good question whether we want to try to explain all that in the spec just to specify the ordering. Since it only comes up with pre-module vendoring, we could just ignore that in the spec and keep saying import path, and understand that people using pre-module vendoring are getting a technical spec violation, but they are essentially opting into an older version of Go anyway. So I think we could probably edit the spec, say import path, and not worry about the fact that the import path in source code differs from "actual" import path in these rare cases. (The vendoring in the standard library is not rare but it's also not something that affects anyone but us.) Editing the spec has the benefit of providing something users can rely on, so probably we should if we can. |
Based on the discussion above, this proposal seems like a likely accept. |
No change in consensus, so accepted. |
Change https://go.dev/cl/462035 mentions this issue: |
Rolled back in https://go.dev/cl/474976 due to problems in Google's internal test suite and perhaps also the aix builder. |
Change https://go.dev/cl/478916 mentions this issue: |
Retry in CL 478916. Google internal ran into two issues with this change. I will explain one; the other is very similar. The problem is a undefined ordering between Package
Package
Typically package Critically for the issue at hand, Without even an indirect dependence, the initialization order is unspecified (before this CL, that is). If the initialization of The fundamental problem here is that we're trying to establish dependence by name (a string) and not by an actual language construct reference. It would be better if the only way to set a flag was to get the address of it somehow (which would require exporting it, at least in test mode). In any case, that ship has sailed with how We're working on fixing some of these implicit ordering dependence things inside Google so we can retry the CL. The question here is, how much do we expect external code to run into this? Anything we can do to mitigate it? One option to consider is having |
A possibility: if |
Change https://go.dev/cl/480215 mentions this issue: |
Context
Package Initialization
When importing a package, it is initialized according to the sequence described in the Go specification. If that package has dependencies, it makes sense that these are initialized first, and that is what is described in the spec. In many instances, this is very useful and helps to write very succinct code, eliminating much boilerplate code.
Importing for Side Effects
In the Go specification, there is mention that it is possible to import packages for their side-effects:
Small Issue
Currently, the import order is predictable for packages with import relations. This is very useful and works flawlessly. However, as I've come to realize it, within a series of import statements in a single file, a package that is imported (for its side effects) early in the list of imports has no guarantee of early initialization. Consequently, the initializations are not ran in order, which makes results unpredictable at times.
Example Use Case - Mocking An Execution Environment Before Running Unit Tests
The package names here are not necessarily representative of what has caused this issue, they are just to explain the use case; let's say I have a package,
packagea
, that initializes a connection to a server, either in itsinit()
or via variable initializers.In this case, I've observed 2 things:
packageb
will be imported and initialized beforepackagea
-- in my case, the package imported the earliest was initialized at the end, effectively not propagating its side effects early enough in the program execution for them to apply to other package initializations, as the environment variable was not set which causedpackagea
to default to the "production code" server URL, and not the testing URL;Proposed Solution
Enable some way of specifying that a package should be imported before any other package for its side effects. As an example, initialization behavior for grouped imports could remain the same, but for separate import statements, statement order could define the initialization order.
EDIT: as mentioned in the comments, the proposed solution wouldn't work, unfortunately; however it still gives an idea of what a candidate implementation could look like.
The text was updated successfully, but these errors were encountered: