Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec: disallow anonymous interface cycles #56103

Open
mdempsky opened this issue Oct 8, 2022 · 29 comments
Open

spec: disallow anonymous interface cycles #56103

mdempsky opened this issue Oct 8, 2022 · 29 comments

Comments

@mdempsky
Copy link
Member

mdempsky commented Oct 8, 2022

TL;DR: We should disallow declarations like type I interface { m() interface { I } }. They cause a lot of trouble for tools, and no one uses them in practice.

Background

The Go spec allows interface types to embed other declared interfaces. For example, package io declares ReadCloser using the type literal interface { Reader; Closer }. This type literal is identical to interface { Read([]byte) (int, error); Close() error }, but also to interface { Reader; Close() error } and interface { Read([]byte) (int, error); Closer }.

Type identity is a core concept of the Go language, and it's useful for tools (e.g., type checkers, compilers, static analysis passes) to have a canonical representation that types can be referred to with (e.g., to ensure runtime type descriptors representing identical types in separate compilation units are deduplicated by the linker). For interface types, the natural canonical representation is the fully expanded form without any embedded interfaces.

However, this causes problems for some self-referential interface types. The simplest example is:

type I interface { m() interface { I } }

Tools would like to expand the interface { I } to a canonical, embedding-free representation. However, expanding it produces interface { m() interface { I } }, which again contains interface { I } and requires expansion. Handled naively, this process never terminates.

The (accepted) type aliases proposal stated: "In contrast, aliases must be possible to “expand out”, and there is no way to expand out an alias like type T = *T."

The issue there was the same: given type T = U, we want to replace (expand) all occurrences of T with U to find a canonical type description. But if U is *T, then this process never terminates either.

Proposal: I propose the same principle should apply to interface embedding: it should be possible to finitely expand all embedded interfaces, and we should disallow declarations like type I interface { m() interface { I } }.

Unused

Anonymous, cyclic interfaces appear unused in practice. I ran an analysis of every unique module path indexed by index.golang.org, and I found only 2 occurrences of cyclic interfaces:

However, neither of these packages appear to be imported anywhere, even within their own modules.

The first use case could be addressed by introducing a second named interface type like type MySQLTx interface { MySQLDb; Rollback(); Commit() }. (This would also allow https://github.com/gozelus/zelus_rest/blob/master/core/db/db.go#L48 to be changed to func (d *dbImp) Begin() MySQLTx { ... }, instead of requiring the interface type literal to be repeated.)

The second appears like it's meant to be a testdata file instead.

Broken

In the past, anonymous, cyclic interfaces have been a recurring issue that we've struggled to support: #10222, #16369, #25262, #29312.

Notably, packages whose package export data contained an anonymous interface cycle couldn't be imported prior to Go 1.7, because the old textual export data format couldn't handle them. (See #16369 (comment).)

x/tools/go/ssa.Hasher has been known to mishandle them since 2018 (#26863), yet no end users have clamored for it to be fixed. This feature underpins many x/tools features like callgraph construction, points-to analysis, SSA building, gopls completion suggestions, and staticcheck. Moreover, CL 439117 was started to fix this issue, but has stalled as yet more failure cases are identified. The options at the moment are either: (1) continue to ignore the issue, (2) simplify interface hashing (at the risk of introducing collisions in realistic Go code), or (3) implement a considerably more complex algorithm.

Finally, while waiting for my module analysis code to execute, I manually discovered many more implementation issues with anonymous, cyclic interfaces: #56045, #56046, #56055, #56056, #56057, #56059, #56061, #56062, #56063, #56065.

Why not tie to go.mod go version?

In #3939, it's been proposed to remove int->string conversions, because the semantics are surprising to new users. One recurring suggestion has been to tie this to the go line in go.mod files: old modules would continue to compile successfully, whereas new modules would get an error instead to protect users from misuse.

Could we do the same for anonymous interface cycles?

I argue no: the issue with anonymous interface cycles isn't that users accidentally use them (like int->string conversions), but that tools authors have to deal with them at all. As long as anonymous interface cycles are allowed anywhere, all tools authors have to deal with them (e.g., CL 439117 above). Users are better served by letting tools authors focus on features that users actually care about and use.

@gopherbot gopherbot added this to the Proposal milestone Oct 8, 2022
@mvdan
Copy link
Member

mvdan commented Oct 8, 2022

I've written several Go tools and I agree with your conclusion: I've never seen this kind of cycle in Go code, and I'm pretty sure that at least one of my tools would fail catastrophically if presented with one.

@dominikh
Copy link
Member

dominikh commented Oct 8, 2022

Anonymous, cyclic interfaces appear unused in practice

Note that this doesn't speak to their usefulness, only to how much code would break if this change were made. People might have used cyclic interfaces more often if they didn't break tools. The existence of bug reports suggests that people have at least tried to use them.

As a tool author I'm in favour of this change, but I am conveniently ignoring that this is a backwards incompatible change. But ISTM that recently we've taken to allowing those if they don't actually affect any existing code…

@go101
Copy link

go101 commented Oct 9, 2022

type I interface { m() interface { I } } is equivalent to type I interface { m() I }. If the latter one is allowed, then maybe the former should also.

Is this only limited to result types? How about type I interface { m(interface { I; I2 } )}?

@mdempsky
Copy link
Member Author

mdempsky commented Oct 9, 2022

type I interface { m() interface { I } } is equivalent to type I interface { m() I }.

No, they're not.

Is this only limited to result types?

No.

@go101
Copy link

go101 commented Oct 9, 2022

Could you elaborate more on why they are not equivalent?

@go101
Copy link

go101 commented Oct 9, 2022

OK, I got it. One is named, the other is not.

@go101
Copy link

go101 commented Oct 9, 2022

Does the proposed expand rule requires all named interfaces to be expanded to unnamed ones?

@mdempsky
Copy link
Member Author

mdempsky commented Oct 9, 2022

All interface type literals must expand to canonical, embedding-free representations.

@dr2chase
Copy link
Contributor

dr2chase commented Oct 9, 2022

The existence of bug reports suggests that people have at least tried to use them.

Any indication how unhappy the bug reporters were about this? Was it a curiosity, or a real impediment?

@dominikh
Copy link
Member

dominikh commented Oct 9, 2022

Any indication how unhappy the bug reporters were about this? Was it a curiosity, or a real impediment?

For Staticcheck, only one issue has been filed because of this problem (dominikh/go-tools#310). I believe the snippet was extracted from real code, but the issue has gone unfixed since 2018 and the user hasn't complained yet, so I can only assume that they worked around it.

@mdempsky
Copy link
Member Author

mdempsky commented Oct 9, 2022

People might have used cyclic interfaces more often if they didn't break tools.

Maybe, but I think there's a few points:

  1. There are features like plugins where we get users regularly asking us for updates. I think if this was a feature users really cared about, we'd have had at least one complaint since 2018.
  2. There are tons of issues with anonymous, cyclic interfaces, but they do work at a very basic level. Despite that, there still don't seem to be any uses of them in practice.
  3. Maybe in the future we discover both some compelling use cases for anonymous, cyclic interfaces and the algorithms/approach necessary to support them. In that case, I'd be okay with relaxing the spec to (re)allow them. But today, the cost of even trying to support them seems to far outweigh the benefit the ecosystem enjoys from that.

@adonovan
Copy link
Member

adonovan commented Oct 10, 2022

[domink] As a tool author I'm in favour of this change, but I am conveniently ignoring that this is a backwards incompatible change.

That's my feeling too. I wish this feature would go away, but generally it's the compiler-writer's job to absorb the complexity to make the (idealized) spec simpler, and if the spec doesn't have to disallow this special case, that's one less thing the user needs to understand. (Let's ignore for now that the spec doesn't say quite what it should about recursive types.) Matthew has shown empirically that a truly tiny number of Go programs use this feature, but even if all of them are excessively complex and could be easily rewritten, that figure is still not zero.

. . .

We tool writers had been operating under the assumption that cycles always arose via named types, and thus it would sufficient to break cycles at named types, such as I in the example:

type I interface { m() interface { I } }

But what we discovered is that in traversing the method m we end up back at m without ever seeing the name I. That's because types.Interface.Methods iterates over the method set of interface { I } without the client being aware that it was notated using an embedding of the named type I. This suggests that Interface.Methods is different from all the other accessor methods in the types API in that it returns the result of deep computation over the type graph rather than a shallow property of the interface { I } type itself.

Perhaps there are other ways to make clients more aware that Interface.Methods is qualitatively different, and to help provide them with the information they need to make their algorithms robust against cycles. For example, last week my coworkers and I spent between us probably a couple of hours on a cyclic-interface problem in the golang.org/x/tools/go/types/typeutil.hasher type. In that instance, if the type checker API had exposed the information that interface type was self-referential it would have made the solution trivial.

@rsc
Copy link
Contributor

rsc commented Oct 12, 2022

This proposal has been added to the active column of the proposals project
and will now be reviewed at the weekly proposal review meetings.
— rsc for the proposal review group

@DmitriyMV
Copy link

DmitriyMV commented Oct 13, 2022

Will this compile under the new rules?

type B interface { I }
type I interface { m() interface { B } }

What about this?

type B = interface{ I }
type I interface{ m() interface{ B } }

And this?

type B = interface{ I }
type I interface{ m() B }

@timothy-king
Copy link
Contributor

timothy-king commented Oct 13, 2022

@DmitriyMV All 3 of those examples are cyclic. These would be disallowed by the new rules.

@mdempsky
Copy link
Member Author

mdempsky commented Oct 13, 2022

@DmitriyMV I proposed "it should be possible to finitely expand all embedded interfaces."

type B interface { I }
type I interface { m() interface { B } }

has two embedded interfaces, I in interface { I } and B in interface { B }. Trying to expand I first, we get:

type B interface { m() interface { B } }
type I interface { m() interface { B } }

Expanding the first embedded interface again (now the first B) produces:

type B interface { m() interface { m() interface { B } } }
type I interface { m() interface { B } }

This process will clearly never terminate in a finite number of steps. A similar process unfolds given your other two examples. So no, none of those would compile.

--

Note: This rule naturally subsumes the spec's "An interface type T may not embed any type element that is, contains, or embeds T, recursively."

For example, the spec demonstrates with an example of:

type Bad interface { Bad }

Trying to expand Bad here again produces the same interface literal. So no finite number of expansions yields an embedding-free representation for Bad.

The same reasoning applies to type Bad1 interface { Bad2 }; type Bad2 interface { Bad1 }.

@deefdragon
Copy link

deefdragon commented Oct 20, 2022

I have three things.

1: This is the best simple, but non-trivial, example of something that uses this paradigm I could come up with

type Nexter interface { 
    Next(Input) (interface { Nexter }, error)
    Done() Output
}

If this is a valid example, I can see this as a rather useful chaining pattern.

2: I think that this would be a backwards incompatible change under the strict definitions of the compatibility promise. I am however, personally in favor of the "loosening" of the compatibility promise that has happened recently. I feel it has made the language better overall. Therefore I think it would be ok to disallow it in regards to the promise.

3: While I think it would be ok to be be disallowed, I don't think it should be disallowed. I dont feel that the reasoning of "They causes a lot of trouble for tools" is very strong. I think the tooling should strive to match the language, not the other way around. I also echo @dominikh in that I suspect if the tooling was better around them, they might be a more common pattern.

@mdempsky
Copy link
Member Author

mdempsky commented Oct 21, 2022

@deefdragon But why would you write the interface that way instead of:

type Nexter interface { 
    Next(Input) (Nexter, error)
    Done() Output
}

Did you happen to try writing a type that actually implements your Nexter interface? (I assume not, because of the syntax errors.)

I can see this as a rather useful chaining pattern.

By definition, if the pattern is "useful" then it must have some "uses." But as I noted in the initial comment, that does not seem to be the case in practice.

I think the tooling should strive to match the language, not the other way around.

Yes, it's easy to say tools should be better and do more, but as tools authors we have finite engineering resources to spread around different user demands. In practice, no one actually uses this feature, yet the complexity it imposes hinders us from implementing compiler optimizations that users would actually benefit from.

@deefdragon
Copy link

deefdragon commented Oct 21, 2022

Apologies on the syntax. That has been corrected.

Regarding interface{Nexter} vs just Nexter, as I had said, "If this is a valid example". Id obviously not (and maybe still don't) understand the problem perfectly. And anything that could require interface{Nexter, Fooer} would probably be better off embedded in the Nexter interface vs returned from Next. (maybe something that you only want an initially receiving step to do but that's more convoluted than the "simple" example I was after)

Regarding its uses, I can see uses for it in the back of my head, however, I cant come up with anything concrete or specific.

Regarding your third point, I obviously misinterpreted the "Broken" section of your proposal. The way I read it did not get through to me the lengths to which it could improve things for an average go user in the background.

Regarding my tooling comments. The point behind it was not to say that tooling has to be perfect. I fully recognize how difficult that would be, and how little time people have, (a large portion of which they are already donating to the development of these tools (Thank you for that btw)). My intended point was that tooling not being a perfect match to the spec is, to me, not a very good reason to change the language on its own.

I am more on the fence about this staying than I was before. I still lean towards keeping it, and still think this could be a useful pattern if explored, but I don't deal with chaining function calls enough to make a concrete argument or have a concrete example.

@folays
Copy link

folays commented Oct 22, 2022

What about the case of "node" / "graph" / "list" / "chaining" interfaces ?

// hypothetical developper A's Go module "go-node", package "gonode"
type Node interface { // public interface
	Parent() Node
	FirstChild() Node
	Children() []Node
}

// hypothetical developper B's Go module "my-implem", aiming to implement gonode.Node interface
// another package providing one implementation respecting the Node interface
type myNode struct {
	parent   *node
	children []*node
}

I did not provided the necessary methods of myNode to make it implement Node, but you can guess what they are doing.

I never wrote code like this, and never used code like this, but I would be very surprised that no-one is doing it, and I would find it sensical.

That would also be disallowed ?

Even reflect interface which somewhat genericalize all types and permit type traversal of all types involved for a given type, is doing that.

type Type interface {
	[...] // only including the relevant methods
	Elem() Type
	Key() Type
	In(i [int) Type
	Out(i int) Type
}

@mdempsky
Copy link
Member Author

mdempsky commented Oct 22, 2022

@folays I proposed "it should be possible to finitely expand all embedded interfaces."

None of your examples involve embedded interfaces, so they're already expanded. They would continue to be allowed.

@timothy-king
Copy link
Contributor

timothy-king commented Oct 24, 2022

One [maybe?] advantage of allowing cyclic interfaces to be anonymous is that they do not require a definition of the interface to be imported in order to be implemented. This is a hyped property of Go's interfaces in other contexts. A signature that takes or returns a named type does need to be imported even if the named type is an interface.

The simplest example I can think of to illustrate this point is weird: https://go.dev/play/p/_1JCj2upF-R

type I interface { Out() interface{ I } }

type Impl struct{} // Impl implements I without referring to I so it can be in another package.
type otherI interface { Out() interface{ otherI } }    // name+embedding is used to create the same structure as I.
func (Impl) Out() interface{ otherI } { return nil }    // the embedding erases the name otherI.
var _ I = Impl{}

I do not think anybody wants to write or read code like this [outside of intellectual exercises]. If the above is banned, folks would still have the option of creating a package just to export the interface definition. That does actually sound better than reading and maintaining the above example even if it requires importing the interface to implement it.

@rsc
Copy link
Contributor

rsc commented Oct 26, 2022

This boils down to whether we want to guarantee that every expression of the form 'interface { ... }' can have its embedded interfaces "compiled out" so that the interface expression can be written without any embedded interfaces.

So this is OK:

type B interface { x() I }
type I interface { m() interface { B } }

Because it compiles out to:

type B interface { x() I }
type I interface { m() interface { x() I } }

But this is not:

type B interface { I }
type I interface { m() interface { B } }

because B compiles out to:

type B interface { m() interface { B } }
type B interface { m() interface { m() interface { B } } }
etc.

Making this guarantee would simplify tools and analysis and the compiler, and it might even simplify teaching Go to people if you can say that interfaces can always be expanded out this way to remove embedding.

It's true that we've allowed this in some forms already, but @mdempsky says that complex cases are still buggy today.

It also appears to be true that almost no one uses these forms, so if we did make the restriction, it seems that hardly any code would be affected. As @mdempsky reported, it looks like https://github.com/gozelus/zelus_rest/blob/master/core/db/db.go#L20 is the only problematic use in the entire public ecosystem, and the package appears to be unused, even in the rest of that module.

If we do move forward with the restriction, we should have some way to disable the restriction for a release to let people adapt. Or maybe we should do one release where vet complains, followed by removing it entirely in the next release?

Does anyone object to adopting this change and this rollout plan?

@gopherbot
Copy link

gopherbot commented Oct 27, 2022

Change https://go.dev/cl/445598 mentions this issue: cmd/compile: reject anonymous interface cycles

@mdempsky
Copy link
Member Author

mdempsky commented Oct 27, 2022

If we do move forward with the restriction, we should have some way to disable the restriction for a release to let people adapt.

SGTM. go.dev/cl/445598 changes cmd/compile to reject anonymous interface cycles, and adds a flag (spelled -d=interfacecycles) to suppress the error diagnostic. In 1.21, we can make it an unconditional error in go/types / types2 instead.

@rsc
Copy link
Contributor

rsc commented Nov 2, 2022

Given how very few uses we think there are (approximately zero), a compiler flag to opt-out sounds fine. If we thought a significant number of users were affected, we'd want to do a more gradual rollout with a vet check first. But it seems safe to skip that for now.

So the plan sounds like:

For Go 1.20, the compiler will reject these interface cycles by default, but you can build with 'go build -gcflags=all=-d=interfacecycles' to keep old code building. If people report significant amounts of breakage to us during the release candidates, we would back out this change.

For Go 1.22, the -d=interfacecycles flag will be removed, and old code will simply not build anymore. Using Go 1.22 gives people a year to update their code and helps with people who may skip a version. Normally we would not ever want to break old code, but all the evidence we have suggests that this will break no real code at all. If people do report breakages to us, we can push the flag removal out to a later release, like Go 1.24.

Do I have that right?

@mdempsky
Copy link
Member Author

mdempsky commented Nov 2, 2022

SGTM. I'm happy to wait until 1.22 before removing the opt-out to give more time to smoke out potentially affected users.

@rsc
Copy link
Contributor

rsc commented Nov 9, 2022

Based on the discussion above, this proposal seems like a likely accept.
— rsc for the proposal review group

@rsc
Copy link
Contributor

rsc commented Nov 16, 2022

No change in consensus, so accepted. 🎉
This issue now tracks the work of implementing the proposal.
— rsc for the proposal review group

@rsc rsc changed the title proposal: spec: disallow anonymous interface cycles spec: disallow anonymous interface cycles Nov 16, 2022
@rsc rsc modified the milestones: Proposal, Backlog Nov 16, 2022
gopherbot pushed a commit that referenced this issue Nov 21, 2022
This CL changes cmd/compile to reject anonymous interface cycles like:

	type I interface { m() interface { I } }

We don't anticipate any users to be affected by this change in
practice. Nonetheless, this CL also adds a `-d=interfacecycles`
compiler flag to suppress the error. And assuming no issue reports
from users, we'll move the check into go/types and types2 instead.

Updates #56103.

Change-Id: I1f1dce2d7aa19fb388312cc020e99cc354afddcb
Reviewed-on: https://go-review.googlesource.com/c/go/+/445598
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
felixge pushed a commit to felixge/go that referenced this issue Nov 21, 2022
This CL changes cmd/compile to reject anonymous interface cycles like:

	type I interface { m() interface { I } }

We don't anticipate any users to be affected by this change in
practice. Nonetheless, this CL also adds a `-d=interfacecycles`
compiler flag to suppress the error. And assuming no issue reports
from users, we'll move the check into go/types and types2 instead.

Updates golang#56103.

Change-Id: I1f1dce2d7aa19fb388312cc020e99cc354afddcb
Reviewed-on: https://go-review.googlesource.com/c/go/+/445598
Run-TryBot: Matthew Dempsky <mdempsky@google.com>
Reviewed-by: Robert Griesemer <gri@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Matthew Dempsky <mdempsky@google.com>
@griesemer griesemer modified the milestones: Backlog, Go1.21 Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Accepted
Development

No branches or pull requests