all: support gradual code repair while moving a type between packages #18130

Closed
rsc opened this Issue Dec 1, 2016 · 225 comments

Comments

Projects
None yet
@rsc
Contributor

rsc commented Dec 1, 2016

Original Title: proposal: support gradual code repair while moving a type between packages

Go should add the ability to create alternate equivalent names for types, in order to enable gradual code repair during codebase refactoring. This was the target of the Go 1.8 alias feature, proposed in #16339 but held back from Go 1.8. Because we did not solve the problem for Go 1.8, it remains a problem, and I hope we can solve it for Go 1.9.

In the discussion of the alias proposal, there were many questions about why this ability to create alternate names for types in particular is important. As a fresh attempt to answer those questions, I wrote and posted an article, “Codebase Refactoring (with help from Go).” Please read that article if you have questions about the motivation. (For an alternate, shorter presentation, see Robert's Gophercon lightning talk. Unfortunately, that video wasn't available online until October 9. Update, Dec 16: here's my GothamGo talk, which was essentially the first draft of the article.)

This issue is not proposing a specific solution. Instead, I want to gather feedback from the Go community about the space of possible solutions. One possible avenue is to limit aliases to types, as mentioned at the end of the article. There may be others we should consider as well.

Please post thoughts about type aliases or other solutions as comments here.

Thank you.

Update, Dec 16: Design doc for type aliases posted.
Update, Jan 9: Proposal accepted, dev.typealias repository created, implementation due at the start of the Go 1.9 cycle for experimentation.


Discussion summary (last updated 2017-02-02)

Do we expect to need a general solution that works for all declarations?

If type aliases are 100% necessary, then var aliases are maybe 10% necessary, func aliases are 1% necessary, and const aliases are 0% necessary. Because const already has = and func could plausibly use = too, the key question is whether var aliases are important enough to plan for or implement.

As argued by @rogpeppe (#16339 (comment)) and @ianlancetaylor (#16339 (comment)) in the original alias proposal and as mentioned in the article, a mutating global var is usually a mistake. It probably doesn't make sense to complicate the solution to accommodate what is usually a bug. (In fact, if we can figure out how, it would not surprise me if in the long term Go moves toward requiring global vars to be immutable.)

Because richer var aliases are likely not important enough to plan for, it seems like the right choice here is to focus only on type aliases. Most of the comments here seem to agree. I won't list everyone.

Do we need a new syntax (= vs => vs export)?

The strongest argument for new syntax is the need to support var aliases, either now or in the future (#18130 (comment) by @Merovius). It seems okay to plan not to have var aliases (see previous section).

Without var aliases, reusing = is simpler than introducing new syntax, whether => like in the alias proposal, ~ (#18130 (comment) by @joegrasse), or export (#18130 (comment) by @cznic).

Using = in would also exactly match the syntax of type aliases in Pascal and Rust. To the extent that other languages have the same concepts, it's nice to use the same syntax.

Looking ahead, there could be a future Go in which func aliases exist too (see #18130 (comment) by @nigeltao), and then all declarations would permit the same form:

const C2 = C1
func F2 = F1
type T2 = T1
var V2 = V1

The only one of these that wouldn't establish a true alias would be the var declaration, because V2 and V1 can be redefined independently as the program executes (unlike the const, func, and type declarations which are immutable). Since one main reason for variables is to allow them to vary, that exception would at least be easy to explain. If Go moves toward immutable global vars, then even that exception would disappear.

To be clear, I am not suggesting func aliases or immutable global vars here, just working through the implications of such future additions.

@jimmyfrasche suggested (#18130 (comment)) aliases for everything except consts, so that const would be the exception instead of var:

const C2 = C1 // no => form
func F2 => F1
type T2 => T1
var V2 => V1
var V2 = V1 // different from => form

Having inconsistencies with both const and var seems more difficult to explain than having just an inconsistency for var.

Can this be a tooling- or compiler-only change instead of a language change?

It's certainly worth asking whether gradual code repair can be enabled purely by side information supplied to the compiler (for example, #18130 (comment) by @btracey).

Or maybe if the compiler can apply some kind of rule-based preprocessing to transform input files before compilation (for example, #18130 (comment) by @tux21b).

Unfortunately, no, the change really can't be confined that way. There are at least two compilers (gc and gccgo) that would need to coordinate, but so would any other tools that analyze programs, like go vet, guru, goimports, gocode (code completion), and others.

As @bcmills said (#18130 (comment)), “a ‘non-language-change’ mechanism which must be supported by all implementations is a de facto language change — it’s just one with poorer documentation.”

What other uses might aliases have?

We know of the following. Given that type aliases in particular were deemed important enough for inclusion in Pascal and Rust, there are likely others.

  1. Aliases (or just type aliases) would enable creating drop-in replacements that expand other packages. For example see https://go-review.googlesource.com/#/c/32145/, especially the explanation in the commit message.

  2. Aliases (or just type aliases) would enable structuring a package with a small API surface but a large implementation as a collection of packages for better internal structure but still present just one package to be imported and used by clients. There's a somewhat abstract example described at #16339 (comment).

  3. Protocol buffers have an "import public" feature whose semantics is trivial to implement in generated C++ code but impossible to implement in generated Go code. This causes frustration for authors of protocol buffer definitions shared between C++ and Go clients. Type aliases would provide a way for Go to implement this feature. In fact, the original use case for import public was gradual code repair. Similar issues may arise in other kinds of code generators.

  4. Abbreviating long names. Local (unexported or not-package-scoped) aliases might be handy to abbreviate a long type name without introducing the overhead of a whole new type. As with all these uses, the clarity of the final code would strongly influence whether this is a suggested use.

What other issues does a proposal for type aliases need to address?

Listing these for reference. Not attempting to solve or discuss them in this section, although a few were discussed later and are summarized in separate sections below.

  1. Handling in godoc. (#18130 (comment) by @nigeltao and #18130 (comment) by @jimmyfrasche)

  2. Can methods be defined on types named by alias? (#18130 (comment) by @ulikunitz)

  3. If aliases to aliases are allowed, how do we handle alias cycles? (#18130 (comment) by @thwd)

  4. Should aliases be able to export unexported identifiers? (#18130 (comment) by @thwd)

  5. What happens when you embed an alias (how do you access the embedded field)? (#18130 (comment) by @thwd, also #17746)

  6. Are aliases available as symbols in the built program? (#18130 (comment) by @thwd)

  7. Ldflags string injection: what if we refer to an alias? (#18130 (comment) by @thwd; this only arises if there are var aliases.)

Is versioning a solution by itself?

"In that case maybe versioning is the whole answer, not type aliases."
(#18130 (comment) by @iainmerrick)

As noted in the article, I think versioning is an complementary concern. Support for gradual code repair, such as with type aliases, gives a versioning system more flexibility in how it builds a large program, which can be difference between being able to build the program and not.

Can the larger refactoring problem be solved instead?

In #18130 (comment), @niemeyer points out that there were actually two changes for moving os.Error to error: the name changed but so did the definition (the current Error method used to be a String method).

@niemeyer suggests that perhaps we can find a solution to the broader refactoring problem that fixes types moving between packages as a special case but also handles things like method names changing, and he proposes a solution built around "adapters".

There is a fair amount of discussion in the comments that I can't easily summarize here. The discussion isn't over, but so far it is unclear whether "adapters" can fit into the language or be implemented in practice. It does seem clear that adapters are at least one order of magnitude more complex than type aliases.

Adapters need a coherent solution to the subtyping problems noted below as well.

Can methods be declared on alias types?

Certainly aliases do not allow bypassing the usual method definition restrictions: if a package defines type T1 = otherpkg.T2, it cannot define methods on T1, just as it cannot define methods directly on otherpkg.T2. That is, if type T1 = otherpkg.T2, then func (T1) M() is equivalent to func (otherpkg.T2) M(), which is invalid today and remains invalid. However, if a package defines type T1 = T2 (both in the same package), then the answer is less clear. In this case, func (T1) M() would be equivalent to func (T2) M(); since the latter is allowed, there is an argument to allow the former. The current design doc does not impose a restriction here (in keeping with the general avoidance of restrictions), so that func (T1) M() is valid in this situation.

In #18130 (comment), @jimmyfrasche suggests that instead defining "no use of aliases in method definitions" would be a clear rule and avoid needing to know what T is defined as to know if func (T) M() is valid. In #18130 (comment), @rsc points out that even today there are certain T for which func (T) M() is not valid: https://play.golang.org/p/bci2qnldej. In practice this doesn't come up because people write reasonable code.

We will keep this possible restriction in mind but wait until there is strong evidence that it is needed before introducing it.

Is there a cleaner way to handle embedding and, more generally, field renames?

In #18130 (comment), @Merovius points out that an embedded type that changes its name during a package move will cause problems when that new name must eventually be adopted at the use sites. For example if user type U has an embedded io.ByteBuffer that moves to bytes.Buffer, then while U embeds io.ByteBuffer the field name is U.ByteBuffer, but when U is updated to refer to bytes.Buffer, the field name necessarily changes to U.Buffer.

In #18130 (comment), @neild points out that there is at least a workaround if references to io.ByteBuffer must be excised: the package P that defines U can also define 'type ByteBuffer = bytes.Buffer' and embed that type into U. Then U still has a U.ByteBuffer, even after io.ByteBuffer is gone entirely.

In #18130 (comment), @bcmills suggests the idea of field aliases, to allow a field to have multiple names during a gradual repair. Field aliases would allow defining something like type U struct { bytes.Buffer; ByteBuffer = Buffer } instead of having to create the top-level type alias.

In #18130 (comment), @rsc raises yet another possibility: some syntax for 'embed this type with this name', so that it is possible to embed a bytes.Buffer as the field name ByteBuffer, without needing a top-level type or an alternate name. If that existed, then the type name could be updated from io.ByteBuffer to bytes.Buffer while preserving the original name (and not introducing a second, nor a clumsy exported type).

These all seem worth exploring once we have more evidence of large-scale refactorings blocked by problems with fields changing names. As @rsc wrote, "If type aliases help us get to the point where lack of field aliases is the next big roadblock for large-scale refactorings, that will be progress!"

There was a suggestion of restricting the use of aliases in embedded fields or changing the embedded name to use the target type's name, but those make the alias introduction break existing definitions that must then be fixed atomically, essentially preventing any gradual repair. @rsc: "We discussed this at some length in #17746. I was originally on the side of the name of an embedded io.ByteBuffer alias being Buffer, but the above argument convinced me I was wrong. @jimmyfrasche in particular made some good arguments about the code not changing depending on the definition of the embedded thing. I don't think it's tenable to disallow embedded aliases completely."

What is the effect on programs using reflection?

Programs using reflection see through aliases. In #18130 (comment), @atdiar points out that if a program is using reflection to, for example, find the package in which a type is defined or even the name of a type, it will observe the change when the type is moved, even if a forwarding alias is left behind. In #18130 (comment), @rsc confirmed this and wrote "Like the situation with embedding, it's not perfect. Unlike the situation with embedding, I don't have any answers except maybe code shouldn't be written using reflect to be quite that sensitive to those details."

The use of vendored packages today also changes package import paths seen by reflect, and we have not been made aware of significant problems caused by that ambiguity. This suggests that programs are not commonly inspecting reflect.Type.PkgPath in ways that would be broken by use of aliases. Even so, it's a potential gap, just like embedding.

What is the effect on separate compilation of programs and plugins?

In #18130 (comment), @atdiar raises the question of the effect on object files and separate compilation. In #18130 (comment), @rsc replies that there should be no need to make changes here: if X imports Y and Y changes and is recompiled, then X needs to be recompiled too. That's true today without aliases, and it will remain true with aliases. Separate compilation means being able to compile X and Y in distinct steps (the compiler does not have to process them in the same invocation), not that it is possible to change Y without recompiling X.

Would sum types or some kind of subtyping be an alternative solution?

In #18130 (comment), @iand suggests "substitutable types", "a list of types that may be substituted for the named type in function arguments, return values etc.". In #18130 (comment), @j7b suggests using algebraic types "so we also get an empty interface equivalent with compile time type checking as a bonus". Other names for this concept are sum types and variant types.

In general this does not suffice to allow moving types with gradual code repair. There are two ways to think about this.

In #18130 (comment), @bcmills takes the concrete way, pointing out that algebraic types have a different representation than the original, which makes it not possible to treat the sum and the original as interchangeable: the latter has type tags.

In #18130 (comment), @rsc takes the theoretical way, expanding on #18130 (comment) by @gri pointing out that in a gradual code repair, sometimes you need T1 to be a subtype of T2 and sometimes vice versa. The only way for both to be subtypes of each other is for them to be the same type, which not concidentally is what type aliases do.

As a side tangent, in addition to not solving the gradual code repair problem, algebraic types / sum types / union types / variant types are by themselves hard to add to Go. See
the FAQ answer and the Go 1.6 AMA discussion for more.

In #18130 (comment), @thwd suggests that since Go has a subtyping relationship between concrete types and interfaces (bytes.Buffer can be seen as a subtype of io.Reader) and between interfaces (io.ReadWriter is a subtype of io.Reader in the same way), making interfaces "recursively covariant (according to the current variance rules) down to their method arguments" would solve the problem provided that all future packages only use interfaces, never concrete types like structs ("encourages good design, too").

There are three problems with that as a solution. First, it has the subtyping issues above, so it doesn't solve gradual code repair. Second, it doesn't apply to existing code, as @thwd noted in this suggestion. Third, forcing the use of interfaces everywhere may not actually be good design and introduces performance overheads (see for example #18130 (comment) by @Merovius and #18130 (comment) by @zombiezen).

Restrictions

This section collects proposed restrictions for reference, but keep in mind that restrictions add complexity. As I wrote in #18130 (comment), "we should probably only implement those restrictions after actual experience with the unrestricted, simpler design helps us understand whether the restriction would bring enough benefits to pay for its cost."

Put another way, any restriction would need to be justified by evidence that it would prevent some serious misuse or confusion. Since we haven't implemented a solution yet, there is no such evidence. If experience did provide that evidence, these will be worth returning to.

Restriction? Aliases of standard library types can only be declared in standard library.

(#18130 (comment) and #18130 (comment) by @iand)

The concern is "code that has renamed standard library concepts to fit a custom naming convention", or "long spaghetti chains of aliases across multiple packages that end up back at the standard library", or "aliasing things like interface{} and error".

As stated, the restriction would disallow the "extension package" case described above involving x/image/draw.

It's unclear why the standard library should be special: the problems would exist with any code. Also, neither interface{} nor error is a type from the standard library. Rephrasing the restriction as "aliasing predefined types" would disallow aliasing error, but the need to alias error was one of the motivating examples in the article.

Restriction? Alias target must be package-qualified identifier.

(#18130 (comment) by @jba)

This would make it impossible to make an alias when renaming a type within a package, which may be used widely enough to necessitate a gradual repair (#18130 (comment) by @bcmills).

It would also disallow aliasing error as in the article.

Restriction? Alias target must be package-qualified identifier with same name as alias.

(proposed during alias discussion in Go 1.8)

In addition to the problems of the previous section with limiting to package-qualified identifiers, forcing the name to stay the same would disallow the conversion from io.ByteBuffer to bytes.Buffer in the article.

Restriction? Aliases should be discouraged in some way.

"How about hiding aliases behind an import, just like for "C" and “unsafe”, to further discourage it's usage? In the same vein, I would like the aliases syntax to be verbose and stand out as a scaffold for on going refactoring." - #18130 (comment) by @xiegeo

"Should we also automatically infer that an aliased type is legacy and should be replaced by the new type? If we enforce golint, godoc and similar tools to visualize the old type as deprecated, it would limit the abuse of type aliasing very significantly. And the final concern of aliasing feature being abused would be resolved." - #18130 (comment) by @rakyll

Until we know that they will be used wrong, it seems premature to discourage usage. There may be good, non-temporary uses (see above).

Even in the case of code repair, either the old or new type may be the alias during the transition, depending on the constraints imposed by the import graph. Being an alias does not mean the name is deprecated.

There is already a mechanism for marking certain declarations as deprecated (see #18130 (comment) by @jimmyfrasche).

Restriction? Aliases must target named types.

"Aliases shouldn't not apply to unnamed type. Their is no "code repair" story in moving from one unnamed type to another. Allowing aliases on unnamed types means I can no longer teach Go as simply named and unnamed types." - #18130 (comment) by @davecheney

Until we know that they will be used wrong, it seems premature to discourage usage. There may be good uses with unnamed targets (see above).

As noted in the design doc, we do expect to change the terminology to make the situation clearer.

@rsc rsc added the Proposal label Dec 1, 2016

@rsc rsc added this to the Proposal milestone Dec 1, 2016

@variadico

This comment has been minimized.

Show comment Hide comment
@variadico

variadico Dec 1, 2016

I like how visually uniform this looks.

const OldAPI => NewPackage.API
func  OldAPI => NewPackage.API
var   OldAPI => NewPackage.API
type  OldAPI => NewPackage.API

But since we can almost gradually move most elements, maybe the simplest
solution is just to allow an = for types.

const OldAPI = NewPackage.API
func  OldAPI() { NewPackage.API() }
var   OldAPI = NewPackage.API
type  OldAPI = NewPackage.API

I like how visually uniform this looks.

const OldAPI => NewPackage.API
func  OldAPI => NewPackage.API
var   OldAPI => NewPackage.API
type  OldAPI => NewPackage.API

But since we can almost gradually move most elements, maybe the simplest
solution is just to allow an = for types.

const OldAPI = NewPackage.API
func  OldAPI() { NewPackage.API() }
var   OldAPI = NewPackage.API
type  OldAPI = NewPackage.API
@zquestz

This comment has been minimized.

Show comment Hide comment
@zquestz

zquestz Dec 1, 2016

So first, I just wanted to thank you for that excellent write-up. I think the best solution is to introduce type aliases with an assignment operator. This requires no new keywords/operators, uses a familiar syntax, and should solve the refactoring problem for large code bases.

zquestz commented Dec 1, 2016

So first, I just wanted to thank you for that excellent write-up. I think the best solution is to introduce type aliases with an assignment operator. This requires no new keywords/operators, uses a familiar syntax, and should solve the refactoring problem for large code bases.

@iand

This comment has been minimized.

Show comment Hide comment
@iand

iand Dec 1, 2016

As Russ's article points out, any alias-like solution needs to gracefully solve #17746 and #17784

iand commented Dec 1, 2016

As Russ's article points out, any alias-like solution needs to gracefully solve #17746 and #17784

@travisjeffery

This comment has been minimized.

Show comment Hide comment
@travisjeffery

travisjeffery Dec 1, 2016

Thank you for the write up of that article.

I find the type-only aliases using the assignment operator to be best:

type OldAPI = NewPackage.API

My reasons:

  • It's simpler.
    The alternative solution => having subtly different meaning based on its operand feels out of place for Go.
  • It's focused and conservative.
    The issue at hand with types is solved and you don't need to worry about imagining the complications of the generalized solution.
  • It's aesthetic.
    I think it looks more pleasing.

All of these above: the result being simple, focused, conservative, and aesthetic make it easy for me to picture of it being a part of Go.

travisjeffery commented Dec 1, 2016

Thank you for the write up of that article.

I find the type-only aliases using the assignment operator to be best:

type OldAPI = NewPackage.API

My reasons:

  • It's simpler.
    The alternative solution => having subtly different meaning based on its operand feels out of place for Go.
  • It's focused and conservative.
    The issue at hand with types is solved and you don't need to worry about imagining the complications of the generalized solution.
  • It's aesthetic.
    I think it looks more pleasing.

All of these above: the result being simple, focused, conservative, and aesthetic make it easy for me to picture of it being a part of Go.

@cznic

This comment has been minimized.

Show comment Hide comment
@cznic

cznic Dec 1, 2016

Contributor

If the solution would be limited to types only then the syntax

type NewFoo = old.Foo

already considered before, as discussed in the @rsc's article, looks very good to me.

If we would like to be able to do the same for constants, variables and functions, my preferred syntax would be (as proposed before)

package newfmt

import (
	"fmt"
)

// No renaming.
export fmt.Printf // Note: Same as `export Printf fmt.Printf`.

export (
        fmt.Sprintf
        fmt.Formatter
)

// Renaming.
export Foo fmt.Errorf // Foo must be exported, ie. `export foo fmt.Errorf` would be invalid.

export (
	Bar fmt.Fprintf
	Qux fmt.State
)

As discussed before, the disadvantage is that a new, top-level only keyword is introduced, which is admittedly akward, even though technically feasible and fully backwards compatible. I like this syntax because it reflects the pattern of imports. It would seem natural to me that exports would be permitted only in the same section where imports are allowed, ie. between the package clause and any var, type, constant or function TLD.

The renaming identifiers would be declared in the package scope, however, the new names are not visible in the package declaring them (newfmt in the example above) above with respect to redeclaration, which is disallowed as usual. Given the previous example, TLDs

var v = Printf // undefined: Printf.
var Printf int // Printf redeclared, previous declaration at newfmt.go:8.

In the importing package the renaming identifiers are visible normally, as any other exported identifier of the (newftm's) package block.

package foo

import "newfmt"

type bar interface {
	baz(qux newfmt.Qux) // qux type is identical to fmt.State.
}

In conclusion, this approach does not introduce any new local name binding in newfmt, which I believe avoids at least some of the problems discussed in #17746 and solves #17784 completely.

Contributor

cznic commented Dec 1, 2016

If the solution would be limited to types only then the syntax

type NewFoo = old.Foo

already considered before, as discussed in the @rsc's article, looks very good to me.

If we would like to be able to do the same for constants, variables and functions, my preferred syntax would be (as proposed before)

package newfmt

import (
	"fmt"
)

// No renaming.
export fmt.Printf // Note: Same as `export Printf fmt.Printf`.

export (
        fmt.Sprintf
        fmt.Formatter
)

// Renaming.
export Foo fmt.Errorf // Foo must be exported, ie. `export foo fmt.Errorf` would be invalid.

export (
	Bar fmt.Fprintf
	Qux fmt.State
)

As discussed before, the disadvantage is that a new, top-level only keyword is introduced, which is admittedly akward, even though technically feasible and fully backwards compatible. I like this syntax because it reflects the pattern of imports. It would seem natural to me that exports would be permitted only in the same section where imports are allowed, ie. between the package clause and any var, type, constant or function TLD.

The renaming identifiers would be declared in the package scope, however, the new names are not visible in the package declaring them (newfmt in the example above) above with respect to redeclaration, which is disallowed as usual. Given the previous example, TLDs

var v = Printf // undefined: Printf.
var Printf int // Printf redeclared, previous declaration at newfmt.go:8.

In the importing package the renaming identifiers are visible normally, as any other exported identifier of the (newftm's) package block.

package foo

import "newfmt"

type bar interface {
	baz(qux newfmt.Qux) // qux type is identical to fmt.State.
}

In conclusion, this approach does not introduce any new local name binding in newfmt, which I believe avoids at least some of the problems discussed in #17746 and solves #17784 completely.

@4ad

This comment has been minimized.

Show comment Hide comment
@4ad

4ad Dec 1, 2016

Member

My first preference is for a type-only type NewFoo = old.Foo.

If a more general solution is desired, I agree with @cznic that a dedicated keyword is better than a new operator (especially an asymetric operator with confusing directionality[1]). That being said, I don't think the export keyword conveys the right meaning. Neither the syntax, nor semantics mirrors import. What about alias?

I understand why @cznic doesn't want the new names to be accesible in the package declaring them, but, to me at least, that restriction feels unexpected and artificial (although I perfectly well understand the reason behind it).

[1] I have been using Unix for almost 20 years, and I still can't create a symlink on the first try. And I usually fail even on the second try, after I have read the manual.

Member

4ad commented Dec 1, 2016

My first preference is for a type-only type NewFoo = old.Foo.

If a more general solution is desired, I agree with @cznic that a dedicated keyword is better than a new operator (especially an asymetric operator with confusing directionality[1]). That being said, I don't think the export keyword conveys the right meaning. Neither the syntax, nor semantics mirrors import. What about alias?

I understand why @cznic doesn't want the new names to be accesible in the package declaring them, but, to me at least, that restriction feels unexpected and artificial (although I perfectly well understand the reason behind it).

[1] I have been using Unix for almost 20 years, and I still can't create a symlink on the first try. And I usually fail even on the second try, after I have read the manual.

@iand

This comment has been minimized.

Show comment Hide comment
@iand

iand Dec 1, 2016

I would like to propose an additional constraint: type aliases to standard library types may only be declared in the standard library.

My reasoning is that I don't want to work with code that has renamed standard library concepts to fit a custom naming convention. I also don't want to deal with long spaghetti chains of aliases across multiple packages that end up back at the standard library.

iand commented Dec 1, 2016

I would like to propose an additional constraint: type aliases to standard library types may only be declared in the standard library.

My reasoning is that I don't want to work with code that has renamed standard library concepts to fit a custom naming convention. I also don't want to deal with long spaghetti chains of aliases across multiple packages that end up back at the standard library.

@quentinmit

This comment has been minimized.

Show comment Hide comment
@quentinmit

quentinmit Dec 1, 2016

Contributor

@iand: That constraint would block the use of this feature to migrate anything into the standard library. Case in point, the current migration of Context into the standard library. The old home of Context should become an alias for the Context in the standard library.

Contributor

quentinmit commented Dec 1, 2016

@iand: That constraint would block the use of this feature to migrate anything into the standard library. Case in point, the current migration of Context into the standard library. The old home of Context should become an alias for the Context in the standard library.

@iand

This comment has been minimized.

Show comment Hide comment
@iand

iand Dec 1, 2016

@quentinmit that is unfortunately true. It also limits the use case for golang.org/x/image/draw in this CL https://go-review.googlesource.com/#/c/32145/

My real concern is with people aliasing things like interface{} and error

iand commented Dec 1, 2016

@quentinmit that is unfortunately true. It also limits the use case for golang.org/x/image/draw in this CL https://go-review.googlesource.com/#/c/32145/

My real concern is with people aliasing things like interface{} and error

@joegrasse

This comment has been minimized.

Show comment Hide comment
@joegrasse

joegrasse Dec 1, 2016

If it is decided to introduce a new operator, I would like to propose ~. In the English language, it is generally understood to mean "similar to", "approximately", "about", or "around". As @4ad above stated, the => is an asymetric operator with confusing directionality.

For example:

const OldAPI ~ NewPackage.API
func  OldAPI ~ NewPackage.API
var   OldAPI ~ NewPackage.API
type  OldAPI ~ NewPackage.API

If it is decided to introduce a new operator, I would like to propose ~. In the English language, it is generally understood to mean "similar to", "approximately", "about", or "around". As @4ad above stated, the => is an asymetric operator with confusing directionality.

For example:

const OldAPI ~ NewPackage.API
func  OldAPI ~ NewPackage.API
var   OldAPI ~ NewPackage.API
type  OldAPI ~ NewPackage.API
@jba

This comment has been minimized.

Show comment Hide comment
@jba

jba Dec 1, 2016

@iand if we limit the right-hand side to a package-qualified identifier, then that would eliminate your specific concern.

It would also mean you couldn't have aliases to any types in the current package, or to long type expressions like map[string]map[int]interface{}. But those uses have nothing to do with the main goal of gradual code repair, so maybe they are no great loss.

jba commented Dec 1, 2016

@iand if we limit the right-hand side to a package-qualified identifier, then that would eliminate your specific concern.

It would also mean you couldn't have aliases to any types in the current package, or to long type expressions like map[string]map[int]interface{}. But those uses have nothing to do with the main goal of gradual code repair, so maybe they are no great loss.

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Dec 1, 2016

Contributor

@cznic, @iand, others: Please note that restrictions add complexity. They complicate the explanation of the feature, and they add cognitive load for any user of the feature: if you forget about a restriction, you have to puzzle through why something you thought should work doesn't.

It's often a mistake to implement restrictions on a trial of a design solely due to hypothetical misuse. That happened in the alias proposal discussions, and it made the aliases in the trial unable to handle the io.ByteBuffer => bytes.Buffer conversion from the article. Part of the goal of writing the article is to define some cases we know we want to be able to handle, so that we don't inadvertently restrict them away.

As another example, it would be easy to make a misuse argument to disallow non-pointer receivers, or to disallow methods on non-struct types. If we'd done either of those, you couldn't create enums with String() methods for printing themselves, and you couldn't have http.Headers both be a plain map and provide helper methods. It's often easy to imagine misuses; compelling positive uses can take longer to appear, and it's important to create space for experimentation.

As yet another example, the original design and implementation for pointer vs value methods did not distinguish between the method sets on T and *T: if you had a *T, you could call the value methods (receiver T), and if you had a T, you could call the pointer methods (receiver *T). This was simple, with no restrictions to explain. But then actual experience showed us that allowing pointer method calls on values led to a specific class of confusing, surprising bugs. For example, you could write:

var buf bytes.Buffer
io.Copy(buf, reader)

and io.Copy would succeed, but buf would have nothing in it. We had to choose between explaining why that program ran incorrectly or explaining why that program didn't compile. Either way there were going to be questions, but we came down on the side of avoiding incorrect execution. Even so, we still had to write a FAQ entry about why the design has a hole cut out of it.

Again, please remember that restrictions add complexity. Like all complexity, restrictions need significant justification. At this stage in the design process it is good to think about restrictions that might be appropriate for a particular design, but we should probably only implement those restrictions after actual experience with the unrestricted, simpler design helps us understand whether the restriction would bring enough benefits to pay for its cost.

Contributor

rsc commented Dec 1, 2016

@cznic, @iand, others: Please note that restrictions add complexity. They complicate the explanation of the feature, and they add cognitive load for any user of the feature: if you forget about a restriction, you have to puzzle through why something you thought should work doesn't.

It's often a mistake to implement restrictions on a trial of a design solely due to hypothetical misuse. That happened in the alias proposal discussions, and it made the aliases in the trial unable to handle the io.ByteBuffer => bytes.Buffer conversion from the article. Part of the goal of writing the article is to define some cases we know we want to be able to handle, so that we don't inadvertently restrict them away.

As another example, it would be easy to make a misuse argument to disallow non-pointer receivers, or to disallow methods on non-struct types. If we'd done either of those, you couldn't create enums with String() methods for printing themselves, and you couldn't have http.Headers both be a plain map and provide helper methods. It's often easy to imagine misuses; compelling positive uses can take longer to appear, and it's important to create space for experimentation.

As yet another example, the original design and implementation for pointer vs value methods did not distinguish between the method sets on T and *T: if you had a *T, you could call the value methods (receiver T), and if you had a T, you could call the pointer methods (receiver *T). This was simple, with no restrictions to explain. But then actual experience showed us that allowing pointer method calls on values led to a specific class of confusing, surprising bugs. For example, you could write:

var buf bytes.Buffer
io.Copy(buf, reader)

and io.Copy would succeed, but buf would have nothing in it. We had to choose between explaining why that program ran incorrectly or explaining why that program didn't compile. Either way there were going to be questions, but we came down on the side of avoiding incorrect execution. Even so, we still had to write a FAQ entry about why the design has a hole cut out of it.

Again, please remember that restrictions add complexity. Like all complexity, restrictions need significant justification. At this stage in the design process it is good to think about restrictions that might be appropriate for a particular design, but we should probably only implement those restrictions after actual experience with the unrestricted, simpler design helps us understand whether the restriction would bring enough benefits to pay for its cost.

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Dec 1, 2016

Contributor

Also, my hope is that we can reach a tentative decision about what to try and then have something ready for experimentation at the beginning of the Go 1.9 cycle (ideally the day the cycle opens). Having more time to experiment will have many benefits, among them an opportunity to learn whether a particular restriction is compelling. One mistake with alias was not committing a complete implementation until near the end of the Go 1.8 cycle.

Contributor

rsc commented Dec 1, 2016

Also, my hope is that we can reach a tentative decision about what to try and then have something ready for experimentation at the beginning of the Go 1.9 cycle (ideally the day the cycle opens). Having more time to experiment will have many benefits, among them an opportunity to learn whether a particular restriction is compelling. One mistake with alias was not committing a complete implementation until near the end of the Go 1.8 cycle.

@btracey

This comment has been minimized.

Show comment Hide comment
@btracey

btracey Dec 1, 2016

Contributor

One thing about the original alias proposal is that in the intended use case (enabling refactoring) the actual use of the alias type should only be temporary. In the protobuffer example, the io.BytesBuffer stub was deleted once the gradual repair had been concluded.

If the alias mechanism should only be seen temporarily, does it actually require a language change? Perhaps instead there could be a mechanism to supply gc with a list of "aliases". gc could temporarily make the substitutions, and the author of the downstream codebase could gradually remove items in this file as fixes are merged. I realize this suggestion also has tricky consequences, but it at least encourages a temporary mechanism.

Contributor

btracey commented Dec 1, 2016

One thing about the original alias proposal is that in the intended use case (enabling refactoring) the actual use of the alias type should only be temporary. In the protobuffer example, the io.BytesBuffer stub was deleted once the gradual repair had been concluded.

If the alias mechanism should only be seen temporarily, does it actually require a language change? Perhaps instead there could be a mechanism to supply gc with a list of "aliases". gc could temporarily make the substitutions, and the author of the downstream codebase could gradually remove items in this file as fixes are merged. I realize this suggestion also has tricky consequences, but it at least encourages a temporary mechanism.

@Merovius

This comment has been minimized.

Show comment Hide comment
@Merovius

Merovius Dec 1, 2016

I will not participate in the bikeshedding about syntax (I basically don't care), with one exception: If adding aliases is decided and if it's decided to restrict them to types, please use a syntax that is consistently extensible to at least var, if not also func and const (all proposed syntactical constructs allow for all, except type Foo = pkg.Bar). The reason is that, while I agree that cases where aliases for var make the difference might be rare, I don't think they are non-existent and as such believe that we might well at some point decide to add them too. At that point we definitely will want to have all alias declarations be consistent, it would be bad if it's type Foo = pkg.Bar and var Foo => pkg.Bar.

I'd also slightly argue for having all four. The reasons are

  1. there is a distinction for var and I do sometimes use it. For example I often expose a global var Debug *log.Logger, or reassign global singletons like http.DefaultServeMux to intercept/remove registrations of packages that add handlers to it.

  2. I also think that, while func Foo() { pkg.Bar() } does the same thing as func Foo => pkg.Bar, the intention of the latter is much clearer (especially if you already know about aliases). It clearly states "this isn't really meant to be here". So while technically identical, the alias syntax might serve as documentation.

It's not the hill I'd die on, though; type-aliases alone for now would be fine with me, as long as there is the option to extend them later.

I'm also super glad that this was written up like it was. It summarizes a bunch of opinions I had about API design and stability for a while and will, in the future, serve as a simple reference to link people too :)

However, I also want to emphasize that there where additional use cases covered by aliases that are different from the doc (and AIUI the more general intention of this issue, which is to find some solution to solve gradual repair). I am very glad if the community can agree on the concept of enabling gradual repair, but if a different decision from aliases is decided to reach it, I'd also think that in that case there should be simultaneously talk about if and how to support things like the protobuf public imports or the x/image/draw use case of drop-in replacement packages (both somewhat near to my heart too) with a different solution. @btracey's proposal of a go-tool/gc flag for aliases is an example where I believe that, while it covers gradual repair relatively well, it is not really acceptable for those other usecases. You can't really expect everyone who wants to compile something that uses x/image/draw to pass those flags, they should just be able to go get.

Merovius commented Dec 1, 2016

I will not participate in the bikeshedding about syntax (I basically don't care), with one exception: If adding aliases is decided and if it's decided to restrict them to types, please use a syntax that is consistently extensible to at least var, if not also func and const (all proposed syntactical constructs allow for all, except type Foo = pkg.Bar). The reason is that, while I agree that cases where aliases for var make the difference might be rare, I don't think they are non-existent and as such believe that we might well at some point decide to add them too. At that point we definitely will want to have all alias declarations be consistent, it would be bad if it's type Foo = pkg.Bar and var Foo => pkg.Bar.

I'd also slightly argue for having all four. The reasons are

  1. there is a distinction for var and I do sometimes use it. For example I often expose a global var Debug *log.Logger, or reassign global singletons like http.DefaultServeMux to intercept/remove registrations of packages that add handlers to it.

  2. I also think that, while func Foo() { pkg.Bar() } does the same thing as func Foo => pkg.Bar, the intention of the latter is much clearer (especially if you already know about aliases). It clearly states "this isn't really meant to be here". So while technically identical, the alias syntax might serve as documentation.

It's not the hill I'd die on, though; type-aliases alone for now would be fine with me, as long as there is the option to extend them later.

I'm also super glad that this was written up like it was. It summarizes a bunch of opinions I had about API design and stability for a while and will, in the future, serve as a simple reference to link people too :)

However, I also want to emphasize that there where additional use cases covered by aliases that are different from the doc (and AIUI the more general intention of this issue, which is to find some solution to solve gradual repair). I am very glad if the community can agree on the concept of enabling gradual repair, but if a different decision from aliases is decided to reach it, I'd also think that in that case there should be simultaneously talk about if and how to support things like the protobuf public imports or the x/image/draw use case of drop-in replacement packages (both somewhat near to my heart too) with a different solution. @btracey's proposal of a go-tool/gc flag for aliases is an example where I believe that, while it covers gradual repair relatively well, it is not really acceptable for those other usecases. You can't really expect everyone who wants to compile something that uses x/image/draw to pass those flags, they should just be able to go get.

@bcmills

This comment has been minimized.

Show comment Hide comment
@bcmills

bcmills Dec 1, 2016

Member

@jba

@iand if we limit the right-hand side to a package-qualified identifier, then that would eliminate your specific concern.

It would also mean you couldn't have aliases to any types in the current package, […]. But those uses have nothing to do with the main goal of gradual code repair, so maybe they are no great loss.

Renaming within a package (e.g. to a more idiomatic or consistent name) is certainly a type of refactoring one might reasonably want to do, and if the package is used widely then that necessitates gradual repair.

I think a restriction to only package-qualified names would be a mistake. (A restriction to only exported names might be more tolerable.)

Member

bcmills commented Dec 1, 2016

@jba

@iand if we limit the right-hand side to a package-qualified identifier, then that would eliminate your specific concern.

It would also mean you couldn't have aliases to any types in the current package, […]. But those uses have nothing to do with the main goal of gradual code repair, so maybe they are no great loss.

Renaming within a package (e.g. to a more idiomatic or consistent name) is certainly a type of refactoring one might reasonably want to do, and if the package is used widely then that necessitates gradual repair.

I think a restriction to only package-qualified names would be a mistake. (A restriction to only exported names might be more tolerable.)

@bcmills

This comment has been minimized.

Show comment Hide comment
@bcmills

bcmills Dec 1, 2016

Member

@btracey

Perhaps instead there could be a mechanism to supply gc with a list of "aliases". gc could temporarily make the substitutions, and the author of the downstream codebase could gradually remove items in this file as fixes are merged.

A mechanism for gc would either mean that the code is only buildable with gc during the repair process, or that the mechanism would have to be supported by the other compilers (e.g. gccgo and llgo) too. A "non-language-change" mechanism which must be supported by all implementations is a de facto language change — it's just one with poorer documentation.

Member

bcmills commented Dec 1, 2016

@btracey

Perhaps instead there could be a mechanism to supply gc with a list of "aliases". gc could temporarily make the substitutions, and the author of the downstream codebase could gradually remove items in this file as fixes are merged.

A mechanism for gc would either mean that the code is only buildable with gc during the repair process, or that the mechanism would have to be supported by the other compilers (e.g. gccgo and llgo) too. A "non-language-change" mechanism which must be supported by all implementations is a de facto language change — it's just one with poorer documentation.

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Dec 1, 2016

Contributor

@btracey and @bcmills, and not just the compilers: any tool that analyzes source code, like guru or anything else people have built. It's certainly a language change no matter how you slice it.

Contributor

rsc commented Dec 1, 2016

@btracey and @bcmills, and not just the compilers: any tool that analyzes source code, like guru or anything else people have built. It's certainly a language change no matter how you slice it.

@btracey

This comment has been minimized.

Show comment Hide comment
@btracey

btracey Dec 1, 2016

Contributor

Okay, thanks.

Contributor

btracey commented Dec 1, 2016

Okay, thanks.

@jimmyfrasche

This comment has been minimized.

Show comment Hide comment
@jimmyfrasche

jimmyfrasche Dec 1, 2016

Contributor

Another possibility is aliases for everything except consts (and @rsc please forgive me for proposing a restriction!)

For consts, => is really just a longer way to write =. There's no new semantics, as with types and vars. There's no saved keystrokes as with funcs.

That would resolve #17784 at least.

The counterargument would be that tooling could treat the cases differently and that it could be an indicator of intent. That's a good counterargument, but I don't think it outweighs the fact that it's basically two ways to do exactly the same thing.

That said, I'm fine with just type aliases for now, they are certainly the most important. I definitely agree with @Merovius that we should strongly consider retaining the option for adding var and func aliases in the future, even if those doesn't happen for some time.

Contributor

jimmyfrasche commented Dec 1, 2016

Another possibility is aliases for everything except consts (and @rsc please forgive me for proposing a restriction!)

For consts, => is really just a longer way to write =. There's no new semantics, as with types and vars. There's no saved keystrokes as with funcs.

That would resolve #17784 at least.

The counterargument would be that tooling could treat the cases differently and that it could be an indicator of intent. That's a good counterargument, but I don't think it outweighs the fact that it's basically two ways to do exactly the same thing.

That said, I'm fine with just type aliases for now, they are certainly the most important. I definitely agree with @Merovius that we should strongly consider retaining the option for adding var and func aliases in the future, even if those doesn't happen for some time.

@xiegeo

This comment has been minimized.

Show comment Hide comment
@xiegeo

xiegeo Dec 1, 2016

How about hiding aliases behind an import, just like for "C" and “unsafe”, to further discourage it's usage? In the same vein, I would like the aliases syntax to be verbose and stand out as a scaffold for on going refactoring.

xiegeo commented Dec 1, 2016

How about hiding aliases behind an import, just like for "C" and “unsafe”, to further discourage it's usage? In the same vein, I would like the aliases syntax to be verbose and stand out as a scaffold for on going refactoring.

@josharian

This comment has been minimized.

Show comment Hide comment
@josharian

josharian Dec 1, 2016

Contributor

As an attempt to open up the design space a little, here are some ideas. They're not fleshed out. They're probably bad and/or impossible; the hope is mainly to trigger new/better ideas in others. And if there's any interest, we can explore further.

The motivating idea for (1) and (2) is to somehow use conversion instead of aliases. In #17746, aliases ran into issues around having multiple names for the same type (or multiple ways to spell the same name, depending on whether you think of aliases as like #define or as like hard links). Using conversion sidesteps that by keeping the types distinct.

  1. Add more automatic conversion.

When you call fmt.Println("abc") or write var e interface{} = "abc", "abc" is automatically converted to an interface{}. We could change the language so that when you have declared type T struct { S }, and T has no non-promoted methods, the compiler will automatically convert between S and T as necessary, including recursively inside other structs. T could then serve as a de-facto alias of S (or vice versa) for gradual refactoring purposes.

  1. Add a new "looks like" kind of type.

Let type T ~S declare a new type T that is a type that "looks like S". More precisely, T is "any type convertible to and from type S". (As always, syntax could be discussed later.) Like interface types, T cannot have methods; to do basically anything at all with T, you need to convert it to S (or a type convertible to/from S). Unlike interface types, there is no "concrete type", conversion between S to T and T to S involves no representation changes. For gradual refactoring, these "looks like" types would allow authors to write APIs accepting both old and new types. ("Looks like" types are basically a highly restricted, simplified union type.)

  1. Type tags

Bonus super-hideous idea. (Please don't bother telling me this is awful--I know it. I'm only trying to spur new ideas in others.) What if we introduced type tags (like struct tags), and used special type tags to set up and control aliases, like say type T S "alias:\"T\"". Type tags will have other uses as well and it provides scope for more specification of aliases by the package author than merely "this type is an alias"; for example, the author of the code could specify embedding behavior.

Contributor

josharian commented Dec 1, 2016

As an attempt to open up the design space a little, here are some ideas. They're not fleshed out. They're probably bad and/or impossible; the hope is mainly to trigger new/better ideas in others. And if there's any interest, we can explore further.

The motivating idea for (1) and (2) is to somehow use conversion instead of aliases. In #17746, aliases ran into issues around having multiple names for the same type (or multiple ways to spell the same name, depending on whether you think of aliases as like #define or as like hard links). Using conversion sidesteps that by keeping the types distinct.

  1. Add more automatic conversion.

When you call fmt.Println("abc") or write var e interface{} = "abc", "abc" is automatically converted to an interface{}. We could change the language so that when you have declared type T struct { S }, and T has no non-promoted methods, the compiler will automatically convert between S and T as necessary, including recursively inside other structs. T could then serve as a de-facto alias of S (or vice versa) for gradual refactoring purposes.

  1. Add a new "looks like" kind of type.

Let type T ~S declare a new type T that is a type that "looks like S". More precisely, T is "any type convertible to and from type S". (As always, syntax could be discussed later.) Like interface types, T cannot have methods; to do basically anything at all with T, you need to convert it to S (or a type convertible to/from S). Unlike interface types, there is no "concrete type", conversion between S to T and T to S involves no representation changes. For gradual refactoring, these "looks like" types would allow authors to write APIs accepting both old and new types. ("Looks like" types are basically a highly restricted, simplified union type.)

  1. Type tags

Bonus super-hideous idea. (Please don't bother telling me this is awful--I know it. I'm only trying to spur new ideas in others.) What if we introduced type tags (like struct tags), and used special type tags to set up and control aliases, like say type T S "alias:\"T\"". Type tags will have other uses as well and it provides scope for more specification of aliases by the package author than merely "this type is an alias"; for example, the author of the code could specify embedding behavior.

@nigeltao

This comment has been minimized.

Show comment Hide comment
@nigeltao

nigeltao Dec 1, 2016

Contributor

If we do try aliases again, it might be worth thinking about "what does godoc do", similar to the "what does iota do" and "what does embedding do" issues.

Specifically, if we have

type  OldAPI => NewPackage.API

and NewPackage.API has a doc comment, are we expected to copy/paste that comment next to "type OldAPI", are we expected to leave it un-commented (with godoc automatically providing a link or automatically copy/pasting), or will there be some other convention?

Contributor

nigeltao commented Dec 1, 2016

If we do try aliases again, it might be worth thinking about "what does godoc do", similar to the "what does iota do" and "what does embedding do" issues.

Specifically, if we have

type  OldAPI => NewPackage.API

and NewPackage.API has a doc comment, are we expected to copy/paste that comment next to "type OldAPI", are we expected to leave it un-commented (with godoc automatically providing a link or automatically copy/pasting), or will there be some other convention?

@nigeltao

This comment has been minimized.

Show comment Hide comment
@nigeltao

nigeltao Dec 1, 2016

Contributor

Somewhat tangential, while the primary motivation is and should be supporting gradual code repair, a minor use case (going back to the alias proposal, since that is a concrete proposal) could be to avoid a double function-call overhead when presenting a single function backed by multiple, build-tag-dependent implementations. I'm only hand-waving right now, but I feel like aliases could have been useful in the recent https://groups.google.com/d/topic/golang-nuts/wb5I2tjrwoc/discussion "Avoiding function call overhead in packages with go+asm implementations" discussion.

Contributor

nigeltao commented Dec 1, 2016

Somewhat tangential, while the primary motivation is and should be supporting gradual code repair, a minor use case (going back to the alias proposal, since that is a concrete proposal) could be to avoid a double function-call overhead when presenting a single function backed by multiple, build-tag-dependent implementations. I'm only hand-waving right now, but I feel like aliases could have been useful in the recent https://groups.google.com/d/topic/golang-nuts/wb5I2tjrwoc/discussion "Avoiding function call overhead in packages with go+asm implementations" discussion.

@jimmyfrasche

This comment has been minimized.

Show comment Hide comment
@jimmyfrasche

jimmyfrasche Dec 1, 2016

Contributor

@nigeltao re godoc, I think:

It should always link to the original, regardless.

If there's docs on the alias, those should be displayed, regardless.

If there are not docs on the alias, it's tempting to have godoc display the original docs, but the name of the type would be wrong if the alias also changed the name, the docs could refer to items not in the current package, and, if it's being used for gradual refactoring, there could be a message that says "Deprecated: use X" when you're looking at X.

However, maybe that wouldn't matter for the majority of use cases. Those are things that could go wrong, not things that will go wrong. And some of them could be detected by linting, like renamed aliases and accidentally copying deprecation warnings.

Contributor

jimmyfrasche commented Dec 1, 2016

@nigeltao re godoc, I think:

It should always link to the original, regardless.

If there's docs on the alias, those should be displayed, regardless.

If there are not docs on the alias, it's tempting to have godoc display the original docs, but the name of the type would be wrong if the alias also changed the name, the docs could refer to items not in the current package, and, if it's being used for gradual refactoring, there could be a message that says "Deprecated: use X" when you're looking at X.

However, maybe that wouldn't matter for the majority of use cases. Those are things that could go wrong, not things that will go wrong. And some of them could be detected by linting, like renamed aliases and accidentally copying deprecation warnings.

@tux21b

This comment has been minimized.

Show comment Hide comment
@tux21b

tux21b Dec 1, 2016

Contributor

I am not sure if the following idea had been posted before, but what's about a mostly tool-based "gofix" / "gorename" like approach? To elaborate:

  • any package can contain a set of rewriting rules (e.g. mapping pkg.Ident => otherpkg.Ident)
  • those rewriting rules can be specified with //+rewrite ... tags inside arbitrary go files
  • those rewriting rules are not limited to ABI compatible changes, it's also possible to do other things (e.g. pkg.MyFunc(a) => pkg.MyFunc(context.Contex(), a))
  • a gofix like tool can be used to apply all transformations to the current repository. This makes it easy for users of a package to update their code.
  • it's not necessary to call the gofix tool in order to compile successfully. A library that still wants to use the old API of a dependency X (to stay compatible with old and new versions of X) can still do so. The go build command should apply the transformations (specified in the rewrite tags of package X) on-the-fly without changing the files on disk.

The last steps might complicate / slow-down the compiler a bit, but it's basically just a pre-processor and the amount of rewrite rules should be kept small anyway. So, enough brainstorming for today :)

Contributor

tux21b commented Dec 1, 2016

I am not sure if the following idea had been posted before, but what's about a mostly tool-based "gofix" / "gorename" like approach? To elaborate:

  • any package can contain a set of rewriting rules (e.g. mapping pkg.Ident => otherpkg.Ident)
  • those rewriting rules can be specified with //+rewrite ... tags inside arbitrary go files
  • those rewriting rules are not limited to ABI compatible changes, it's also possible to do other things (e.g. pkg.MyFunc(a) => pkg.MyFunc(context.Contex(), a))
  • a gofix like tool can be used to apply all transformations to the current repository. This makes it easy for users of a package to update their code.
  • it's not necessary to call the gofix tool in order to compile successfully. A library that still wants to use the old API of a dependency X (to stay compatible with old and new versions of X) can still do so. The go build command should apply the transformations (specified in the rewrite tags of package X) on-the-fly without changing the files on disk.

The last steps might complicate / slow-down the compiler a bit, but it's basically just a pre-processor and the amount of rewrite rules should be kept small anyway. So, enough brainstorming for today :)

@uluyol

This comment has been minimized.

Show comment Hide comment
@uluyol

uluyol Dec 1, 2016

Contributor

Using aliases to avoid function call overhead seems like a hack to work around the compiler's inability to inline non-leaf functions. I don't think implementation deficiencies should influence the language spec.

Contributor

uluyol commented Dec 1, 2016

Using aliases to avoid function call overhead seems like a hack to work around the compiler's inability to inline non-leaf functions. I don't think implementation deficiencies should influence the language spec.

@Merovius

This comment has been minimized.

Show comment Hide comment
@Merovius

Merovius Dec 1, 2016

@josharian While you didn't intend them as full proposals, let me response (even if only, so that whoever is inspired by you can take the immediate criticism into account):

  1. Doesn't really solve the problem, because conversions aren't really the issue. x/net/context.Context is assignable/convertable/whateverable to context.Context. The problem are higher-order types; namely the types func (ctx x/net/context.Context) and func (ctx context.Context) are not the same, even though the arguments are assignable. So, for 1 to solve the problem, type T struct { S } would need to mean, that T and S are identical types. Which means, that you are simply using a different syntax for aliases after all (just that this syntax already has a different meaning).

  2. Again has a problem with higher-order types, because assignable/convertible types do not necessarily have the same memory representation (and if they do, the interpretation might change significantly). For example, an uint8 is convertible to an uint64 and vice-versa. But that would mean, that, e.g. with type T ~uint8, the compiler can't know how to call a func(T); does it need to push 1, 2,4 or 8 bytes on the stack? There might be ways around this issue, but it sounds pretty complicated to me (and harder to understand than aliases).

Merovius commented Dec 1, 2016

@josharian While you didn't intend them as full proposals, let me response (even if only, so that whoever is inspired by you can take the immediate criticism into account):

  1. Doesn't really solve the problem, because conversions aren't really the issue. x/net/context.Context is assignable/convertable/whateverable to context.Context. The problem are higher-order types; namely the types func (ctx x/net/context.Context) and func (ctx context.Context) are not the same, even though the arguments are assignable. So, for 1 to solve the problem, type T struct { S } would need to mean, that T and S are identical types. Which means, that you are simply using a different syntax for aliases after all (just that this syntax already has a different meaning).

  2. Again has a problem with higher-order types, because assignable/convertible types do not necessarily have the same memory representation (and if they do, the interpretation might change significantly). For example, an uint8 is convertible to an uint64 and vice-versa. But that would mean, that, e.g. with type T ~uint8, the compiler can't know how to call a func(T); does it need to push 1, 2,4 or 8 bytes on the stack? There might be ways around this issue, but it sounds pretty complicated to me (and harder to understand than aliases).

@josharian

This comment has been minimized.

Show comment Hide comment
@josharian

josharian Dec 1, 2016

Contributor

Thanks, @Merovius.

  1. Yes, I missed interface satisfaction here. You're right, this doesn't do the job.

  2. I had in mind "have the same memory representation". Convertible back-and-forth is clearly not the right elucidation of that--thanks.

Contributor

josharian commented Dec 1, 2016

Thanks, @Merovius.

  1. Yes, I missed interface satisfaction here. You're right, this doesn't do the job.

  2. I had in mind "have the same memory representation". Convertible back-and-forth is clearly not the right elucidation of that--thanks.

@nigeltao

This comment has been minimized.

Show comment Hide comment
@nigeltao

nigeltao Dec 2, 2016

Contributor

@uluyol yes, it's largely about the compiler's inability to inline non-leaf functions, but explicit aliasing might be less surprising with respect to whether or not inlined calls to non-leafs should show up in stack traces, runtime.Callers, etc.

In any case, as I said, it's a minor tangent.

Contributor

nigeltao commented Dec 2, 2016

@uluyol yes, it's largely about the compiler's inability to inline non-leaf functions, but explicit aliasing might be less surprising with respect to whether or not inlined calls to non-leafs should show up in stack traces, runtime.Callers, etc.

In any case, as I said, it's a minor tangent.

@Merovius

This comment has been minimized.

Show comment Hide comment
@Merovius

Merovius Dec 2, 2016

@josharian Similar problem: [2]uintptr and interface{} have the same memory representation; so only relying on memory representation will allow circumventing type safety. uint64 and float64 have both the same memory representation and are convertible back-and-forth, but would still lead to really weird results at the least, if you don't know which is which.

You might get away with "same underlying type", though. Not sure what the implications would be for that. Off the top of my hat, that might lead to wrongness if a type is used in fields, for example. If you have type S1 struct { T1 } and type S2 struct { T2 } (with T1 and T2 the same underlying type), then under type L1 ~T1 both might be work as type S struct { L1 }, but as T1 and T2 still have a different (though looking alike) underlying type, with type L2 ~S1 you won't have S2 looking alike S1 and not be usable as an L2.

So you'd have to, in a bunch of places in the spec, replace or amend "identical types" with "same underlying type" to make this work, which seems unwieldy and will probably have unforeseen consequences for type safety. "look-alike" types also seem to have an even greater abuse and confusion potential than aliases, IMHO, which seem to be the main arguments against aliases.

If anyone can come up with a simple rule for it, though, that doesn't have these problems, it should definitely be considered as an alternative :)

Merovius commented Dec 2, 2016

@josharian Similar problem: [2]uintptr and interface{} have the same memory representation; so only relying on memory representation will allow circumventing type safety. uint64 and float64 have both the same memory representation and are convertible back-and-forth, but would still lead to really weird results at the least, if you don't know which is which.

You might get away with "same underlying type", though. Not sure what the implications would be for that. Off the top of my hat, that might lead to wrongness if a type is used in fields, for example. If you have type S1 struct { T1 } and type S2 struct { T2 } (with T1 and T2 the same underlying type), then under type L1 ~T1 both might be work as type S struct { L1 }, but as T1 and T2 still have a different (though looking alike) underlying type, with type L2 ~S1 you won't have S2 looking alike S1 and not be usable as an L2.

So you'd have to, in a bunch of places in the spec, replace or amend "identical types" with "same underlying type" to make this work, which seems unwieldy and will probably have unforeseen consequences for type safety. "look-alike" types also seem to have an even greater abuse and confusion potential than aliases, IMHO, which seem to be the main arguments against aliases.

If anyone can come up with a simple rule for it, though, that doesn't have these problems, it should definitely be considered as an alternative :)

@iand

This comment has been minimized.

Show comment Hide comment
@iand

iand Dec 2, 2016

Following on from @josharian's ideation, here's a variation of his number 2:

Allow the specification of "substitutable types". This is a list of types that may be substituted for the named type in function arguments, return values etc. The compiler would allow calling of a function with an argument of the named type or any of its substitutes. The substitute types must have a compatible definition with the named type. Compatible here means identical memory representations and identical declarations after allowing for other substitute types in the declaration.

One immediate problem is that the directionality of this relationship is opposite to the alias proposal which inverts the dependency graph. This alone might make it unworkable but I propose it here because others might think of a way around this. One way might be to declare substitutes as //go comments rather than via the import graph. In this way they perhaps become more like macros.

Conversely there are some advantages to this reversal of directionality:

  • the set of substitutable types is controlled by the author of the new package who is better placed to guarantee semantics
  • no code changes are required in the original package so clients don't have to update until they start using the new package

Applying this to the Context refactoring: the standard library context package would declare that context.Context may be substituted by golang.org/x/net/context.Context. This means any usage that accepts context.Context may also accept a golang.org/x/net/context.Context in its place. However functions in the context package that return a Context would always return a context.Context.

This proposal circumvents the embedding issue (#17746) because the name of the embedded type never changes. However, an embedded type could be initialised using a value of a substitute type.

iand commented Dec 2, 2016

Following on from @josharian's ideation, here's a variation of his number 2:

Allow the specification of "substitutable types". This is a list of types that may be substituted for the named type in function arguments, return values etc. The compiler would allow calling of a function with an argument of the named type or any of its substitutes. The substitute types must have a compatible definition with the named type. Compatible here means identical memory representations and identical declarations after allowing for other substitute types in the declaration.

One immediate problem is that the directionality of this relationship is opposite to the alias proposal which inverts the dependency graph. This alone might make it unworkable but I propose it here because others might think of a way around this. One way might be to declare substitutes as //go comments rather than via the import graph. In this way they perhaps become more like macros.

Conversely there are some advantages to this reversal of directionality:

  • the set of substitutable types is controlled by the author of the new package who is better placed to guarantee semantics
  • no code changes are required in the original package so clients don't have to update until they start using the new package

Applying this to the Context refactoring: the standard library context package would declare that context.Context may be substituted by golang.org/x/net/context.Context. This means any usage that accepts context.Context may also accept a golang.org/x/net/context.Context in its place. However functions in the context package that return a Context would always return a context.Context.

This proposal circumvents the embedding issue (#17746) because the name of the embedded type never changes. However, an embedded type could be initialised using a value of a substitute type.

@4ad

This comment has been minimized.

Show comment Hide comment
@4ad

4ad Dec 2, 2016

Member

@iand @josharian you are asking for a certain variant of covariant types.

Member

4ad commented Dec 2, 2016

@iand @josharian you are asking for a certain variant of covariant types.

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Dec 2, 2016

Contributor

@josharian, thanks for the suggestions.

Re type T struct { S }, that looks like a different syntax for alias, and not necessarily a clearer one.

Re type T ~S, I am either not sure how it differs from alias or not sure how it helps refactoring. I guess in a refactoring (say, io.ByteBuffer -> bytes.Buffer), you would write:

package io
type ByteBuffer ~bytes.Buffer

but then if, as you say, "to do basically anything at all with T, you need to convert it to S", then all the code doing anything with io.ByteBuffer still breaks.

Re type T S "alias": A key point @bcmills made above is that having multiple equivalent names for types is a language change, no matter how it is spelled. All compilers need to know that, say, io.ByteBuffer and bytes.Buffer are the same, as do any tools that analyze or even type-check code. The key part of your suggestion seems to me something like "maybe we should plan ahead for other additions". Maybe, but it's unclear that a string would be the best way to describe those, and it's also unclear we want to design syntax (like Java generalized annotations) without a clear need. Even if we did have a general form, we'd still need to consider carefully all the implications of any new semantics we introduced, and most would still be language changes that would require updating all tools (except gofmt, admittedly). On balance it seems simpler to continue to find the clearest way to write the forms we need one by one instead of creating a meta-language of one kind or another.

@Merovius FWIW, I would say that [2]uintptr and interface{} do not have the same memory representation. An interface{} is a [2]unsafe.Pointer not a [2]uintptr. A uintptr and a pointer are different representations. But I think your general point is right, that we do not want to necessarily allow direct conversion of that kind of thing. I mean, can you convert from interface{} to [2]*byte too? It's a lot more than is needed here.

Contributor

rsc commented Dec 2, 2016

@josharian, thanks for the suggestions.

Re type T struct { S }, that looks like a different syntax for alias, and not necessarily a clearer one.

Re type T ~S, I am either not sure how it differs from alias or not sure how it helps refactoring. I guess in a refactoring (say, io.ByteBuffer -> bytes.Buffer), you would write:

package io
type ByteBuffer ~bytes.Buffer

but then if, as you say, "to do basically anything at all with T, you need to convert it to S", then all the code doing anything with io.ByteBuffer still breaks.

Re type T S "alias": A key point @bcmills made above is that having multiple equivalent names for types is a language change, no matter how it is spelled. All compilers need to know that, say, io.ByteBuffer and bytes.Buffer are the same, as do any tools that analyze or even type-check code. The key part of your suggestion seems to me something like "maybe we should plan ahead for other additions". Maybe, but it's unclear that a string would be the best way to describe those, and it's also unclear we want to design syntax (like Java generalized annotations) without a clear need. Even if we did have a general form, we'd still need to consider carefully all the implications of any new semantics we introduced, and most would still be language changes that would require updating all tools (except gofmt, admittedly). On balance it seems simpler to continue to find the clearest way to write the forms we need one by one instead of creating a meta-language of one kind or another.

@Merovius FWIW, I would say that [2]uintptr and interface{} do not have the same memory representation. An interface{} is a [2]unsafe.Pointer not a [2]uintptr. A uintptr and a pointer are different representations. But I think your general point is right, that we do not want to necessarily allow direct conversion of that kind of thing. I mean, can you convert from interface{} to [2]*byte too? It's a lot more than is needed here.

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Dec 2, 2016

Contributor

@jimmyfrasche and @nigeltao, re godoc: I agree that we need that working early too. I agree that we should not hard-code the assumption "the new feature - whatever it is - will only be used for codebase refactoring". It may have other important uses, like Nigel found for helping to write a draw extension package with aliases. I expect that deprecated things will be marked deprecated in their doc comments explicitly, as Jimmy said. I did think about generating a doc comment automatically if one is not there, but there's nothing obvious to say that shouldn't already be clear from the syntax (speaking generally). To make a specific example, consider the old Go 1.8 aliases. Given

type ByteBuffer => bytes.Buffer

we could synthesize a doc comment saying "ByteBuffer is an alias for bytes.Buffer", but that seems redundant with displaying the definition. If someone writes "type X struct{}" today, we don't synthesize "X is a named type for a struct{}".

Contributor

rsc commented Dec 2, 2016

@jimmyfrasche and @nigeltao, re godoc: I agree that we need that working early too. I agree that we should not hard-code the assumption "the new feature - whatever it is - will only be used for codebase refactoring". It may have other important uses, like Nigel found for helping to write a draw extension package with aliases. I expect that deprecated things will be marked deprecated in their doc comments explicitly, as Jimmy said. I did think about generating a doc comment automatically if one is not there, but there's nothing obvious to say that shouldn't already be clear from the syntax (speaking generally). To make a specific example, consider the old Go 1.8 aliases. Given

type ByteBuffer => bytes.Buffer

we could synthesize a doc comment saying "ByteBuffer is an alias for bytes.Buffer", but that seems redundant with displaying the definition. If someone writes "type X struct{}" today, we don't synthesize "X is a named type for a struct{}".

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Dec 2, 2016

Contributor

@iand, thanks. It sounds like your proposal requires the author of the new package to write the exact definition from the old package and then also a declaration linking the two, like (making up syntax):

package old
type T { x int }

package new
import "old"
type T1 { x int }
substitutable T1 <- old.T

I agree that the import reversal is problematic and may be a show-stopper by itself, but let's skip that. At this point the codebase seems like it is in a fragile state: now package new can be broken by a change to add a struct field in package old. Given the substitutable line, there is only one possible definition for T1: exactly the same as old.T. If the two types still have distinct definitions then you also have to worry about the methods: do the method implementations need to match too? If not, what happens when you put a T in an interface{} and then pull it out using a type assertion as a T1 and call M()? Do you get T1.M? What if you pull it out as an interface { M() }, without naming T1 directly, and call M()? Do you get T.M? There's a lot of complexity caused by the ambiguity of having both definitions in the source tree.

Of course, you could say that the substitutable line makes the rest redundant and not require a definition for type T1 or any methods. But then that's basically the same as writing (in the old alias syntax) type T1 => old.T.

Getting back to the import graph issue, although the examples in the article all made the old code defined in terms of the new code, if the package graph were such that new had to import old instead, it's equally effective to put the redirect in the new package during the transition.

I think this shows that in any transition like this, there's probably not a useful distinction between the author of the new package and the author of the old package. By the end, the goal is that code has been added to new and deleted from old, so both authors (if they're different) need to be involved then. And the two need some kind of coordinated compatibility during the middle too, whether explicit (some kind of redirect) or implicit (type definitions must match exactly, as in the substitutability requirement).

Contributor

rsc commented Dec 2, 2016

@iand, thanks. It sounds like your proposal requires the author of the new package to write the exact definition from the old package and then also a declaration linking the two, like (making up syntax):

package old
type T { x int }

package new
import "old"
type T1 { x int }
substitutable T1 <- old.T

I agree that the import reversal is problematic and may be a show-stopper by itself, but let's skip that. At this point the codebase seems like it is in a fragile state: now package new can be broken by a change to add a struct field in package old. Given the substitutable line, there is only one possible definition for T1: exactly the same as old.T. If the two types still have distinct definitions then you also have to worry about the methods: do the method implementations need to match too? If not, what happens when you put a T in an interface{} and then pull it out using a type assertion as a T1 and call M()? Do you get T1.M? What if you pull it out as an interface { M() }, without naming T1 directly, and call M()? Do you get T.M? There's a lot of complexity caused by the ambiguity of having both definitions in the source tree.

Of course, you could say that the substitutable line makes the rest redundant and not require a definition for type T1 or any methods. But then that's basically the same as writing (in the old alias syntax) type T1 => old.T.

Getting back to the import graph issue, although the examples in the article all made the old code defined in terms of the new code, if the package graph were such that new had to import old instead, it's equally effective to put the redirect in the new package during the transition.

I think this shows that in any transition like this, there's probably not a useful distinction between the author of the new package and the author of the old package. By the end, the goal is that code has been added to new and deleted from old, so both authors (if they're different) need to be involved then. And the two need some kind of coordinated compatibility during the middle too, whether explicit (some kind of redirect) or implicit (type definitions must match exactly, as in the substitutability requirement).

@nathany

This comment has been minimized.

Show comment Hide comment
@nathany

nathany Feb 2, 2017

Contributor

Perhaps type intSet = map[int]struct{} (not exported) would be a better way to use unnamed type aliases, but this sounds like the domain of CodeReviewComments and recommended programming practices, rather than limiting the feature.

Contributor

nathany commented Feb 2, 2017

Perhaps type intSet = map[int]struct{} (not exported) would be a better way to use unnamed type aliases, but this sounds like the domain of CodeReviewComments and recommended programming practices, rather than limiting the feature.

@nathany

This comment has been minimized.

Show comment Hide comment
@nathany

nathany Feb 2, 2017

Contributor

That said, %T is a handy tool to see types when debugging or exploring the type system. I wonder if there should be a similar format verb that includes the alias? q.B = []int in @davecheney's example.

Contributor

nathany commented Feb 2, 2017

That said, %T is a handy tool to see types when debugging or exploring the type system. I wonder if there should be a similar format verb that includes the alias? q.B = []int in @davecheney's example.

@bcmills

This comment has been minimized.

Show comment Hide comment
@bcmills

bcmills Feb 2, 2017

Member

@nathany How you you implement that verb? The alias information is not present at runtime. (As far as the reflect package is concerned, the alias is the same type as the thing it is aliased to.)

Member

bcmills commented Feb 2, 2017

@nathany How you you implement that verb? The alias information is not present at runtime. (As far as the reflect package is concerned, the alias is the same type as the thing it is aliased to.)

@nathany

This comment has been minimized.

Show comment Hide comment
@nathany

nathany Feb 2, 2017

Contributor

@bcmills I thought that might be the case... 😞

I imagine static analysis tools and editor plugins are still in the picture to help work with aliases, so that's okay.

Contributor

nathany commented Feb 2, 2017

@bcmills I thought that might be the case... 😞

I imagine static analysis tools and editor plugins are still in the picture to help work with aliases, so that's okay.

@minux

This comment has been minimized.

Show comment Hide comment
@minux

minux Feb 3, 2017

Member
Member

minux commented Feb 3, 2017

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Feb 3, 2017

Contributor

@davecheney I added your suggestion to the "Restrictions" section of the discussion summary at the top. Like all restrictions, our general position is that restrictions add complexity (see notes above) and we would likely need to see actual evidence of widespread harm in order to introduce a restriction. Having to change the way you teach Go is not sufficient: any change we make to the language will require changing the way you teach Go.

As noted in the design doc and on the mailing list, we are working on better terminology to make explanations easier.

Contributor

rsc commented Feb 3, 2017

@davecheney I added your suggestion to the "Restrictions" section of the discussion summary at the top. Like all restrictions, our general position is that restrictions add complexity (see notes above) and we would likely need to see actual evidence of widespread harm in order to introduce a restriction. Having to change the way you teach Go is not sufficient: any change we make to the language will require changing the way you teach Go.

As noted in the design doc and on the mailing list, we are working on better terminology to make explanations easier.

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Feb 3, 2017

Contributor

@minux, like @bcmills pointed out, alias information does not exist at runtime (completely fundamental to the design). There is no way to implement a "%T that includes the alias".

Contributor

rsc commented Feb 3, 2017

@minux, like @bcmills pointed out, alias information does not exist at runtime (completely fundamental to the design). There is no way to implement a "%T that includes the alias".

@minux

This comment has been minimized.

Show comment Hide comment
@minux

minux Feb 3, 2017

Member
Member

minux commented Feb 3, 2017

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Feb 3, 2017

Contributor

@minux, oh I see, you are replying via email and Github makes the quoted text look like text you wrote yourself. I was replying to the text you quoted from Nathan Youngman, thinking it was yours. Sorry for the confusion.

Contributor

rsc commented Feb 3, 2017

@minux, oh I see, you are replying via email and Github makes the quoted text look like text you wrote yourself. I was replying to the text you quoted from Nathan Youngman, thinking it was yours. Sorry for the confusion.

@nathany

This comment has been minimized.

Show comment Hide comment
@nathany

nathany Feb 3, 2017

Contributor

Regarding terminology and teaching, I found the branded types background @griesemer posted quite informative. Thanks for that.

When explaining types and type conversions, baby gophers initially think I'm talking about a type alias, likely due to familiarity with other languages.

Whatever the final terminology, I could imagine introducing type aliases before named (branded) types, especially since declaring new named types is likely to come after introducing byte and rune in any book or curriculum. However, I do want to be mindful of @davecheney's concern to not encourage anti-patterns.

For type intSet map[int]struct{} we say map[int]struct{} is the underlying type. What do we call either side of type intSet = map[int]struct{}? Alias and aliased type?

As for %T, I already need to explain that a byte and rune result in a uint8 and int32, so this is no different.

If anything, I think type aliases will make byte and rune easier to explain. IMO, the challenge will be knowing when to use named types vs. type aliases, and then being able to communicate that.

Contributor

nathany commented Feb 3, 2017

Regarding terminology and teaching, I found the branded types background @griesemer posted quite informative. Thanks for that.

When explaining types and type conversions, baby gophers initially think I'm talking about a type alias, likely due to familiarity with other languages.

Whatever the final terminology, I could imagine introducing type aliases before named (branded) types, especially since declaring new named types is likely to come after introducing byte and rune in any book or curriculum. However, I do want to be mindful of @davecheney's concern to not encourage anti-patterns.

For type intSet map[int]struct{} we say map[int]struct{} is the underlying type. What do we call either side of type intSet = map[int]struct{}? Alias and aliased type?

As for %T, I already need to explain that a byte and rune result in a uint8 and int32, so this is no different.

If anything, I think type aliases will make byte and rune easier to explain. IMO, the challenge will be knowing when to use named types vs. type aliases, and then being able to communicate that.

@griesemer

This comment has been minimized.

Show comment Hide comment
@griesemer

griesemer Feb 3, 2017

Contributor

@nathany I think it makes a lot of sense to introduce "alias types" first - though I wouldn't use the term necessarily. The newly introduced "alias" declarations are simply regular declarations that don't do anything special. The identifier on the left and the type on the right are one and the same, they denote identical types. I'm not even sure we need the terms alias or aliased type (we don't call a constant name an alias, and the constant value the aliased constant).

The traditional (non-alias) type declaration does more work: It first creates a new type from the type on the right before binding the identifier on the left to it. Thus the identifier and the type on the right are not the same (they only share the same underlying type). This is clearly the more complicated concept.

We do need a new term for these newly created types because any type can now have a name. And we need to be able to refer to them since there are spec rules referring to them (type identity, assignability, receiver base types).

Here's another way to describe this, which may be useful in a teaching environment: A type may either be colored or uncolored. All predeclared types, and all type literals are uncolored. The only way to create a new colored type is via a traditional (non-alias) type declaration which first paints (a copy of) the type on the right with a brand-new, never-before used color (stripping the old color, if any, entirely in the process) before binding the identifier on the left to it. Again, the identifier and the (implicitly and invisibly created) colored type are identical, but they are different from the (differently colored or uncolored) type written down on the right.

Using this analogy, we can reformulate various other existing rules as well:

  • A colored type is always different from any other type (because each type declaration uses a brand-new, never-before used color).
  • Methods may only be associated with receiver base types that are colored.
  • The underlying type of a type is that type stripped off all its color.
    etc.
Contributor

griesemer commented Feb 3, 2017

@nathany I think it makes a lot of sense to introduce "alias types" first - though I wouldn't use the term necessarily. The newly introduced "alias" declarations are simply regular declarations that don't do anything special. The identifier on the left and the type on the right are one and the same, they denote identical types. I'm not even sure we need the terms alias or aliased type (we don't call a constant name an alias, and the constant value the aliased constant).

The traditional (non-alias) type declaration does more work: It first creates a new type from the type on the right before binding the identifier on the left to it. Thus the identifier and the type on the right are not the same (they only share the same underlying type). This is clearly the more complicated concept.

We do need a new term for these newly created types because any type can now have a name. And we need to be able to refer to them since there are spec rules referring to them (type identity, assignability, receiver base types).

Here's another way to describe this, which may be useful in a teaching environment: A type may either be colored or uncolored. All predeclared types, and all type literals are uncolored. The only way to create a new colored type is via a traditional (non-alias) type declaration which first paints (a copy of) the type on the right with a brand-new, never-before used color (stripping the old color, if any, entirely in the process) before binding the identifier on the left to it. Again, the identifier and the (implicitly and invisibly created) colored type are identical, but they are different from the (differently colored or uncolored) type written down on the right.

Using this analogy, we can reformulate various other existing rules as well:

  • A colored type is always different from any other type (because each type declaration uses a brand-new, never-before used color).
  • Methods may only be associated with receiver base types that are colored.
  • The underlying type of a type is that type stripped off all its color.
    etc.
@nathany

This comment has been minimized.

Show comment Hide comment
@nathany

nathany Feb 3, 2017

Contributor

we don't call a constant name an alias, and the constant value the aliased constant

good point 👍

I'm not sure if the coloured vs. uncoloured analogy is easier to understand, but it does demonstrate that there is more than one way to explain the concepts.

Traditional named/branded/coloured types certainly require more explanation. Especially when a named type can be declared using an existing named type. There are fairly subtle differences to keep in mind.

type intSet map[int]struct{} // a new type with an underlying type map[int]struct{}

type myIntSet intSet // a new type with an underlying type map[int]struct{}

type otherIntSet = intSet // just another name (alias) for intSet, add methods to intSet (only in the same package)

type literalIntSet = map[int]struct{} // just another name for map[int]struct{}, no adding methods

It's not insurmountable though. Assuming this lands in Go 1.9, I suspect we'll be seeing 2nd editions of several Go books. 😉

I regularly refer to Go spec for the accepted terminology, so I'm very curious what terms are chosen in the end.

Contributor

nathany commented Feb 3, 2017

we don't call a constant name an alias, and the constant value the aliased constant

good point 👍

I'm not sure if the coloured vs. uncoloured analogy is easier to understand, but it does demonstrate that there is more than one way to explain the concepts.

Traditional named/branded/coloured types certainly require more explanation. Especially when a named type can be declared using an existing named type. There are fairly subtle differences to keep in mind.

type intSet map[int]struct{} // a new type with an underlying type map[int]struct{}

type myIntSet intSet // a new type with an underlying type map[int]struct{}

type otherIntSet = intSet // just another name (alias) for intSet, add methods to intSet (only in the same package)

type literalIntSet = map[int]struct{} // just another name for map[int]struct{}, no adding methods

It's not insurmountable though. Assuming this lands in Go 1.9, I suspect we'll be seeing 2nd editions of several Go books. 😉

I regularly refer to Go spec for the accepted terminology, so I'm very curious what terms are chosen in the end.

@bcmills

This comment has been minimized.

Show comment Hide comment
@bcmills

bcmills Feb 3, 2017

Member

We do need a new term for these newly created types because any type can now have a name.

Some ideas:

  • "distinguished" or "distinct" (as in, can be distinguished from other types)
  • "unique" (as in, it is a type different from all other types)
  • "concrete" (as in, it is an entity that exists in the runtime)
  • "identifiable" (as in, the type has an identity)
Member

bcmills commented Feb 3, 2017

We do need a new term for these newly created types because any type can now have a name.

Some ideas:

  • "distinguished" or "distinct" (as in, can be distinguished from other types)
  • "unique" (as in, it is a type different from all other types)
  • "concrete" (as in, it is an entity that exists in the runtime)
  • "identifiable" (as in, the type has an identity)
@griesemer

This comment has been minimized.

Show comment Hide comment
@griesemer

griesemer Feb 3, 2017

Contributor

@bcmills We've been thinking about distinguished, unique, distinct, branded, colored, defined, non-alias, etc. types. "Concrete" is misleading because an interface can be colored as well, and an interface is the incarnation of an abstract type. "Identifiable" also seems misleading because a "struct{int}" is just as identifiable as any explicitly (non-alias) named type.

Contributor

griesemer commented Feb 3, 2017

@bcmills We've been thinking about distinguished, unique, distinct, branded, colored, defined, non-alias, etc. types. "Concrete" is misleading because an interface can be colored as well, and an interface is the incarnation of an abstract type. "Identifiable" also seems misleading because a "struct{int}" is just as identifiable as any explicitly (non-alias) named type.

@bcmills

This comment has been minimized.

Show comment Hide comment
@bcmills

bcmills Feb 3, 2017

Member

I would recommend against:

  • "colored" (in non-programming contexts the phrase "colored types" carries strong racial-bias connotations)
  • "non-alias" (it's confusing, since the target of the alias may or may not be what was formerly called a "named type")
  • "defined" (aliases are defined too, they're just defined to be aliases)

"branded" could work: it carries a "types as cattle" connotation but that doesn't strike me as intrinsically bad.

Member

bcmills commented Feb 3, 2017

I would recommend against:

  • "colored" (in non-programming contexts the phrase "colored types" carries strong racial-bias connotations)
  • "non-alias" (it's confusing, since the target of the alias may or may not be what was formerly called a "named type")
  • "defined" (aliases are defined too, they're just defined to be aliases)

"branded" could work: it carries a "types as cattle" connotation but that doesn't strike me as intrinsically bad.

@jimmyfrasche

This comment has been minimized.

Show comment Hide comment
@jimmyfrasche

jimmyfrasche Feb 3, 2017

Contributor

Unique and distinct seem like the stand out options so far.

They're simple and understandable without a lot of additional context or knowledge. If I didn't know the distinction, I think I'd at least have a general sense of what they imply. I can't say that about the other choices.

Once you learn the term it doesn't matter, but a connotative name avoids unnecessary barriers to internalizing the distinction.

Contributor

jimmyfrasche commented Feb 3, 2017

Unique and distinct seem like the stand out options so far.

They're simple and understandable without a lot of additional context or knowledge. If I didn't know the distinction, I think I'd at least have a general sense of what they imply. I can't say that about the other choices.

Once you learn the term it doesn't matter, but a connotative name avoids unnecessary barriers to internalizing the distinction.

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Feb 3, 2017

Contributor

This is the definition of a bikeshed argument. Robert has a pending CL at https://go-review.googlesource.com/#/c/36213/ that seems perfectly fine.

Contributor

rsc commented Feb 3, 2017

This is the definition of a bikeshed argument. Robert has a pending CL at https://go-review.googlesource.com/#/c/36213/ that seems perfectly fine.

@gopherbot

This comment has been minimized.

Show comment Hide comment
@gopherbot

gopherbot Feb 3, 2017

CL https://golang.org/cl/36213 mentions this issue.

CL https://golang.org/cl/36213 mentions this issue.

gopherbot pushed a commit that referenced this issue Feb 6, 2017

spec: introduce alias declarations and type definitions
To avoid confusion caused by the term "named type" (which now just
means a type with a name, but formerly meant a type declared with
a non-alias type declaration), a type declaration now comes in two
forms: alias declarations and type definitions. Both declare a type
name, but type definitions also define new types.

Replace the use of "named type" with "defined type" elsewhere in
the spec.

For #18130.

Change-Id: I49f5ddacefce90354eb65ee5fbf10ba737221995
Reviewed-on: https://go-review.googlesource.com/36213
Reviewed-by: Rob Pike <r@golang.org>
@LionNatsu

This comment has been minimized.

Show comment Hide comment
@LionNatsu

LionNatsu Feb 7, 2017

Contributor

I want to bring up the issue of go fix again.

To be clear that I am not suggesting 'take down' the alias. Maybe it is some thing useful and suitable for other jobs, that is another story.

It's something very important IMO that the title is about moving type. I have no wish to perplex the issue. Our aim is to deal with a kind of interface changes in a project. When we come to a change on interface, it is not true that we hope all the users use these two interface (old & new) as the same eventually, and that is why we say 'gradual code repair'. We hope that users remove/change the usage of the old one.

I still consider tool as the best method to repair the code, something like the idea which @tux21b suggested. For example:

$ cat "$GOROOT"/RENAME
# This file could be used for `go fix`
[package]
x/net/context=context
[type]
io.ByteBuffer=bytes.Buffer

$ go fix -rename "$GOROOT"/RENAME [packages]
# -- or --
# use a standard libraries rename table as default
$ go fix -rename [packages]
# -- or --
# include this fix as default
$ go fix [packages]

The only reason @rsc say no here is that changes will affect other tools. But I think it's not true in this work flow: if there is an out-of-date package (e.g. a dependency) uses the deprecated name/path of package, e.g. x/net/context, we can fix the code at first, just like the doc says how to migrate code to new version, but not hard-coding, via a configurable table in text format. Then you may use any tools whenever you like as same as Go of the new version. There is a side-effect: it will modify code.

Contributor

LionNatsu commented Feb 7, 2017

I want to bring up the issue of go fix again.

To be clear that I am not suggesting 'take down' the alias. Maybe it is some thing useful and suitable for other jobs, that is another story.

It's something very important IMO that the title is about moving type. I have no wish to perplex the issue. Our aim is to deal with a kind of interface changes in a project. When we come to a change on interface, it is not true that we hope all the users use these two interface (old & new) as the same eventually, and that is why we say 'gradual code repair'. We hope that users remove/change the usage of the old one.

I still consider tool as the best method to repair the code, something like the idea which @tux21b suggested. For example:

$ cat "$GOROOT"/RENAME
# This file could be used for `go fix`
[package]
x/net/context=context
[type]
io.ByteBuffer=bytes.Buffer

$ go fix -rename "$GOROOT"/RENAME [packages]
# -- or --
# use a standard libraries rename table as default
$ go fix -rename [packages]
# -- or --
# include this fix as default
$ go fix [packages]

The only reason @rsc say no here is that changes will affect other tools. But I think it's not true in this work flow: if there is an out-of-date package (e.g. a dependency) uses the deprecated name/path of package, e.g. x/net/context, we can fix the code at first, just like the doc says how to migrate code to new version, but not hard-coding, via a configurable table in text format. Then you may use any tools whenever you like as same as Go of the new version. There is a side-effect: it will modify code.

@rsc

This comment has been minimized.

Show comment Hide comment
@rsc

rsc Feb 7, 2017

Contributor

@LionNatsu, I think you are right, but I think that's a separate issue: should we adopt conventions for packages to explain to potential clients how to update their code in response to API changes in a mechanical way? Perhaps, but we'd have to figure out what those conventions are. Can you open a separate issue for this topic, pointing back at this conversation? Thanks.

Contributor

rsc commented Feb 7, 2017

@LionNatsu, I think you are right, but I think that's a separate issue: should we adopt conventions for packages to explain to potential clients how to update their code in response to API changes in a mechanical way? Perhaps, but we'd have to figure out what those conventions are. Can you open a separate issue for this topic, pointing back at this conversation? Thanks.

@gopherbot

This comment has been minimized.

Show comment Hide comment
@gopherbot

gopherbot Feb 9, 2017

CL https://golang.org/cl/36691 mentions this issue.

CL https://golang.org/cl/36691 mentions this issue.

gopherbot pushed a commit that referenced this issue Feb 10, 2017

spec: refer to "not defined type" rather than "unnamed type" in conve…
…rsions

We missed this in https://golang.org/cl/36213.
Thanks to Chris Hines for pointing it out.

For #18130.

Change-Id: I6279ab19966c4391c4b4458b21fd2527d3f949dd
Reviewed-on: https://go-review.googlesource.com/36691
Reviewed-by: Ian Lance Taylor <iant@golang.org>
@crawshaw

This comment has been minimized.

Show comment Hide comment
@crawshaw

crawshaw Feb 26, 2017

Contributor

With this proposal at tip, I can now create this package:

package safe

import "unsafe"

type Pointer = unsafe.Pointer

which allows programs to create unsafe.Pointer values without importing unsafe directly:

package main

import "safe"

func main() {
	x := []int{4, 9}
	y := *(*int)(safe.Pointer(uintptr(safe.Pointer(&x[0])) + 8))
	println(y)
}

The original alias declarations design doc calls out this as explicitly supported. It is not explicit in this newer type alias proposal, but it works.

On the alias declaration issue the rational for this is: "The reason we allow aliasing for unsafe.Pointer is that it's already possible to define a type that has unsafe.Pointer as underlying type." #16339 (comment)

While that's true, I think allowing an alias of unsafe.Pointer introduces something new: programs can now create unsafe.Pointer values without explicitly importing unsafe.

To write the program above before this proposal, I would have to move the safe.Pointer cast into a package that imports unsafe. This may make it a bit harder to audit programs for their use of unsafe.

Contributor

crawshaw commented Feb 26, 2017

With this proposal at tip, I can now create this package:

package safe

import "unsafe"

type Pointer = unsafe.Pointer

which allows programs to create unsafe.Pointer values without importing unsafe directly:

package main

import "safe"

func main() {
	x := []int{4, 9}
	y := *(*int)(safe.Pointer(uintptr(safe.Pointer(&x[0])) + 8))
	println(y)
}

The original alias declarations design doc calls out this as explicitly supported. It is not explicit in this newer type alias proposal, but it works.

On the alias declaration issue the rational for this is: "The reason we allow aliasing for unsafe.Pointer is that it's already possible to define a type that has unsafe.Pointer as underlying type." #16339 (comment)

While that's true, I think allowing an alias of unsafe.Pointer introduces something new: programs can now create unsafe.Pointer values without explicitly importing unsafe.

To write the program above before this proposal, I would have to move the safe.Pointer cast into a package that imports unsafe. This may make it a bit harder to audit programs for their use of unsafe.

@bcmills

This comment has been minimized.

Show comment Hide comment
@bcmills

bcmills Feb 26, 2017

Member

@crawshaw, couldn't you have just done this before?

package safe

import (
  "reflect"
  "unsafe"
)

func Pointer(p interface {}) unsafe.Pointer {
  switch v := reflect.ValueOf(p); v.Kind() {
  case reflect.Uintptr:
    return unsafe.Pointer(uintptr(v.Uint()))
  default:
    return unsafe.Pointer(v.Pointer())
  }
}

I believe that would allow exactly the same program to compile, with the same lack of import in package main.

(It wouldn't necessarily be a valid program: the uintptr-to-Pointer conversion includes a function call, so it doesn't meet the unsafe package constraint that "both conversions must appear in the same expression, with only the intervening arithmetic between them". However, I suspect it would be possible to construct an equivalent, valid program without importing unsafe from main by making use of things like reflect.SliceHeader.)

Member

bcmills commented Feb 26, 2017

@crawshaw, couldn't you have just done this before?

package safe

import (
  "reflect"
  "unsafe"
)

func Pointer(p interface {}) unsafe.Pointer {
  switch v := reflect.ValueOf(p); v.Kind() {
  case reflect.Uintptr:
    return unsafe.Pointer(uintptr(v.Uint()))
  default:
    return unsafe.Pointer(v.Pointer())
  }
}

I believe that would allow exactly the same program to compile, with the same lack of import in package main.

(It wouldn't necessarily be a valid program: the uintptr-to-Pointer conversion includes a function call, so it doesn't meet the unsafe package constraint that "both conversions must appear in the same expression, with only the intervening arithmetic between them". However, I suspect it would be possible to construct an equivalent, valid program without importing unsafe from main by making use of things like reflect.SliceHeader.)

@dr2chase

This comment has been minimized.

Show comment Hide comment
@dr2chase

dr2chase Feb 27, 2017

Contributor

Seems like exporting a hidden unsafe type is just another rule to add to the audit.

Contributor

dr2chase commented Feb 27, 2017

Seems like exporting a hidden unsafe type is just another rule to add to the audit.

@crawshaw

This comment has been minimized.

Show comment Hide comment
@crawshaw

crawshaw Feb 27, 2017

Contributor

Yes, I wanted to point out that directly aliasing unsafe.Pointer makes code harder to audit, enough so that I hope no one ends up doing so.

Contributor

crawshaw commented Feb 27, 2017

Yes, I wanted to point out that directly aliasing unsafe.Pointer makes code harder to audit, enough so that I hope no one ends up doing so.

@griesemer

This comment has been minimized.

Show comment Hide comment
@griesemer

griesemer Feb 27, 2017

Contributor

@crawshaw Per my comment, this was also true before we had type aliasing. The following is valid:

package a

import "unsafe"

type P unsafe.Pointer
package main

import "./a"
import "fmt"

var x uint64 = 0xfedcba9876543210
var h = *(*uint32)(a.P(uintptr(a.P(&x)) + 4))

func main() {
	fmt.Printf("%x\n", h)
}

That is, in package main, I can do unsafe arithmetic using a.P even though there's no unsafe package and a.P is not an alias. This was always possible.

Is there something else you are referring to?

Contributor

griesemer commented Feb 27, 2017

@crawshaw Per my comment, this was also true before we had type aliasing. The following is valid:

package a

import "unsafe"

type P unsafe.Pointer
package main

import "./a"
import "fmt"

var x uint64 = 0xfedcba9876543210
var h = *(*uint32)(a.P(uintptr(a.P(&x)) + 4))

func main() {
	fmt.Printf("%x\n", h)
}

That is, in package main, I can do unsafe arithmetic using a.P even though there's no unsafe package and a.P is not an alias. This was always possible.

Is there something else you are referring to?

@crawshaw

This comment has been minimized.

Show comment Hide comment
@crawshaw

crawshaw Feb 27, 2017

Contributor

My mistake. I thought that didn't work. (I was under the impression that the special rules applied to unsafe.Pointer would not propagate to new types defined from it.)

Contributor

crawshaw commented Feb 27, 2017

My mistake. I thought that didn't work. (I was under the impression that the special rules applied to unsafe.Pointer would not propagate to new types defined from it.)

@griesemer

This comment has been minimized.

Show comment Hide comment
@griesemer

griesemer Feb 27, 2017

Contributor

The spec is actually not clear on this. Looking at the implementation of go/types, it turns out that my initial implementation required unsafe.Pointer exactly, not just some type that happened to have an underlying type of unsafe.Pointer. I just found #6326 which is when I changed go/types to be gc compliant.

Perhaps we should disallow this for regular type definitions and also disallow aliases of unsafe.Pointer. I can't see any good reason for allowing it and it does compromise the explicitness of having to import unsafe for unsafe code.

Contributor

griesemer commented Feb 27, 2017

The spec is actually not clear on this. Looking at the implementation of go/types, it turns out that my initial implementation required unsafe.Pointer exactly, not just some type that happened to have an underlying type of unsafe.Pointer. I just found #6326 which is when I changed go/types to be gc compliant.

Perhaps we should disallow this for regular type definitions and also disallow aliases of unsafe.Pointer. I can't see any good reason for allowing it and it does compromise the explicitness of having to import unsafe for unsafe code.

@griesemer

This comment has been minimized.

Show comment Hide comment
@griesemer

griesemer Feb 27, 2017

Contributor

I created #19306.

Contributor

griesemer commented Feb 27, 2017

I created #19306.

@bradfitz

This comment has been minimized.

Show comment Hide comment
@bradfitz

bradfitz May 3, 2017

Owner

This happened. I don't think anything remains here.

Owner

bradfitz commented May 3, 2017

This happened. I don't think anything remains here.

@bradfitz bradfitz closed this May 3, 2017

gopherbot pushed a commit that referenced this issue Jun 7, 2017

builtin: use type aliases for rune and byte
As motivated by https://golang.org/design/18130-type-alias which says:

https://github.com/golang/proposal/blob/master/design/18130-type-alias.md#relationship-to-byte-and-rune

> The language specification already defines byte as an alias for
> uint8 and similarly rune as an alias for int32, using the word alias
> as an informal term. It is a goal that the new type declaration
> semantics not introduce a different meaning for alias. That is, it
> should be possible to describe the existing meanings of byte and
> uint8 by saying that they behave as if predefined by:
>
>     type byte = uint8
>     type rune = int32

So, do that. Seems to work.

Updates #18130

Change-Id: I0740bab3f8fb23e946f3542fdbe819007a99465a
Reviewed-on: https://go-review.googlesource.com/45017
Reviewed-by: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Robert Griesemer <gri@golang.org>
Run-TryBot: Brad Fitzpatrick <bradfitz@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment