Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generics external impl versus extends #995

Closed
josh11b opened this issue Dec 20, 2021 · 25 comments
Closed

Generics external impl versus extends #995

josh11b opened this issue Dec 20, 2021 · 25 comments
Labels
leads question A question for the leads team

Comments

@josh11b
Copy link
Contributor

josh11b commented Dec 20, 2021

Right now there is an inconsistency between how classes/adapters refer to interfaces they implement and interfaces/constraints to interfaces they require with respect to how they distinguish between whether they include the names from the implemented/required interface.

Include names: Yes No
class, adapter impl external impl
interface, constraint extends impl

Should we make these consistent, and if so how? There are three options:

  • No change: the difference between "implementing" and "requiring" justifies the difference in syntax, which reflects which option should be shorter to say.
    interface Hashable {
      extends Equatable;
      impl as Movable;
      fn Hash...;
    }
    class Song {
      impl as Media { ... }
      external impl as Equatable { ... }
    }
    
  • extends: Consistently use extends when including the names, drop the external keyword.
    interface Hashable {
      extends as Equatable;
      impl as Movable;
      fn Hash...;
    }
    class Song {
      extends as Media { ... }
      impl as Equatable { ... }
    }
    
  • external: Consistently use external impl when not including names, stop using the extends keyword for interfaces. extends only used for base/derived classes.
    interface Hashable {
      impl as Equatable;
      external impl as Movable;
      fn Hash...;
    }
    class Song {
      impl as Media { ... }
      external impl as Equatable { ... }
    }
    
@zygoloid
Copy link
Contributor

zygoloid commented Jan 5, 2022

The "consistently use extends" option seems to result in somewhat awkward syntax, especially for conditional conformance situations:

class Wrapper(T:! Type) {
  // This reads a bit strangely.
  extends as Copyable { ... }
  // This reads very strangely.
  extends Wrapper(E:! Hashable) as Hashable { ... }
}

... so I find I much prefer consistently using impl/external impl over than consistently using extends/impl from a readability standpoint.

I think I'm OK with having extends in interfaces and constraints rather than impl as, specifically because when implementing an interface or constraint that extends another interface or constraints, you can also implicitly implement the extended interface. That to me feels like a stronger connection than an internal impl. Perhaps, mostly from a consistency standpoint, we could allow all of extends, impl, and external impl in constraints, and treat all three differently:

interface A {
  fn FA();
}

interface B {
  extends A;
  fn FB();
}
// OK: B has both FB and FA member names.
fn CallB(T:! B) { T.FB(); T.FA(); }

interface C {
  impl as A;
  fn FC();
}
// OK: C has both FC and FA member names.
fn CallC(T:! C) { T.FC(); T.FA(); }

interface D {
  external impl as A;
  fn FD();
}
// Error: D has no member name FA.
fn CallD(T:! D) { T.FD(); T.FA(); }

class CB {
  // OK, this implements C as B and implicitly C as A.
  impl as B {
    fn FA() {}
    fn FB() {}
  }
}

class CC1 {
  // Error, impl CC as A is required by impl CC as C but not provided.
  impl as C {
    // Error, function FA is not part of interface C.
    fn FA() {}
    fn FC() {}
  }
}
class CC2 {
  impl as A { fn FA() {} }
  // OK, requirement that CC2 as A exists is satisfied.
  impl as C { fn FC() {} }
  fn G() {
    // OK, FC found in internal impl of C.
    CC2.FC();
    // OK, FA found via internal impl of A (and maybe also impl of C? see below)
    CC2.FA();
  }
}
class CC3 {
  external impl as A { fn FA() {} }
  impl as C { fn FC() {} }
  fn G() {
    // OK, FC found in internal impl of C.
    CC3.FC();
    // OK? FA found via internal impl of C, even though A implemented externally?
    // Maybe should reject 'external impl as A' above, because A is effectively internally implemented?
    // Or maybe `impl as` should only add an internal impl to the archetype, and not to implementers
    // of the type itself?
    CC3.FA();
  }
}

class CD {
  external impl as A { fn FA() {} }
  impl as D { fn FD() {} }
  fn G() {
    // OK, FD found in internal impl of D.
    CC.FD();
    // Error: no member name FA in class CD.
    // Use "CC.(A.FA)()" to call member of interface A.
    CC.FA();
  }
}

That said, I'm not sure whether it's useful from an interface design perspective to introduce this choice between whether an interface requires an internal impl of another interface and whether it extends that interface (where the choice only affects implementers and not users).

(One related thing I think we should keep in mind is that we might want to reserve syntactic space to allow interfaces to require an impl on a type other than Self. Eg:

interface CopyableContainer {
  let Iter:! Iterator;
  impl Iter.ValueType as Copyable;
}

... or maybe as a big stretch something like:

interface ContainerProvider {
  fn Container(T:! Type) -> Type;
  impl Container(i32) as Hashable;
}

But I think all the options in front of us are OK from this perspective: they all have an as in the implementation requirement syntax in an interface.)

[Edit: class CopyableContainer -> interface CopyableContainer, class ContainerProvider -> interface ContainerProvider.]

@josh11b
Copy link
Contributor Author

josh11b commented Jan 5, 2022

class CopyableContainer {
  let Iter:! Iterator;
  impl Iter.ValueType as Copyable;
}

can be expressed using a where clause:

class CopyableContainer {
  let Iter:! Iterator where .ValueType is Copyable;
}

But this example is weird since it uses let in a class without assigning it a value.

@github-actions
Copy link

We triage inactive PRs and issues in order to make it easier to find active work. If this issue should remain active or becomes active again, please comment or remove the inactive label. The long term label can also be added for issues which are expected to take time.
This issue is labeled inactive because the last activity was over 90 days ago.

@github-actions github-actions bot added the inactive Issues and PRs which have been inactive for at least 90 days. label Apr 22, 2022
@jonmeow jonmeow added the leads question A question for the leads team label Aug 10, 2022
@github-actions github-actions bot removed the inactive Issues and PRs which have been inactive for at least 90 days. label Aug 11, 2022
@josh11b
Copy link
Contributor Author

josh11b commented Nov 8, 2022

We've been discussing an alternative that is more consistent:

  • Always uses impl when implementing.
  • Always uses extends when adding another place to look up names.
  • Never uses external

The changes in this alternative are:

  • external impl T as C in a class/adapter becomes impl T as C, matching what is done now for an interface or constraint
  • impl as C in a class/adapter becomes impl extends C.
    • impl extends C is equivalent to impl as C plus interface extends C.
    • interface extends C by itself adds C as another place to look up names without declaring an implementation of C for the type.
    • This replaces extends C for interface or constraint.
    • This form does not allow specifying a type.

Using this alternative, the example would be written:

interface Hashable {
  impl extends Equatable;
  impl as Movable;
  fn Hash...;
}
class Song {
  impl extends Media { ... }
  impl as Equatable { ... }
}

Conditional impls would be accomplished by using interface extends C to include the names of C in the type, combined with separate impl declarations for each condition for which we have an implementation of that interface. For example:

class Vector(T:! Type) {
  interface extends Sortable;
  impl forall [U:! Comparable] Vector(U) as Sortable;
}

Discussions:

@josh11b
Copy link
Contributor Author

josh11b commented Nov 10, 2022

A question about this new alternative is how you would control name lookup into mixins.

@josh11b
Copy link
Contributor Author

josh11b commented Nov 15, 2022

Notes from a discussion with @zygoloid on this topic:

mixin Foo {
  // injected
  export fn F[me: Self]();
  // accessible, but not injected
  fn Make() -> Self;
  // inaccessible implementation detail of `Foo`
  private var x: i32;
}
class Bar {
  // names of `Foo` accessed only through `m`
  [private | protected] mix m: Foo;

  // --- A bunch of options ----
  // lookup into `Bar` also finds names of `Foo`
  extends [private | protected] mix m: Foo;  // clearer than next option
  [private | protected] extends mix m: Foo;
  [private | protected] mix extends m: Foo;  // a little bit weird, doesn't quite work
  [private | protected] mix m extends Foo;  // dislike

  [private | protected] mix m: Foo;
  [private | protected] extends m;

  [private | protected] mix m: Foo;
  [private | protected] mix extends m;
}

We considered a possible addition, if we go with the extends m route:

class Baz { fn G[me: Self](); }
class Bar {
  var b: Baz;
  extends b;
  // means: Bar.G == Bar.b.G
}

However, this had two problems: (1) there are no export keywords in the definition of Baz to control what is exported, just public versus private, and (2) (more serious) final classes reserve the right to add new names without breakage which would be incompatible with this.

Concern: the defaults are "public" and "not extends" consistent with other Carbon constructs, though we expect "private" & "extends" to be the common case.

private mix m: Foo;
mix extends m;

Still always having extends marking every other place to look for names is a nice consistency.

Question: would we want a short cut for the common case, like we do for impl extends == impl as + interface extends.

Observation: it is meaningful to write protected impl extends Media. The fact that Media is implemented can't be access controlled, but the names injected into the class can be.

Question: Would we want to move extends to the beginning consistently?

extends impl as C;
extends interface C;
extends mix m;

Alternative is: extends followed either by a declaration or an existing name which we are going to extract names from.

extends impl as C;
// avoids using the word `interface` when `C` might be a named constraint
extends C; 
// a shortcut for `mix m: Foo` + `extends m` 
extends mix m: Foo;

@josh11b
Copy link
Contributor Author

josh11b commented Nov 16, 2022

Now with @chandlerc , we considered this might lead to:

protected extends private mix m: Foo;

as a shortcut for:

private mix m: Foo;
protected extends m;

Which is reasonably intuitive / self-descriptive, even though it is a bit of keyword salad.

We also talked about moving extends for inheriting from a base class into the body of the class definition, from the declaration.

base class B { ... }
class D {
  // Must be before other fields, for data layout reasons
  // (though we could consider relaxing that restriction in the future)
  // Nice: it makes the forward declaration match what is before `{`
  extends base B;

  // can write just `private base B;` for private extension 
}

private/protected before extends controls visibility of the added names, private/protected before base would control who is allowed to perform the conversion-to-base type. However, we had concerns that private inheritance use cases would be better served by using mixins instead.

This idea would also allow deriving from member types.

@josh11b
Copy link
Contributor Author

josh11b commented Nov 18, 2022

Since there is some interest in this approach for this issue and for #1159 , I'd like to iterate on the details. We'd need a couple of keywords:

  • The first means "add this as another place to lookup names" which we are currently spelling extends but could also be include or something else
  • The second means "make this a base class of the current type" which was spelled base above, but I'm going to spell derived_from since I think using base here will cause conflicts or confusion.

Starting with inheritance, a class may have at most one derived_from declaration in the body of its definition, before any mix or var declarations. A derived_from-declaration can have one of these forms:

  • [ protected | private ] derived_from existing-base-class-name ;
  • [ protected | private ] derived_from base-class-declaration-or-definition
    where base-class-declaration-or-definition is:
    • [ protected | private ] ( abstract | base ) class class-name ( ; | { class-body } )

The optional protected or private keyword modifier on derived_from controls access to the derived-to-base conversion. The optional protected or private keyword modifier in base-class-declaration-or-definition controls access to class-name.

Examples:

// `C1` is derived from `ABC`
abstract class ABC { ... }
class C1 {
  derived_from ABC;
}

// `C2` is  derived from member class `C2.BC`
class C2 {
  base class BC { ... }
  derived_from BC;
  // Keeping these two cases separate is much easier when
  // using `derived_from` instead of `base`
}

// `C3` is like `C2`, but with the two declarations in its body combined
class C3 {
  derived_from base class BC { ... }
  // By using `derived_from` instead of `base`, we avoid
  // writing `base base` here
}

// `C4` is privately derived from private member class `C2.BC`
class C4 {
  // `BC` is a private member of `C4`
  private base class BC { ... }
  // Ability to convert from `C4` to `C4.BC` requires private access.
  private derived_from BC;
}

// `C5` is like `C4`, but the two declarations in its body are combined
class C5 {
  // `BC` is a private member of `C5` and the ability to convert
  // from `C5` to `C5.BC` requires private access.
  private derived_from private base class BC { ... }
}

An extends-declaration follows a similar pattern of supporting two forms:

  • [ protected | private ] extends existing-name ;
  • [ protected | private ] extends declaration-or-definition

The optional protected or private keyword modifier on extends controls access to the names added by this declaration, and is only allowed when the extends-declaration is inside a class definition.

In an interface definition, the name would be of an interface, and the "declaration or definition" must be the declaration of an interface implementation requirement. Examples:

interface A { ... }

interface B1 {
  // implementing interface `A` is a requirement to implement `B1`
  impl as A;
  // also look in `A` for names in `B1`
  extends A;
}

interface B2 {
  // Same as in `B1`, but the two declarations in the body are combined here
  extends impl as A;
}

In a class, the name, declaration, or definition needs to be one of:

  • an interface impl
  • a derived_from class
  • an adapts class
  • a mixin

Note that there is no way (at this time) to combine an adapts declaration (in the class declaration) with an extends declaration (in the class definition body).

Adapter example:

class CA adapts C1 {
  // Include the names from `C1` into `CA`
  extends C1;
}

Inheritance examples:

// `CB1` is derived from `ABC`, and includes its names
class CB1 {
  derived_from ABC;
  extends ABC;
}

// `CB2` is like `CB1`, but with the two declarations in the body combined
class CB2 {
  extends derived_from ABC;
}

// `CC1` is derived from a member class `CC1.BC`, and includes its names
class CC1 {
  base class BC { ... }
  derived_from BC;
  extends BC;
}

// `CC2` is like `CC1`, but with two of the declarations in the body combined
class CC2 {
  base class BC { ... }
  extends derived_from BC;
}

// `CC3` is like `CC1`, but with two different declarations in the body combined
class CC3 {
  derived_from base class BC { ... }
  extends BC;
}

// `CC4` is like `CC1`, but with all three of the declarations in the body combined
class CC4 {
  extends derived_from base class BC { ... }
  // A bit awkward that `extends`, `derived_from`, and `base` are all used here
  // to mean different aspects of inheritance.
}

// `CC5` is like `CC4`, but with protected access
class CC5 {
  protected extends protected derived_from protected base class BC { ... }
  // Can write `protected` in three places to mean three different things!
}

Interface implementation examples:

class CD1 {
  // class `CD1` implements interface `A`
  impl as A;
  // class `CD1` includes the names from interface `A` as members
  extends A;
}

class CD2 {
  // Equivalent to the two declarations in the body of `CD1`
  extends impl as A;
}

(Mixin examples covered earlier in this thread. I can go through them on request, but this comment has grown long enough.)

@chandlerc
Copy link
Contributor

Starting with inheritance, a class may have at most one derived_from declaration in the body of its definition, before any mix or var declarations. A derived_from-declaration can have one of these forms:

  • [ protected | private ] derived_from existing-base-class-name ;

  • [ protected | private ] derived_from base-class-declaration-or-definition
    where base-class-declaration-or-definition is:

    • [ protected | private ] ( abstract | base ) class class-name ( ; | { class-body } )

While I think it will be common to want to both extend and derive from a class, it doesn't seem like an important common use case to support both deriving from and declaring a (nested) base class.

Would it be ok to only have the first of these forms? Does that reduce any of the concerns? (It reduces one of mine which is having 3-levels of nested access specifiers.)

@josh11b
Copy link
Contributor Author

josh11b commented Nov 22, 2022

I agree that removing the ability to declare a base class in a derived_from declaration is an improvement.

@chandlerc and I talked in person about restricting access-control specifiers, with the understanding that finer control is always possible by using multiple declarations instead of combining them.

  • I wanted an access control specifier earlier in the declaration to act as the default for later omitted access control specifiers.
  • @chandlerc wanted to limit to a single access control specifier per declaration.
    • We considered only allowing it at the beginning, but that prevents the extends private mix... combination which is expected to be common

It was important to me that if a declaration started with private, no later part of the declaration could make something public. This way you could scan the first words of the declarations and find all the ones adding public members. We settled on only allowing a single access control specifier per declaration:

  • if it was at the beginning, it would apply to everything in that declaration
  • if it was in the middle, everything before would be public and everything after would have the access of the specifier

We also talked about this example from earlier in the thread:

class Baz { fn G[me: Self](); }
class Bar {
  var b: Baz;
  extends b;
  // means: Bar.G == Bar.b.G
}

We agreed we didn't want to allow extending a final class, but wanted to leave the door open in the future to possibly doing something with this in the case Baz is a base class, to get a form of multiple inheritance without subtyping. For now, we believe instead using mixins or thunks in this case would be fine for now.

@chandlerc was not excited about the spelling of derived_from and still wonders if we might spell it base instead, despite the overloading of that keyword. We could also spell it as.

Below the line are the updated versions of what I wrote earlier in this thread.


A derived_from-declaration can have only have this form:

  • [ protected | private ] derived_from existing-base-class-name ;

The optional protected or private keyword modifier on derived_from controls access to the derived-to-base conversion.

Examples:

// `C1` is derived from `ABC`
abstract class ABC { ... }
class C1 {
  derived_from ABC;
}

// `C2` is  derived from member class `C2.BC`
class C2 {
  base class BC { ... }
  derived_from BC;
  // Keeping these two cases separate is much easier when
  // using `derived_from` instead of `base`
}

// `C4` is privately derived from private member class `C2.BC`
class C4 {
  // `BC` is a private member of `C4`
  private base class BC { ... }
  // Ability to convert from `C4` to `C4.BC` requires private access.
  private derived_from BC;
}

Classes C3 and C5 above would no longer be allowed.

Interface examples are not changed:

interface A { ... }

interface B1 {
  // implementing interface `A` is a requirement to implement `B1`
  impl as A;
  // also look in `A` for names in `B1`
  extends A;
}

interface B2 {
  // Same as in `B1`, but the two declarations in the body are combined here
  extends impl as A;
}

Adapter example is unchanged:

class CA adapts C1 {
  // Include the names from `C1` into `CA`
  extends C1;
}

Inheritance examples:

// `CB1` is derived from `ABC`, and includes its names
class CB1 {
  derived_from ABC;
  extends ABC;
}

// `CB2` is like `CB1`, but with the two declarations in the body combined
class CB2 {
  extends derived_from ABC;
}

// `CC1` is derived from a member class `CC1.BC`, and includes its names
class CC1 {
  base class BC { ... }
  derived_from BC;
  extends BC;
}

// `CC2` is like `CC1`, but with two of the declarations in the body combined
class CC2 {
  base class BC { ... }
  extends derived_from BC;
}

// `CC7` is like `CC2`, but with protected access
class CC7 {
  protected base class BC { ... }
  protected extends derived_from BC;
  // last declaration equivalent to:
  protected derived_from BC;
  protected extends BC;
}

Classes CC3, CC4, and CC5 above would no longer be allowed.

Interface implementation examples are unchanged:

class CD1 {
  // class `CD1` implements interface `A`
  impl as A;
  // class `CD1` includes the names from interface `A` as members
  extends A;
}

class CD2 {
  // Equivalent to the two declarations in the body of `CD1`
  extends impl as A;
}

Mixin examples now no longer allow two access specifiers, but there are two possible places to put an access specifier when writing extends...mix:

class CMA {
  extends mix m: Foo;
  // Equivalent to:
  // `m` is a public name
  mix m: Foo;
  // this class publicly includes `Foo`'s names
  extends m;
}

// We expect this case to be the most common
class CMB {
  extends private mix m: Foo;
  // Equivalent to:
  // `m` is a private name
  private mix m: Foo;
  // this class publicly includes `Foo`'s names
  extends m;
}

class CMC {
  protected extends mix m: Foo;
  // Equivalent to:
  // `m` is a protected name
  protected mix m: Foo;
  // this class includes `Foo`'s names with protected access
  protected extends m;
}

class CMD {
  // No way to combine these two declarations:
  // `m` is a public name
  mix m: Foo;
  // this class includes `Foo`'s names with protected access
  protected extends m;
}

class CME {
  // No way to combine these two declarations:
  // `m` is a private name
  private mix m: Foo;
  // this class includes `Foo`'s names with protected access
  protected extends m;
}

@josh11b
Copy link
Contributor Author

josh11b commented Nov 23, 2022

I had a talk with @mconst last night about this and he had a different perspective:

  • We probably don't need a lot of the extra flexibility this proposal gives us, and it adds lots of complexity. As evidence: in C++ almost all inheritance is public. The few use cases of private inheritance are for things like the empty base optimization, which won't be relevant in Carbon. As further evidence: no languages since C++ have copied this feature.
  • It would make more sense to use extends in place of derived_from, and come up with another keyword (maybe includes?) for "adds to this class' API".
  • Instead of putting private or protected before an internal impl declaration, it might make more sense to add it before the individual member functions, like we are planning to do with virtual.

We talked about mixins in greater detail. @mconst thought:

  • A mixin should be in control of which members are added to the mixer, and their visibility. export in a mixin may be followed by any declaration, including export protected...
  • The keyword mix already pretty clearly indicates that you should look in the mixin for exported names.
  • There may need to be some mechanism for handling name conflicts from a mixin, but that is not the common case and so the defaults should favor the case of no conflicts.
  • We looked at the use cases for mixing in the same mixin more than once into a class. The only example we know of is intrusive lists, which need to be parameterized anyway, and will have distinct argument values (see example below).
  • If every mixin is going to have a unique type, we can avoid giving each mixin a name in the mixer. If it doesn't have a name and the mixin controls how it affects the mixer class, there is no need for access control specifiers on mix declarations.
interface IntrusiveListNodeIface(TagT:! type) {
  fn NextPtr[addr me: Self*]() -> Self**;
}

mixin IntrusiveListNode(TagT:! type) {
  impl as IntrusiveListNoteIface(TagT) {
    fn NextPtr[addr me: Self*]() -> Self** {
      return &me.next;
    }
  }
  var next: Self*;
}

class IntrusiveList(TagT:! type,
                    NodeT:! IntrusiveListNoteIface(TagT)) {
  fn Add[addr me: Self*](node_ptr: NodeT*);
}

class Tag1 {}
class Tag2 {}
abstract class Foo {
  mix IntrusiveListNode(Tag1);
  mix IntrusiveListNode(Tag2);
}
var foo1_list: IntrusiveList(Tag1, Foo);
var foo2_list: IntrusiveList(Tag2, Foo);
class Bar extends Foo {}
var bar: Bar;
// No need to specify which mixin within `Foo`. `foo1_list`
// can find the next pointer via the implementation of
// `IntrusiveListNoteIface(Tag1)`.
foo1_list.Add(&bar);
foo2_list.Add(&bar);
  • The only problem with this approach is we need to add some mechanism for specifying how to construct the mixin.
  • If the constructor parameters don't vary, this can be done as part of the mix declaration:
abstract class Foo {
  mix IntrusiveListNode(Tag1) = IntrusiveListNode(Tag1).Make();
  mix IntrusiveListNode(Tag2) = IntrusiveListNode(Tag2).Make();
}
  • The other case requires some new mechanism to make struct literals where the field names can be parameterized by types:
abstract class Foo {
  mix IntrusiveListNode(Tag1);
  mix IntrusiveListNode(Tag2);
  fn Make(init: i32) -> Self {
    return {
      .mix(IntrusiveListNode(Tag1)) = IntrusiveListNode(Tag1).Make(init),
      .mix(IntrusiveListNode(Tag2)) = IntrusiveListNode(Tag2).Make(init)
    };
    // or maybe `.mixin(...) = ...`?
  }
}

@mconst is on vacation at the moment, but is available starting Tuesday.

@josh11b
Copy link
Contributor Author

josh11b commented Feb 1, 2023

There have been some discussions recently about intentionally making the "requires interface/constraint to be implement" declaration in interfaces and constraints more distinct from the "implements interface/constraint" declaration in types, see for example https://discord.com/channels/655572317891461132/941071822756143115/1069691751968817283 . Some differences:

  • requiring vs. extending in the interface is decoupled from whether the type implements the interface internally or externally
  • typically the type won't implement a required interface nested in the body of the implementation of the requiring interface; for extending interfaces it could be inline, for required interfaces it will likely be a separate block
  • rules for rewrites are different, see Associated constant assignment versus equality #2173 .

Possibly we could mirror the where syntax, which is directly about expressing requirements, either with the same keyword or a new keyword?

interface Hashable {
  // was: impl as Movable;
  // now:
  where Self is Movable;
  // or where Self impls Movable; depending on how #2495 is resolved
  // or using a different keyword introducer:
  requires Self is Movable;
}

Possibly we could do something like make impls Movable a shortcut for where Self impls Movable, since that is the common case.

@chandlerc
Copy link
Contributor

chandlerc commented Feb 4, 2023

We had a long discussion about this and related topics today (lots but potentially incomplete notes).

I just wanted to try to capture the end result, as I think we got to some good consensus on the more major directional questions. However, there are a few details that might warrant further diving into and trying out some alternatives.

For classes, I think we increasingly have a story we like, which I think largely addresses the part of this issue that also came up in #2293:

class Foo {
  extends base: Base;

  extends m1: Mix1;

  // Maybe even:
  extends _: Mix2;

  // Internal impl contributes to the API.
  extends impl as Sortable;

  // External impl of an operator.
  impl as Add;

  // Still a mixin, but no API extension.
  var m3: Mix3;
}

// Also external impl of an operator
impl Foo as Mul { ... }

There is a related question in classes that we didn't get an answer we were clearly happy about: combining conditional conformance and API extension. But that is maybe worth factoring out into a separate issue as it seems fairly narrow and not something that should dramatically influence the rest of this picture.

We also got fairly close to syntax for how analogous things show up in interfaces:

interface I {
  fn F();
}

interface Requires {
  // This is just `requires <where-clause>`, with the same
  // syntax as in `Requires where <where-clause>`, but at
  // the top level, and can be repeated and interleaved
  // with other declarations as proves useful to organize
  // the interface.
  requires Self is I;
}

interface Extends {
  // Works just like today. Analogous to base classes and
  // mixins, but there is no meaningful name here because
  // for interfaces there is no meaningful state and so
  // no reason to need to refer to a name.
  extends I;
}

The idea is that extends I; implies both extending the interface and requires Self is I.

The keyword is is already being discussed in #2495 to find a better keyword. That should proceed independently (and I now have some more suggestions and thoughts there, but will add to that issue). WHatever the outcome is of that, it's expected to apply to the syntax here.

We might want something shorter than requires Self is I because this is expected to be the common case. I've suggested revisiting this with experience before introducing it, but there seem to be lots of easy options, including just requires I. We'll need to think about the tradeoff of adding any short form to both requires ... and to where ... vs. having the two deviate, but that seems like it can be considered if/when we decide to add the short form.

I think that captures the outcome of the discussion we had, but @josh11b, @mconst, and @zygoloid -- please chime in if I got anything wrong. I'll separately add the parts of our discussion that really applied to other issues to those issues (#2495 and a new issue focused on the conditional conformance question).

@chandlerc
Copy link
Contributor

Filed #2580 to document the challenges of conditional internal implementation, the directions we discussed that could address that, as well as the downsides. FWIW, I'm suggesting there to just remove conditional internal implementation entirely and revisit it later if it proves an important feature.

@chandlerc
Copy link
Contributor

@josh11b mentioned we should record more of the rationale for each of the decisions that led to my summary outcome in #995 (comment) -- I'm going to use one post per point I recall so folks can react to them individually.

First, we considered the case of a mixin that doesn't contribute to the API in any way. We considered a bunch of the syntaxes mentioned previously, but didn't really find any motivation for them being different from a field where the type was actually a mixin: var m3: Mix3; above.

One reason we increasingly liked this solution was the move in #2360 to have things convert to a type when used in type position. This makes the use of Mix3 in type position perfectly reasonable and gives us a clean way to explain and model this.

@chandlerc
Copy link
Contributor

Next, we looked at mixins themselves, largely considering the two leading approaches:

  1. mix m1: Mix1;
    
  2. extends m1: Mix1;
    

While (1) is very visually nice, we were a bit worried about users wanting to have mix available as an identifier.

Because the first decision handled non-API-contributing mixins completely and without any mix keyword at all, it was easy to keep (2) to the minimal syntax shown rather than the more verbose extends mix m1: Mix1 considered previously.

Some other reasons that (2) ended up being reasonably nice that I recall:

  • When a mixin is used and it contributes to the API it is filling an analogous role to multiple inheritance in C++, so the extends isn't likely to be surprising here for folks that associate it with inheritance-like constructs.
  • This gives us two orthogonal features that compose cleanly: a) using a mixin instead of a normal type enables the embedding and conversion (instead of a simple sub-object), b) using extends injects an API into the containing type.
  • Consist extends in the left-most column for the entities that are importing some chunk of API into a type.

My feeling of the discussion is that option (2) was most concerning when we hadn't solved the non-API-injecting use case as we did. Once that was handled, the syntax in (2) seemed reasonably broadly appealing. While I've tried to list as much of the rationale here as I can, I don't feel like this was a case of a particularly sharp and clear-cut point that justified one option over the other. Instead, it seemed much more a balance between a lot of fairly small factors (in isolation) and subtle tradeoffs. Hopefully the summary here is useful and not inaccurate or misrepresenting the discussion in any way.

@chandlerc
Copy link
Contributor

The decision around extern impl ... as ...; was possibly the most difficult. But not for syntactic reasons as became clear during the discussion.

A major part of the discussion here was about whether there was a way to eliminate the distinction between internal and external implementation of an interface. The existence of this distinction is the root cause of some of the complexity that bothered folks about all of the approaches here.

For example, other languages don't have a distinction here. All interfaces implemented are available as-if internally. The downside is that this makes member name lookup fundamentally more complex (for example, depending on which implementations are imported). Carbon's design is avoiding this by using a much more explicit model of internal/external. But that necessarily adds some complexity -- implementing an interface requires thinking about which way to do it.

We may find that even with the explicit difference, one or the other ends up being the dominant use case in Carbon. We might want to revisit our syntax here if that proves to be true, but I at least had a strong suspicion that we're going to see a mixture of trends here, where some code bases and some interfaces lean one direction or the other, and will end up needing to support both approaches as well as we can.

Once we focused on supporting both use cases, and minimizing the cost of that, things converged much more I think. Specifically, the best way to support both use cases but minimize the cost is to make one a fairly minimal "decoration" of the other so that folks can pattern recognize all implementations using common syntax, and just add an extra marker when "internal".

This rationale led pretty directly to settling on the extends impl T as C { ... } syntax, basically just adding an extends in front to make it "internal". Various advantages this has compared to other approaches:

  • External impls become syntactically minimal, which is important as they can happen in places far away from a class and without a useful reminder. There was an observation that the authors of the explorer have consistently failed to remember an explicit decorator for external impls.
  • The added decoration of extends aligns with mixins and base classes.
  • With the notable exception of conditional implementation, and extending implementation doesn't seem in tension with more inheritance views -- it is very analogous to extending an abstract base class.
  • Having internal impls get the decorator makes sense as they're notionally doing two things -- implementing an interface and extending the class API.
  • Retains an impl keyword to signify that it is an implementation, and to make it pattern match the same as an external impl but with the added prefix.

We considered a bunch of other permutations of words and syntaxes that have been mentioned in the past (impl T extends C, extends T as C, etc) but for the reasons above didn't find them better.

@chandlerc
Copy link
Contributor

When looking at interface syntax...

The extends I; syntax didn't really get many competing suggestions.

  • Seems obvious and unsurprising.
  • Interface extension really is extremely close to inheritance so it seemed to fit very well there.
  • Nothing to name and so no reason to add a name here.
  • Briefly considered extends as I but the as didn't seem to add anything or help. No expectation that we want any more complex formulations of the extends here and if this is the only thing no need to say more.

The requires ... version did have some discussion.

Previously these have been modeled by mirroring the type's syntax for satisfying the requirement: impl T as C;. The problems emerge when we start to see ways in which these are different, for example within a type definition the C may include where .ElementType = U which has very different semantics from the analogous syntax in an interface. There was a feeling that matching the where clause expression syntax of T is C would be better, and if we can improve on that syntax, we should do that independently and get the benefits in both places.

We considered just allowing a where directive:

interface WhereI {
  where T is I;
}

But this didn't read cleanly because it started with where and is a stand-alone directive (declaration? unclear) in the interface definition.

However, we immediately looked at the alternative of introducing the same grammar as where clauses have but using the requires introducer. This seemed obvious, unsurprising, and generally everyone was happy.

  • Good to reuse the exact where clause grammar within it.
    • If we need to shorten, should shorten this generically rather than as a special case
  • Simple, and expected to be completely unsurprising in what it means to readers.

We didn't come up with anything that was interestly better to consider so we went with requires.

@chandlerc
Copy link
Contributor

Since the discussion, I've also been thinking about one issue that was briefly discussed but didn't resolve one way or another, and I'd like to suggest we actually do this.

Specifically, I think we should remove the as from the shorthand for impl Self as C inside a class, and add an analogous shorthand to requires, but not to where:

interface BaseI {}
interface OtherI {}

interface I {
  extends BaseI;

  // same as `requires Self ?? BaseI` where `??` is `is` or `impls` or `as` or...
  requires OtherI;
}

class C {
  // same as `impl Self as I { ... }`
  impl I { ... }
}

// No shorthand to remove `.Self` within a `where`:
let T:! C where .Self is I;

While in the discussion there was a desire for the shorthand to be part of the normal grammar for where clauses, I find it much less obvious in an actual where clause attached to a constraint as opposed to in a requires. It also seems much less useful there as you can always use T:! C & I if desired but that isn't an option for requires. And removing the as from the shorthand form of impl gives us symmetry across impl I, requires I, and extends I.

@josh11b
Copy link
Contributor Author

josh11b commented Feb 8, 2023

@chandlerc and @zygoloid came to an agreement today in an in-person meeting on class syntax. We considered three broad approaches:

  1. (representing @mconst ) "interfaces should pick the default for internal versus external, and when implementing there is a single, convenient syntax to use that default"
  2. (representing @chandlerc ) "the important thing is that readers of a type understand its API, looking up only the definitions that contribute to its API", so internal/external is only determined by impl declaration in the type
  3. (considered by @zygoloid ) "types should say which interfaces contribute to their API, but interfaces may want to control which members are injected (when requested), like mixins", combining approaches 1 and 2

Details of the discussion have been documented in the open discussion notes for 2023-02-07. The conclusion was to go with approach 2, at least for now. Looking in fewer places to see the API of a type was deemed an important property. Approach 2 was also deemed the simplest approach. The door was left open to later adopt approach 3 using a keyword (possibly local) in an interface definition to opt-out a member from being injected.

Another question that was considered (though it was deemed "lower stakes" since the decision would be easier to change) was:

  • does "extends" mean "adds to the API of this type" and is used with base classes, mixins, and impls; or
  • does "extends" mean "inheriting from a base class" and we use "mix" for mixins, and some other keyword for impls

@chandlerc and @zygoloid agreed on the first choice.

There were a few more points that we settled on:

  • Contrary to what was considered in the previous comment, we are going to keep as after impl. Reasoning: going to make the parser a lot easier, and leaves more room for evolving the syntax without breaking existing code.
  • We don't seem to immediately need a var syntax with mixins, to perform mixin without extending the interface. It would be simpler and consistent with our interface strategy to just require qualification to access any member with a name conflict, unless shadowed by a declaration in the type itself. The property use case for mixins would just not have any injected members, its functionality would be accessed through public members using the name of the mixin. (This was discussed on 2023-02-06.) This is of course something that could be added later, if the need was demonstrated.

The resulting syntax is:

mixin M1;
mixin M2;
interface Sortable;
interface Add;
interface Sub;
base class B;
class C {
  // Inheriting from base class B
  extends base: B;

  // Mixing in mixin M1
  extends m: M1;
  // Mixing in mixin M2. This member is not named.
  // Initialized using `M2.MakeDefault()`.
  extends _: M2 = M2.MakeDefault();

  // Internal impl contributes to the API.
  extends impl as Sortable;
  // External impl of an operator.
  impl as Add;

  // No use of `var` with mixins.

  // API of C includes API of B, M1, M2, and Sortable
  // except any name that conflicts due to being in
  // more than one of those.
  // Names explicitly defined in C shadow names in
  // B, M1, M2, and Sortable, according to the rules
  // in #2355.
}
// External impl of an operator.
impl C as Sub;

Last question left for me: is there some sort of access control for the member name m when mixing in a mixin, and if so where does the access specifier keyword go? Concern with putting the access specifier first (where it typically goes) is that you would not expect something starting with private to add to the API, but a private extends... often would. So that leads to extends private m: M1;.

@chandlerc
Copy link
Contributor

I think both @zygoloid and I are happy with extends private m: M1;, and that it only makes the name m private.

I also think we can completely omit this for base classes for now. After talking it through we don't have any real use cases, and having the tokens private base seems a bit too likely to create confusion for folks coming from C++. If we want something here, it should be reasonably motivated at least, so let's start with the minimal thing of making the interesting names (mixins) private.

Rationale for the position: putting private at the start seems confusing as you say. Similarly, the most important thing is the extends so it seems good to lead with that.

And I think the answer for interface syntax haven't changed from my comment here: #995 (comment)
(in particular, no short-hand for requires ..., it is the same syntax that would be allowed with where ... on the interface in a constraint)

Are there any other open corners here?

@chandlerc
Copy link
Contributor

An orthogonal aspect of all of this is whether it is spelled extend ... or extends .... I think this has come up a few times in discussions but not sure it was captured.

I think the overall feeling was we should use extend ... everywhere. Rationale:

  • We use impl already.
    • We may be moving towards impls as a predicate and impl to satisfy the predicate. Even if we never want an extends predicate, it would seem important to not deviate in the formulation.
  • One character shorter? Really it's just for consistency with impl. =D

@chandlerc
Copy link
Contributor

chandlerc commented Apr 5, 2023

I think this has actually converged, but trying to update examples...

For the class syntax, slight updates to the earlier example:

mixin M1;
mixin M2;
interface Sortable;
interface Add;
interface Sub;
base class B;
class C {
  // Inheriting from base class B. `base` is a special name similar to `self` here.
  extend base: B;

  // Mixing in mixin M1
  extend m: M1;
  // Mixing in mixin M2. This member is not named.
  // Initialized using `M2.MakeDefault()`.
  extend _: M2 = M2.MakeDefault();
  // Alternative to the above `M2` that uses a private name instead of no name:
  extend private m2: M2;

  // Internal impl contributes to the API.
  extend impl as Sortable;

  // External impl of an operator.
  impl as Add;

  // No use of `var` with mixins.

  // API of C includes API of B, M1, M2, and Sortable.
  //
  // Names explicitly defined in C shadow names in
  // B according to the rules from #2355. How other conflicts
  // work is a separate question in #2745.
}
// External impl of an operator.
impl C as Sub;

And interfaces:

interface BaseI {
  let T:! Type;
}
interface OtherI {}

interface I {
  // `BaseI`'s interface is incorporated into `I`.
  extend BaseI;

  // Anything that could be in a `where` clause can
  // be in a `require` declaration:
  require BaseI.T == i32;

  // No impact on `I`s interface, but an implementation must exist.
  require Self impls OtherI;
}

Interesting details:

  • extend instead of extends to match impl.
  • require instead of requires to match extend and impl.
  • Allow private (or any visibility) only on the name inside the extend, and only when it is not extend base: ....

Have I mis-remembered anything? (Happy for folks to edit this to be a useful summary.)

The rationale here is pretty distributed as well but I do tihnk it is present and probably can just be consolidated in the proposal.

@chandlerc
Copy link
Contributor

FYI, I edited my summary example -- it no longer supposes an answer to shadowing outside of #2355 -- @josh11b filed the rest of that question to be tracked separately in #2745.

I think the name conflicts were the only concern I've heard from a lead here and they should be separated now. Are we good to decide the syntax?

@chandlerc
Copy link
Contributor

Marking as decided.

@josh11b josh11b changed the title Generics external impl vs. extends Generics external impl versus extends Apr 12, 2023
zygoloid added a commit that referenced this issue Jun 15, 2023
Update syntax of `class` and `interface` definitions to be more consistent. Constructs that add names to the class or interface from another definition are always prefixed by the `extend` keyword.

Implements the decisions in:

-   [#995: Generics external impl versus extends](#995),
-   [#1159: adaptor versus adapter may be harder to spell than we'd like](#1159),
-   [#2580: How should Carbon handle conditionally implemented internal interfaces](#2580), and
-   [#2770: Terminology for internal and external implementations](#2770).

Co-authored-by: Richard Smith <richard@metafoo.co.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
leads question A question for the leads team
Projects
None yet
Development

No branches or pull requests

4 participants