Skip to content

Latest commit

 

History

History
1361 lines (1137 loc) · 53.1 KB

p2173.md

File metadata and controls

1361 lines (1137 loc) · 53.1 KB

Associated constant assignment versus equality

Pull request

Table of contents

Abstract

Split the where A == B constraint in two: where .A = B produces a new constraint from an old one, where the value of .A in the new constraint is known and eagerly rewritten to B, and where A == B, which does not cause A and B to be considered as identical by language rules but does permit implicit (no-op) conversion between them.

This aims to provide an efficiently-computable and human-understandable type equality rule, with type canonicalization and therefore transitive type equality, without sacrificing too much in the way of ergonomics and without sacrificing determinism, while still providing the full power of a general type constraint system in a less ergonomic form.

Problem

Equal types with different interfaces

When an associated type is constrained to be a concrete type, it is desirable for the associated type to behave like that concrete type:

interface Hashable {
  fn Hash[me: Self]() -> u128;
}
interface X {
  let R:! Hashable;
  fn Make() -> R;
}
fn F[T:! X where .R == i32](x: T) -> i32 {
  return T.Make() + 1;
}

Here, we want to be able to use the result of F.Make() in all regards like an i32. Assuming an i32.Abs function exists, and i32 implements Hashable externally, it is more desirable for F.Make().Abs() to work than it is for F.Make().Hash() to work.

We are currently aiming to address this by saying that when two type expressions are constrained to be equal, a value whose type is either of those type expressions provides the (disjoint) union of their interfaces. This leads to a surprise:

fn G[T:! X where .R == i32](x: T) -> u128 {
  let n: i32 = 1;
  let m: i64 = 2;
  return n.Hash() ^ m.Hash();
}

Here, the call to n.Hash() is valid but the call to m.Hash() is invalid, because the type i32 of n is constrained to be equal to T.R which is of type Hashable, but the type i64 is not constrained in that way and does not implement Hashable internally.

This suggests that there might be two different behaviors that we want when adding a constraint: either we are changing the interface, or not. This is analogous to internal versus external implementation of an interface.

Type canonicalization

There are a variety of contexts in which types are compared for equality. In some of those contexts, it might be acceptable to perform a search and a computation, and it might be acceptable to have false negatives. For example, when analyzing an argument being passed to a function in a generic, it might be OK to say "we can't prove that the argument has a type that is equal to (or convertible to) the type of the parameter", so long as the developer has some way to demonstrate that the types are the same if that is indeed the case.

However, not all comparisons of generic types have that property. As an extreme and hopefully avoidable example, suppose a C++-like linkage model is used, with name mangling. We may need to determine whether two type expressions are equal at link time, by reducing them both to strings that are guaranteed to be identical if the types are equal and non-identical otherwise. This requires some form of type canonicalization: reducing all equal type expressions to some common representation where we can faithfully perform equality comparisons.

Even if the semantics of Carbon don't require canonicalization, it is still a very useful property for implementation purposes. For example, if type canonicalization is possible to perform efficiently, building a data structure mapping from a type to some value becomes much simpler, and if the representation of a type expression carries its canonical form then type equality comparisons can be done in constant time.

Transitivity of equality

For ergonomic purposes, it is valuable for the notion of type expression equality to be transitive: if type A is known to be the same as type B, and type B is known to be the same as type C, then shouldn't we be able to use an A where a C is required? See #1369 for an explanation of why this is important.

However, if we permit arbitrary constraints and have a transitive equality rule, then type equality is isomorphic to the word problem for finitely-presented monoids, which is undecidable, and certainly not computable, in general.

Therefore we must impose a restriction somewhere: either not all constraints are permissible, or we do not have transitivity, or we impose some other constraint on the language to avoid the hard cases. Some languages merely impose some kind of depth limit on their search for a proof of equality.

Note that if we can perform type canonicalization, then type equality is necessarily transitive: if all equal pairs of types canonicalize to the same representation, then all equal types canonicalize to the same representation transitively.

Coherence

In order to avoid surprises in the language behavior, we want Carbon to have the property that learning more about the type of a value should not cause the behavior of an expression to change from one valid behavior to another valid behavior.

It's important to distinguish here between learning more about a type, such as learning that it implements a particular interface – for example, from an added import – and changing the type, such as by writing a different expression after a : or :!. Changing the type of a value is expected to be able to change the behavior of an expression using that value.

Background

There has been a lot of work in this space. Some previous relevant proposals:

  • #731: generics details 2 introduced associated types, with the intent that a mechanism to constrain their value would be provided in a later proposal.
  • #818: generics details 3 introduced where constraints for generics, with an open question for whether and how we might restrict constraints in order to give a decidable type equality rule.
    • #818 represents the status quo of approved proposals. It has both = and == where constraints, which are similar to, but somewhat different from, those in this proposal.
    • = constraints require a concrete type on their right-hand side. The left-hand side is not restricted. The facet type of the left-hand side type is changed to match the right-hand side.
    • == constraints do not affect the facet type of either operand.
    • As an answer to the question of how to produce a decidable type equality rule, both kinds of constraints are applied automatically, but only at a depth of one constraint. Neither form is transitive but both can be transitively extended using observe declarations.
  • #2070: always == not = in where clauses describes an attempt to unify the syntax and semantics of = and == constraints, under which we would always merge the facet types of the operands. That proposal has not yet been accepted, and this proposal intends to supersede it.

Proposal

Replace the semantics of the existing where A == B and where A = B constraints and restrict the syntax of where A = B as follows:

  • where .A = T specifies a rewrite constraint. The left-hand side is required to name an associated constant. If that associated constant .A is named as a member of a type declared with this constraint, it is rewritten to T. This means that the interface of the type changes, and the behavior is in all respects as if T had been specified directly.
  • where X == Y specifies a same-type constraint. This is treated much like where X is SameAs(Y). Here, SameAs is a hypothetical constraint that is reflexive (T is SameAs(T) for all T), commutative (T is SameAs(U) if and only if U is SameAs(T)), and not implementable by the developer. Moreover, if F is any pure type function, T is SameAs(U) implies F(T) is SameAs(F(U)). observe statements can be used to form transitive SameAs relations. Same-type constraints are not used automatically by the language rules for any purpose, but a blanket ImplicitAs impl permits conversions between types that are known to be the same:
    impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U);
    

A type implements C where .A = T if and only if it implements C where .A == T, assuming both constraints are valid – the inhabitants of these two facet types are the same, but they are different facet types and provide different interfaces for corresponding type values.

Details

Rewrite constraints

In a rewrite constraint, the left-hand operand of = must be a . followed by the name of an associated constant. .Self is not permitted.

interface RewriteSelf {
  // ❌ Error: `.Self` is not the name of an associated constant.
  let Me:! type where .Self = Self;
}
interface HasAssoc {
  let Assoc:! type;
}
interface RewriteSingleLevel {
  // ✅ Uses of `A.Assoc` will be rewritten to `i32`.
  let A:! HasAssoc where .Assoc = i32;
}
interface RewriteMultiLevel {
  // ❌ Error: Only one level of associated constant is permitted.
  let B:! RewriteSingleLevel where .A.Assoc = i32;
}

This notation is permitted anywhere a constraint can be written, and results in a new constraint with a different interface: the named member effectively no longer names an associated constant of the constrained type, and is instead treated as a rewrite rule that expands to the right-hand side of the constraint, with any mentioned parameters substituted into that type.

interface Container {
  let Element:! type;
  let Slice:! Container where .Element = Element;
  fn Add[addr me: Self*](x: Element);
}
// `T.Slice.Element` rewritten to `T.Element`
//     because type of `T.Slice` says `.Element = Element`.
// `T.Element` rewritten to `i32`
//     because type of `T` says `.Element = i32`.
fn Add[T:! Container where .Element = i32](p: T*, y: T.Slice.Element) {
  // ✅ Argument `y` has the same type `i32` as parameter `x` of
  // `T.(Container.Add)`, which is also rewritten to `i32`.
  p->Add(y);
}

Rewrites aren't performed on the left-hand side of such an =, so where .A = .B and .A = C is not rewritten to where .A = .B and .B = C. Instead, such a where clause is invalid when the constraint is resolved unless each rule for .A specifies the same rewrite.

Note that T:! C where .R = i32 can result in a type T.R whose behavior is different from the behavior of T.R given T:! C. For example, member lookup into T.R can find different results and operations can therefore have different behavior. However, this does not violate coherence because the facet types C and C where .R = i32 don't differ by merely having more type information; rather, they are different facet types that have an isomorphic set of values, somewhat like i32 and u32. An = constraint is not merely learning a new fact about a type, it is requesting different behavior.

Combining constraints with &

Suppose we have X = C where .R = A and Y = C where .R = B. What should C & X produce? What should X & Y produce?

We could perhaps say that X & Y results in a facet type where the type of R has the union of the interface of A and the interface of B, and that C & X similarly results in a facet type where the type of R has the union of the interface of A and the interface originally specified by C. However, this proposal suggests a simpler rule:

  • Combining two rewrite rules with different rewrite targets results in a facet type where the associated constant is ambiguous. Given T:! X & Y, the type expression T.R is ambiguous between a rewrite to A and a rewrite to B. But given T:! X & X, T.R is unambiguously rewritten to A.
  • Combining a constraint with a rewrite rule with a constraint with no rewrite rule preserves the rewrite rule. For example, supposing that interface Container extends interface Iterable, and Iterable has an associated constant Element, the constraint Container & (Iterable where .Element = i32) is the same as the constraint (Container & Iterable) where .Element = i32 which is the same as the constraint Container where .Element = i32.

If the rewrite for an associated constant is ambiguous, the facet type is rejected during constraint resolution.

Combining constraints with and

It's possible for one = constraint in a where to refer to another. When this happens, the facet type C where A and B is interpreted as (C where A) where B, so rewrites in A are applied immediately to names in B, but rewrites in B are not applied to names in A until the facet type is resolved:

interface C {
  let T:! type;
  let U:! type;
  let V:! type;
}
class M {
  alias Me = Self;
}
// ✅ Same as `C where .T = M and .U = M.Me`, which is
// the same as `C where .T = M and .U = M`.
fn F[A:! C where .T = M and .U = .T.Me]() {}
// ❌ No member `Me` in `A.T:! type`.
fn F[A:! C where .U = .T.Me and .T = M]() {}

Combining constraints with extends

Within an interface or named constraint, extends can be used to extend a constraint that has rewrites.

interface A {
  let T:! type;
  let U:! type;
}
interface B {
  extends A where .T = .U and .U = i32;
}

var n: i32;

// ✅ Resolved constraint on `T` is
// `B where .(A.T) = i32 and .(A.U) = i32`.
// `T.(A.T)` is rewritten to `i32`.
fn F(T:! B) -> T.(A.T) { return n; }

Combining constraints with impl as and is

Within an interface or named constraint, the impl T as C syntax does not permit = constraints to be specified directly. However, such constraints can be specified indirectly, for example if C is a named constraint that contains rewrites. Because these constraints don't change the type of T, rewrite constraints in this context will never result in a rewrite being performed, and instead are equivalent to == constraints.

interface A {
  let T:! type;
  let U:! type;
}
constraint C {
  extends A where .T = .U and .U = i32;
}
constraint D {
  extends A where .T == .U and .U == i32;
}
interface B {
  // OK, equivalent to `impl as D;` or
  // `impl as A where .T == .U and .U == i32;`.
  impl as C;
}

var n: i32;

// ❌ No implicit conversion from `i32` to `T.(A.T)`.
// Resolved constraint on `T` is
// `B where T.(A.T) == T.(A.U) and T.(A.U) = i32`.
// `T.(A.T)` is single-step equal to `T.(A.U)`, and
// `T.(A.U)` is single-step equal to `i32`, but
// `T.(A.T)` is not single-step equal to `i32`.
fn F(T:! B) -> T.(A.T) { return n; }

A purely syntactic check is used to determine if an = is specified directly in an expression:

  • An = constraint is specified directly in its enclosing where expression.
  • If an = constraint is specified directly in an operand of an & or (...), then it is specified directly in that enclosing expression.

In an impl as C or impl T as C declaration in an interface or named constraint, C is not allowed to directly specify any = constraints:

// Compile-time identity function.
fn Identity[T:! type](x:! T) -> T { return x; }

interface E {
  // ❌ Rewrite constraint specified directly.
  impl as A where .T = .U and .U = i32;
  // ❌ Rewrite constraint specified directly.
  impl as type & (A where .T = .U and .U = i32);
  // ✅ Not specified directly, but does not result
  // in any rewrites being performed.
  impl as Identity(A where .T = .U and .U = i32);
}

The same rules apply to is constraints. Note that .T == U constraints are also not allowed in this context, because the reference to .T is rewritten to .Self.T, and .Self is ambiguous.

// ❌ Rewrite constraint specified directly in `is`.
fn F[T:! A where .U is (A where .T = i32)]();

// ❌ Reference to `.T` in same-type constraint is ambiguous:
// does this mean the outer or inner `.Self.T`?
fn G[T:! A where .U is (A where .T == i32)]();

// ✅ Not specified directly, but does not result
// in any rewrites being performed. Return type
// is not rewritten to `i32`.
fn H[T:! type where .Self is C]() -> T.(A.U);

// ✅ Return type is rewritten to `i32`.
fn I[T:! C]() -> T.(A.U);

Constraint resolution

When a facet type is used as the declared type of a type T, the constraints that were specified within that facet type are resolved to determine the constraints that apply to T. This happens:

  • When the constraint is used explicitly, when declaring a generic parameter or associated constant of the form T:! Constraint.
  • When declaring that a type implements a constraint with an impl declaration, such as impl T as Constraint. Note that this does not include impl ... as constraints appearing in interface or constraint declarations.

In each case, the following steps are performed to resolve the facet type's abstract constraints into a set of constraints on T:

  • If multiple rewrites are specified for the same associated constant, they are required to be identical, and duplicates are discarded.
  • Rewrites are performed on other rewrites in order to find a fixed point, where no rewrite applies within any other rewrite. If no fixed point exists, the generic parameter declaration or impl declaration is invalid.
  • Rewrites are performed throughout the other constraints in the facet type -- that is, in any == constraints and is constraints -- and the type .Self is replaced by T throughout the constraint.
// ✅ `.T` in `.U = .T` is rewritten to `i32` when initially
// forming the facet type.
// Nothing to do during constraint resolution.
fn InOrder[A:! C where .T = i32 and .U = .T]() {}
// ✅ Facet type has `.T = .U` before constraint resolution.
// That rewrite is resolved to `.T = i32`.
fn Reordered[A:! C where .T = .U and .U = i32]() {}
// ✅ Facet type has `.U = .T` before constraint resolution.
// That rewrite is resolved to `.U = i32`.
fn ReorderedIndirect[A:! (C where .T = i32) & (C where .U = .T)]() {}
// ❌ Constraint resolution fails because
// no fixed point of rewrites exists.
fn Cycle[A:! C where .T = .U and .U = .T]() {}

To find a fixed point, we can perform rewrites on other rewrites, cycling between them until they don't change or until a rewrite would apply to itself. In the latter case, we have found a cycle and can reject the constraint. Note that it's not sufficient to expand only a single rewrite until we hit this condition:

// ❌ Constraint resolution fails because
// no fixed point of rewrites exists.
// If we only expand the right-hand side of `.T`,
// we find `.U`, then `.U*`, then `.U**`, and so on,
// and never detect a cycle.
// If we alternate between them, we find
// `.T = .U*`, then `.U = .U**`, then `.V = .U***`,
// then `.T = .U**`, then detect that the `.U` rewrite
// would apply to itself.
fn IndirectCycle[A:! C where .T = .U and .U = .V* and .V = .U*]();

After constraint resolution, no references to rewritten associated constants remain in the constraints on T.

If a facet type is never used to constrain a type, it is never subject to constraint resolution, and it is possible for a facet type to be formed for which constraint resolution would always fail. For example:

package Broken api;

interface X {
  let T:! type;
  let U:! type;
}
let Bad:! auto = (X where .T = .U) & (X where .U = .T);
// Bad is not used here.

In such cases, the facet type Broken.Bad is not usable: any attempt to use that facet type to constrain a type would perform constraint resolution, which would always fail because it would discover a cycle between the rewrites for .T and for .U. In order to ensure that such cases are diagnosed, a trial constraint resolution is performed for all facet types. Note that this trial resolution can be skipped for facet types that are actually used, which is the common case.

Precise rules and termination

This section explains the detailed rules used to implement rewrites. There are two properties we aim to satisfy:

  1. After type-checking, no symbolic references to associated constants that have an associated rewrite rule remain.
  2. Type-checking always terminates in a reasonable amount of time, ideally linear in the size of the problem.

Rewrites are applied in two different phases of program analysis.

  • During qualified name lookup and type checking for qualified member access, if a rewritten member is looked up, the right-hand side's value and type are used for subsequent checks.
  • During substitution, if the symbolic name of an associated constant is substituted into, and substitution into the left-hand side results in a value with a rewrite for that constant, that rewrite is applied.

In each case, we always rewrite to a value that satisfies property 1 above, and these two steps are the only places where we might form a symbolic reference to an associated cosntant, so property 1 is recursively satisfied. Moreover, we apply only one rewrite in each of the above cases, satisfying property 2.

Qualified name lookup

Qualified name lookup into either a type parameter or into an expression whose type is a symbolic type T -- either a type parameter or an associated type -- considers names from the facet type C of T, that is, from T’s declared type.

interface C {
  let M:! i32;
  let U:! C;
}
fn F[T:! C](x: T) {
  // Value is C.M in all four of these
  let a: i32 = x.M;
  let b: i32 = T.M;
  let c: i32 = x.U.M;
  let d: i32 = T.U.M;
}

When looking up the name N, if C is an interface I and N is the name of an associated constant in that interface, the result is a symbolic value representing "the member N of I". If C is formed by combining interfaces with &, all such results are required to find the same associated constant, otherwise we reject for ambiguity.

If a member of a class or interface is named in a qualified name lookup, the type of the result is determined by performing a substitution. For an interface, Self is substituted for the self type, and any parameters for that class or interface (and enclosing classes or interfaces, if any) are substituted into the declared type.

interface SelfIface {
  fn Get[me: Self]() -> Self;
}
class UsesSelf(T:! type) {
  // Equivalent to `fn Make() -> UsesSelf(T)*;`
  fn Make() -> Self*;
  impl as SelfIface;
}

// ✅ `T = i32` is substituted into the type of `UsesSelf(T).Make`,
// so the type of `UsesSelf(i32).Make` is `fn () -> UsesSelf(i32)*`.
let x: UsesSelf(i32)* = UsesSelf(i32).Make();

// ✅ `Self = UsesSelf(i32)` is substituted into the type
// of `SelfIface.Get`, so the type of `UsesSelf(i32).(SelfIface.Get)`
// is `fn [me: UsesSelf(i32)]() -> UsesSelf(i32)`.
let y: UsesSelf(i32) = x->Get();

None of this is new in this proposal. This proposal adds the rule:

If a facet type C into which lookup is performed includes a where clause saying .N = U, and the result of qualified name lookup is the associated constant N, that result is replaced by U, and the type of the result is the type of U. No substitution is performed in this step, not even a Self substitution -- any necessary substitutions were already performed when forming the facet type C, and we don’t consider either the declared type or value of the associated constant at all for this kind of constraint. Going through an example in detail:

interface A {
  let T:! type;
}
interface B {
  let U:! type;
  // More explicitly, this is of type `A where .(A.T) = Self.(B.U)`
  let V:! A where .T = U;
}
// Type of T is B.
fn F[T:! B](x: T) {
  // The type of the expression `T` is `B`.
  // `T.V` finds `B.V` with type `A where .(A.T) = Self.(B.U)`.
  // We substitute `Self` = `T` giving the type of `u` as
  // `A where .(A.T) = T.(B.U)`.
  let u:! auto = T.V;
  // The type of `u` is `A where .(A.T) = T.(B.U)`.
  // Lookup for `u.T` resolves it to `u.(A.T)`.
  // So the result of the qualified member access is `T.(B.U)`,
  // and the type of `v` is the type of `T.(B.U)`, namely `type`.
  // No substitution is performed in this step.
  let v:! auto = u.T;
}

The more complex case of

fn F2[T:! B where .U = i32](x: T);

is discussed later.

Type substitution

At various points during the type-checking of a Carbon program, we need to substitute a set of (binding, value) pairs into a symbolic value. We saw an example above: substituting Self = T into the type A where .(A.T) = Self.U to produce the value A where .(A.T) = T.U. Another important case is the substitution of inferred parameter values into the type of a function when type-checking a function call:

fn F[T:! C](x: T) -> T;
fn G(n: i32) -> i32 {
  // Deduces T = i32, which is substituted
  // into the type `fn (x: T) -> T` to produce
  // `fn (x: i32) -> i32`, giving `i32` as the type
  // of the call expression.
  return F(n);
}

Qualified name lookup is not re-done as a result of type substitution. For a template, we imagine there’s a completely separate step that happens before type substitution, where qualified name lookup is redone based on the actual value of template arguments; this proceeds as described in the previous section. Otherwise, we performed the qualified name lookup when type-checking the generic, and do not do it again:

interface IfaceHasX {
  let X:! type;
}
class ClassHasX {
  class X {}
}
interface HasAssoc {
  let Assoc:! IfaceHasX;
}

// Qualified name lookup finds `T.(HasAssoc.Assoc).(IfaceHasX.X)`.
fn F(T:! HasAssoc) -> T.Assoc.X;

fn G(T:! HasAssoc where .Assoc = ClassHasX) {
  // `T.Assoc` rewritten to `ClassHasX` by qualified name lookup.
  // Names `ClassHasX.X`.
  var a: T.Assoc.X = {};
  // Substitution produces `ClassHasX.(IfaceHasX.X)`,
  // not `ClassHasX.X`.
  var b: auto = F(T);
}

During substitution, we might find a member access that named an opaque symbolic associated constant in the original value can now be resolved to some specific value. It’s important that we perform this resolution:

interface A {
  let T:! type;
}
class K { fn Member(); }
fn H[U:! A](x: U) -> U.T;
fn J[V:! A where .T = K](y: V) {
  // We need the interface of `H(y)` to include
  // `K.Member` in order for this lookup to succeed.
  H(y).Member();
}

The values being substituted into the symbolic value are themselves already fully substituted and resolved, and in particular, satisfy property 1 given above.

Substitution proceeds by recursively rebuilding each symbolic value, bottom-up, replacing each substituted binding with its value. Doing this naively will propagate values like i32 in the F/G case earlier in this section, but will not propagate rewrite constants like in the H/J case. The reason is that the .T = K constraint is in the type of the substituted value, rather than in the substituted value itself: deducing T = i32 and converting i32 to the type C of T preserves the value i32, but deducing U = V and converting V to the type A of U discards the rewrite constraint.

In order to apply rewrites during substitution, we associate a set of rewrites with each value produced by the recursive rebuilding procedure. This is somewhat like having substitution track a refined facet type for the type of each value, but we don’t need -- or want -- for the type to change during this process, only for the rewrites to be properly applied. For a substituted binding, this set of rewrites is the rewrites found on the type of the corresponding value prior to conversion to the type of the binding. When rebuilding a member access expression, if we have a rewrite corresponding to the accessed member, then the resulting value is the target of the rewrite, and its set of rewrites is that found in the type of the target of the rewrite, if any. Because the target of the rewrite is fully resolved already, we can ask for its type without triggering additional work. In other cases, the rewrite set is empty; all necessary rewrites were performed when type-checking the value we're substituting into.

Continuing an example from qualified name lookup:

interface A {
  let T:! type;
}
interface B {
  let U:! type;
  let V:! A where .T = U;
}

// Type of the expression `T` is `B where .(B.U) = i32`
fn F2[T:! B where .U = i32](x: T) {
  // The type of the expression `T` is `B where .U = i32`.
  // `T.V` is looked up and finds the associated type `(B.V)`.
  // The declared type is `A where .(A.T) = Self.U`.
  // We substitute `Self = T` with rewrite `.U = i32`.
  // The resulting type is `A where .(A.T) = i32`.
  // So `u` is `T.V` with type `A where .(A.T) = i32`.
  let u:! auto = T.V;
  // The type of `u` is `A where .(A.T) = i32`.
  // Lookup for `u.T` resolves it to `u.(A.T)`.
  // So the result of the qualified member access is `i32`,
  // and the type of `v` is the type of `i32`, namely `type`.
  // No substitution is performed in this step.
  let v:! auto = u.T;
}

Examples

interface Container {
  let Element:! type;
}
interface SliceableContainer {
  extends Container;
  let Slice:! Container where .Element = Self.(Container.Element);
}
// ❌ Qualified name lookup rewrites this facet type to
// `SliceableContainer where .(Container.Element) = .Self.(Container.Element)`.
// Constraint resolution rejects this because this rewrite forms a cycle.
fn Bad[T:! SliceableContainer where .Element = .Slice.Element](x: T.Element) {}
interface Helper {
  let D:! type;
}
interface Example {
  let B:! type;
  let C:! Helper where .D = B;
}
// ✅ `where .D = ...` by itself is fine.
// `T.C.D` is rewritten to `T.B`.
fn Allowed(T:! Example, x: T.C.D);
// ❌ But combined with another rewrite, creates an infinite loop.
// `.C.D` is rewritten to `.B`, resulting in `where .B = .B`,
// which causes an error during constraint resolution.
// Using `==` instead of `=` would make this constraint redundant,
// rather than it being an error.
fn Error(T:! Example where .B = .C.D, x: T.C.D);
interface Allowed;
interface AllowedBase {
  let A:! Allowed;
}
interface Allowed {
  extends AllowedBase where .A = .Self;
}
// ✅ The final type of `x` is `T`. It is computed as follows:
// In `((T.A).A).A`, the inner `T.A` is rewritten to `T`,
// resulting in `((T).A).A`, which is then rewritten to
// `(T).A`, which is then rewritten to `T`.
fn F(T:! Allowed, x: ((T.A).A).A);
interface MoveYsRight;
constraint ForwardDeclaredConstraint(X:! MoveYsRight);
interface MoveYsRight {
  let X:! MoveYsRight;
  // Means `Y:! MoveYsRight where .X = X.Y`
  let Y:! ForwardDeclaredConstraint(X);
}
constraint ForwardDeclaredConstraint(X:! MoveYsRight) {
  extends MoveYsRight where .X = X.Y;
}
// ✅ The final type of `x` is `T.X.Y.Y`. It is computed as follows:
// The type of `T` is `MoveYsRight`.
// The type of `T.Y` is determined as follows:
// -   Qualified name lookup finds `MoveYsRight.Y`.
// -   The declared type is `ForwardDeclaredConstraint(Self.X)`.
// -   That is a named constraint, for which we perform substitution.
//     Substituting `X = Self.X` gives the type
//     `MoveYsRight where .X = Self.X.Y`.
// -   Substituting `Self = T` gives the type
//     `MoveYsRight where .X = T.X.Y`.
// The type of `T.Y.Y` is determined as follows:
// -   Qualified name lookup finds `MoveYsRight.Y`.
// -   The declared type is `ForwardDeclaredConstraint(Self.X)`.
// -   That is a named constraint, for which we perform substitution.
//     Substituting `X = Self.X` gives the type
//     `MoveYsRight where .X = Self.X.Y`.
// -   Substituting `Self = T.Y` with
//     rewrite `.X = T.X.Y` gives the type
//     `MoveYsRight where .X = T.Y.X.Y`, but
//     `T.Y.X` is replaced by `T.X.Y`, giving
//     `MoveYsRight where .X = T.X.Y.Y`.
// The type of `T.Y.Y.X` is determined as follows:
// -   Qualified name lookup finds `MoveYsRight.X`.
// -   The type of `T.Y.Y` says to rewrite that to `T.X.Y.Y`.
// -   The result is `T.X.Y.Y`, of type `MoveYsRight`.
fn F4(T:! MoveYsRight, x: T.Y.Y.X);

Termination

Each of the above steps performs at most one rewrite, and doesn't introduce any new recursive type-checking steps, so should not introduce any new forms of non-termination. Rewrite constraints thereby give us a deterministic, terminating type canonicalization mechanism for associated constants: in A.B, if the type of A specifies that .B = C, then A.B is replaced by C. Equality of types constrained in this way is transitive.

However, some existing forms of non-termination may remain, such as template instantiation triggering another template instantiation. Such cases will need to be detected and handled in some way, such as by a depth limit, but doing so doesn't compromise the soundness of the type system.

Same type constraints

A same-type constraint describes that two type expressions are known to evaluate to the same value. Unlike a rewrite constraint, however, the two type expressions are treated as distinct types when type-checking a generic that refers to them.

Same-type constraints are brought into scope, looked up, and resolved exactly as if there were a SameAs(U:! type) interface and a T == U impl corresponded to T is SameAs(U), except that == is commutative. As such, it's not possible to ask for a list of types that are the same as a given type, nor to ask whether there exists a type that is the same as a given type and has some property. But it is possible to ask whether two types are (non-transitively) the same.

In order for same-type constraints to be useful, they must allow the two types to be treated as actually being the same in some context. This can be accomplished by the use of == constraints in an impl, such as in the built-in implementation of ImplicitAs:

final impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U) {
  fn Convert[me: Self](other: U) -> U { ... }
}

It superficially seems like it would be convenient if such implementations were made available implicitly – for example, by writing impl forall [T:! type] T as ImplicitAs(T) – but in more complex examples that turns out to be problematic. Consider:

interface CommonTypeWith(U:! type) {
  let Result:! type;
}
final impl forall [T:! type] T as CommonTypeWith(T) where .Result = T {}

fn F[T:! Potato, U:! Hashable where .Self == T](x: T, y: U) -> auto {
  // What is T.CommonTypeWith(U).Result? Is it T or U?
  return (if cond then x else y).Hash();
}

With this proposal, impl validation for T as CommonTypeWith(U) fails: we cannot pick a common type when given two distinct type expressions, even if we know they evaluate to the same type, because we would not know which API the result should have.

Implementation of same-type ImplicitAs

It is possible to implement the above impl of ImplicitAs directly in Carbon, without a compiler builtin, by taking advantage of the built-in conversion between C where .A = X and C where .A == X:

interface EqualConverter {
  let T:! type;
  fn Convert(t: T) -> Self;
}
fn EqualConvert[T:! type](t: T, U:! EqualConverter where .T = T) -> U {
  return U.Convert(t);
}
impl forall [U:! type] U as EqualConverter where .T = U {
  fn Convert(u: U) -> U { return u; }
}

impl forall [T:! type, U:! type where .Self == T] T as ImplicitAs(U) {
  fn Convert[self: Self]() -> U { return EqualConvert(self, U); }
}

The transition from (T as ImplicitAs(U)).Convert, where we know that U == T, to EqualConverter.Convert, where we know that .T = U, allows a same-type constraint to be used to perform a rewrite.

This implementation is in use in Carbon Explorer.

Observe declarations

Same-type constraints are non-transitive, just like other constraints such as ImplicitAs. The developer can use an observe declaration to bring a new same-type constraint into scope:

observe A == B == C;

notionally does much the same thing as

impl A as SameAs(C) { ... }

where the impl makes use of A is SameAs(B) and B is SameAs(C).

Implementing an interface

When implementing an interface, the = syntax must be used, and each associated constant without a default must be explicitly assigned-to:

impl IntVec as Container where .Element = i32 { ... }

not

impl IntVec as Container where .Element == i32 { ... }

The latter would be treated as

impl IntVec as Container where IntVec.Element is SameAs(i32) { ... }

... which only checks the value of Element rather than specifying it.

As in other contexts, Self.Element is rewritten to i32 within the context of the impl.

Ergonomics

This proposal can be viewed as dividing type constraints into two categories:

  • Ergonomic, but quite restricted, constraints using =.
  • Non-ergonomic, but fully general, constraints using ==.

In order for this to be an effective and ergonomic strategy, we require both that the difference between these two kinds of constraints are readily understood by developers, and that the more-ergonomic = constraints cover sufficiently many scenarios that the developer seldom needs to write code to request a conversion between types that are known to be the same.

Ideally, we hope that = constraints will cover all real situations, and == constraints will never need to be written, outside of the one ImplicitAs implementation described above. If this turns out to be true in practice, we should consider removing == syntax from the language. However, this will not remove == semantics, both because of the behavior of ImplicitAs and because impl T as C where .A = B constraints in an interface or named constraint give rise to == semantics.

Implementation experience

This proposal has been mostly implemented in Carbon Explorer, and appears to behave as expected. However, Explorer does not yet support forward declarations of interfaces and constraints, which means that some of the more interesting cases can't be tested.

Rationale

  • Language tools and ecosystem
    • This rule is easy to implement, and even a naive implementation should be efficient.
  • Software and language evolution
    • The status quo causes coincidentally equal types to have the same interface. Under this proposal, the interface of a type is not affected by coincidental equality, only by intentional assignment to an associated type.
  • Code that is easy to read, understand, and write
    • Using = rather than == in impls is easier to read and write.
    • The interface available on a type is more predictable in this proposal than in the status quo ante, making code easier to understand.

Alternatives considered

Status quo

The Problem section describes some concerns with the status quo approach. This proposal aims to address those concerns.

Restrict constraints to allow computable type equality

We could somehow restrict which constraints can be specified, in order to ensure that our type equality rule is efficiently computable, perhaps even in a way that permits a deterministic computation of a canonical type. The exact details of how we would do this have not been worked out, but this might lead to a coherent rule that has transitive type equality. However, this would not solve the "equal types with different interfaces" problem.

Find a fully transitive approach to type equality

This proposal makes type equality transitive, but still has non-transitive "same-type" constraints describing types that are known to always eventual be the same after substitution, but which are not treated as the same type value while type-checking a generic. This proposal also imposes a directionality on rewrites, a restriction to only rewrite at a depth of exactly one associated constant, and a restriction that only one value can be specified as equal to a given associated constant. We could seek a type equality rule that would avoid having two different kinds of equality and would avoid some or all of the restrictions placed on rewrite constraints.

One candidate approach is to accept that the full generality of the equality problem is hard, but attempt to solve it anyway. Because of the close correspondence between Turing machines and string rewriting systems, this results in a type system with no upper bound on its runtime. We consider this unacceptable for Carbon, but it is the approach taken by some other modern languages:

  • Swift has an undecidable type system due to this, and handles the problem by placing a limit on the complexity of cases they will accept, but that seems unsuited to Carbon's goals, as the rule cannot be readily understood by a developer nor effectively worked around.
  • Rust also has an undecidable type system (content warning: contains many instances of an obscene word as part of a programming language name) for the same reason, and similarly has a recursion limit.

See also Typing is Hard, which lists similar information for a variety of other languages. Note that most languages listed have an undecidable type system.

Another candidate approach is to find a way to reduce the inputs to the type-checking step as a set of non-quantified equalities. The resulting system of equalities can then be solved efficiently to determine whether two types are known to be the same. This approach appears sufficient to cope with locally declared constraints, but when constraints appear within interfaces, they can give rise to an infinite family of equalities, and picking any finite subset risks the result being non-transitive in general, if a different subset of equalities might be considered in a different type-checking context.

Different syntax for rewrite constraint

We could use a different syntax, such as ~>, for rewrite constraints. This would have the advantage of not using assignment for an operation that's not always doing something that developers will think of as assignment. It also avoids hard-to read code in cases where a binding has an initializer:

// This proposal.
let T:! Add where .Result = i32 = i32;
// Alternative syntax.
let T:! Add where .Result ~> i32 = i32;

However, it seems valuable to use only a single syntax for specifying these constraints, and equality is the appropriate mental model most of the time. Moreover, in an impl declaration, we really are assigning to the associated constant in the sense of setting a value, rather than merely constraining a value.

This decision should be revisited if a superior syntax that avoids the above visual ambiguity is suggested.

Different syntax for same-type constraint

We considered various options for the spelling of a same-type constraint:

  • where A == B
  • where A ~ B
  • some other operator symbol
  • where A is SameType(B), or some other constraint name

The advantage of == is that it has a lot of desirable implications: it's familiar, symmetric, and connotes the left and right operand being the same. However, it also might be taken as suggesting that the == overloaded operator is used to determine equality. A different spelling would also indicate that this is an unusual constraint, and would be easier to search for. Concerns were also raised that == might suggest transitivity if taken to mean "same value" rather than something like the Carbon == binary operator, and the transitivity of == constraints is not automatic.

The ~ operator is currently not in use as a binary operator. However, that makes it quite valuable syntactic real estate, and it may see little use in this context in practice. A longer operator could be used, but any choice other than == will present an unfamiliarity barrier, and longer operators will be harder to remember.

A named constraint is appealing, but this operator does not behave like a named constraint in that it is automatically symmetrized, not implementable, and especially in its interactions with observe declarations. It is unclear how observe declarations would be written with a named SameType constraint:

// Maybe something like this special-case and verbose syntax?
observe A is SameType(B) and B is SameType(C) => A is SameType(C);

The fundamental connection between same-type constraints and observe suggests that dedicated syntax is justified.

For now, we use ==, despite it having a different meaning in the context of a constraint than in expression contexts. This is analogous to how = and and have different meanings in this context from in expressions, so the behavior divergence is at least consistent. It's also not clear at this stage how much usage same-type constraints will have, compared to rewrite constraints, so it's hard to judge the other arguments in favor of and against other operators. It would be reasonable to revisit this decision when we have more usage information.

Required ordering for rewrites

We considered a rule where a rewrite would need to be specified in a constraint before the associated constant is used. For example:

interface C {
  let T:! type;
  let U:! type;
}
// Could reject, `.U` referenced before rewrite is defined.
fn G[A:! C where .T = .U and .U = i32]() {}

This would remove the need to perform constraint resolution. Instead, we could apply rewrites as we form the constraint, and constraints formed in this way would always satisfy the property that they don't refer to associated constants that have a rewrite.

It may also be less surprising to use this rule. Because Carbon uses top-down processing for type-checking in general, it may be unexpected that a rewrite rule would rewrite an earlier occurrence of its rewrite target, even though this happens in practice at a later point in time, during constraint resolution.

However, such an approach would not give a satisfactory outcome for cases such as:

constraint X {
  extends C where .T = .U;
}
constraint Y {
  extends C where .U = i32;
}
// Desired behavior is that `.T` and `.U` both rewrite to `i32`.
fn G[A:! X & Y]() -> A.T { return 0; }

Performing earlier-appearing rewrites in later constraints and additionally performing a constraint resolution step to apply rewrites throughout the constraint seems to give better outcomes for the examples we considered.

Multi-constraint where clauses

Instead of treating A where B and C as (A where B) where C, we could model it as (A where B) & (A where C).

Specifically, given

interface C {
  let T:! type;
  let U:! type;
}

this would treat C where .T = A and .U = B as (C where .T = A) & (C where .U = B), where the type of .Self would be C in both clauses, with both constraints applied "simultaneously" to C. This would mean that the order of clauses doesn’t matter.

However, if we do this, then neither right-hand side is rewritten by name lookup, and thus we would reject cases such as C where .T = i32 and .U = .T.(Add.Result). This case seems reasonable, and we would prefer to accept it.

Therefore, the rule we select is that, as soon as a rewrite is declared, it is in scope, and later references to that associated constant refer to the rewritten value with its rewritten type. We still reject cases like C where .U = .T.(Add.Result) and .T = i32, because we do not know that T is Add from the declared type of C.T.

The same rule applies to is constraints: we accept C where .T is Add and .U = .T.(Add.Result), but reject C where .U = .T.(Add.Result) and .T is Add.

Rewrite constraints in impl as constraints

A rewrite constraint can appear indirectly in an impl as constraint in an interface or named constraint:

interface A {
  let T:! type;
}
constraint AInt {
  extends A where .T = i32;
}
interface B {
  // Or, `impl as A where .T = i32;`
  impl as AInt;
}

When this happens, the rewrite constraint does not result in rewrites being performed when the associated constant is mentioned in a member access into that interface or named constraint:

// Return type is not rewritten to `i32`,
// but there is still a same-type constraint.
fn F(T:! B) -> T.(A.T) {
  // OK, `i32` implicitly converts to `T.(A.T)`.
  return 0 as i32;
}

This may be surprising: B says that its .(A.T) has a rewrite to i32, but that rewrite is not performed. In contrast, rewrites in an extends constraint do become part of the enclosing interface or named constraint.

interface C {
  extends AInt;
}
// Return type is rewritten to `i32`.
fn G(T:! C) -> T.T;

Something similar happens with impl T as constraints. The following two interfaces A1 and A2 are not equivalent:

interface A1 {
  let X:! Add where .Result = i32;
}
constraint AddToi32 {
  extends Add where .Result = i32;
}
interface A2 {
  let X:! Add;
  impl X as AddToi32;
}

In the former case, references to A.X.Result are rewritten to i32, and in the latter case, they are not, because the rewrite is not part of the type of X.

The general principle here is that rewrites are part of the declared type of a name, and can't be changed after the fact by another declaration. For the impl T as case, this behavior is also a necessary part of the termination argument. If rewrites from impl T as constraints were used, it would be possible to form a rewrite cycle:

interface C { let X:! type; }
interface A {
  let U:! C;
  let T:! C;
}
interface B1 {
  extends A;
  impl T as C where .X = U.X;
}
interface B2 {
  extends A;
  impl U as C where .X = T.X;
}
// `V.T.X` ~> `V.U.X` ~> `V.T.X` ~> ...
fn F(V:! B1 & B2) -> V.T.X;

We could allow the specific case of impl as (not impl T as) to introduce rewrites. The advantage of this change is that this behavior may be less surprising in some cases. However, this would mean that an impl as sometimes changes the interface of the enclosing constraint, and behaves inconsistently from an impl T as constraint, so this proposal does not adopt that behavior.

We could allow impl T as to introduce rewrites for T in general, but we would need to find some way to fix the resulting holes in the termination argument.

To minimize the scope of confusion, we currently disallow rewrites from appearing syntactically within an impl as or impl T as constraint. We could allow these constructs. However, a .T = V constraint would be equivalent to a .T == V constraint, and the latter more clearly expresses the resulting semantics, so disallowing the misleading form seems preferable.