Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: "extends oneof" generic constraint; allows for narrowing type parameters #27808

Open
4 tasks done
Nathan-Fenner opened this issue Oct 11, 2018 · 84 comments
Open
4 tasks done
Labels
In Discussion Not yet reached consensus Suggestion An idea for TypeScript

Comments

@Nathan-Fenner
Copy link
Contributor

Nathan-Fenner commented Oct 11, 2018

Search Terms

  • generic bounds
  • narrow generics
  • extends oneof

Suggestion

Add a new kind of generic type bound, similar to T extends C but of the form T extends oneof(A, B, C).

(Please bikeshed the semantics, not the syntax. I know this version is not great to write, but it is backwards compatible.)

Similar to T extends C, when the type parameter is determined (either explicitly or through inference), the compiler would check that the constraint holds. T extends oneof(A, B, C) means that at least one of T extends A, T extends B, T extends C holds. So, for example, in a function

function smallest<T extends oneof(string, number)>(x: T[]): T {
    if (x.length == 0) {
        throw new Error('empty');
    }
    return x.slice(0).sort()[0];
}

Just like today, these would be legal:

smallest<number>([1, 2, 3);        // legal
smallest<string>(["a", "b", "c"]); // legal

smallest([1, 2, 3]);               // legal
smallest(["a", "b", "c"]);         // legal

But (unlike using extends) the following would be illegal:

smallest<string | number>(["a", "b", "c"]); // illegal
// string|number does not extend string
// string|number does not extend number
// Therefore, string|number is not "in" string|number, so the call fails (at compile time).

// Similarly, these are illegal:
smallest<string | number>([1, 2, 3]);       // illegal
smallest([1, "a", 3]);                      // illegal

Use Cases / Examples

What this would open up is the ability to narrow generic parameters by putting type guards on values inside functions:

function smallestString(xs: string[]): string {
    ... // e.g. a natural-sort smallest string function
}
function smallestNumber(x: number[]): number {
    ... // e.g. a sort that compares numbers correctly instead of lexicographically
}

function smallest<T extends oneof(string, number)>(x: T[]): T {
    if (x.length == 0) {
        throw new Error('empty');
    }
    const first = x[0]; // first has type "T"
    if (typeof first == "string") {
        // it is either the case that T extends string or that T extends number.
        // typeof (anything extending number) is not "string", so we know at this point that
        // T extends string only.
        return smallestString(x); // legal
    }
    // at this point, we know that if T extended string, it would have exited the first if.
    // therefore, we can safely call
    return smallestNumber(x);
}

This can't be safely done using extends, since looking at one item (even if there's only one item) can't tell you anything about T; only about that object's dynamic type.

Unresolved: Syntax

The actual syntax isn't really important to me; I just would like to be able to get narrowing of generic types in a principled way.

(EDIT:)
Note: despite the initial appearance, oneof(...) is not a type operator. The abstract syntax parse would be more like T extends_oneof(A, B, C); the oneof and the extends are not separate.

Checklist

My suggestion meets these guidelines:

  • This wouldn't be a breaking change in existing TypeScript / JavaScript code
  • This wouldn't change the runtime behavior of existing JavaScript code
  • This could be implemented without emitting different JS based on the types of the expressions
  • This isn't a runtime feature (e.g. new expression-level syntax)

(any solution will reserve new syntax, so it's not a breaking change, and it only affects flow / type narrowing so no runtime component is needed)

@mattmccutchen
Copy link
Contributor

The use case sounds like a duplicate of #24085, though it seems you've thought through more of the consequences. Let's close this in favor of #24085.

@DanielRosenwasser
Copy link
Member

Sounds like you want an "exclusive-or" type operator - similar to #14094.

@DanielRosenwasser DanielRosenwasser added Suggestion An idea for TypeScript In Discussion Not yet reached consensus labels Oct 11, 2018
@ghost
Copy link

ghost commented Oct 11, 2018

I don't think that's it since string and number are already exclusive so a string xor number type wouldn't be distinguishable from string | number. It's more like they want two overloads:

function smallest<T extends string>(x: T[]): T;
function smallest<T extends number>(x: T[]): T;

But not string | number because smallest([1, "2"]) is likely to be an error.

@Nathan-Fenner
Copy link
Contributor Author

It is close to a combination of #24085 and #25879 but in a relatively simple way that ensures soundness is preserved.

It only affects generic instantiation and narrowing inside generic functions. No new types or ways of types are being created; an xor operator doesn't do anything to achieve this.

@jack-williams
Copy link
Collaborator

jack-williams commented Oct 12, 2018

T extends oneof(A, B, C) means that at least one of T extends A, T extends B, T extends C holds.

That is what a union constraint does. Do you not mean exactly one of?

@mattmccutchen
Copy link
Contributor

That is what a union constraint does.

No because if T = A | B | C then none of T extends A, T extends B, T extends C holds. (Were you reading extends backwards?)

@jack-williams
Copy link
Collaborator

True! No I wasn't, but I had parsed it in my head like (T extends A) | (T extends B) | (T extends C).

@michaeljota
Copy link

What about a XOR type operator? This would be useful in other scenarios as well. As | is a valid bitwise operator in JS for OR, it would fit using ^ as XOR operator in types.

function smallest<T extends string ^ number>(x: T[]): T {
    if (x.length == 0) {
        throw new Error('empty');
    }
    return x.slice(0).sort()[0];
}

@Nathan-Fenner
Copy link
Contributor Author

@michaeljota See previous comments for why that doesn't work. There is no type today that could be put inside the extends to get the desired behavior. Therefore, new type operations cannot solve this problem, since they just allow you (potentially conveniently) write new types. The type number ^ string is exactly the same as string | number, since there is no overlap between string and number. It does nothing to solve this problem.

This has to be solved by a different kind of constraint rather than a different type in the extends clause.

@michaeljota
Copy link

You want one of a giving list of types, right? I really don't know much about types, sorry. Still, OR operator can be satisfied with n types of the interception, and a XOR operator should only allow one of the types in the interception.

I understand that if you create a interception, string | number, you can use only the functions of both types, and you should narrow down to one of those to use those functions. Also, if you declare a generic type as T extends string | number, then T would accept string, number or string | number. Then a XOR operator used in a generic type as T extends string ^ number should only accept strictly one of string or number, excluding the interception itself string | number.

That's not what you would like this to do?

@Nathan-Fenner
Copy link
Contributor Author

The problem with that approach is that the ^ is incoherent anywhere outside of an "extends" clause, so it becomes (pointless) syntactic sugar.

For example, what should

var x: string ^ number = 4;

do? Are you allowed to reassign it to "hello"? Or, in more-complicated examples, like:

var x: Array<string ^ number> = [2, 5, "hello"];

why should/shouldn't this be allowed?

There's no coherent way that this can be done. The only way to give it meaning is to only allow it in extends clauses, where it has the meaning of

type Foo<T extends A ^ B> = {}
// means the same thing as
type Foo<T extends_oneof(A, B)> = {}

It still represents a new type of constraint, not a type operator (because it doesn't produce coherent types).

@michaeljota
Copy link

For example, what should

var x: string ^ number = 4;

do? Are you allowed to reassign it to "hello"?

I think it should. In this case, I guess this would behave the same as string | number.

Or, in more-complicated examples, like:

var x: Array<string ^ number> = [2, 5, "hello"];

why should/shouldn't this be allowed?

Here it would be useful, as you either want an array of strings, or an array of numbers, but not both. I think, this would be something like:

var x: Array<string ^ number> = [2, 5, 'hello']
                                ~~~~~~~~~~~~~~~ // Error: Argument of type '(string | number)[]' is not assignable to parameter of type 'string[] | number[]'.

// So (string ^ number)[], would be string[] | number[] 

Playground link

@jcalz
Copy link
Contributor

jcalz commented Jun 6, 2019

This would make #30769 less of a bitter pill to swallow...

@jcalz
Copy link
Contributor

jcalz commented Jun 8, 2019

And would solve #13995, right?

@Nathan-Fenner
Copy link
Contributor Author

@jcalz It "solves" it in the sense that (with other compiler work) the following code would compile like you'd expect:

declare function takeA(val: 'A'): void;
export function bounceAndTakeIfA<AB extends_oneof('A', 'B')>(value: AB): AB {
    if (value === 'A') {
        // we now know statically that AB extends 'A'
        takeA(value);
        return value;
    }
    else {
        // we now know statically that AB does not extend 'A', so it must extend 'B'
        return value;
    }
}

It should be noted that besides the actual implementation of extends_oneof as a new form of constraint, the way that constraints are calculated/propagated for generic types needs to change. Currently, TS never adds/changes/refines generic type variable constraints (because as the conversation above shows, you can never really be sure about the value of the generic type, only of particular values of that type).

This feature makes it possible to soundly refine the constraints for type parameters, but actually doing that refinement would be a bit more work.

@emilioplatzer
Copy link

Hi @angryzor, I'm trying to follow you and I have a question about your third post. But I'm not sure that this issue (#27808) is the place. I think this issue is a feature request. And what you are doing is trying to collaborate with one solution for this.

@angryzor
Copy link
Contributor

angryzor commented May 1, 2023

Hi @angryzor, I'm trying to follow you and I have a question about your third post. But I'm not sure that this issue (#27808) is the place. I think this issue is a feature request. And what you are doing is trying to collaborate with one solution for this.

Hi @emilioplatzer , I don't entirely understand your message.

Should I not be posting these analyses here? I have never tried to contribute to TypeScript before, if I am misusing this issue please let me know! Just wanted to contribute by refining the semantics of the oneof proposal. Though indeed I am only looking at it from the perspective of 1 potential solution.

If instead what you are saying is that you would like to ask me a question not entirely related to this issue, you can find me on the TypeScript Community Discord under the same username and I would be glad to answer any questions you have. If that doesn't work for you, let me know, we can probably find an alternative.

@angryzor
Copy link
Contributor

angryzor commented May 2, 2023

oneof as a type operator

Sometimes you spend days on a complex solution for a problem, something external forces you to take a break and then you get a shower thought that makes you want to go back to the beginning and challenge your initial assumptions. For me this happened today, and the initial assumption that suddenly didn't seem so certain anymore is the assumption that oneof is a constraint and couldn't possibly be a type operator.

This moment came about as I was stuck in traffic and musing the potential implementation of a function like map in e.g. lodash and how it would be possible to implement this function in the presence of the equals oneof constraint. Of course you could implement it like this:

function map<T, U>(f: (t: T) => U, xs: readonly T[]) {
    for (const x of xs) {
        f(x)
    }
}

But what if instead you wanted to do this:

function map<T, U>(f: (t: T) => U, xs: readonly T[]) {
    for (let i = 0; i < xs.length; i++) {
        f(xs[i])
    }
}

At this point xs[i] is not a oneof type parameter, and this would not typecheck. So how could you make it typecheck. Currently the only 2 ways to call a function with a oneof type parameter is either calling it with a unary type (which you can't do here) or calling it with a oneof type parameter, which we don't have. You could do something pretty dirty, maybe this:

function map<T, U>(f: (t: T) => U, xs: readonly T[]) {
    for (let i = 0; i < xs.length; i++) {
        (f as any)(xs[i])
    }
}

But this is deeply unsatisfactory. My next thought was "if oneof was a type operator then you could just do this:

function map<T, U>(f: (t: T) => U, xs: readonly T[]) {
    for (let i = 0; i < xs.length; i++) {
        f(xs[i] as oneof T)
    }
}

I had already assumed oneof could only be a constraint modifier, but this assumption only came from preconceptions and a lot of naysaying in issue threads. However, I spent 3 days now doing research into the semantics of oneof as a constraint. It would only be fair to do the same for oneof as a type operator, instead of dismissing it purely out of preconceptions.

So with that, let's look at the semantics of oneof as a type operator. I will ignore the switch return problem, as to solve this issue we do still need the equals constraint modifier (so the work I have done up till now would in any case not be in vain).

TypeScript unions and their inverse operation

Are TypeScript unions truly unions? They're supposed to be unions, but not really... If the union operator produced a true union then the resulting type would be opaque. But it's not. We can for instance use distributive conditional types to iterate over the elements of a union. If that is the case then that means TS unions are more like "a set of types". That's also why the oneof construct even makes sense as a concept. If the union operator just produces a set of types, and this underlying structure is not destroyed, then it can also be undone. In other words there must be an inverse of the union operator. In fact this operator is already being used under the hood, as "distributive conditional types with a naked type parameter".

What if this inverse operator can be made explicit, and its name is oneof. Take a type oneof X to signify "any of the constituents of the union X". Let's have a look at what kind of laws and behavior would spring from this definition.

Of course we'd have a trivial subtyping relation (I'm reusing the syntax from my oneof-as-a-constraint comments here):

Ô <: oneof O

I.e. each of the constituents of the union O would be subtypes of oneof O. However notably any subsets that are themselves unions wouldn't be, as oneof would completely deconstruct the union. This immediately gives us rule 1 from our constraint-based analysis. All subtypes of oneof O are unary types.

The following should probably also hold:

oneof O <: O

If you have any constituent of O you can also assign it to a variable of type O.

Existential type?

From that first subtyping relationship I feel like maybe oneof O is an existential type: ∃ P ∈ O . P.
When you deal with existential types, your types can have multiple valid instantiations. What kind of instantiations could we have for these types? Given the oneof operator is a compiler intrinsic with a fixed dimension it should be able to figure out potential instantiations.

What would be potential instantiations of oneof (string | number)? Well, simply string and number. Simple enough. What about (oneof (string | number))[]? The only 2 options would be string[] and number[]. Huh. This sounds familiar. In fact this sounds very familiar! This is exactly the behavior that we expected from rule 2 of our oneof constraint in previous brainstorms. It might be that we were describing an existential type all along. Let's try to plug this into one of our experiments.

type A = { obj: { bar: number }, key: 'bar' }
type B = { obj: { baz: number }, key: 'baz' }

function fn<T extends oneof A | B>(o: T) {
    return [o.obj, o.key]
}

So now we return a type [T['obj'], T['key']] where T is a type parameter with base constraint oneof A | B. If we try to instantiate to the maximum constraint like we did in the constraint-based analysis we get multiple possible instantiations: [{ bar: number }, 'bar'] and [{ baz: number }, 'baz']. Just like in the constraint-based example.

What about the original mapping problem example?

const foo = [
    { obj: { bar: 5 }, key: 'bar' },
    { obj: { baz: 7 }, key: 'baz' },
] as const

declare function fn<K extends string, O extends Record<K, unknown>>(o: O, key: K): void

for (const f of foo) {
    fn(f.obj, f.key)
}

NOTE: The following paragraphs have been proven false, but I'm leaving them here for context.

At first we may think that f doesn't need to be a type parameter now. But if we try this out so that f just gets the type oneof (typeof foo) from the for ... of construct we bump into the problem from before: K gets instantiated with oneof (typeof foo)['obj'] and O gets instantiated with oneof (typeof foo)['key'], losing the relationship between both arguments. So it seems we still need f's type to be a type parameter to avoid this from happening.

Every time we try to use this type without the accompanying generic type parameter we lose what makes it useful for us:

type A = { obj: { bar: number }, key: 'bar' }
type B = { obj: { baz: number }, key: 'baz' }

function fn(o: oneof A | B) {
    return [o.obj, o.key] // Output type is now [(oneof A | B)['obj'], (oneof A | B)['key']], which will just end up as a cartesian product.
}

So it seems that to solve the mapping problem we really do need generic type parameters, no matter if we are using the constraint approach or introduce oneof as an existential type.

We can do the cast from that map implementation now though:

function map<T, U>(f: (t: oneof T) => U, xs: readonly T[]) {
    for (let i = 0; i < xs.length; i++) {
        f(xs[i] as oneof T)
    }
}

In fact, you may not need to. The index could also just return oneof T. oneof just becomes a type that says “one of the union’s constituents. you don’t know which, but the module passing you this value may. and in some cases that’s the compiler”.

And the idea that the type distribution comes from the compiler trying to solve an existential type instead of some special constraint is alluring.

@angryzor
Copy link
Contributor

angryzor commented May 4, 2023

Subtyping table with oneof as an existential type

Let's look at some subtyping rules next. I'll be ignoring the equals relation for now since I now consider this a separate issue. I'll use the quantification syntax this time as the existential type can literally be written using that same syntax and it will be very obvious if they are hiding in these typing rules.

Covariant subtyping

The typing table for the constraint approach:

// U and X are any type (unary or n-ary) and not `oneof`
type FooEA<T extends U> = // ...
type FooEO<T extends oneof U> = // ...

type BarEA<W extends X> = FooEA<W> // W <: T ⇔                     X <: U
type BarEA<W extends X> = FooEO<W> // W <: T ⇔ |X| = 1 ∧ ∃ V ∈ U . X <: V

type BarEO<W extends oneof X> = FooEA<W> // W <: T ⇔ ∀ Y ∈ X .           Y <: U
type BarEO<W extends oneof X> = FooEO<W> // W <: T ⇔ ∀ Y ∈ X . ∃ V ∈ U . Y <: V

The typing table for the existential type approach:

// U and X are any type (unary or n-ary) and not `oneof`
type FooEA<T extends U> = // ...
type FooEO<T extends oneof U> = // ...

type BarEA<W extends X> = FooEA<W> // W <: T ⇔ X <: U
type BarEA<W extends X> = FooEO<W> // W <: T ⇔ X <: U

type BarEO<W extends oneof X> = FooEA<W> // W <: T ⇔ X <: U
type BarEO<W extends oneof X> = FooEO<W> // W <: T ⇔ X <: U

Contravariant subtyping

The typing table for the constraint approach:

let fea: <T extends U>(t: T) => void
let feo: <T extends oneof U>(t: T) => void
let gea: <W extends X>(t: W) => void
let geo: <W extends oneof X>(t: W) => void

fea = gea // S <: R ⇔                     U <: X
feo = gea // S <: R ⇔ ∀ V ∈ U .           V <: X
fea = geo // S <: R ⇔ |U| = 1 ∧ ∃ Y ∈ X . U <: Y
feo = geo // S <: R ⇔ ∀ V ∈ U . ∃ Y ∈ X . V <: Y

The typing table for the existential type approach:

let fea: <T extends U>(t: T) => void
let feo: <T extends oneof U>(t: T) => void
let gea: <W extends X>(t: W) => void
let geo: <W extends oneof X>(t: W) => void

fea = gea // S <: R ⇔ U <: X
feo = gea // S <: R ⇔ U <: X
fea = geo // S <: R ⇔ U <: X
feo = geo // S <: R ⇔ U <: X

Conclusion

Well, what do you know. So much cleaner. The existential type perfectly captures the subtype relations. And you can even see the existential types sitting in those original subtype relations plain as day. (The universal quantifiers are just existential quantifiers on the left hand side of the subtyping relation.)

This seems like an elegant way of implementing the oneof proposal. The fact that the existential type naturally forms from the typechecking requirements and the desired semantics is very interesting.

@angryzor
Copy link
Contributor

angryzor commented May 5, 2023

Bringing it together

It seems like in both approaches I did the exact same thing. After all ∀ Y ∈ X . Y -> U is logically equivalent to (∃ Y ∈ X . Y) -> U.

The semantics of the oneof type operator

What is in layman's terms the core of the oneof type operator? It inverts the distributive action of the type checker over unions. If you have a union A | B, instead of proving that "an operation works for A and B" it tries to prove that "the operation works for A and the operation works for B". If you write

function foo(x: A | B) {
}

You are saying "this is a function that can process a value that can either be an A or a B.

If instead you write:

function foo(x: oneof (A | B)) {
}

You're instead saying "this is a function that can either process an A or process a B". The difference is subtle as long as you are just talking about a simple, unnested type. Because it is effectively nonexistent: oneof X = X. You can even see this when you do the earlier expansion on the function's type:

interface Foo {
    (x: oneof (A | B)): void
}

// becomes (intersection because x is in contravariant position)
interface Foo {
    (x: A): void
    (x: B): void
}

// this is equivalent to
interface Foo {
    (x: A | B): void
}

However the difference becomes stark when you use it inside more complex types:

interface Foo {
    (x: (oneof (A | B))[]): void
}

interface Foo {
    (x: A[]): void
    (x: B[]): void
}

interface Foo {
    (x: A[] | B[]): void
}

This is not a function that can process (A | B)[]. It is a function that can process either A[] or B[]. Or how about using it on variables:

const a: (oneof (A | B))[] = ['asdfv', 55] // error

The above is not allowed, because the typechecker's goal is "must either satisfy A[] or satisfy B[]". And when you use a type parameter with oneof in its constraint you are locking all of its uses together in this process, as you only do this once for its constraint. This is how this implementation solves the mapping problem (note we don't even need the generics anymore):

type A = { obj: { bar: number }, key: 'bar' }
type B = { obj: { baz: number }, key: 'baz' }

type Objects = A | B

declare const fn: <K extends string, O extends Record<K, unknown>>(o: O, key: K) => void

function fn2(f: oneof Objects) {
    fn(f.obj, f.key)
    // `typeof f.key` is not "the type of accessing `key` on either an A or a B.
    // it's "either the type of accessing `key` on an A or the type of accessing `key` on a B",
    // and more specifically "the type of accessing `key` on whatever type we have currently instantiated
    // for `f`, as the typechecking process becomes iterative.
    //
    // You'll notice that as a consequence the fate of each type becomes dependent on the oneof
    // type it originates from. We're not trying to prove that the function works for an input that
    // can be any constituent of `Objects`, we're trying to prove that for every constituent of `Objects`
    // the function works.
}

In other words, where (A | B)[] is equivalent to ∃ X ∈ { A, B } . X[] or (∀ X ∈ { A, B } . X)[], (oneof A | B)[] is equivalent to ∀ X ∈ { A, B } . X[] or (∃ X ∈ { A, B } . X)[]. You can think of it as a variable with a oneof type having "multiple possible types".

oneof is viral

The existence of a oneof type somewhere in a structural type "infects" the entire type, producing a cartesian product type:

type A = {
    b: oneof (A | B),
    c: oneof (B | C),
}

// is equivalent to
type A = oneof (
    | { b: A, c: B }
    | { b: B, c: B }
    | { b: A, c: C }
    | { b: B, c: C }
)

allof as a way to unpack a oneof type

I am implementing a complementary allof type operator that collects all the possible types produced by oneof and produces the results as a union:

type A = allof {
    b: oneof (A | B),
    c: oneof (B | C),
}

// is equivalent to
type A =
    | { b: A, c: B }
    | { b: B, c: B }
    | { b: A, c: C }
    | { b: B, c: C }

This could also be done like this:

type A = {
    b: oneof A | B,
    c: oneof B | C,
} extends oneof infer T ? T : never

But providing an operator is more ergonomic and allows us to implement the "explicit conditionals" syntax I mentioned in my first post.

oneof types cannot be eagerly resolved or dereferenced if they're assigned to a variable

The way oneof types work makes it impossible to eagerly resolve them if they are assigned to a variable. Consider the following code:

declare const a: oneof ({ foo: 'bar' } | { foo: 'baz' })
const b = a.foo

What is the type of b? It is impossible to give a definite answer to this as it is dependent on the choice we made while typechecking a. Its type is simply (typeof a)['foo']. You could generate a combinatorial maximum bound for this type for the sake of tooltips, but that's it. Its real type cannot be defined. Saying that it's 'bar' | 'baz' is simply not correct. Saying that it's oneof 'bar' | 'baz' is also incorrect, that would just start a second iteration and you would typecheck the cartesian product of both types. The fact that you can't disable eager type resolution and the lack of a relation between types of different variables is the core of the mapping problem, and through the mapping problem also hinders a lot of advanced type level programming. A solution for this is one of the core needs mentioned in my first post, and oneof as a type gives us this for free.

You can see it very loosely like "if a variable is oneof, then it's the code working with it itself that needs to be distributed".

As you may have noticed, there's a recursive element here that could result in combinatorial explosion. However, this is not necessarily prohibitive. Remember that only values that differ cause issues for typechecking. If you have this code for example:

declare const a: oneof ({ foo: 'bar', xyzzy: boolean } | { foo: 'baz', xyzzy: boolean })
const b = a.xyzzy

The compiler can statically verify that (typeof a)['xyzzy'] is boolean and can collapse the type of b to that value. So the type can actually be resolved if the result is the same type for every branch of the type checking process. It's literally the inverse of what the compiler does with property access on normal unions:

  • Normal union reference: maintain 1 typechecking context, merge the results of all accesses into 1 union
  • Oneof union reference: maintain N typechecking contexts, only merge the results if all accesses result in the same value.

The map example

This is completely solved. Note that oneof X = X. This means that either is assignable to the other, so the value can just be passed.

Conclusion

I think I've reached the point where I've done enough research for now. I'm going to try to implement this oneof type in the typechecker, but it will probably not be as easy as I'm hoping. What started as an attempt to implement a constraint to let people choose only 1 of a union of types turned into the discovery of an entire equally valid complement or "mirror world" to the typechecking algorithm TypeScript uses to handle unions. It would be amazing if this could be added to TypeScript as a language as it would solve a lot of other downstream issues that in essence reduce to the lack of this feature.

In another branch I'll continue work on the constraint modifiers to implement equals.

@angryzor
Copy link
Contributor

angryzor commented May 11, 2023

Small update

I have been writing code for a few evenings and have succeeded in fixing the base case of the mapping problem. Of course much work is still to be done, but here are some sneak peeks:

Mapping problem base case without errors:
correspondence

An example of an error message if the input is nonsensical:
wrongprop

An example of a homogeneous list defined using oneof:
homogeneous-list

From these you can also get an idea of how I went about implementing this. Turns out one of the easiest methods is to implement oneof as an operator that generates a unique type. So every invocation of oneof will produce a type that has a different identity. Different oneof types with the same underlying structure will succeed a subtype check, however in the core of the type checker it does differentiate between them when encountering them during type checks:

Every type remembers a cached set of all the "free" oneof types that are below it in the type tree (they can be "bound" by allof). When the type checker tries to test a subtype (or other) relation, it checks this cached set on both types, generates a cartesian product of all possible instantiations of all the oneofs and does the typecheck for each of them, trying to prove that each instantiation on the subtype side matches at least one instantiation on the supertype side.

It's a pretty simple algorithm currently, and it probably needs some heavy caching and other optimization, but for now I'm first going to make sure that all types are properly handled (e.g. still have to make it work through generic type parameters).

@angryzor
Copy link
Contributor

angryzor commented May 29, 2023

I am still working on this intensively. Everything is going well. I wasted some time trying to make a naive MVP implementation work that tried to crawl the types for oneofs first before typechecking. This didn't work out at all because this process forced resolution of all the types it encountered and made numerous tests in the TypeScript test suite fail, so instead I had to just do it the right way right away.

The core algorithm

I have now built a nice Iterable abstraction that allows me to iterate over possible instantiations of all oneofs in an expression using simple for loops that can be reused all over the code base. It implements a much cleaner algorithm that "discovers" new oneofs as the client code encounters them. It uses the first constituent as the value it represents during the first iteration, and then registers its other options to be tried in later iterations. This way only types that are actually visited by client code are actually resolved. Using this method I was able to add working typechecking for oneof types and the whole TypeScript test suite still succeeds! Woohoo!

The presence of allof is what makes this algorithm more complex. Take the following complex oneof/allof code which currently compiles correctly in my implementation (Note: This type definition may look frightening, but this type is intentionally contrived to test the edge cases of what can technically be done with these operators. In real world code you wouldn't encounter something like this often.):

type Bar<X extends oneof (string | number) = oneof (string | number)> = {
    readonly foo: X
    readonly zoo: readonly (allof {
        readonly goo: X
        readonly moo: oneof ('fer' | 'ber')
    })[]
}

declare function a(x: Bar)

// This is correct code. The `goo`s and `foo`s have the same type.
// The `moo` is closed over by allof, turning the A type into
// `{ foo: X, zoo: ({ goo: X, moo: 'fer' } | { goo: X, moo: 'ber' })[] }`,
// allowing the `moo` type to vary within the array.
// Note that if we didn't have that allof the whole `A` type would instead be equivalent to
// `oneof ({ foo: X, zoo: { goo: X, moo: 'fer' }[] } | { foo: X, zoo: { goo: X, moo: 'ber' }[] })`,
// which is completely different.
a({ foo: 3, zoo: [{ goo: 3, moo: 'fer' }, { goo: 5, moo: 'ber' }] } as const)

// This is invalid code. `number` is not a valid type for `moo`.
a({ foo: 3, zoo: [{ goo: 3, moo: 'fer' }, { goo: 5, moo: 3 }] } as const)

// This is invalid code. The first `goo` is not the same type as `foo`.
a({ foo: 3, zoo: [{ goo: 'wefvwe', moo: 'fer' }, { goo: 5, moo: 'ber' }] } as const)

// This is invalid code. Even though both `goo`s have the same type, it is not equal to `foo`.
a({ foo: 3, zoo: [{ goo: 'wefvwe', moo: 'fer' }, { goo: 'ewfwe', moo: 'ber' }] } as const)

What this code is built to test is the fact that with oneof the meaning of a type can change depending on the type it is used in. When looking at this subpart of the type separately (let's drop the readonly operators, they're just clutter I had to add for those tests to work with the const parameter):

allof {
    goo: X
    moo: oneof ('fer' | 'ber')
}

Let's give it a name:

type Zoo<X extends oneof (string | number)> = allof {
    goo: X
    moo: oneof ('fer' | 'ber')
}

What would this type simplify to if we instantiated it?

type Zoo1 = Zoo<oneof (string | number)>
// =>
type Zoo1 = allof {
    goo: oneof (string | number)
    moo: oneof ('fer' | 'ber')
}
// =>
type Zoo1 =
    | { goo: string, moo: 'fer' }
    | { goo: string, moo: 'ber' }
    | { goo: number, moo: 'fer' }
    | { goo: number, moo: 'ber' }

Ok, simple enough. What if we pass it a subtype?

type Zoo2 = Zoo<string>
// =>
type Zoo2 = allof {
    goo: string
    moo: oneof ('fer' | 'ber')
}
// =>
type Zoo2 =
    | { goo: string, moo: 'fer' }
    | { goo: string, moo: 'ber' }

All working as expected. But remember: with oneof types present the type can have different behavior depending on where it is used! Let's reintroduce it into that original type:

type Bar<X extends oneof (string | number) = oneof (string | number)> = {
    foo: X
    zoo: Zoo<X>[]
}

And let's look at the expansion of the default instantiation, the oneof itself. In these instantiations, remember that the 2 oneofs in Bar1 are the same oneof. Every oneof type has its own identity during iteration over the possible values over them, and will have the same value.

type Bar1 = Bar<oneof (string | number)>
// =>
type Bar1 = allof {
    foo: oneof (string | number)
    zoo: Zoo<oneof (string | number)>[]
}
// =>
type Bar1 = allof {
    foo: oneof (string | number)
    zoo: allof {
        goo: oneof (string | number)
        moo: oneof ('fer' | 'ber')
    }[]
}
// =>
type Bar1 =
    | { foo: string, zoo: ({ goo: string, moo: 'fer' } | { goo: string, moo: 'ber' })[] }
    | { foo: number, zoo: ({ goo: number, moo: 'fer' } | { goo: number, moo: 'ber' })[] }

See that? Even though Zoo was passed a oneof (string | number), while in Zoo1 we got a union with 4 constituents as a result of the allof, in this case we only have 2! It has to work like this to be sound. After all, you could deconstruct with more steps, like this:

type Bar2 = Bar<oneof (string | number)>
// =>
type Bar2 = allof {
    foo: oneof (string | number)
    zoo: Zoo<oneof (string | number)>[]
}
// =>
type Bar2 = 
    | { foo: string, zoo: Zoo<string>[] }
    | { foo: number, zoo: Zoo<number>[] }
}
// =>
type Bar1 =
    | { foo: string, zoo: ({ goo: string, moo: 'fer' } | { goo: string, moo: 'ber' })[] }
    | { foo: number, zoo: ({ goo: number, moo: 'fer' } | { goo: number, moo: 'ber' })[] }

And now you can see that Zoo is actually called 2 times with a single subtype.

In the discovery-on-the-go based algorithm I developed this behavior makes the algorithm more complex, because every allof generates a separate sub-iteration. The typechecker could choose to typecheck the zoo property first, encounter the oneof there, generate 4 options for the Zoo type and only later realize that the goo actually would have been bound by the upper scope.

To handle this, the algorithm uses an error correction technique. Whenever a oneof is encountered in a higher scope that was already found in a deeper iteration, the result of the current iteration is considered "tainted", discarded, and the current iteration is re-run with an explicit instantation for that oneof. This is all built into the Iterable abstraction. The client code can supply a rollback function to be run when a conflict was found.

What is currently working

Currently I have the following things working:

  • Typechecking: Fully as far as I am aware at the moment, still need to check advanced combinations with other types and build a test suite.
  • Verbalization: More or less. Needs some tweaks as oneof has some unique challenges that aren't well covered by the verbalization options we have now. It's hard to make it clear that 2 types may have the same declaration but different identity. In value code and classes there are symbol chains: typeof d.e.g or MyClass.MyType, and types can have aliases to which you can refer, but you cannot generate verbalizations that refer to deeply nested members of anonymous types with aliases attached based on the alias, like MyAlias['a']['b']. I plan on adding that, since any time I fall back to just the oneof (A | B) output it is inherently ambiguous for developers.

Currently doing:

  • Inferencing. This isn't super difficult now that I have the Iterable abstraction.

If you are interested you can follow my progress in this branch: https://github.com/angryzor/TypeScript/tree/feature/oneof

@angryzor
Copy link
Contributor

angryzor commented May 29, 2023

A first class existential type

I think I have been conflating my oneof operator and existentially quantified types over the constituents. I had already suspected something wasn't quite right because every oneof always had to have their children wrapped in unions.

Unions really are universal quantifiers over the set of their constituents, and oneofs are a separate but similar thing: existential quantifiers over the set of their constituents. With my current oneof operator it is possible to group oneofs inside unions, but not the other way around:

type Foo = oneof (A | B) | oneof (C | D)
type Bar = oneof ((A | B) | (C | D)) // won't work, will collapse

I based everything around this oneof operator because that's what worked for me for the mapping problem, but I don't think this is ideal or the most flexible. The type you actually want here for Bar but can't express is (A | B) ^ (C | D) (borrowing the caret operator from above), and it's a different type than A ^ B ^ C ^ D:

A ^ B ^ C ^ D extends (A | B) ^ (C | D)
// => true, all of the left side constituents can find one on the right that they are a subtype of

(A | B) ^ (C | D) extends A ^ B ^ C ^ D
// => false, none of these left side constituents are subtypes of the right side constituents

So basically @michaeljota was right with their suggestion all along, it's just that the semantics of this ^ operator are not that "a value is not allowed to match 2 of the constituents", it just signifies existential quantification over its constituents, where unions signify universal quantification over their constituents.

What are oneof and allof then? Well, they are still very useful, and we need them for the mapping problem. They just convert between ^ (existential) types and | (union) types. Simple.

So now I have a new task on my list, after I finish implementing inference:

  • Making a new first class type ExistentialType that takes over most of the oneof implementation.
  • Change oneof into just a type operator with little extra logic.

Currently the union/intersection types in the compiler look like this:

Type
└ UnionOrIntersectionType
  ├ IntersectionType
  └ UnionType

I plan to change this to this:

Type
└ SetOperationType
  ├ IntersectionType
  └ QuantificationType
    ├ UnionType
    └ ExistentialType

This way I can reuse all the collapsing and other "treat constituents as a set" logic from unions and implementation should be rather smooth.

I am unsure about the renaming part since it will cause a huge amount of changed LoC (even though it's just a find all and replace for me) and might make it harder to maybe have an eventual PR merged, but if I don't then the naming will be confusing. We'll have to see. I prefer having a clean code structure I think.

@angryzor
Copy link
Contributor

angryzor commented Jun 14, 2023

What to do here

A bit longer than a week ago I presented a (back then) strange and confusing observation I ran into in a channel in the TS community discord. One of the other users helpfully responded to my observation and since then I have been a bit stuck on this project. Not because I realized I was completely on the wrong path, but because their comment pointed out a situation where a choice needs to be made, and I'm stuck on this choice, stuck on which one is the "best" or "correct" one. Since then it's been 2 weeks of very little progress, so I thought it might be better to try writing down my thoughts. Maybe someone else could chime in, or maybe writing everything down could bring me to the correct conclusion through some form of rubber ducking. So here we go.

Getting up to speed

The core problem we're trying to solve with the oneof implementation is this: considering the following code, what is the type that should be inferred for T?

string[] | number[] extends (infer T)[]

Currently the typescript compiler will infer string | number. This is correct, but only because we are using the extends constraint. It is not the most specific type that can be inferred. In other words, were we to use the as-of-yet-nonexistent constraint type equals then this inference would not be correct:

string[] | number[] equals (infer T)[]

The reason why becomes clear when you look at this code:

type A = string[] | number[] extends (string | number)[] ? true : false
//   ^? true
type B = (string | number)[] extends string[] | number[] ? true : false
//   ^? false

There is a subtle difference between string[] | number[] and (string | number)[] here. string[] | number[] is either an array of strings or an array of numbers, (string | number)[] is an array where every item can be either a string or a number. The uneven subtyping relation between the two here is correct. If I have an array that only contains strings or an array that only contains numbers, then I can also advertise it as an array that can hold both strings and numbers. (At least, I could do that if it were immutable and that is where this becomes a bit unsound for mutable arrays but let's ignore that for the moment as it is out of scope of the oneof solution.) We can't do the inverse. Presenting an array that can hold both as one that can only hold either one or the other isn't safe.

The same difference exists when you use a simpler type composition like an object property, but it's even more subtle:

type A = { foo: string } | { foo: number } extends { foo: string | number } ? true : false
//   ^? true
type B = { foo: string | number } extends { foo: string } | { foo: number } ? true : false
//   ^? false

{ foo: string } | { foo: number } here is either an object that you can only get a string out of by accessing foo, or an object that you can only get a number out of by accessing foo. Which of the 2 you are holding you don't know, and you'll first have to narrow the type somehow. { foo: string | number } is the type of an object that you can access a property foo on, and that property can hold either a string or a number.

So when we are trying to solve the oneof problem, we are trying to define a type that directly embodies this peculiar relationship in the distributed versions of these types.

There is a contravariant complement of this pattern, with intersections:

type A = string[] & number[] extends (string & number)[] ? true : false
//   ^? false
type B = (string & number)[] extends string[] & number[] ? true : false
//   ^? true

type C = { foo: string } & { foo: number } extends { foo: string & number } ? true : false
//   ^? false
type D = { foo: string & number } extends { foo: string } & { foo: number } ? true : false
//   ^? true

In this case the difference is even more subtle because in practice I can't immediately think of a situation where this difference is significant except in situations where a contravariant interaction places us back into the union version, but it's there. { foo: string } & { foo: number } for example is the type of an object that you can access foo and get a string back and an object that you can access foo on and get a number back. { foo: string & number } is the type of an object that you can access foo on and get a value back that is both a string and a number. Note the subtyping relation is flipped in this case.

It may not immediately be apparent why all of this is related to the oneof type as we see it in function parameters. However it is directly related. The way we currently define functions where we would want to use the oneof type (removing the type parameters for a moment as they are not significant) is like this:

type MyFn1 = (x: A | B) => void

But what we actually want to define is this type:

interface MyFn2 {
    (x: A): void
    (x: B): void
}

// or:
type MyFn2 = ((x: A) => void) & ((x: B) => void)

Note that MyFn1 is actually a subtype of MyFn2:

type A = ((x: A | B) => void) extends ((x: A) => void) & ((x: B) => void) ? true : false
//   ^? true

And note how this pattern looks extremely similar to the patterns above. That is because it is the exact same pattern. x is just in contravariant position, so we're looking for the union version of this oneof type while dealing with an intersection, but the core question is exactly the same: in MyFn2, what would be the type of x? The answer is the oneof type we are trying to define.

What is the issue that has me blocked?

(TBC...)

@laughinghan
Copy link

@Nathan-Fenner Function overloads seem to perfectly solve the example given, and for generic type aliases which don't have overloads, with a little elbow grease you can use conditional types:

// spurious tuple required to workaround distributivity of conditional types
type OneofStrNum<T extends [string | number]> = T[0] extends string ? string : T[0] extends number ? number : never
type MonomorphicArray<T extends string | number> = OneofStrNum<[T]>[]

type M1 = MonomorphicArray<string>
  // ^? type M1 = string[]
type M2 = MonomorphicArray<number>
  // ^? type M2 = number[]
type M3 = MonomorphicArray<string | number>
  // ^? type M3 = never[]

Sure it sucks but the main use case is functions anyway right? Can you motivate some non-function use cases so important that they deserve special syntax sugar for what can already be done with conditional types?

Otherwise can we close this?

@craigphicks
Copy link

craigphicks commented Dec 8, 2023

@angryzor

This port concern TypeScript behavior and are not at all a criticism of you or your ideas.

1

As you said, the typescript compiler will infer string | number for this type declaration case:

type A = string[] | number[] extends (infer T)[] ? T[] : never;
//  ?^  (string | number)[]

But if we write it as a type function,

type Id<In> =In extends (infer Out)[] ? Out[] : never;
type X1 = Id<string[]|number[]>;
//  ?^ string[] | number[]

it does not lose precision. For the distribution over the union functionality to work, the function form is required.

2

Another issue: there is some strange TypeScript behavior with the & calculations:

type SN1 = string[] & number[];
//     ^?  string[] & number[]
type T1 = SN1[0];
//   ?^ never

type A = string[] & number[] extends (string & number)[] ? true : false
//   ^? false
type A1 = never[] extends (string & number)[] ? true : false
//   ^? true

type B = (string & number)[] extends string[] & number[] ? true : false
//   ^? true
type B1 = (string & number)[] extends never[] ? true : false
//   ^? true

TypeScript seems to delay the calculation of & at times - as it does here.
Personally I think it would be better to immediately see

type SN1 = string[] & number[];
//     ^?  never[] 
  • A and A1 should be the same, right?
  • B and B1 should be the same, right?
  • Also A1 and B1 together imply that never[] and (string & number)[] both extend each other, which I believe should mean they are equal. However they are not equal.
  • never[] extends (string & number)[] should be true (A1), according to set theory.
  • (string & number)[] extends never[] should be false (B1), according to set theory.

Conclusion

TypeScript handling of types at sets is not really mathematically precise.
At least, there are a few corner cases that I think TypeScript does not properly handle (as of V5.3.3).

@emilioplatzer
Copy link

I think that this cannot be closed yet. It is unsolved.

@laughinghan, your given example fixes the problem of using a function that allows T1 or T2 but not T1|T2 union.

In my opinion that is not the problem (because it has a solution). The problem is implementing that function. Inside that function you cannot have a generic type T that can be T1 or T2 but not T1|T2. That is specialy bad for numeric types.

See my example in my previous comment: #27808 (comment)

@laughinghan
Copy link

@emilioplatzer Thank you, that's a much better example. Can we update the main issue description then? Both to mention the better example, and to mention that function overloads are an (incomplete) workaround. The fact that the best workaround requires annoying casts is a pretty good argument for this feature:

function add(a: number, b: number): number
function add(a: bigint, b: bigint): bigint
function add<Num extends bigint | number>(a: Num, b: Num): bigint | number {
    return (a as number) + (b as number);
}

More generally TypeScript doesn't really track correlations between types much at all (with distributive conditional types being a very limited exception); this does seem like potentially a small, practical way to specify type correlations in some useful cases.

@craigphicks
Copy link

craigphicks commented Dec 11, 2023

@laughinghan @emilioplatzer @Nathan-Fenner

There are really 3 separate items that are related:

  1. Specifying the call parameter and return type correlations.
  2. Using those correlations for flow analysis inside the function implementation and to validate the return statements therein.
  3. Using those correlations to validate calls to the function.

Item 3 is already working, item 2 is not.

Notice that @emilioplatzer post says

Inside that function you cannot have a generic type T that can be T1 or T2 but not T1|T2.

That is talking about item 3.

The OP #27808 title implies that a new form for specifying correlations (extends oneof) is the main feature - that feature belongs to item 1. However, the OP long content also talks about item 3.

  • It is possible to fix item 3 without adding a new feature for item 1.
  • It is possible to add a new oneof feature for item 1 without fixing item 3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
In Discussion Not yet reached consensus Suggestion An idea for TypeScript
Projects
None yet
Development

No branches or pull requests