Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve type inference for circular references in object initializers #45213

Closed
dhmk083 opened this issue Jul 28, 2021 · 12 comments
Closed

Improve type inference for circular references in object initializers #45213

dhmk083 opened this issue Jul 28, 2021 · 12 comments
Labels
Design Limitation Constraints of the existing architecture prevent this from being fixed

Comments

@dhmk083
Copy link

dhmk083 commented Jul 28, 2021

When you have something like this (also see a playground):

function makeFoo<T>(fn: () => T): () => T {
  // do some work here ...
  return function() {
    // ... or here
    return fn() 
  }
}

const obj = {
  name: 'test',

  foo: makeFoo(() => obj.name.length)
}

type of obj will be infered as any, because in order to determine type of obj typescript first looks at return type of makeFoo(...) which uses obj.

The actual error is this: 'obj' implicitly has type 'any' because it does not have a type annotation and is referenced directly or indirectly in its own initializer.(7022)

What if we could give some hint to a compiler to handle this case? Maybe like this:

const obj = {
  name: 'test',

  foo: makeFoo(() => obj.name.length) as prop  // or a better name
}

Right now I have a workaround for this:

function getter<T, R>(t: T, k: keyof T, value: R) {
  Object.defineProperty(t, k, {value, enumerable: true})  
  return value
}

const obj = {
  name: 'test',

  get foo() {
    return getter(obj, 'foo', makeFoo(() => obj.name.length))
  }
}

but it would be cool if it could be done more easily.

@andrewbranch
Copy link
Member

type of obj will be infered as any

This is not true 🤔

Please follow the issue template, and ensure you include all relevant pieces of code—the definition of makeFoo is important here and you haven’t said what it is.

@andrewbranch andrewbranch added the Needs More Info The issue still hasn't been fully clarified label Aug 5, 2021
@dhmk083
Copy link
Author

dhmk083 commented Aug 5, 2021

I've added more info, aaand 😄 found a similar issue #43683 in which you gave an advice about adding a return type annotation.

Of course this works in my case:

const obj = {
  name: 'test',

  foo: makeFoo((): number => obj.name.length), // infers type of `obj` correctly :)
}

But sadly it doesn't work in this:

const obj = {
  complex: {
    name: 'qwe',
    age: 123,
  },

  bar: makeFoo((): typeof obj.complex => obj.complex)
}

So if I have complex types, I need to declare them separately with no auto-infering help from typescript 😟
My dream is to have auto-infering for everything 😄

@andrewbranch
Copy link
Member

Pretty sure #30134 is your suggestion 😄

@andrewbranch andrewbranch added Design Limitation Constraints of the existing architecture prevent this from being fixed and removed Needs More Info The issue still hasn't been fully clarified labels Aug 5, 2021
@dhmk083
Copy link
Author

dhmk083 commented Aug 6, 2021

I don't know, it seems different. Although, example in the last comment looks sligthly similar.
I think my problem lies in circularity: type of obj.foo depends on makeFoo and type of makeFoo depends on type of obj.
I also don't understand what Full Unification for Generic Inference means. 😃
Do you think fixing it would resolve my problem also?

@andrewbranch
Copy link
Member

Yes, I do.

@theScottyJam
Copy link

I've ran into a similar issue, and was able to reproduce it without using generics (so I don't think that #30134 helps to address it?).

Here's my code sample:

const decorate = (fn: () => boolean): () => boolean => fn

const thing = {
    fn: decorate(() => {
        type Thing = typeof thing
        const thing_: Thing = thing
        return 'fn' in thing_
    })
}

I've tried poking at other threads, and I still fail to understand why this sort of thing is impractical for a type-checker to figure out. The return type of decorate() is known, it's () => boolean. So, the type of thing should be known also, it must be { fn: () => boolean }, because the fn property is assigned to the return value of decorate(). Then, within decorate, I should theoretically be able to freely reference typeof thing, as its type is known.

Where's the flaw in my thinking? Why is it not practical/slow for this sort of logic to be done by a type checker?

Thanks.

@RyanCavanaugh
Copy link
Member

@theScottyJam The process you're going through there mixes and matches depth-first and breadth-first processing of the program, but you're picking and choosing when you go depth-first vs breadth-first based on facts that you can only ascertain by first reading the entire program and establishing those causal links ahead of time. If you had to "check" a program but could only see process one node of code at a time and had to choose what to do next without knowing what the next node you intended to analyze said, you wouldn't be able to do this.

It helps if you consider the active call stack of typechecking a program through a depth-first recursive descent of its AST.

@theScottyJam
Copy link

Thanks for the response @RyanCavanaugh

However, I'm not sure I fully understand. If TypeScript were forced to parse in such a strict fashion, then code like this would never compile in TypeScript.

type A = { x: B }
type B = { y: A }

Or this (which is the same as my previous example, but without the decorate()).

// This works
const thing = {
    fn: () => {
        type Thing = typeof thing
        const thing_: Thing = thing
        return 'fn' in thing_
    }
}

I'm not sure how TypeScript handles the above scenarios, but I'm going to try and present a three-pass algorithm that should be able to handle the issue I had presented. (I'm not sure if three passes are strictly necessary to make this work, but it makes explaining it easier).

Let's assume that TypeScript analyzes my code in the following order.

A. it looks like the const decorate = (...) => ... line.
B. it moves at the inner-most part of the next statement, which is the function being decorated: () => { type Thing = typeof thing ... }
C. it takes a step out and sees that the arrow function is being passed into decorate: decorate(...)
D. finally, it takes another step out and finds the definition for thing: const thing = { fn: ... }

In pass 1, all it does is try and figure out what identifiers exist, along with some trivial-to-gather information about them.

  • In step A, it learns about the decorate function, along with its exact type definition (as it's explicitly stated by the user)
  • In step B and C, nothing interesting happens
  • In step D, it learns about this thing object, which has a fn property. It knows the type of the fn property is whatever the final type of step C's expression.

The whole purpose of pass 2 is to just try and auto-determin all of the different types. We'll follow a depth-first approach, but we'll go out-of-order if necessary.

  • In step A, nothing special happens. We already know the type of the decorate function, as the end-user told us what it is.
  • In step B, as we're trying to auto-determine the return type of this function being wrapped, we see that the function refers to the type of thing. The type of thing isn't immediately known (its type definition wasn't explicitly stated by the programmer), but perhaps it can be found out. So, let's do some out-of-order type-checking. Using what we learned from pass 1, we know that if we do step D, we'll learn the type of the thing object, so let's do that.
    • In step D, we already know almost everything we need to know about the type of thing, except for one thing - the type of the fn property depends on the result of step C, so we'll jump to that step next.
    • In step C, we are able to see that the decorate function is being called, and we know that decorate returns a value of type () => boolean. Now we got everything we needed to know.
  • Back to step B, we now know the type of thing, because we jumped around a bit to figure it out. We're able to get to the end of this function and see that it returns a value of type boolean. We can complete step B now.
  • Steps C and D have already been done, which means we've done all of the steps and have finished this pass.

Now we know all of the type information, so in phase 3, we can start making sure everything connects together properly, and we aren't doing something silly like Math.max('xyz').

@RyanCavanaugh
Copy link
Member

You're basically describing the process we already have, but one of the steps is in the wrong pass.

Specifically, step D. You have this program:

const decorate = (fn: () => boolean): () => boolean => fn

const thing = {
    fn: decorate(() => {
        type Thing = typeof thing
        const thing_: Thing = thing
        return 'fn' in thing_
    })
};

You could equivalently have written this program:

const decorate = (fn: () => boolean): () => boolean => fn
const x = decorate(() => {
    type Thing = typeof thing
    const thing_: Thing = thing
    return 'fn' in thing_
});
const thing = {
    fn: x
}

So applying the bind pass to object literals doesn't remove the circularity per se, it just makes it happen in some cases and not others. It's also not generalizable, because you could have written

const decorate = (fn: () => boolean): () => boolean => fn

const thing = {
    fn: decorate(() => {
        type Thing = typeof thing
        const thing_: Thing = thing
        return 'fn' in thing_
    }),
    ...someExpr
};
const someExpr = thing.fn();

and now the members of thing depend on someExpr, so we can't really make any determinations at all. If you consider a case like

const thing = {
    fn: {
      a: {
        return thing.fn.a;
      }
    }),
    ...someExpr
};

then the type of someExpr is extremely relevant since it changes whether or not thing.fn.a is known to exist.

Naturally anything is possible; we could rearrange a bunch of things and introduce many new special codepaths to make this work, but in general it doesn't come up that often to justify the additional complexity. Keeping a strict rule that the bind phase only uses declaration constructs and the check phase is the only one that's allowed to use inferred object shapes makes the whole system much easier to reason about, and creates invariants that can be used to improve performance in other cases.

@theScottyJam
Copy link

theScottyJam commented Jan 21, 2022

and now the members of thing depend on someExpr, so we can't really make any determinations at all. If you consider a case like

I don't think there's actually an issue there. The type of thing also depended on the type of the decorate(...) expression, and it was able to handle it. With the algorithm I proposed, what needs to happen is that in phase 2 (where it auto-determines types), if it finds it needs to execute some later step early in order to figure out the type information, it just does it.

So, in this example:

const decorate = (fn: () => boolean): () => boolean => fn
const x = decorate(() => {
    type Thing = typeof thing
    const thing_: Thing = thing
    return 'fn' in thing_
});
const thing = {
    fn: x
}

The following would happen during phase 2:

  • A. It evaluates the const decorate = ... line. Nothing special happens.
  • B. It attempts to evaluate the arrow function passed into decorate(), specifically, this chunk: () => { type Thing = typeof thing ... }, but finds that doing so requires knowing the type of thing.
    • D. It jumps ahead and evaluated const thing = { fn: x }, only to find that it needs to know the type of this x variable.
      • C. It jumps to the const x = decorate(...) part, which is one step outside of step B, and figures out that the type of x is the result of calling decorate(), which is () => boolean
    • Back to D: Now that we know the type of x, we can know the type of thing.
  • Back to A: Using the type of thing, we're able to figure out the type of this arrow function.
  • Done.

And here's how the algorithm would be followed with the later code example you shared:

const decorate = (fn: () => boolean): () => boolean => fn

const thing = {
    fn: decorate(() => {
        type Thing = typeof thing
        const thing_: Thing = thing
        return 'fn' in thing_
    }),
    ...someExpr
};
const someExpr = thing.fn();

(I also gave these steps letters, but in no particular order)

  • A. It evaluates the const decorate = ... line. Nothing special happens.
  • B. It attempts to evaluate the arrow function passed into decorate(), specifically, this chunk: () => { type Thing = typeof thing ... }, but finds that doing so requires knowing the type of thing.
    • D. It jumps ahead and evaluated const thing = { fn: x, ...someExpr }, only to find that it needs to know the types of this fn property value and the type of someExpr.
      • C. It jumps to the decorate(...) part, which is one step outside of step B, and figures out that this expression is the result of calling decorate(), which is () => boolean
      • E: It jumps to the const someExpr = thing.fn(); line to figure out what the type of someExpr is. If the type of thing.fn() is known, then our question is answer. If its not known, then it'll jump to whever thing.fn() is defined and do whatever phase-2 steps are needed over there to figure it out.
    • Back to D: Now that we know the type of someExpr, and the fn property, we can know the type of thing.
  • Back to A: Using the type of thing, we're able to figure out the type of this arrow function.
  • Done.

So, I don't think those examples actually break the deterministic logic of the original algorithm I presented, nor does it turn this into a thing that only sometimes works and sometimes doesn't, depending on where you put the code. If it's possible to figure out the type of the thing, it'll figure it out, by running the steps in whatever order necessary, and this should theoretically work as long as you don't have a true circular dependency on types.

but in general it doesn't come up that often to justify the additional complexity.

Yeah, and that's fair. I don't know anything about the TypeScript codebase, nor do I know how much complexity this would add, but I'm sure it would be a fairly large refactor to make anything like this work. So, I'm fine if such a feature doesn't actually get put in. I mostly wanted to know why this feature didn't exist - was it because (a) some technical limitation with making this algorithm for the general case that I wasn't seeing (which we're now discussing, trying to figure out if there really is such a technical limitation), or (b) it simply hasn't been added yet, and perhaps would never be added, because it's just not useful enough to warrant the amount of work it would take to add it in.

@RyanCavanaugh
Copy link
Member

The mechanism you're describing here where, upon detecting a possible circularity, we just sort of put the relevant question on pause and come back to it later - this isn't really compatible with the straightforward caching recursive descent that the checker is written in.

It's instructive to look at the call stack at the point that we log the circularity error and see how every caller in that path (of which the incoming call stack is could be practically any function in the checker) would have to be basically completely rewritten to handle the answer of "ask me again later".

We also need to be able to detect circularities instead of looping forever. For example, if you wrote

function f1() {
  return [0, f2()] as const;
}
function f2() {
  return [0, f1()] as const;
}

then any sort of work-deferring mechanism needs to be able to detect when it's looping forever so that it doesn't attempt to infer an infinite sequence of [0, [0, [0, [0, [0, [0, ....

The relevant codepaths in the checker here are not especially complicated (relatively speaking) and I'd encourage you to try a PR to understand the problem more deeply. Not to be glib, but if this were low-hanging fruit, we would have done it already 😉. Again, it's not impossible, but it's not at all a straightforward fix - we'd have to rewrite very large portions of the entire checker.

@theScottyJam
Copy link

this isn't really compatible with the straightforward caching recursive descent

Yeah, this makes sense. TypeScript uses a recursive descent parsing algorithm, and the sort of algorithm I was describing would require an entirely different type of parsing algorithm. Not only would it take a whole lot of work to switch, it'll also add a lot of complexity to the codebase, and this particular issue isn't an extremely common one.

I don't think I'll try to actually make a PR - I can tell that this sort of thing would be a giant refactor, and it's certainly not a task I would be capable of, especially considering that I currently know nothing about the TypeScript codebase.

Ok, thanks for helping me understand :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Design Limitation Constraints of the existing architecture prevent this from being fixed
Projects
None yet
Development

No branches or pull requests

4 participants