Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a crash when transforming functions in modules. #34513

Merged
merged 1 commit into from Oct 29, 2019

Conversation

mprobst
Copy link
Contributor

@mprobst mprobst commented Oct 16, 2019

When transforming a module declaration and block, a containing function
declarations parent pointer ends up pointing and the synthetic module
block and module declaration nodes.

If a function is contained in such a module declaration, and any of its
preceding statements are transformed, the getOrCreateEmitNode
triggered by visitParameter fails because getSourceFileOfNode fails
to find a SourceFile for the synthetic module declaration nodes.

This change avoids the crash by finding original nodes in
getSourceFileOfNode.

@mprobst
Copy link
Contributor Author

mprobst commented Oct 16, 2019

I am unsure what the intended invariant of the AST structure is here.

  • Is getSourceFileOfNode wrong to assume its input will never point into a synthesized node tree, and its parent pointer chain will always end up at a SourceFile? This is what this PR fixes.
  • Alternatively, is it a bug that a non-synthetic FunctionDeclaration has a parent pointer to a synthetic node? It looks as if getOrCreateEmitNode assumes this cannot happen in its call to getSourceFileOfNode
  • Lastly, is it a bug that the top level ModuleDeclaration ends up with an undefined parent pointer, and it should lead to some source file?

I presume @rbuckton might know?

@rbuckton
Copy link
Member

  • Is getSourceFileOfNode wrong to assume its input will never point into a synthesized node tree, and its parent pointer chain will always end up at a SourceFile? This is what this PR fixes.

Transformed nodes (i.e. nodes with the NodeFlags.Synthesized flag set) are not expected to have parent pointers, and the parent of a parse-tree node (i.e. a node without NodeFlags.Synthesized set) should never be modified.

Generally this means that getSourceFileOfNode does not work on transformed nodes. Perhaps we should instead add a Debug.assert to the top of getSourceFileOfNode to more formally prevent this case.

  • Alternatively, is it a bug that a non-synthetic FunctionDeclaration has a parent pointer to a synthetic node? It looks as if getOrCreateEmitNode assumes this cannot happen in its call to getSourceFileOfNode

Yes, this is a bug. There are only a few places where we explicitly set the parent pointer in order to synthesize an AST for the checker (namely for --emitDecoratorMetadata), otherwise modifying the parent should be avoided.

In the case of getOrCreateEmitNode, the only reason it walks up the spine for any node is so that we can clean up emitNodes on parse tree nodes that might end up being reused during incremental parsing scenarios. We don't bother doing that for synthesized nodes, since they only live until emit completes.

  • Lastly, is it a bug that the top level ModuleDeclaration ends up with an undefined parent pointer, and it should lead to some source file?

It is not a bug to have an undefined parent if the ModuleDeclaration was created during transformation (i.e. has the NodeFlags.Synthesized bit set).


I do not believe this is the correct fix for this issue. I would be more interested in finding the specific case where the parent pointer for a parse-tree node is being modified. Since this is a before transform, there may still be some incorrect assumptions in the ts.ts transform where it assumes it is the first transformation to run.

@mprobst
Copy link
Contributor Author

mprobst commented Oct 22, 2019

@rbuckton thanks for the thorough explanation! That's really helpful.

I've traced this a bit longer, and (by using Object.defineProperty with a debugger call) discovered this stack:

    at setParentPointers (src/compiler/binder.ts:4253:21)
    at src/compiler/binder.ts:4254:43
    at visitNodes (src/compiler/parser.ts:41:32)
    at Object.forEachChild (src/compiler/parser.ts:273:24)
    at setParentPointers (src/compiler/binder.ts:4254:9)
    at Object.getModuleInstanceState (src/compiler/binder.ts:19:13)
    at Object.isInstantiatedModule (src/compiler/checker.ts:254:29)
    at shouldEmitModuleDeclaration (src/compiler/transformers/ts.ts:2456:20)
    at visitModuleDeclaration (src/compiler/transformers/ts.ts:2574:18)
    at visitTypeScript (src/compiler/transformers/ts.ts:538:28)

This looks as if getModuleInstanceState is violating the invariants here, but it's unclear to me how to fix.

@mprobst
Copy link
Contributor Author

mprobst commented Oct 22, 2019

I filed #34644 to track this.

@andrewbranch
Copy link
Member

The issue is that getModuleInstanceState depends on parent pointers being set, so it walks the tree and sets them. When the module declaration is synthesized but contains original nodes nested in it, this is a problem, since the original nodes will have their parent pointers rebound.

I’ve just fixed this locally by cloning the tree in getModuleInstanceState if it’s synthesized, replacing any non-synthesized descendants with synthesized clones. It feels a little clunky to me though, so I’m curious if @rbuckton has a more clever idea.

@andrewbranch
Copy link
Member

@mprobst I talked to @rbuckton and he suggested that the caller of getModuleInstanceState just call it with the original parse tree node instead of the transformed/synthesized node. This means that custom transformers can’t actually affect whether or not a module declaration gets emitted to JS, but that seems fine for now.

ts.ts:

function shouldEmitModuleDeclaration(nodeIn: ModuleDeclaration) {
    const node = getParseTreeNode(nodeIn, isModuleDeclaration);
    // if we can't find a parse tree node, assume the node is instantiated.
    return node ? isInstantiatedModule(node, !!compilerOptions.preserveConstEnums || !!compilerOptions.isolatedModules) : true;
}

We need to get this in in the next couple days for 3.7, so let me know if you want to update it here. Otherwise, or if I don’t hear from you by Wednesday-ish, I’ll open a new PR. Thanks for the investigation!

When transforming a module declaration and block, parse tree nodes
contained in the module block have their parent pointers reset due to
`shouldEmitModuleDeclaration` calling into `isInstantiatedModule`, which
needs to set parent pointers to operate.

That causes a crash when later transforming any nodes within the module,
as retrieving their source file in `getSourceFileOfNode` (via
`getOrCreateEmitNode`) fails, due to their new synthesized parent nodes
not being in a source file.

This change avoids the issue by using the parse tree node in `ts.ts` to
decide whether a module declaration should be emitted (i.e. whether the
module contains values).

This means transformers cannot add values to modules that previously did
not contain any.

Fixes microsoft#34644.
@mprobst
Copy link
Contributor Author

mprobst commented Oct 29, 2019

@andrewbranch thanks for the help. Done, and rebased onto current master.

Copy link
Member

@andrewbranch andrewbranch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mprobst! 🙌

@andrewbranch andrewbranch merged commit ff590b6 into microsoft:master Oct 29, 2019
@andrewbranch
Copy link
Member

@typescript-bot cherry-pick this to release-3.7

@typescript-bot
Copy link
Collaborator

Hey @andrewbranch, I've opened #34800 for you.

typescript-bot pushed a commit to typescript-bot/TypeScript that referenced this pull request Oct 29, 2019
Component commits:
62aad54 Fix a crash when transforming functions in modules.
When transforming a module declaration and block, parse tree nodes
contained in the module block have their parent pointers reset due to
`shouldEmitModuleDeclaration` calling into `isInstantiatedModule`, which
needs to set parent pointers to operate.

That causes a crash when later transforming any nodes within the module,
as retrieving their source file in `getSourceFileOfNode` (via
`getOrCreateEmitNode`) fails, due to their new synthesized parent nodes
not being in a source file.

This change avoids the issue by using the parse tree node in `ts.ts` to
decide whether a module declaration should be emitted (i.e. whether the
module contains values).

This means transformers cannot add values to modules that previously did
not contain any.

Fixes microsoft#34644.
@mprobst mprobst deleted the transform-fn-crash branch October 29, 2019 17:51
andrewbranch pushed a commit that referenced this pull request Nov 11, 2019
Component commits:
62aad54 Fix a crash when transforming functions in modules.
When transforming a module declaration and block, parse tree nodes
contained in the module block have their parent pointers reset due to
`shouldEmitModuleDeclaration` calling into `isInstantiatedModule`, which
needs to set parent pointers to operate.

That causes a crash when later transforming any nodes within the module,
as retrieving their source file in `getSourceFileOfNode` (via
`getOrCreateEmitNode`) fails, due to their new synthesized parent nodes
not being in a source file.

This change avoids the issue by using the parse tree node in `ts.ts` to
decide whether a module declaration should be emitted (i.e. whether the
module contains values).

This means transformers cannot add values to modules that previously did
not contain any.

Fixes #34644.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants