Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuleWalker-based rules are slow and should be avoided #2522

Closed
RyanCavanaugh opened this issue Apr 7, 2017 · 9 comments

Comments

@RyanCavanaugh
Copy link

commented Apr 7, 2017

Bug Report

Rules using the built-in RuleWalker class are more than twice as slow as those using a more efficient pattern. Either some substantial refactoring should be done to make that class more efficient (not sure if that's possible), or it should be deprecated for rules which don't actually need its functionality and existing rules converted over.

  • TSLint version: master
  • TypeScript version: master
  • Running TSLint via: CLI

TypeScript code being linted

https://www.github.com/Microsoft/TypeScript (via jake lint)

Actual behavior

Adding some code to sum up the time needed by each rule, on my machine, the classNameRule (class-name) takes an average of ~900ms to run.

Converting this rule to a recur-style functional pattern drops that time to ~300ms

Let's compare the old code to the new code:
https://github.com/palantir/tslint/blob/master/src/rules/classNameRule.ts (trimmed to the relevant portion):

class NameWalker extends Lint.RuleWalker {
    public visitClassDeclaration(node: ts.ClassDeclaration) {
        // classes declared as default exports will be unnamed
        if (node.name != null) {
            const className = node.name.getText();
            if (!this.isPascalCased(className)) {
                this.addFailureAtNode(node.name, Rule.FAILURE_STRING);
            }
        }

        super.visitClassDeclaration(node);
    }

    public visitInterfaceDeclaration(node: ts.InterfaceDeclaration) {
        const interfaceName = node.name.getText();
        if (!this.isPascalCased(interfaceName)) {
            this.addFailureAtNode(node.name, Rule.FAILURE_STRING);
        }

        super.visitInterfaceDeclaration(node);
    }

    private isPascalCased(name: string) {
        if (name.length <= 0) {
            return true;
        }

        const firstCharacter = name.charAt(0);
        return ((firstCharacter === firstCharacter.toUpperCase()) && name.indexOf("_") === -1);
    }
}

Rule class-name took 930ms

New code:

function walk(ctx: Lint.WalkContext<void>) {
    ts.forEachChild(ctx.sourceFile, recur);

    function recur(node: ts.Node): void {
        if (isNamedClass(node)) {
            const className = (node.name as ts.Identifier).text;
            if (!isPascalCased(className)) {
                ctx.addFailureAtNode(node.name, Rule.FAILURE_STRING);
            }
        }

        if (isInterface(node)) {
            const interfaceName = (node.name as ts.Identifier).text;
            if (!isPascalCased(interfaceName)) {
                ctx.addFailureAtNode(node.name, Rule.FAILURE_STRING);
            }
        }

        ts.forEachChild(node, recur);
    }
}

function isNamedClass(node: ts.Node): node is (ts.ClassDeclaration & { name: ts.Identifier }) {
    return (node.kind === ts.SyntaxKind.ClassDeclaration) && ((node as ts.ClassDeclaration).name != null);
}

function isInterface(node: ts.Node): node is ts.InterfaceDeclaration {
    return node.kind === ts.SyntaxKind.InterfaceDeclaration;
}

function isPascalCased(name: string) {
    if (name.length <= 0) {
        return true;
    }

    const firstCharacter = name.charAt(0);
    return ((firstCharacter === firstCharacter.toUpperCase()) && name.indexOf("_") === -1);
}

Rule class-name took 297ms

Analysis

I think the root cause here is just the large number of indirections needed to complete a tree walk. The VM can't possibly inline or JIT all these virtual method calls, but the closed-over form can be more efficiently optimized.

An even better solution would be if all (or most) rules could share in a single pass of the tree. TypeScript itself budgets full-tree walks very carefully (the binder does ~three and the checker does ~two) because it's so slow for larger programs.

@adidahiya

This comment has been minimized.

Copy link
Member

commented Apr 7, 2017

@RyanCavanaugh thanks for the heads-up. We've been converting a lot of rules to use the recur-style functional pattern recently (props to @andy-hanson and @ajafff) and have written up performance tips and walker design docs that started from the investigation in #1131. I found that TSLint v5 was almost 2x as fast as v4 in a few projects I ran it on (using the recommended config preset), so I think we're well on our way to better linting performance. This is a good issue thread to collect general advice for developers. Once we migrate all the core rules away from RuleWalker, we can deprecate it and hopefully remove it completely a couple major versions from now.

@RyanCavanaugh

This comment has been minimized.

Copy link
Author

commented Apr 7, 2017

I was thinking about writing some single-pass rule setup. We could have something like this, where many rules could share a single tree walk

export class Rule extends Lint.Rules.AbstractRule {
    public static FAILURE_STRING = "Class name must be in pascal case";

    public applyFast(walkContext: WalkContext, sourceFile: ts.SourceFile): Lint.RuleFailure[] {
        // walkContext provides 'on', 'before', and 'after' so rules can do some
        // nested declaration investigation
        walkContext.on([ts.SyntaxKind.ClassDeclaration, ts.SyntaxKind.ClassExpression], checkClass);
        walkContext.on([ts.SyntaxKind.InterfaceDeclaration], checkInterface);
    }
}

function checkClass(context: WalkContext, node: ts.ClassDeclaration | ts.ClassExpression) {
    // check node.name
}

function checkInterface(context: WalkContext, node: ts.InterfaceDeclaration) {
    // check node.name
    context.addError(node, Rule.FAILURE_STRING);
}

Thoughts? For most things which only have to check some identifiers or specific keywords, this could be a substantial improvement

@adidahiya

This comment has been minimized.

Copy link
Member

commented Apr 7, 2017

Ah, sorry I didn't address the last part of your OP in my above comment. A single-pass rule setup for a subset of the core rules sounds great too 👍

@RyanCavanaugh

This comment has been minimized.

Copy link
Author

commented Apr 8, 2017

Prototype branch in progress master...RyanCavanaugh:fastLint

I'm converting over the rules the TS repo uses as a first stab. So far it's working quite nicely

@ajafff

This comment has been minimized.

Copy link
Collaborator

commented Apr 8, 2017

I was thinking about writing some single-pass rule setup. We could have something like this, where many rules could share a single tree walk

TL;DR: please don't, maybe I have an alternative

Although I like the idea, I don't think you will gain that much performance. I thought about this some time ago and figured out that it will eventually have the same performance problems that we see in RuleWalker today.
There are several things to consider:

  • It will get slower the more rules you add to it. As you noted above, VMs are not capable of optimizing and inlining virtual function calls. Your implementation will deopt very early because it's megamorphic
  • You cannot use functions such as isFunctionScopeBoundary as condition
  • If you run tslint --fix, rules should not run "in parallel" to avoid intersecting fixes. After one rule applied fixes, a new SourceFile is created which is then used for linting.
  • Technicaly you could convert every rule to use this implementation. It will get really hard to follow the control flow through all those closures -> debugging nightmare(?)

Instead of writing a more efficient algorithm to work around the performance problems of typescript's AST, why not simply use other data structures that enable better algorithms?

I wrote a small PoC that walks the whole AST once and converts it into two formats which are then cached in a WeakMap. https://github.com/ajafff/tsutils/blob/wrap-ast/src/wrap.ts

This could be used by all rules and will even get faster because of the cache.

Open questions related to caching:

  • does typescript's new transform feature modify the SourceFile or is a new one created?
  • when type checking, tslint uses the SourceFiles provided by Program which are not garbage collected because they are still refenced. that will keep the cached data structures alive.
    • one solution would be to clear the cache when the next SourceFile is converted
    • Maybe Linter can do the caching, because it knows when it's done with a SourceFile and can throw away the cache.
@RyanCavanaugh

This comment has been minimized.

Copy link
Author

commented Apr 13, 2017

@ajafff interesting idea! Some thoughts

Instead of writing a more efficient algorithm to work around the performance problems of typescript's AST, why not simply use other data structures that enable better algorithms?

We have to start by correctly identifying the problem - the AST is not slow! It's nearly as fast as an implementation could plausibly be. It's just the fact that for a large project, an AST walk involves hundreds of thousands of calls.

As a baseline, here are the times it took to run our rules on our codebase (I have manually sorted this list):

Rule whitespace took 1337
Rule typedef-whitespace took 1033
Rule one-line took 768
Rule no-var-keyword took 683
Rule prefer-const took 512
Rule semicolon took 379
Rule indent took 367
Rule class-name took 316
Rule no-inferrable-types took 296
Rule quotemark took 254
Rule no-null-keyword took 247
Rule comment-format took 242
Rule jsdoc-format took 207
Rule no-trailing-whitespace took 120
Rule no-internal-module took 11
Rule linebreak-style took 7
Rule next-line took 4
Rule boolean-trivia took 3
Rule type-operator-spacing took 1
Rule no-increment-decrement took 1
Rule object-literal-surrounding-space took 1
Rule no-in-operator took 1
Rule no-type-assertion-whitespace took 0

There's a clear delineation: Rules which walk everything incur a minimum ~250ms cost, which is just the baseline cost to do an entire walk. As I converted rules over, it was easy to see that cost vanish (because the non-walk cost of most rules is extremely small).

It will get slower the more rules you add to it.

I can't see how this is avoidable?

If you run tslint --fix, rules should not run "in parallel" to avoid intersecting fixes. After one rule applied fixes, a new SourceFile is created which is then used for linting.

Falling back to an iterative mode if more than one rule produces fixes is fine. The modal number of fixes is very small. At worst, it's equivalent to today's implementation.

As you noted above, VMs are not capable of optimizing and inlining virtual function calls.

These function calls aren't virtual (there's no prototype chain to walk). They're indirect, but have fixed lexical/this scope and thus should have no problems getting optimized. Of course, we're both just speculating, and measurements will be needed.

It will get really hard to follow the control flow through all those closures -> debugging nightmare(?)

I don't see how it's any different from today.


I like the approach I have here because we can easily convert all existing walk-based rules to it with no loss of generality and very little risk (see the diff I linked). But I also like yours because I agree it ought to be very easy to write a rule that only scans certain node kinds.

One thing I'm not clear on with your approach: When will walking a tree once in order to make a new data structure, then checking against that data structure, be cheaper than doing those same checks during the walk? It seems like the former will be faster for complex rules walking small trees, where the latter will be faster for simple rules walking large trees. What does the balance of rules actually look like? It'd be great to run an apples-to-apples comparison.

Since the future will certainly involve rules that need to participate in a full (hopefully single) walk, and rules which want to just query select nodes... por que no los dos? We just have no reason to have rules which run their own full walk.

@RyanCavanaugh

This comment has been minimized.

Copy link
Author

commented Jun 12, 2017

@ajafff What can be done to make something happen here? The shared walker approach can get us massive perf gains with only minimal code updates to existing rules. I don't want this to be a "licked cookie" blocked by an alternate approach that isn't making forward progress.

@karfau

This comment has been minimized.

Copy link

commented Jun 17, 2018

As a baseline, here are the times it took to run our rules on our codebase (I have manually sorted this list):

@RyanCavanaugh could you provide a hint how you created those numbers?
This is exactly what I'm looking for, but I have no clue how to go about it.

@JoshuaKGoldberg

This comment has been minimized.

Copy link
Collaborator

commented Aug 20, 2018

Hey @karfau I happened to also have this need - see...

  • async function doLinting(options: Options, files: string[], program: ts.Program | undefined, logger: Logger): Promise<LintResult> {
    (the method that does linting on all files)
  • public lint(fileName: string, source: string, configuration: IConfigurationFile = DEFAULT_CONFIG): void {
    (the function that runs linting on a file)
  • private applyRule(rule: IRule, sourceFile: ts.SourceFile): RuleFailure[] {
    (the function that runs a rule on a file)

You could create a Map<IRule, number> of times taken per rule at the top of the file, then in applyRule use process.hrtime to add to that map for each rule. Add a method to the Linter that logs the results of that, which would be called in the Runner.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
5 participants
You can’t perform that action at this time.