New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return AST from low-level programmatic API [$10 awarded] #948

Closed
RReverser opened this Issue Jun 2, 2014 · 29 comments

Comments

Projects
None yet
8 participants
@RReverser

RReverser commented Jun 2, 2014

Passing AST retrieved from different tools/preprocessors/bundlers would be really great as it would allow not to generate code and reparse output, but reuse already parsed tree which is interoperable by nature.

I believe everything that needs to be changed is here: https://github.com/eslint/eslint/blob/master/lib/eslint.js#L473-L482 (check if typeof text === 'string' and do everything as before and assign it directly to ast if it's object).

I'm aware of rules that require source code to be available, but those check formatting and code style only and so obviously do not apply for preprocessed AST (even in case when we auto-generate code from it before passing to eslint).

However, I'm not sure about rules that call getTokenBefore / getTokenAfter / etc., as most tools work on AST only and do not care about tokens array, so there may be mismatch between modified tree and original tokens. Should we (and can we) regenerate tokens array in that case from tree?

The $10 bounty on this issue has been claimed at Bountysource.

@jrajav

This comment has been minimized.

Show comment
Hide comment
@jrajav

jrajav Jun 2, 2014

Contributor

Wouldn't you get the same benefits from passing in both the source code and the AST? Then you could still apply style rules.

Contributor

jrajav commented Jun 2, 2014

Wouldn't you get the same benefits from passing in both the source code and the AST? Then you could still apply style rules.

@RReverser

This comment has been minimized.

Show comment
Hide comment
@RReverser

RReverser Jun 2, 2014

This is needed exactly for cases when you don't have a source code for modified AST and its generation takes time even when not really needed (like in ESLint which works with works with AST internally anyway); also style rules really do not make sense when applied to generated/transformed AST and code.

RReverser commented Jun 2, 2014

This is needed exactly for cases when you don't have a source code for modified AST and its generation takes time even when not really needed (like in ESLint which works with works with AST internally anyway); also style rules really do not make sense when applied to generated/transformed AST and code.

@RReverser

This comment has been minimized.

Show comment
Hide comment
@RReverser

RReverser Jun 2, 2014

Or do you mean passing original source code + modified AST? Wouldn't it break rules behavior?

RReverser commented Jun 2, 2014

Or do you mean passing original source code + modified AST? Wouldn't it break rules behavior?

@jrajav

This comment has been minimized.

Show comment
Hide comment
@jrajav

jrajav Jun 2, 2014

Contributor

What is an example use case where you would want eslint to be acting on a modified AST? I think most people see it as a tool for guaranteeing the style, correctness and maintainability of the original source, the one developers work with.

Contributor

jrajav commented Jun 2, 2014

What is an example use case where you would want eslint to be acting on a modified AST? I think most people see it as a tool for guaranteeing the style, correctness and maintainability of the original source, the one developers work with.

@RReverser

This comment has been minimized.

Show comment
Hide comment
@RReverser

RReverser Jun 2, 2014

Just one of examples - transpiled React JSX AST to JS AST. JSX AST produced by esprima-fb is not understandable by most of AST tools that rely on strict set of known node types (and esprima-fb introduces set of own ones), so it should be transpiled before passing to escodegen/estraverse/eslint/etc.

One way to do this is to use https://www.npmjs.org/package/react-tools and call React.transform(jsxCode) which parses code to JSX AST, transpiles own nodes to JS-compatible, generates JS code, then you can pass this code to eslint, which parses code back to JS AST and starts its work. Another way is to use mini-transpiler that changes only specific nodes returning modified AST.

And React JSX is not the only example of pre-transformations on an original code before it becomes valid, it's just probably more complex example that illustrates use-case.

So the question is - why not allow passing in JS AST instead of generating-and-reparsing it in cases when it's already available?

RReverser commented Jun 2, 2014

Just one of examples - transpiled React JSX AST to JS AST. JSX AST produced by esprima-fb is not understandable by most of AST tools that rely on strict set of known node types (and esprima-fb introduces set of own ones), so it should be transpiled before passing to escodegen/estraverse/eslint/etc.

One way to do this is to use https://www.npmjs.org/package/react-tools and call React.transform(jsxCode) which parses code to JSX AST, transpiles own nodes to JS-compatible, generates JS code, then you can pass this code to eslint, which parses code back to JS AST and starts its work. Another way is to use mini-transpiler that changes only specific nodes returning modified AST.

And React JSX is not the only example of pre-transformations on an original code before it becomes valid, it's just probably more complex example that illustrates use-case.

So the question is - why not allow passing in JS AST instead of generating-and-reparsing it in cases when it's already available?

@jrajav

This comment has been minimized.

Show comment
Hide comment
@jrajav

jrajav Jun 2, 2014

Contributor

So the question is - why not allow passing in JS AST instead of generating-and-reparsing it in cases when it's already available?

Maybe this isn't enough reason on it's own, but one reason why not is because it introduces extra complexity - All rules now have to worry about having access to an AST and be able to pass if there's no source code. That also creates an inconsistency: With the exact same config, you could get different results from running eslint on AST from a transpiler where source code rules would be ignored, versus running eslint on the generated JS from the same transpiler and source code. You'd get errors in the latter that weren't in the former.

Anyway, integrating eslint with any kind of transpiler (even a "mild" one like JSX) seems a little odd. Eslint is a tool meant to point out problems in Javascript source code, it just happens to use AST parsing as its vehicle. Transpiled languages are not Javascript source code, they are different languages that compile into Javascript source code. If we try to make exceptions for languages that are still mostly Javascript, we start having to deal with lots of edge cases and branched logic.

Contributor

jrajav commented Jun 2, 2014

So the question is - why not allow passing in JS AST instead of generating-and-reparsing it in cases when it's already available?

Maybe this isn't enough reason on it's own, but one reason why not is because it introduces extra complexity - All rules now have to worry about having access to an AST and be able to pass if there's no source code. That also creates an inconsistency: With the exact same config, you could get different results from running eslint on AST from a transpiler where source code rules would be ignored, versus running eslint on the generated JS from the same transpiler and source code. You'd get errors in the latter that weren't in the former.

Anyway, integrating eslint with any kind of transpiler (even a "mild" one like JSX) seems a little odd. Eslint is a tool meant to point out problems in Javascript source code, it just happens to use AST parsing as its vehicle. Transpiled languages are not Javascript source code, they are different languages that compile into Javascript source code. If we try to make exceptions for languages that are still mostly Javascript, we start having to deal with lots of edge cases and branched logic.

@nzakas nzakas added question labels Jun 2, 2014

@RReverser

This comment has been minimized.

Show comment
Hide comment
@RReverser

RReverser Jun 2, 2014

That also creates an inconsistency: With the exact same config, you could get different results

Not when you set all the rules explicitly (and set ignoring style rules).

All rules now have to worry about having access to an AST and be able to pass if there's no source code.

Most rules already use AST+tokens for checks. Even source code dependant rules mostly use range location information (like https://github.com/eslint/eslint/blob/master/lib/rules/space-after-keywords.js#L25) which is part of AST, too, so those rules won't fail as well.

it just happens to use AST parsing as its vehicle

And I believe that it is exactly what makes it great and what can make it great addition to family of AST tools that are interoperable and chainable by nature.

RReverser commented Jun 2, 2014

That also creates an inconsistency: With the exact same config, you could get different results

Not when you set all the rules explicitly (and set ignoring style rules).

All rules now have to worry about having access to an AST and be able to pass if there's no source code.

Most rules already use AST+tokens for checks. Even source code dependant rules mostly use range location information (like https://github.com/eslint/eslint/blob/master/lib/rules/space-after-keywords.js#L25) which is part of AST, too, so those rules won't fail as well.

it just happens to use AST parsing as its vehicle

And I believe that it is exactly what makes it great and what can make it great addition to family of AST tools that are interoperable and chainable by nature.

@nzakas

This comment has been minimized.

Show comment
Hide comment
@nzakas

nzakas Jun 2, 2014

Member

ESLint relies both on the AST, tokens, and the source code being used together, we can't use just an AST. Further, we rely on certain Esprima options so that AST nodes have certain properties available. Without all of this data, ESLint won't work.

So, I'm not sure we could do this in a way that would provide any real value to anyone. If you just have an AST, ESLint can't really help you.

Member

nzakas commented Jun 2, 2014

ESLint relies both on the AST, tokens, and the source code being used together, we can't use just an AST. Further, we rely on certain Esprima options so that AST nodes have certain properties available. Without all of this data, ESLint won't work.

So, I'm not sure we could do this in a way that would provide any real value to anyone. If you just have an AST, ESLint can't really help you.

@RReverser

This comment has been minimized.

Show comment
Hide comment
@RReverser

RReverser Jun 2, 2014

I see, that makes sense.

Ok, what about inverting logic then - could it be possible to return AST from eslint on a successful run (as part of programmatic API), so we wouldn't parse same code twice for linting and, then, for transformations?

RReverser commented Jun 2, 2014

I see, that makes sense.

Ok, what about inverting logic then - could it be possible to return AST from eslint on a successful run (as part of programmatic API), so we wouldn't parse same code twice for linting and, then, for transformations?

@nzakas

This comment has been minimized.

Show comment
Hide comment
@nzakas

nzakas Jun 2, 2014

Member

That's more feasible, certainly. We can take a look at that when the low-level utility is refactored.

Member

nzakas commented Jun 2, 2014

That's more feasible, certainly. We can take a look at that when the low-level utility is refactored.

@nzakas nzakas added enhancement and removed question labels Jun 2, 2014

@nzakas nzakas changed the title from Allow passing AST as input to Return AST from low-level programmatic API Jun 2, 2014

@RReverser

This comment has been minimized.

Show comment
Hide comment
@RReverser

RReverser Jun 2, 2014

Thanks, will be waiting for that and falling back to double-parsing for now. Let's leave this issue open then until it becomes possible to reuse AST.

RReverser commented Jun 2, 2014

Thanks, will be waiting for that and falling back to double-parsing for now. Let's leave this issue open then until it becomes possible to reuse AST.

@nzakas

This comment has been minimized.

Show comment
Hide comment
@nzakas

nzakas Jun 3, 2014

Member

Title has already been changed to reflect the action item.

Member

nzakas commented Jun 3, 2014

Title has already been changed to reflect the action item.

@RReverser

This comment has been minimized.

Show comment
Hide comment
@RReverser

RReverser Jul 12, 2014

Since #935 is closed - is it possible now to get AST as output?

RReverser commented Jul 12, 2014

Since #935 is closed - is it possible now to get AST as output?

@jrajav

This comment has been minimized.

Show comment
Hide comment
@jrajav

jrajav Jul 12, 2014

Contributor

No, that was a more generic issue to abstract the CLI interface to be more easily called programmatically. AST in/out is still undone.

#1025 seems related to this issue and has more recent discussion.

Contributor

jrajav commented Jul 12, 2014

No, that was a more generic issue to abstract the CLI interface to be more easily called programmatically. AST in/out is still undone.

#1025 seems related to this issue and has more recent discussion.

@nzakas nzakas added the accepted label Sep 20, 2014

@IanVS

This comment has been minimized.

Show comment
Hide comment
@IanVS

IanVS Jul 14, 2015

Member

I've posted a $10 bounty on Bountysource for this issue. Implementing this may simplify automatic configuration creation.

Member

IanVS commented Jul 14, 2015

I've posted a $10 bounty on Bountysource for this issue. Implementing this may simplify automatic configuration creation.

@willklein

This comment has been minimized.

Show comment
Hide comment
@willklein

willklein Jul 14, 2015

I don't think it would simplify the auto-configuration, but it would significantly improve its performance. Without this, we're repeatedly reading and parsing every file in a project, and doing that many times.

willklein commented Jul 14, 2015

I don't think it would simplify the auto-configuration, but it would significantly improve its performance. Without this, we're repeatedly reading and parsing every file in a project, and doing that many times.

@ilyavolodin ilyavolodin changed the title from Return AST from low-level programmatic API to Return AST from low-level programmatic API ($10) Jul 14, 2015

@ilyavolodin ilyavolodin added the bounty label Jul 14, 2015

@ilyavolodin

This comment has been minimized.

Show comment
Hide comment
@ilyavolodin

ilyavolodin Jul 14, 2015

Member

I don't think it will help with that either. Because while you might get AST from CLIEngine, you will not be able to pass it back to CLIEngine.

Member

ilyavolodin commented Jul 14, 2015

I don't think it will help with that either. Because while you might get AST from CLIEngine, you will not be able to pass it back to CLIEngine.

@ilyavolodin ilyavolodin changed the title from Return AST from low-level programmatic API ($10) to Return AST from low-level programmatic API [$10] Jul 14, 2015

@willklein

This comment has been minimized.

Show comment
Hide comment
@willklein

willklein Jul 14, 2015

@ilyavolodin, if this and #1025 were both done, would that give us the means to pass the returned AST (and source) back in for reuse?

willklein commented Jul 14, 2015

@ilyavolodin, if this and #1025 were both done, would that give us the means to pass the returned AST (and source) back in for reuse?

@ilyavolodin

This comment has been minimized.

Show comment
Hide comment
@ilyavolodin

ilyavolodin Jul 14, 2015

Member

Yes, if both would be done, it would probably help speed things up.

Member

ilyavolodin commented Jul 14, 2015

Yes, if both would be done, it would probably help speed things up.

@tqc

This comment has been minimized.

Show comment
Hide comment
@tqc

tqc Aug 7, 2015

I'm looking at implementing this based on changes made in #3217 to fix #1025. I can see a couple of possibilities for the API:

Make parse a public method

var code = new SourceCode(text, linter.parse(text, config));
doSomething(code.parseResult);
linter.verify(code, config);

Add a method similar to the existing getAllComments

linter.verify(text, config)
var code = linter.getSourceCode()
doSomething(code.parseResult);

Thoughts?

tqc commented Aug 7, 2015

I'm looking at implementing this based on changes made in #3217 to fix #1025. I can see a couple of possibilities for the API:

Make parse a public method

var code = new SourceCode(text, linter.parse(text, config));
doSomething(code.parseResult);
linter.verify(code, config);

Add a method similar to the existing getAllComments

linter.verify(text, config)
var code = linter.getSourceCode()
doSomething(code.parseResult);

Thoughts?

@nzakas

This comment has been minimized.

Show comment
Hide comment
@nzakas

nzakas Aug 11, 2015

Member

I'm not sure I understand what problem this is solving. Can you explain?

Member

nzakas commented Aug 11, 2015

I'm not sure I understand what problem this is solving. Can you explain?

@tqc

This comment has been minimized.

Show comment
Hide comment
@tqc

tqc Aug 12, 2015

@nzakas It's pretty much exactly what the title says - If you are using a tool alongside ESLint that requires an AST as input, this gives you access to the AST produced by ESLint rather than requiring you to generate a new one. Earlier comments suggest automatic configuration as a possible use case, and I can see some possibilities around automatically fixing errors where it would be quite useful.

Both of those could be accomplished with #1025 alone, but this makes it much more convenient by using the standard ESLint parse rather than requiring the caller to reimplement it.

Impact on the codebase should be fairly minimal - the first option at least can be done with only a single line change on top of #1025.

tqc commented Aug 12, 2015

@nzakas It's pretty much exactly what the title says - If you are using a tool alongside ESLint that requires an AST as input, this gives you access to the AST produced by ESLint rather than requiring you to generate a new one. Earlier comments suggest automatic configuration as a possible use case, and I can see some possibilities around automatically fixing errors where it would be quite useful.

Both of those could be accomplished with #1025 alone, but this makes it much more convenient by using the standard ESLint parse rather than requiring the caller to reimplement it.

Impact on the codebase should be fairly minimal - the first option at least can be done with only a single line change on top of #1025.

@BYK

This comment has been minimized.

Show comment
Hide comment
@BYK

BYK Aug 12, 2015

Member

@nzakas - It would allow ESLint to be part/entry point of projects like https://github.com/asterjs/aster which I think is the future.

Member

BYK commented Aug 12, 2015

@nzakas - It would allow ESLint to be part/entry point of projects like https://github.com/asterjs/aster which I think is the future.

@IanVS

This comment has been minimized.

Show comment
Hide comment
@IanVS

IanVS Aug 12, 2015

Member

I wonder, couldn't the public parse method of Espree be used to get AST?

As the README says:

var espree = require("espree");
var ast = espree.parse(code);

What is the value of returning the AST from ESLint rather than straight from whatever parser you have configured ESLint to use (Espree, babel-eslint, etc.)? This is essentially all that ESLint is doing.

Member

IanVS commented Aug 12, 2015

I wonder, couldn't the public parse method of Espree be used to get AST?

As the README says:

var espree = require("espree");
var ast = espree.parse(code);

What is the value of returning the AST from ESLint rather than straight from whatever parser you have configured ESLint to use (Espree, babel-eslint, etc.)? This is essentially all that ESLint is doing.

@tqc

This comment has been minimized.

Show comment
Hide comment
@tqc

tqc Aug 12, 2015

@IanVS True, #1025 allows that much. However, the ESLint parse isn't exactly the same as the espree parse method - It includes things like finding the right parser to use and making sure that the result will include all the components needed by verify. The call to espree will end up being a lot longer than just parse(code).

tqc commented Aug 12, 2015

@IanVS True, #1025 allows that much. However, the ESLint parse isn't exactly the same as the espree parse method - It includes things like finding the right parser to use and making sure that the result will include all the components needed by verify. The call to espree will end up being a lot longer than just parse(code).

@RReverser

This comment has been minimized.

Show comment
Hide comment
@RReverser

RReverser Aug 12, 2015

@IanVS The point is exactly not to parse code twice if ESLint does that anyway under the hood and you can just get and use it for further operations.

@BYK Thank you for mentioning aster! Didn't know someone still cares about that kind of projects ;)

RReverser commented Aug 12, 2015

@IanVS The point is exactly not to parse code twice if ESLint does that anyway under the hood and you can just get and use it for further operations.

@BYK Thank you for mentioning aster! Didn't know someone still cares about that kind of projects ;)

@nzakas

This comment has been minimized.

Show comment
Hide comment
@nzakas

nzakas Aug 12, 2015

Member

@BYK I'm not debating the usefulness, the issue has already been accepted. :)

@tqc I'm just trying to understand what you're suggesting because the two suggestions are very different in my mind. I think the second one is closer to what we want, but need to look more closely.

Member

nzakas commented Aug 12, 2015

@BYK I'm not debating the usefulness, the issue has already been accepted. :)

@tqc I'm just trying to understand what you're suggesting because the two suggestions are very different in my mind. I think the second one is closer to what we want, but need to look more closely.

@tqc

This comment has been minimized.

Show comment
Hide comment
@tqc

tqc Aug 12, 2015

@nzakas Fair enough. I saw them as quite similar - both are three lines that can replace the existing call to verify if you want more control over the AST.

The main difference would be when you get access to the AST.

Calling parse, you get the AST first and could potentially use it to update the rule settings before calling verify. It also fits in quite naturally with #1025 - I like the idea of providing a parse method as a reference - if your custom parse function doesn't return the same sort of thing, you are doing it wrong.

Calling getSourceCode will only work after calling verify. It looks a bit cleaner than the first option in some ways, but I'm not sure if that limitation is worth it.

tqc commented Aug 12, 2015

@nzakas Fair enough. I saw them as quite similar - both are three lines that can replace the existing call to verify if you want more control over the AST.

The main difference would be when you get access to the AST.

Calling parse, you get the AST first and could potentially use it to update the rule settings before calling verify. It also fits in quite naturally with #1025 - I like the idea of providing a parse method as a reference - if your custom parse function doesn't return the same sort of thing, you are doing it wrong.

Calling getSourceCode will only work after calling verify. It looks a bit cleaner than the first option in some ways, but I'm not sure if that limitation is worth it.

tqc added a commit to tqc/eslint that referenced this issue Aug 14, 2015

tqc added a commit to tqc/eslint that referenced this issue Aug 14, 2015

@nzakas nzakas closed this in fc3bc8a Aug 24, 2015

nzakas added a commit that referenced this issue Aug 24, 2015

Merge pull request #3448 from eslint/issue1025
Update: Allow pre-parsed code (fixes #1025, fixes #948)
@RReverser

This comment has been minimized.

Show comment
Hide comment
@RReverser

RReverser Aug 25, 2015

Awesome, thanks @nzakas!

RReverser commented Aug 25, 2015

Awesome, thanks @nzakas!

@nzakas nzakas changed the title from Return AST from low-level programmatic API [$10] to Return AST from low-level programmatic API [$10 awarded] Sep 17, 2015

@eslint eslint bot locked and limited conversation to collaborators Feb 7, 2018

@eslint eslint bot added the archived due to age label Feb 7, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.