Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sourceType 'function' #6671

Closed
babel-bot opened this issue Oct 9, 2017 · 11 comments
Closed

Add sourceType 'function' #6671

babel-bot opened this issue Oct 9, 2017 · 11 comments
Labels
imported outdated A closed issue/PR that is archived due to age. Recommended to make a new issue pkg: parser

Comments

@babel-bot
Copy link
Collaborator

Issue originally reported by @esprehn in babel/babylon#752

We have snippets of JS supplied by content creators that should run inside a function and which we pass into new Function(transform(code)), but babel will only let us parse starting at either a top level file or module instead of inside a function. This manifests in an error about return statements which we can suppress, but that's not really what we want since we want it to parse based on the rules of new Function not invalid top level file syntax.

@babel-bot
Copy link
Collaborator Author

Comment originally made by @loganfsmyth

There is currently a allowReturnOutsideFunction option that should satisfy this usecase already I think.

@babel-bot
Copy link
Collaborator Author

Comment originally made by @esprehn

I mention that in the bug report, but that's not what I want. I want the parser to be in the mode that the spec says new Function(value) is parsed in. Is that what allowReturnOutsideFunction enables?

@babel-bot
Copy link
Collaborator Author

Comment originally made by @esprehn

Specifically that algorithm is this:
https://www.ecma-international.org/ecma-262/8.0/#sec-createdynamicfunction

which can parse as either FunctionBody, GeneratorBody (allowing yield) or AsyncFunctionBody (allowing await).

So you probably need 3 source types, one for each type of function.

@babel-bot
Copy link
Collaborator Author

Comment originally made by @loganfsmyth

I mention that in the bug report

I don't see that in there?

I want the parser to be in the mode that the spec says new Function(value) is parsed in.

Fair enough. I don't think this would be sourceType-related though. We'd probably just want a general context: 'program' | 'function' | 'async' | 'generator' option or something if we wanted to do this, since those can all occur both in modules and in scripts.

That said, is there any reason you wouldn't just actually wrap your code in a function before parsing it, then just parse it as a normal file?

@babel-bot
Copy link
Collaborator Author

Comment originally made by @esprehn

I mention that in the bug report

I don't see that in there?

Sorry I should have been more clear, that's what I meant by "an error about return statements which we can suppress, but that's not really what we want".

I want the parser to be in the mode that the spec says new Function(value) is parsed in.

Fair enough. I don't think this would be sourceType-related though. We'd probably just want a general context: 'program' | 'function' | 'async' | 'generator' option or something if we wanted to do this, since those can all occur both in modules and in scripts.

Yeah that would be fine. In the same way there's parseExpression, parseFunction would be nice.

That said, is there any reason you wouldn't just actually wrap your code in a function before parsing it, then just parse it as a normal file?

For the same reasons we no longer do that in web browsers for inline event handlers, and the same reason the spec doesn't do that. It means invalid code will be parsed as valid and things can confusingly escape from the enclosed function. That's the reason to use new Function(...argNames, body) over:

  eval(`( ${argNames.join(',')} ) => { ${body} }`)

Since a body like '} codeOutsideFunction()<!--' or an invalid argument name, or more accidental things things like an extra brace on the end of an if block can early terminate the function and produce confusing, buggy, (or security problematic) behavior.

Trying prevent all of those situations leads you down a path of trying to parse the code with regexes and wrap it with progressively more complex wrappers (we tried all this in Chrome, it was a bad idea :)) The reason to use a parser like Babel is to avoid having to ever parse or interact with the code as a string and instead make sure to interact with it always as an object structure as defined by the spec.

I suppose an alternative would be to add analogs to allowReturnOutsideFunction for both yield and async but that feels a lot hackier than babylon.parseFunction(body, { kind: "async" }).

@babel-bot
Copy link
Collaborator Author

Comment originally made by @nicolo-ribaudo

I'm strongly in favor of what Logan proposed in babel/babylon#341 (comment).

I already have a working implementation of it since I used it a while ago in a personal project, so I can open a PR, if wanted.

@babel-bot
Copy link
Collaborator Author

Comment originally made by @bakkot

@esprehn

It means invalid code will be parsed as valid and things can confusingly escape from the enclosed function.

For this particular case, you can just assert on the shape of the parsed code, surely:

function parseFunctionBody(src, isGenerator, isAsync) {
  const fn = parseExpression('(' + (isAsync ? 'async ' : '') + 'function ' + (isGenerator ? '*' : '') + '(){' + src + '\n})');
  if (fn.type !== 'FunctionExpression') {
    throw new SyntaxError('error');
    // you can also walk the tree to find the location of the leftmost function expression and throw an error with location pointing at its closing brace
  }
  return fn.body;
}

I suppose this doesn't give you precise location information for syntax errors which occur after an extra } (i.e., it would point to those errors rather than the brace). But other than that I don't see any particular downsides; it doesn't have the "invalid code will be parsed as valid" problem, at least, I don't think. I don't know see why you'd have to get any more complicated with wrappers and regexes than this.

Which isn't to say I'm opposed to something more in the API, but I'd like to better understand why you don't want to go with this approach.

@babel-bot
Copy link
Collaborator Author

Comment originally made by @esprehn

We don't want to go that approach because argument names are still passed in as raw strings so we'd need to validate those, plus we'd need to validate that there's only one function in the list (since you could escape and end up with two). I suppose we could parse it with no function names, pull the function body out, serialize just that code, and then pass it down through new Function. It's getting uglier and uglier though.

Overall this means abandoning the .transform() abstraction and implementing a bunch of our own parsing logic which is unfortunate since these parse steps are both used by the browser for things like setTimeout(string) and in the spec for things like new Function().

I'm surprised you're so opposed to adding the same API the spec has. Being able to do new Function("a", "b", babel.transform(...)) is way better than eval(), it seems reasonable to me that babel should support that use case. :)

(Also fwiw having this API makes much more sense to me than something like allowReturnOutsideFunction which has nothing to do with the spec, and for which most use cases probably want the parsing rules for a function.)

@babel-bot
Copy link
Collaborator Author

Comment originally made by @bakkot

@esprehn

We don't want to go that approach because argument names are still passed in as raw strings so we'd need to validate those

That seems like it would still be true with or without the proposed API, no? An API for parsing a function body isn't going to help you with parameter names.

plus we'd need to validate that there's only one function in the list (since you could escape and end up with two). I suppose we could parse it with no function names, pull the function body out, serialize just that code, and then pass it down through new Function. It's getting uglier and uglier though.

That's pretty much what I've proposed above, yes. It doesn't actually seem all that ugly to me.

I'm surprised you're so opposed to adding the same API the spec has.

Like I say, I'm not necessarily opposed. My main concern is, well, there are a lot of different contexts which one could want something parsed in - function body, expression inside or outside an async function, loop body, case body, parameter, class body, etc. I don't think all of these should be added; I'd be a lot happier with an API that satisfied all of these cases, including possibly others we haven't thought of. But when I think about what that would look like, I end up thinking that the existing APIs + wrappers like the one I've proposed above seem to already satisfy it.

Wrapping your input in something which gives you the context you want and then pulling out the subtree you care about, while asserting that the full parse results in a node of the type you're expecting (so you know it didn't accidentally escape the context you wanted), seems like a fully general solution to the problem of "parse this thing in this context". With small helpers to handle serialization you'd still get all the power of .transform, I think.

(Also, despite my "member" flair, I'm largely just a bystander here, commenting only from my experience on other parsers and with the spec. Don't put too much wait on my opinions.)

@babel-bot
Copy link
Collaborator Author

Comment originally made by @esprehn

new Function deals with parameter names for you, we never parse them. :)

@xtuc
Copy link
Member

xtuc commented Nov 6, 2018

Closing, out of scope for now.

@xtuc xtuc closed this as completed Nov 6, 2018
@lock lock bot added the outdated A closed issue/PR that is archived due to age. Recommended to make a new issue label Feb 5, 2019
@lock lock bot locked as resolved and limited conversation to collaborators Feb 5, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
imported outdated A closed issue/PR that is archived due to age. Recommended to make a new issue pkg: parser
Projects
None yet
Development

No branches or pull requests

2 participants