New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support @babel/template Placeholders at the Syntax Level in @babel/parser #9112

Open
tolmasky opened this Issue Dec 2, 2018 · 2 comments

Comments

Projects
None yet
3 participants
@tolmasky
Contributor

tolmasky commented Dec 2, 2018

Feature Request

Problem

Currently @babel/template allows you to supply a placeholderPattern that is applied after the successful parse of the template string. What this means is that it is impossible to create a pattern to unambiguously separate placeholders from common identifiers and string literals. This is fine if you are creating a template literal yourself, but makes things difficult if the templating is happening on user input where you have to communicate these restrictions. If you try a pattern like /^@[a-z]+@$/ the parse will simply fail before it even gets to the replacement phase since @x@ is not valid JavaScript syntax.

To concisely state the problem and its implications:

  1. It is not currently possible to construct placeholder patterns that unambiguously separate placeholders from normal JavaScript identifiers.
  2. The default pattern is too loose and too often catches cases not intended to be Placeholders.
  3. The current state leads to a "forever maintenance" burden for either maintainers or users as new valid global identifiers are introduced to JavaScript runtimes that can be inadvertently caught by the matcher. The new URL constructor is a perfect example of this. While one could safely assume URL is a replacement in the past, it is now built-in on the web and on node. This means that something like const NAME = new URL() would surprisingly throw an error. The API then becomes immediately harder to use as you have to look through documentation to fix this. You have no option but to employ a very verbose solution for this very simple snippet because you need to use the full capital constructor. The options are: 1) provide a whitelist containing URL, 2) come up with a new pattern, or 3) pass in URL for URL. All three significantly increase the noise of the operation for a built-in type.
  4. Strings are particularly confusing to deal with in this implementation as seen here.
  5. This often leads to fracturing of templating patterns in the community as people try to solve the above problems by coming up with their own more restrictive placeholderPatterns.
  6. It makes it hard (impossible?) to use babel templates for arbitrary user input as opposed to a developer's static templates.

Placeholder AST Nodes Solution

It would be nice to add a new node type,Placeholder, to @babel/parser with the following shape:

interface Placeholder {
    type: "Placeholder";
    name: string;
}

This node can appear anywhere an Expression, Statement (and possibly Pattern?) can. A more rigorous version would have a specific Placeholder${Type} analog for each specific subtype of Expression and Statement (for example, PlaceholderBlockStatement to represent placeholders where BlockStatements are expected, etc.). However, this is largely an (IMO) unnecessary implementation detail as the given replacement can be validated against the parent's NODE_FIELDS. So for example:

function replace(nodePath, field, node)
{
    if (NODE_FIELDS[nodePath.node.type][field].validate(node)) {
        nodePath.node[field] = node;
    }
}

I imagine this is probably how the current templating system handles this anyways.

Similarly, we would extend the Statement and Expression, etc. parse rules to support a new, currently invalid, JavaScript syntax. For example:

Placeholder ::
    @ IdentifierName @

Expression ::
    ...
    Placeholder

We then would create an additional option in @babel/parse called allowsPlaceholders which would turn on support for Placeholder parsing. Placeholders would then of course be parsed into Placeholder AST Nodes, which can then be easily identified in the AST and replaced. I am not tied to the @@ syntax, just throwing it out as a quick example.

With a system like this, you get the following benefits

  1. A templating system that is not dependent on the input template like the current one. Instead the entire community can standardize against one placeholder pattern since it works with any JavaScript source.

  2. As in my case, it allows to safely rely on @babel/template as my templating system without worrying about user input or having to specify rules to the user.

  3. There is no longer any need to handle string contents, which currently surprises people, like in this bug. This should also simplify the implementation because strings don't need to be special cased. The reason for this is that Placeholders can naturally be used with template literals to achieve the same effect syntactically:

    const message = `Hi ${@name@}!`;

    As you can see, the feature naturally supports the string case, and is much clearer to boot.

  4. We can support more expressive templating. For example, the string class A DEFINITION with the expectation of a block would work fine, which currently does not. We may decide we don't want to support this of course, but that decision would be based entirely on our specific desires and not merely a side-effect of the fact that the string class A DEFINITION doesn't pass initial parse.

Teachability, Documentation, Adoption, Migration Strategy

I think the documentation would be greatly simplified with this solution. A lot of the documentation currently revolves around options like placeholderWhitelist and placeholderPattern that have little use outside of accounting for the short comings of the valid identifier-name string matching strategy currently employed.

@babel-bot

This comment has been minimized.

Collaborator

babel-bot commented Dec 2, 2018

Hey @tolmasky! We really appreciate you taking the time to report an issue. The collaborators
on this project attempt to help as many people as possible, but we're a limited number of volunteers,
so it's possible this won't be addressed swiftly.

If you need any help, or just have general Babel or JavaScript questions, we have a vibrant Slack
community that typically always has someone willing to help. You can sign-up here
for an invite.

@loganfsmyth

This comment has been minimized.

Member

loganfsmyth commented Dec 2, 2018

I'd definitely be in favor of supporting an official placeholder syntax as a plugin for the parser and having @babel/template use it. I don't personally prioritize user-provided template code via @babel/template so I'd be curious to hear more about your exact usecase, but it does seem like something that could be addressed as a side effect of this effort, if you wanted to set placeholderPattern: false yourself and rely on these new placeholder nodes instead.

This node can appear anywhere an Expression, Statement (and possibly Pattern?) can.

If we're going through the effort of supporting this, I'd probably want to make this as broad as possible. One case that motivated the addition of string patterns is import foo from 'FOO'; so I'd probably want to support import foo from @FOO@; for instance.

A more rigorous version would have a specific Placeholder${Type} analog for each specific subtype of Expression and Statement (for example, PlaceholderBlockStatement to represent placeholders where BlockStatements are expected, etc.).

It seems like as long as you have the context of parent nodes, you'd be able to know what type it should be. On the other hand, it could be nice to have these specific types so that tooling could have rough type information without needing intimate knowledge of ancestor structure.

Similarly, we would extend the Statement and Expression, etc. parse rules to support a new, currently invalid, JavaScript syntax. For example:

One difficulty that we'd have to decide here is how to handle @FOO@ ;. Is that an placeholder statement followed by an empty statement, or an ExpressionStatement with a .expression placeholder node. I'm not sure if we'd want to have slightly different grammars for expression-like and statement-like patterns.

We can support more expressive templating.

I find this example particularly compelling, especially taking into account using something like what you mentioned on Slack as(wrong person :P) pattern-matching an existing AST, e.g.

var { NAME: name, BODY: body } = match(ast, `class @NAME@ @BODY@;`);

as a way to pull the body metadata out of an AST without manually walking the AST.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment