Use babel-parse for import/export/jsx blocks #345

johno · 2018-12-06T14:31:55Z

This issue is intended to discuss bringing in @babel/parse to handle ES/JSX parsing.

I've discussed this a few times with @timneutkens and @ChristopherBiscardi in passing but it's something we haven't been prioritizing. Over the last week I've been thinking about it more heavily after @jescalan reported a bunch of new parse issues.

We've reached a point where most open issues with the bug label are related to parsing exports and JSX. So we ultimately need to be able to parse import/export/jsx nodes to handle the following scenarios:

Parse out the default export since that maps to the layout
Walk a JSX node to its completion (right now newlines aren't properly handled)
Handle all use cases where import/export/JSX blocks aren't separated by empty lines
Allow (eventually) for the interleaving of MDX in JSX blocks

There might be a small performance hit, but since it's build time it's not something I'm personally worried about. The big bummer will be the heavier bundle, but that really only comes into play for browser runtimes (which isn't really our targeted env anyway).

Anyone have any thoughts, insights, or FUD about this?
Is there anyone that wants to take this on?

The text was updated successfully, but these errors were encountered:

silvenon · 2018-12-06T17:43:07Z

I really like this idea because that way we can finally gain the much needed stability. I brought it up before, but a concern for performance was raised, so I'm glad to see interest in it again.

Other than the scenarios in your list, I wonder if we could also solve parsing inline JSX with this approach. (#222)

My FUD is that we have to parse the content with two different parsers that are mutually incompatible, I'm not sure know how we can combine them in a stable way. E.g. if we're trying to solve the issue with newlines, how do we know how many lines to parse with Babel until we start parsing with remark again?

johno · 2018-12-06T17:58:27Z

My FUD is that we have to parse the content with two different parsers that are mutually incompatible

Similarly to how we're extending the block parser currently, we'll be able to continue doing the same thing.

if we're trying to solve the issue with newlines, how do we know how many lines to parse with Babel until we start parsing with remark again

My first naive approach might be to continue walking the document until the babel parser doesn't error. When that occurs we know we either have a completed JSX block or we've reached the end of the document and there's a syntax error.

This will become less of an issue after micromark, but that's in its very early stages so we can't really wait on it when we can get this stable now. Swapping out remark for micromark down the road won't require an API change for the end user.

silvenon · 2018-12-06T18:27:22Z

My first naive approach might be to continue walking the document until the babel parser doesn't error. When that occurs we know we either have a completed JSX block or we've reached the end of the document and there's a syntax error.

That's a great idea, we can split the file content based on two or more consecutive newlines and keep adding those chunks until we don't get an error. Works for me!

jonsherrard · 2018-12-10T18:13:15Z

Hey @johno

I wrote an mdx-to-mdx-ast package here: https://github.com/devular/mdx-to-mdx-ast

It uses the Acorn JS parser that powers babel to verify nodes are import, export, or JSX elements.

I'm not sure if it's of any use to you, I mostly wrote it for fun and to learn a bit about the MDX spec and ASTs.

Thanks for all your work on MDX, there's so many opportunities around content authoring, an amazing library. 🙏

jonsherrard · 2018-12-10T18:15:51Z

(Oh yes - I'd be happy to take this on, if people want to give me examples of:

Handle all use cases where import/export/JSX blocks aren't separated by empty line

or any other case the mdx-ast generator should handle.

wooorm · 2018-12-10T18:20:12Z

@silvenon @johno I remember @ChristopherBiscardi tweeting somewhere that neither babel nor ~~remark~~rehype complains about invalid input, in his work on working to bridge Gatsby and MDX. Chris, any thoughts on this?

johno · 2018-12-10T18:48:39Z

This is amazing @jonsherrard!

We'd love for you to take this on if you're interested. Perhaps you could coordinate a bit with @wooorm on how to merge work into remark-mdx? Ideally it'd be great to get all the special MDX parsing to live together in there and consumed by MDX core itself.

neither babel nor rehype complains about invalid input

😞. I imagine that means we need to walk the AST to ensure that the JSX block is closed then? Even if it doesn't complain we can ensure that the block is closed/self-closing.

jonsherrard · 2018-12-10T19:01:09Z

Hi @johno, yeah I'd love to. @wooorm where's the best place to start (if you're happy to!)

Also @johno - what's the issue with invalid input? I'm not sure I've grokked the invalid cases. Do you have an example?

johno · 2018-12-10T19:20:49Z

This evening I can start a branch with failing tests that we can use for you to dev against. But for now, here are a few off the top of my head:

1. parsing out exports

The following breaks because we don't find the export since we expect an empty newline before the export.

import Foo from './bar'
export default Foo

2. allowing other types of exports

We currently error when encountering exports like this, but we should really handle them properly:

export { default } from './foo'

3. empty lines in JSX blocks

The following won't properly parse because of the empty line before the return:

<Component>
  {renderProps => {
    const data = doStuffWith(renderProps)

    return (
      <Graph data={data} />
    )
  }
</Component>

4. JSX blocks that contain numbers are ignored

<Component2 />

Related issues

jonsherrard · 2018-12-10T19:36:50Z

Cheers!

Acorn will parse it into two ES expressions, one import and one export, so I just need to find a unist utility that allows me to replace a single AST node with two nodes at the same depth ☑️
Works as expected if I add it to my fixture with acorn ✅
I'll take a look at prettier's recent additions to see how they group funky whitespace like that. 🤔
Works as expected if add it to my fixture with acorn ✅

Bonus. Supports variable declaration named exports as well: #341

export const meta = {
  title: "Testing 123",
  description: "This is a default description"
};

On bundlesize: Bundlephobia reports 95kb for Acorn and Acorn-JSX 🤔

ChristopherBiscardi · 2018-12-10T22:46:19Z

@wooorm @johno Probably referring to these tests: ChristopherBiscardi/gatsby-mdx@97c4e36 where I was having trouble figuring out which combination of rehype, babel, etc would fail for different combinations of content with the intent of figuring out a way to differentiate jsx from html and convert html to jsx.

If you look at the PR I switched to an approach that just touches attributes and handles capitalization/to-object on styles though babel after the entire mdx process runs rather than try to run on segments of code inside the AST. Not sure any of that will help here

jonsherrard · 2018-12-11T01:00:13Z

Update so far, I'm still in the playground phase over at: https://github.com/devular/mdx-to-mdx-ast

Handle all use cases where import/export/JSX blocks aren't separated by empty lines

Remark

remark-parse will return

import Foo from './foo'
export default foo

as a single MDAST node.

Acorn

Acorn will parse the same example into two javascript expressions represented by the Acorn AST.
We now have to robust AST nodes (albeit Acorn AST nodes), that represent the two declarations.
Using Unist Flatmap, we can return multiple nodes to our new MDXAST tree from a single node of our MDAST tree.
I did think about constructing the new `import Foo from './foo', by hand using the object, but thought better of it and installed escodegen, to robustly create ES compatible code.

So now we have two solid solutions:

-> Acorn parsing of strings to a JS AST
<- ESCodegen to create strings of ES2015 JavaScript for our MDXAST

You can see this handled here: https://github.com/devular/mdx-to-mdx-ast/blob/master/handle-multi-program-nodes.js

Reservations

Using https://github.com/zeit/ncc, the bundle is a whopping 287kb, now that we've added acorn, acorn-jsx, and escodegen. There may be optimisations here through tree-shaking, or potentially this a point where I look at babel again. It might have codegen and JS parsing built in, and most projects will have it installed already.

I'd like to use this in the browser at some point... that's pretty huge - but I was thinking for a CMS - slightly more palatable.

There's now a lot going in here JS parsing, and JS codegen, Markdown parsing, and remark,rehype plugin execution.... it feels like a lot, but that's maybe how robust it needs to be.

This introduces babel for better handling import and export parsing. Anything that is an import/export block will be parsed by babel and split up ensuring that edge cases will no longer occur based on how the JS is grouped. It also makes parsing the export default nodes more robust. - #345 - #340

* Improve import and export parsing with babel This introduces babel for better handling import and export parsing. Anything that is an import/export block will be parsed by babel and split up ensuring that edge cases will no longer occur based on how the JS is grouped. It also makes parsing the export default nodes more robust. - #345 - #340 * Add multiline fixtures * Preserve positional information for AST nodes In order to maintain positional metadata for AST nodes that are parsed, we need to ensure that each import/export is eaten in order as a correct subvalue. Since the remark eat function requires the string to be consumed in order, with no missing characters, the babel plugin now returns the start value which is used to sort the nodes and partition the input string. They're then combined again and returned as nodes that can be eaten in order by the tokenizer. This also comes with the benefit of no longer changing the input code which occurs with the prior babel generator. * Remove unneeded babel generator dep * Simplify partition function * Make naming consistent * Add snapshot test to verify positional info

johno · 2019-03-06T23:14:22Z

import/export parsing have landed in the alpha release so we can track block parsing in #195

johno added the 💬 type/discussion This is a request for comments label Dec 6, 2018

giladv mentioned this issue Dec 30, 2018

contents of multi line strings get parsed as markdown #357

Closed

johno mentioned this issue Feb 12, 2019

Improve import and export parsing with babel #399

Merged

johno pinned this issue Feb 15, 2019

johno added the 💎 v1 Issues related to v1 label Feb 26, 2019

johno self-assigned this Feb 26, 2019

johno closed this as completed Mar 6, 2019

johno unpinned this issue Mar 6, 2019

johno mentioned this issue Jul 11, 2019

RFC: Interleaving Markdown in JSX #628

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use babel-parse for import/export/jsx blocks #345

Use babel-parse for import/export/jsx blocks #345

johno commented Dec 6, 2018 •

edited

Loading

silvenon commented Dec 6, 2018 •

edited

Loading

johno commented Dec 6, 2018

silvenon commented Dec 6, 2018

jonsherrard commented Dec 10, 2018 •

edited

Loading

jonsherrard commented Dec 10, 2018

wooorm commented Dec 10, 2018 •

edited

Loading

johno commented Dec 10, 2018

jonsherrard commented Dec 10, 2018

johno commented Dec 10, 2018 •

edited

Loading

jonsherrard commented Dec 10, 2018 •

edited

Loading

ChristopherBiscardi commented Dec 10, 2018

jonsherrard commented Dec 11, 2018 •

edited

Loading

johno commented Mar 6, 2019

Use babel-parse for import/export/jsx blocks #345

Use babel-parse for import/export/jsx blocks #345

Comments

johno commented Dec 6, 2018 • edited Loading

silvenon commented Dec 6, 2018 • edited Loading

johno commented Dec 6, 2018

silvenon commented Dec 6, 2018

jonsherrard commented Dec 10, 2018 • edited Loading

jonsherrard commented Dec 10, 2018

wooorm commented Dec 10, 2018 • edited Loading

johno commented Dec 10, 2018

jonsherrard commented Dec 10, 2018

johno commented Dec 10, 2018 • edited Loading

1. parsing out exports

2. allowing other types of exports

3. empty lines in JSX blocks

4. JSX blocks that contain numbers are ignored

Related issues

jonsherrard commented Dec 10, 2018 • edited Loading

ChristopherBiscardi commented Dec 10, 2018

jonsherrard commented Dec 11, 2018 • edited Loading

Remark

Acorn

Reservations

johno commented Mar 6, 2019

johno commented Dec 6, 2018 •

edited

Loading

silvenon commented Dec 6, 2018 •

edited

Loading

jonsherrard commented Dec 10, 2018 •

edited

Loading

wooorm commented Dec 10, 2018 •

edited

Loading

johno commented Dec 10, 2018 •

edited

Loading

jonsherrard commented Dec 10, 2018 •

edited

Loading

jonsherrard commented Dec 11, 2018 •

edited

Loading