Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug when using MDXRenderer + Large Markdown Files + PrismJS #411

Closed
rwieruch opened this issue Jul 2, 2019 · 9 comments

Comments

Projects
None yet
2 participants
@rwieruch
Copy link

commented Jul 2, 2019

The last half a year I tinkered on a new blog written with Gatsby to get rid of my Hugo website generator. Since it is a technical blog, I used PrismJs and MDX. Now I am finally in the process of bringing all my content over, but hit a roadblock when I introduced my first blog post in Gatsby which is very long.

TLDR: Large markdown files with MDX (EDIT: and PrismJS) bring Gatsby to crash.


Problem

It all started with the following output on the command line during gatsby develop:

[BABEL] Note: The code generator has deoptimised the styling of undefined as it exceeds the max of 500KB.

It can be seen several times during the process of starting the website.

When I visit the website, I see the this output on the screen.

Screenshot 2019-07-02 at 10 27 35

If I open the developer tools console, I see this output multiple times:

Uncaught SyntaxError: Unexpected token export
    at new Function (<anonymous>)
    at mdx-renderer.js:31
    at mountMemo (react-dom.development.js:13460)
    at Object.useMemo (react-dom.development.js:13669)
    at useMemo (react.development.js:1492)
    at Object.wrappedHook [as useMemo] (react-hot-loader.development.js:2493)
    at MDXRenderer (mdx-renderer.js:15)
    at renderWithHooks (react-dom.development.js:12939)
    at mountIndeterminateComponent (react-dom.development.js:15021)
    at beginWork (react-dom.development.js:15626)
    at performUnitOfWork (react-dom.development.js:19313)
    at workLoop (react-dom.development.js:19353)
    at HTMLUnknownElement.callCallback (react-dom.development.js:150)
    at Object.invokeGuardedCallbackDev (react-dom.development.js:200)
    at invokeGuardedCallback (react-dom.development.js:257)
    at replayUnitOfWork (react-dom.development.js:18579)
    at renderRoot (react-dom.development.js:19469)
    at performWorkOnRoot (react-dom.development.js:20343)
    at performWork (react-dom.development.js:20255)
    at performSyncWork (react-dom.development.js:20229)
    at requestWork (react-dom.development.js:20098)
    at scheduleWork (react-dom.development.js:19912)
    at Object.enqueueSetState (react-dom.development.js:11170)
    at JSONStore../node_modules/react/cjs/react.development.js.Component.setState (react.development.js:335)
    at JSONStore._this.handleMittEvent (json-store.js:40)
    at mitt.es.js:58
    at Array.map (<anonymous>)
    at Object.emit (mitt.es.js:58)
    at r.<anonymous> (socketIo.js:56)
    at r.emit (index.js:83)
    at r.onevent (index.js:83)
    at r.onpacket (index.js:83)
    at r.<anonymous> (index.js:83)
    at r.emit (index.js:83)
    at r.ondecoded (index.js:83)
    at a.<anonymous> (index.js:83)
    at a.r.emit (index.js:83)
    at a.add (index.js:83)
    at r.ondata (index.js:83)
    at r.<anonymous> (index.js:83)
    at r.emit (index.js:83)
    at r.onPacket (index.js:83)
    at r.<anonymous> (index.js:83)
    at r.emit (index.js:83)
    at r.onPacket (index.js:83)
    at r.onData (index.js:83)
    at WebSocket.ws.onmessage (index.js:83)
(anonymous) @ mdx-renderer.js:31
mountMemo @ react-dom.development.js:13460
useMemo @ react-dom.development.js:13669
useMemo @ react.development.js:1492
wrappedHook @ react-hot-loader.development.js:2493
MDXRenderer @ mdx-renderer.js:15
renderWithHooks @ react-dom.development.js:12939
mountIndeterminateComponent @ react-dom.development.js:15021
beginWork @ react-dom.development.js:15626
performUnitOfWork @ react-dom.development.js:19313
workLoop @ react-dom.development.js:19353
callCallback @ react-dom.development.js:150
invokeGuardedCallbackDev @ react-dom.development.js:200
invokeGuardedCallback @ react-dom.development.js:257
replayUnitOfWork @ react-dom.development.js:18579
renderRoot @ react-dom.development.js:19469
performWorkOnRoot @ react-dom.development.js:20343
performWork @ react-dom.development.js:20255
performSyncWork @ react-dom.development.js:20229
requestWork @ react-dom.development.js:20098
scheduleWork @ react-dom.development.js:19912
enqueueSetState @ react-dom.development.js:11170
./node_modules/react/cjs/react.development.js.Component.setState @ react.development.js:335
JSONStore._this.handleMittEvent @ json-store.js:40
(anonymous) @ mitt.es.js:58
emit @ mitt.es.js:58
(anonymous) @ socketIo.js:56
r.emit @ index.js:83
r.onevent @ index.js:83
r.onpacket @ index.js:83
(anonymous) @ index.js:83
r.emit @ index.js:83
r.ondecoded @ index.js:83
(anonymous) @ index.js:83
r.emit @ index.js:83
a.add @ index.js:83
r.ondata @ index.js:83
(anonymous) @ index.js:83
r.emit @ index.js:83
r.onPacket @ index.js:83
(anonymous) @ index.js:83
r.emit @ index.js:83
r.onPacket @ index.js:83
r.onData @ index.js:83
ws.onmessage @ index.js:83
Show 15 more frames
10:12:31.223 

Reproduction

I tried to copy and paste the blog post's content into different starter packs until I narrowed it down to MDX. For instance, it works in gatsby-starter-blog. However, when I tried to use it in my gatsby-MDX-starter-blog, it crashes again; the same way like for my new Gatsby blog.

  1. I started a branch for my gatsby-MDX-starter-blog project to have a minimal reproduction of the bug.

  2. In order to exclude styled-components as troublemaker (see gatsbyjs/gatsby#15205 (comment)), I removed it in this commit on the branch.

  3. Then I started to introduce the long blog post (commit), but not everything, to keep it still without the bug. It still works.

  4. I introduced the remaining parts of the blog post (commit) which leads to the bug. Not sure whether there is a clear threshold so that it breaks for everyone the same, but it breaks after more than 1590 lines in markdown.


How to fix it?

  1. I tried to use https://www.gatsbyjs.org/packages/gatsby-plugin-no-sourcemaps/ out of desperation, but it didn't help.

  2. I set NODE_OPTIONS=--max_old_space_size=4096 but it didn't help.

  3. I removed styled-components (see Reproduction 1), but it didn't help.

  4. I tried to remove MDX, it helped, but I would want to keep it.

Any help is super much appreciated, because I have the feeling that 6 months of work for my new blog with Gatsby went down for nothing, since I struggle with the problem for the last 24 hours. Really appreciate all the things that are possible with MDX now. Hopefully we can find a fix for it. 👍


My Dependencies

  "dependencies": {
    "@mdx-js/mdx": "^1.0.21",
    "@mdx-js/react": "^1.0.21",
    "core-js": "^2.5.7",
    "gatsby": "^2.12.0",
    "gatsby-image": "^2.2.3",
    "gatsby-link": "^2.2.0",
    "gatsby-mdx": "^0.6.3",
    "gatsby-plugin-catch-links": "^2.1.0",
    "gatsby-plugin-manifest": "^2.2.0",
    "gatsby-plugin-offline": "^2.2.0",
    "gatsby-plugin-react-helmet": "^3.1.0",
    "gatsby-plugin-sharp": "^2.2.1",
    "gatsby-plugin-styled-components": "^3.1.0",
    "gatsby-remark-copy-linked-files": "^2.1.0",
    "gatsby-remark-images": "^3.1.2",
    "gatsby-remark-prismjs": "^3.3.0",
    "gatsby-source-filesystem": "^2.1.1",
    "gatsby-transformer-remark": "^2.5.0",
    "gatsby-transformer-sharp": "^2.2.0",
    "prismjs": "^1.16.0",
    "react": "^16.8.6",
    "react-dom": "^16.8.6",
    "react-helmet": "~5.2.1",
    "react-youtube": "^7.9.0"
  },
@rwieruch

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

What I tried next:

Project still runs! So one would assume it's not related to MDX.

Project shows same Babel output as seen above.

So I thought PrismJS would be the problem. But then I tried my long read blog post in https://github.com/gatsbyjs/gatsby-starter-blog and added PrismJs there again. No Babel output. I even made the blog post 4 times longer and it continued to work.

So the problem must be related to PrismJS which is used within MDX. If MDX is not there, PrismJS performs well.

@ChristopherBiscardi would it be possible to find the culprit within gatsby-mdx (see error output above) or is this related to MDX core? Any help would be super much appreciated!

@rwieruch

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

Forget the last comment... It works in the MDX starter (except for the Babel 500kb output still showing up).

Somehow it only happens because I am using MDXRenderer in my project (see Reproduction from first comment) and in the MDX starter there are only children passed in the Layout. If I exclude PrismJs in my project, it works as well. So PrismJS is altering the code which gets passed to MDXRenderer somehow so that MDXRenderer doesn't like it.

@rwieruch rwieruch changed the title [BABEL] Note: The code generator has deoptimised the styling of undefined as it exceeds the max of 500KB. Error in MDXRenderer + Large Markdown Files + PrismJS Jul 5, 2019

@rwieruch rwieruch changed the title Error in MDXRenderer + Large Markdown Files + PrismJS Bug when using MDXRenderer + Large Markdown Files + PrismJS Jul 5, 2019

@johno

This comment has been minimized.

Copy link
Collaborator

commented Jul 5, 2019

Thank you for the detailed bug report and updates! I'm gonna dive into this a bit today and see what I can dig up.

My initial hunch for the Babel warning is the HTML that gatsby-remark-prismjs injects ends up causing the transpiled JSX too be too large for Babel's readable output (which might be inevitable for any very large MDX file).

The crashing is much more concerning to me.

Any help is super much appreciated, because I have the feeling that 6 months of work for my new blog with Gatsby went down for nothing, since I struggle with the problem for the last 24 hours. Really appreciate all the things that are possible with MDX now. Hopefully we can find a fix for it.

We'll find a fix! ❤️

@rwieruch

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

My initial hunch for the Babel warning is the HTML that gatsby-remark-prismjs injects ends up causing the transpiled JSX too be too large for Babel's readable output (which might be inevitable for any very large MDX file).

Yes. I think PrismJS blows it up in the end. If I output everything that goes through MDXRenderer, I get large pieces of [0] and it comes out as 802kb string [1].

If I remove several PrismJS line highlights, it's possible to render it again.

We'll find a fix! ❤️

❤️ I am there to help as well if I can do anything! Didn't dive too much into Babel's implementation details yet though 😅 Thank you so much for digging into this. Didn't expect this to be a blocker, but perhaps it's good to have an edge case like this to work on. This will fix any "large markdown file"-issue for future generations 😄


[1]

Screenshot 2019-07-05 at 19 01 58

[0]

    }), ";"), "\n", mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token keyword"
    }), "import"), " React ", mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token keyword"
    }), "from"), " ", mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token string"
    }), "'react'"), mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token punctuation"
    }), ";"), "\n", mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token keyword"
    }), "import"), " React ", mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token keyword"
    }), "from"), " ", mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token string"
    }), "'react'"), mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token punctuation"
    }), ";"), "\n", mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token keyword"
    }), "import"), " React ", mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token keyword"
    }), "from"), " ", mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token string"
    }), "'react'"), mdx("span", _extends({
        parentName: "code"
    }, {
        "className": "token punctuation"
    }), ";")))));
@johno

This comment has been minimized.

Copy link
Collaborator

commented Jul 5, 2019

As a quick update I've been able to track down where things go wrong. It's indeed a bug in gatsby-plugin-mdx since we make some assumptions about the format of the transpiled JSX. When Babel deopts styling it turns out those assumptions no longer hold true 🤕.

So, I'm going to start work on a Babel plugin to address this issue. I should, hopefully, have something together soon!

Didn't expect this to be a blocker, but perhaps it's good to have an edge case like this to work on. This will fix any "large markdown file"-issue for future generations 😄

Yep! There are a few things all coming together to cause this edge case to happen, but now we can fix it for good. Thanks for your patience and understanding.

johno added a commit to johno/gatsby that referenced this issue Jul 5, 2019

fix(gatsby-plugin-mdx): Use babel plugin to remove export keyword
For very large MDX documents babel will deopt styling. This
results in variations in whitespace that can't be handled
by the original regex for stripping the export keyword.

This replaces that functionality with a plugin.

- ChristopherBiscardi/gatsby-mdx#411
- https://github.com/mdx-js/mdx#618
@johno

This comment has been minimized.

Copy link
Collaborator

commented Jul 5, 2019

I've got a PR open in Gatsby to fix the error. Though I'm also noticing that the actual code blocks are also missing proper whitespace formatting:

image

I'm wondering if this also has to do with babel deopting?


Something to also consider @rwieruch is to use react-prism-renderer directly to avoid some of these issue. Since quite a few of these posts are so long and code-block heavy you'd likely see a smaller bundle size since MDXProvider composition can be code split to save on large prism output in the document.

@rwieruch

This comment has been minimized.

Copy link
Author

commented Jul 5, 2019

Again, wow! If there is anything I can do for you @johno just let me know. Your effort on this made my day and surely my next week, because I can migrate all the content to my Gatsby blog now 🎉

Thanks to @ChristopherBiscardi as well for this neat Gatsby to MDX bridge ❤️


Regarding your hint: I will give this example a shot in my code. Haven't seen this approach before! Super valuable. Do I understand correctly that I keep the gatsby-remark-prismjs + MDXRenderer component, but simply define my custom Highlight components for code?

@johno

This comment has been minimized.

Copy link
Collaborator

commented Jul 5, 2019

Again, wow! If there is anything I can do for you @johno just let me know.

Will do <3

Your effort on this made my day and surely my next week, because I can migrate all the content to my Gatsby blog now 🎉

Radical! If you ever get a chance I'd love to read a post on the good and bad of your migration (when you complete it) so we can improve upon it. 😸

Do I understand correctly that I keep the gatsby-remark-prismjs + MDXRenderer component, but simply define my custom Highlight components for code?

Using this approach you can remove the gatsby-remark-prismjs plugin entirely. Instead the new syntax highlighting component (using react-prism-renderer) will take over the rendering of all code blocks using React Context via MDXProvider and MDX's custom pragma.

It's a bit of a bizarre departure from traditional Markdown-style plugins but is more idiomatic for React and composition as a whole.


Best of luck, and please do reach out if you encounter any other questions/issues.

johno added a commit to johno/gatsby that referenced this issue Jul 9, 2019

fix(gatsby-plugin-mdx): Use babel plugin to remove export keyword
For very large MDX documents babel will deopt styling. This
results in variations in whitespace that can't be handled
by the original regex for stripping the export keyword.

This replaces that functionality with a plugin.

- ChristopherBiscardi/gatsby-mdx#411
- https://github.com/mdx-js/mdx#618

gatsbybot added a commit to gatsbyjs/gatsby that referenced this issue Jul 9, 2019

fix(gatsby-plugin-mdx): Use babel plugin to remove export keyword (#1…
…5452)

For very large MDX documents babel will deopt styling. This
results in variations in whitespace that can't be handled
by the original regex for stripping the export keyword.

This replaces that functionality with a plugin.

- ChristopherBiscardi/gatsby-mdx#411
- https://github.com/mdx-js/mdx#618
@johno

This comment has been minimized.

Copy link
Collaborator

commented Jul 9, 2019

After a patch in mdx-js/mdx#622 it looks like everything in your edge case is addressed @rwieruch! Thanks for your patience and the thorough report with reproduction. 🎉


Git diff of your reproduction repo

❯ gd
diff --git a/gatsby-config.js b/gatsby-config.js
index d17d788..bc662f0 100644
--- a/gatsby-config.js
+++ b/gatsby-config.js
@@ -21,7 +21,7 @@ module.exports = {
       },
     },
     {
-      resolve: `gatsby-mdx`,
+      resolve: `gatsby-plugin-mdx`,
       options: {
         extensions: ['.mdx', '.md'],
         gatsbyRemarkPlugins: [
diff --git a/gatsby-node.js b/gatsby-node.js
index d9908db..7582d8e 100644
--- a/gatsby-node.js
+++ b/gatsby-node.js
@@ -118,9 +118,6 @@ exports.createPages = ({ actions, graphql }) =>
               slug
               categories
             }
-            code {
-              scope
-            }
           }
         }
       }
diff --git a/package.json b/package.json
index 83a46f6..54f25d4 100644
--- a/package.json
+++ b/package.json
@@ -5,25 +5,25 @@
   "author": "Robin Wieruch <hello@rwieruch.com> (https://www.robinwieruch.de/)",
   "repository": "https://github.com/rwieruch/gatsby-mdx-starter-project",
   "dependencies": {
-    "@mdx-js/mdx": "^1.0.21",
-    "@mdx-js/react": "^1.0.21",
-    "core-js": "^2.5.7",
-    "gatsby": "^2.12.0",
-    "gatsby-image": "^2.2.3",
+    "@mdx-js/mdx": "^1.0.23",
+    "@mdx-js/react": "^1.0.23",
+    "core-js": "^3.1.4",
+    "gatsby": "^2.13.10",
+    "gatsby-image": "^2.2.4",
     "gatsby-link": "^2.2.0",
-    "gatsby-mdx": "^0.6.3",
     "gatsby-plugin-catch-links": "^2.1.0",
-    "gatsby-plugin-manifest": "^2.2.0",
-    "gatsby-plugin-offline": "^2.2.0",
+    "gatsby-plugin-manifest": "^2.2.1",
+    "gatsby-plugin-mdx": "^1.0.8",
+    "gatsby-plugin-offline": "^2.2.1",
     "gatsby-plugin-react-helmet": "^3.1.0",
-    "gatsby-plugin-sharp": "^2.2.1",
+    "gatsby-plugin-sharp": "^2.2.3",
     "gatsby-plugin-styled-components": "^3.1.0",
     "gatsby-remark-copy-linked-files": "^2.1.0",
-    "gatsby-remark-images": "^3.1.2",
-    "gatsby-remark-prismjs": "^3.3.0",
-    "gatsby-source-filesystem": "^2.1.1",
-    "gatsby-transformer-remark": "^2.5.0",
-    "gatsby-transformer-sharp": "^2.2.0",
+    "gatsby-remark-images": "^3.1.3",
+    "gatsby-remark-prismjs": "^3.3.1",
+    "gatsby-source-filesystem": "^2.1.2",
+    "gatsby-transformer-remark": "^2.6.1",
+    "gatsby-transformer-sharp": "^2.2.1",
     "prismjs": "^1.16.0",
     "react": "^16.8.6",
     "react-dom": "^16.8.6",
diff --git a/src/templates/post.js b/src/templates/post.js
index 435118d..8d664dc 100644
--- a/src/templates/post.js
+++ b/src/templates/post.js
@@ -1,7 +1,7 @@
 import React, { Fragment } from 'react';
 import { graphql } from 'gatsby';
 import Img from 'gatsby-image';
-import MDXRenderer from 'gatsby-mdx/mdx-renderer';
+import MDXRenderer from 'gatsby-plugin-mdx/mdx-renderer';
 
 import Layout from '../components/Layout';
 import Link from '../components/Link';
@@ -35,7 +35,7 @@ export default function Post({
         />
       )}
 
-      <MDXRenderer>{mdx.code.body}</MDXRenderer>
+      <MDXRenderer>{mdx.body}</MDXRenderer>
 
       <div>
         <CategoryList list={mdx.frontmatter.categories} />
@@ -79,9 +79,7 @@ export const pageQuery = graphql`
         categories
         keywords
       }
-      code {
-        body
-      }
+      body
     }
   }
 `;```

@johno johno closed this Jul 9, 2019

rayriffy added a commit to rayriffy/gatsby that referenced this issue Jul 14, 2019

fix(gatsby-plugin-mdx): Use babel plugin to remove export keyword (ga…
…tsbyjs#15452)

For very large MDX documents babel will deopt styling. This
results in variations in whitespace that can't be handled
by the original regex for stripping the export keyword.

This replaces that functionality with a plugin.

- ChristopherBiscardi/gatsby-mdx#411
- https://github.com/mdx-js/mdx#618
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.