Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add hermes-parser as a supported parser for flow code #13818

Open
bradzacher opened this issue Nov 8, 2022 · 4 comments
Open

Add hermes-parser as a supported parser for flow code #13818

bradzacher opened this issue Nov 8, 2022 · 4 comments
Labels
lang:flow Issues affecting Flow-specific constructs (not general JS issues) status:needs discussion Issues needing discussion and a decision to be made before action can be taken type:enhancement A potential new feature to be added, or an improvement to how we print something

Comments

@bradzacher
Copy link

hermes is a JavaScript engine optimized originally designed for react native usecases. hermes-parser is wrapper around hermes' parser compiled to wasm and built to output EITHER a babel or estree compatible AST.

Over the past year we on the flow team have been pushing to make hermes-parser the gold-standard for parsing flow code outside of flow itself. We've done so by switching our parsing cases away from alternatives (like babel/babel-eslint or flow-parser) and onto hermes-parser.
At this point most of our web infra is powered by it - including prettier, eslint, and code transforms!

In July we decided to give hermes-parser a go in prettier. We did this via the below patch file used in conjunction with patch-package. For the past ~3 months internally at meta hermes-parser has been powering all prettier formatting on our largest codebase (>>>500k files). We have not (as of yet) found any issues - so I'm happy to call it working perfectly!

Patch File used to patch support
diff --git a/node_modules/prettier/index.js b/node_modules/prettier/index.js
index e0a851e..121f4a0 100644
--- a/node_modules/prettier/index.js
+++ b/node_modules/prettier/index.js
@@ -29467,6 +29467,10 @@ var require_parsers = __commonJS2({
       get flow() {
         return require("./parser-flow.js").parsers.flow;
       },
+      get hermes() {
+        const hermesParser = require('./parser-hermes.js');
+        return hermesParser.parsers.hermes;
+      },
       get typescript() {
         return require("./parser-typescript.js").parsers.typescript;
       },
@@ -29608,14 +29612,14 @@ var require_language_js = __commonJS2({
     var parsers = require_parsers();
     var languages = [createLanguage(require_JavaScript(), (data) => ({
       since: "0.0.0",
-      parsers: ["babel", "acorn", "espree", "meriyah", "babel-flow", "babel-ts", "flow", "typescript"],
+      parsers: ["babel", "acorn", "espree", "meriyah", "babel-flow", "babel-ts", "flow", "typescript", "hermes"],
       vscodeLanguageIds: ["javascript", "mongo"],
       interpreters: [...data.interpreters, "zx"],
       extensions: [...data.extensions.filter((extension) => extension !== ".jsx"), ".wxs"]
     })), createLanguage(require_JavaScript(), () => ({
       name: "Flow",
       since: "0.0.0",
-      parsers: ["flow", "babel-flow"],
+      parsers: ["flow", "babel-flow", "hermes"],
       vscodeLanguageIds: ["javascript"],
       aliases: [],
       filenames: [],
@@ -29623,7 +29627,7 @@ var require_language_js = __commonJS2({
     })), createLanguage(require_JavaScript(), () => ({
       name: "JSX",
       since: "0.0.0",
-      parsers: ["babel", "babel-flow", "babel-ts", "flow", "typescript", "espree", "meriyah"],
+      parsers: ["babel", "babel-flow", "babel-ts", "flow", "typescript", "espree", "meriyah", "hermes"],
       vscodeLanguageIds: ["javascriptreact"],
       aliases: void 0,
       filenames: void 0,
diff --git a/node_modules/prettier/parser-hermes.js b/node_modules/prettier/parser-hermes.js
new file mode 100644
index 0000000..218210d
--- /dev/null
+++ b/node_modules/prettier/parser-hermes.js
@@ -0,0 +1,208 @@
+/**
+ * (c) Meta Platforms, Inc. and affiliates. Confidential and proprietary.
+ *
+ * @format
+ * @oncall flow
+ */
+
+'use strict';
+
+// copied from https://github.com/prettier/prettier/blob/89f90fc01363c31e4c9b21f34e3f1942994b1565/src/language-js/parse/utils/replace-hashbang.js
+function replaceHashbang(text) {
+  if (text.charAt(0) === '#' && text.charAt(1) === '!') {
+    return '//' + text.slice(2);
+  }
+
+  return text;
+}
+
+// copied from https://github.com/prettier/prettier/blob/89f90fc01363c31e4c9b21f34e3f1942994b1565/src/utils/get-last.js
+const getLast = arr => arr[arr.length - 1];
+
+// copied from https://github.com/prettier/prettier/blob/89f90fc01363c31e4c9b21f34e3f1942994b1565/src/utils/is-non-empty-array.js
+function isNonEmptyArray(object) {
+  return Array.isArray(object) && object.length > 0;
+}
+
+// copied from https://github.com/prettier/prettier/blob/89f90fc01363c31e4c9b21f34e3f1942994b1565/src/language-js/loc.js#L9-L23
+function locStart(node, opts) {
+  const {ignoreDecorators} = opts || {};
+
+  // Handle nodes with decorators. They should start at the first decorator
+  if (!ignoreDecorators) {
+    const decorators =
+      (node.declaration && node.declaration.decorators) || node.decorators;
+
+    if (isNonEmptyArray(decorators)) {
+      return locStart(decorators[0]);
+    }
+  }
+
+  return node.range ? node.range[0] : node.start;
+}
+// copied from https://github.com/prettier/prettier/blob/89f90fc01363c31e4c9b21f34e3f1942994b1565/src/language-js/loc.js#L25-L27
+function locEnd(node) {
+  return node.range ? node.range[1] : node.end;
+}
+
+// copied from https://github.com/prettier/prettier/blob/89f90fc01363c31e4c9b21f34e3f1942994b1565/src/language-js/parse/postprocess/visit-node.js
+function visitNode(node, fn) {
+  if (Array.isArray(node)) {
+    // As of Node.js 16 using raw for loop over Array.entries provides a
+    // measurable difference in performance. Array.entries returns an iterator
+    // of arrays.
+    for (let i = 0; i < node.length; i++) {
+      node[i] = visitNode(node[i], fn);
+    }
+    return node;
+  }
+  if (node && typeof node === 'object' && typeof node.type === 'string') {
+    // As of Node.js 16 this is benchmarked to be faster over Object.entries.
+    // Object.entries returns an array of arrays. There are multiple ways to
+    // iterate over objects but the Object.keys combined with a for loop
+    // benchmarks well.
+    const keys = Object.keys(node);
+    for (let i = 0; i < keys.length; i++) {
+      node[keys[i]] = visitNode(node[keys[i]], fn);
+    }
+    return fn(node) || node;
+  }
+  return node;
+}
+
+// adapted from https://github.com/prettier/prettier/blob/89f90fc01363c31e4c9b21f34e3f1942994b1565/src/language-js/parse/postprocess/index.js
+function postprocess(ast, options) {
+  ast = visitNode(ast, node => {
+    switch (node.type) {
+      // Espree
+      case 'ChainExpression': {
+        return transformChainExpression(node.expression);
+      }
+      case 'LogicalExpression': {
+        // We remove unneeded parens around same-operator LogicalExpressions
+        if (isUnbalancedLogicalTree(node)) {
+          return rebalanceLogicalTree(node);
+        }
+        break;
+      }
+      // fix unexpected locEnd caused by --no-semi style
+      case 'VariableDeclaration': {
+        const lastDeclaration = getLast(node.declarations);
+        if (lastDeclaration && lastDeclaration.init) {
+          overrideLocEnd(node, lastDeclaration);
+        }
+        break;
+      }
+
+      case 'SequenceExpression': {
+        // Babel (unlike other parsers) includes spaces and comments in the range. Let's unify this.
+        const lastExpression = getLast(node.expressions);
+        node.range = [
+          locStart(node),
+          Math.min(locEnd(lastExpression), locEnd(node)),
+        ];
+        break;
+      }
+    }
+  });
+
+  return ast;
+
+  /**
+   * - `toOverrideNode` must be the last thing in `toBeOverriddenNode`
+   * - do nothing if there's a semicolon on `toOverrideNode.end` (no need to fix)
+   */
+  function overrideLocEnd(toBeOverriddenNode, toOverrideNode) {
+    if (options.originalText[locEnd(toOverrideNode)] === ';') {
+      return;
+    }
+    toBeOverriddenNode.range = [
+      locStart(toBeOverriddenNode),
+      locEnd(toOverrideNode),
+    ];
+  }
+}
+function transformChainExpression(node) {
+  switch (node.type) {
+    case 'CallExpression':
+      node.type = 'OptionalCallExpression';
+      node.callee = transformChainExpression(node.callee);
+      break;
+    case 'MemberExpression':
+      node.type = 'OptionalMemberExpression';
+      node.object = transformChainExpression(node.object);
+      break;
+    // typescript
+    case 'TSNonNullExpression':
+      node.expression = transformChainExpression(node.expression);
+      break;
+    // No default
+  }
+  return node;
+}
+function isUnbalancedLogicalTree(node) {
+  return (
+    node.type === 'LogicalExpression' &&
+    node.right.type === 'LogicalExpression' &&
+    node.operator === node.right.operator
+  );
+}
+function rebalanceLogicalTree(node) {
+  if (!isUnbalancedLogicalTree(node)) {
+    return node;
+  }
+
+  return rebalanceLogicalTree({
+    type: 'LogicalExpression',
+    operator: node.operator,
+    left: rebalanceLogicalTree({
+      type: 'LogicalExpression',
+      operator: node.operator,
+      left: node.left,
+      right: node.right.left,
+      range: [locStart(node.left), locEnd(node.right.left)],
+    }),
+    right: node.right.right,
+    range: [locStart(node), locEnd(node)],
+  });
+}
+
+// copied from https://github.com/prettier/prettier/blob/89f90fc01363c31e4c9b21f34e3f1942994b1565/src/language-js/pragma.js#L19-L22
+function hasPragma(text) {
+  const pragmas = Object.keys(parseDocBlock(text).pragmas);
+  return pragmas.includes('prettier') || pragmas.includes('format');
+}
+
+// copied from https://github.com/prettier/prettier/blob/89f90fc01363c31e4c9b21f34e3f1942994b1565/src/language-js/parse/utils/create-parser.js
+function createParser(options) {
+  options = typeof options === 'function' ? {parse: options} : options;
+
+  return {
+    astFormat: 'estree',
+    hasPragma,
+    locStart,
+    locEnd,
+    ...options,
+  };
+}
+
+function parse(originalText, parsers, options = {}) {
+  const textToParse = replaceHashbang(originalText);
+
+  const {parse} = require('hermes-parser');
+  const result = parse(textToParse, {
+    allowReturnOutsideFunction: true,
+    flow: 'all',
+    sourceType: 'module',
+    tokens: true,
+  });
+
+  options.originalText = originalText;
+  return postprocess(result, options);
+}
+
+module.exports = {
+  parsers: {
+    hermes: createParser(parse),
+  },
+};

cc @pieterv

@sosukesuzuki
Copy link
Member

sosukesuzuki commented Nov 9, 2022

Thanks for making this issue!

I think hermes parser is a good parser for Flow, however we are currently suffering from Prettier's npm package size being too large.

The main reason for there are too many JavaScript parsers.

For this reason, I would personally consider removing all but the truly necessary parsers from the core of Prettier and providing and maintaining them as external plug-ins(The team has not yet made any decisions).

So, I disagree adding hermes parser to the Prettier core.

In the future I think it would be nice to provide it as a separate package like @prettier/plugin-hermes-parser.

@sosukesuzuki sosukesuzuki added type:enhancement A potential new feature to be added, or an improvement to how we print something status:needs discussion Issues needing discussion and a decision to be made before action can be taken lang:flow Issues affecting Flow-specific constructs (not general JS issues) labels Nov 9, 2022
@bradzacher
Copy link
Author

I'd be more than happy if the future state was a module-based approach!
Could definitely trim a lot of fat by doing that.

There's a bit of internal logic that would need to be exposed to allow custom parsers, or else you'd need to copy+paste code everywhere (see how much code my patch had to copy in to work due to the necessary post-processing steps!)

@fisker
Copy link
Member

fisker commented Nov 29, 2022

Actually I tried it long time ago, but it's about 1m, and we can't do anything to reduce size because it's just a 1M string of wasm data url. Do you think we can reduce size to maybe < 500k? I think we can add it directly if the size not too big.

@bradzacher
Copy link
Author

It's a WASM blob so I'm not entirely sure how much it can be reduced. I'm sure that there's probably a bit of extra junk included in the compilation that could be trimmed out - @pieterv would have a better idea of how to action this (I no longer work at Meta)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang:flow Issues affecting Flow-specific constructs (not general JS issues) status:needs discussion Issues needing discussion and a decision to be made before action can be taken type:enhancement A potential new feature to be added, or an improvement to how we print something
Projects
None yet
Development

No branches or pull requests

3 participants