Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Basic Arrow function support + array methods #123

Closed
wants to merge 9 commits into from

Conversation

6utt3rfly
Copy link
Collaborator

@6utt3rfly 6utt3rfly commented Dec 17, 2020

  • Add basic arrow function support (Arrow functions? #49 ): () => expr
  • Allow array methods: [...].method()

NOTE: Objects and properties aren't supported, so () => ({ a: 1 }) will fail

Examples:

  1. multi-param arrow function: a.find((val, key) => key === "abc")
{
  "type": "CallExpression",
  "arguments": [
    {
      "type": "ArrowFunctionExpression",
      "params": [
        {
          "type": "Identifier",
          "name": "val"
        },
        {
          "type": "Identifier",
          "name": "key"
        }
      ],
      "body": {
        "type": "BinaryExpression",
        "operator": "===",
        "left": {
          "type": "Identifier",
          "name": "key"
        },
        "right": {
          "type": "Literal",
          "value": "abc",
          "raw": "\"abc\""
        }
      }
    }
  ],
  "callee": {
    "type": "MemberExpression",
    "computed": false,
    "object": {
      "type": "Identifier",
      "name": "a"
    },
    "property": {
      "type": "Identifier",
      "name": "find"
    }
  }
}
  1. no parentheses around single arrow function argument: a.find(val => key === "abc")
{
  "type": "CallExpression",
  "arguments": [
    {
      "type": "ArrowFunctionExpression",
      "params": [
        {
          "type": "Identifier",
          "name": "val"
        }
      ],
      "body": {
        "type": "BinaryExpression",
        "operator": "===",
        "left": {
          "type": "Identifier",
          "name": "key"
        },
        "right": {
          "type": "Literal",
          "value": "abc",
          "raw": "\"abc\""
        }
      }
    }
  ],
  "callee": {
    "type": "MemberExpression",
    "computed": false,
    "object": {
      "type": "Identifier",
      "name": "a"
    },
    "property": {
      "type": "Identifier",
      "name": "find"
    }
  }
}
  1. no arg arrow function: a.find(() => true)
{
  "type": "CallExpression",
  "arguments": [
    {
      "type": "ArrowFunctionExpression",
      "params": null,
      "body": {
        "type": "Literal",
        "value": true,
        "raw": "true"
      }
    }
  ],
  "callee": {
    "type": "MemberExpression",
    "computed": false,
    "object": {
      "type": "Identifier",
      "name": "a"
    },
    "property": {
      "type": "Identifier",
      "name": "find"
    }
  }
}
  1. array function: [1, 2].find(v => v === 2)
{
  "type": "CallExpression",
  "arguments": [
    {
      "type": "ArrowFunctionExpression",
      "params": [
        {
          "type": "Identifier",
          "name": "v"
        }
      ],
      "body": {
        "type": "BinaryExpression",
        "operator": "===",
        "left": {
          "type": "Identifier",
          "name": "v"
        },
        "right": {
          "type": "Literal",
          "value": 2,
          "raw": "2"
        }
      }
    }
  ],
  "callee": {
    "type": "MemberExpression",
    "computed": false,
    "object": {
      "type": "ArrayExpression",
      "elements": [
        {
          "type": "Literal",
          "value": 1,
          "raw": "1"
        },
        {
          "type": "Literal",
          "value": 2,
          "raw": "2"
        }
      ]
    },
    "property": {
      "type": "Identifier",
      "name": "find"
    }
  }
}

- Allow array methods: [...].method()
@LeaVerou
Copy link
Collaborator

Hey @6utt3rfly, thank you so much for your efforts to extend jsep!

@EricSmekens feel free to correct me, but I believe these are out of scope?
However, we should eventually make jsep extensible, because "expression syntax" means different things to different use cases. Then we could add optional arrow function support.

… could be used within binary operations

['a', 'b'].includes(x).length > 2
@6utt3rfly
Copy link
Collaborator Author

Hey @LeaVerou ! Thank you for the speedy response! I wasn't sure if it was out of scope based on #49 ? I initially tried to treat the arrow function with addBinaryOp, but I came across several parsing errors, which lead me to this PR.

'(a, b) => 1' // Unclosed ( at character 11. ...but '() => 1' and 'v => v' work
'[1, 2].length === 2' // Variable names cannot start with a number (.l) at character 7

Making jsep extensible would be interesting. I reviewed other forks and some have added some object support, or back-ticks (I haven't seem template supported yet(`hi ${name}`), which would also be nice. Beyond that, I agree with keeping more complex items out (functions, classes, new, assignment, async/await, etc), although adding an ability to somehow support keywords like new, async, await by extension would be nice, too.

@EricSmekens
Copy link
Owner

We decided to keep it more simple, as the project is being maintained, but there is not a clear plan on what features we would like to introduce, or determine as out of scope for this library. (There are alternatives which solves more complex scenario's).

For me, this falls right into between those two.
I like the pull-request you've made, and I really agree with Lea about extensibility. But at this moment I don't have the energy to dive into making jsep extensible for those kind of features yet. Clear ideas are always welcome ofcourse!

Lea, how do you feel about accepting this PR and creating a new version soon? (I have the time to release a new version etc. It's just the question if we want to go that way). I've the feeling this doesn't break any other correct parsing at this moment, and I think it is anice addition for now, until we make this more extensible.

@amb26
Copy link

amb26 commented Feb 16, 2021

I'm interested in this functionality and would welcome seeing it merged. I'm not in any huge hurry, but just thought I would add my vote :)

@EricSmekens
Copy link
Owner

EricSmekens commented Mar 21, 2021

Another review on this, I would be happy to merge this functionality in jsep, if it would be in a extensible way.
I"m also happy to merge this in if lots of people will value this worthy, and we can just add a backwards-compatibilty warning that some behaviour is different now.

@EricSmekens EricSmekens added the waiting Waiting for feedback label Mar 21, 2021
# Conflicts:
#	src/jsep.js
@6utt3rfly
Copy link
Collaborator Author

@EricSmekens - I can resolve conflict with the latest updates. I've also been thinking of how to do this in an extensible way... For arrow support, there are 2 main changes needed:

  1. Allowing recognition of => to give a new ArrowFunctionExpression
  2. Allowing multiple expressions in an argument list without an identifier (the changes to gobbleGroup)

1 - could be made into an extensible param to jsep. I'm thinking of some way that could support both ternary and arrow functions (even if the current ternary code currently in there was left alone, it would prove the design of the extensible system. Moving into the system could allow it to be removed though). For this system, I'm thinking maybe it could just allow any arbitrary function pointer bound to this...? Seems better than trying to define an expression syntax by parameters and lets the caller use the built-in methods. What do you think?

The code blocks I'm talking about are in gobbleExpression, and show why a function pointer might be best?:

if(ch === QUMARK_CODE) {
  // Ternary expression: test ? consequent : alternate
  index++;
  consequent = gobbleExpression();
  if(!consequent) {
    throwError('Expected expression', index);
  }
  gobbleSpaces();
  if(exprICode(index) === COLON_CODE) {
    index++;
    alternate = gobbleExpression();
    if(!alternate) {
      throwError('Expected expression', index);
    }
    return {
      type: CONDITIONAL_EXP,
      test: test,
      consequent: consequent,
      alternate: alternate
    };
  } else {
    throwError('Expected :', index);
  }
} else if(ch === EQUAL_CODE) {
  // arrow expression: () => expr
  index++;
  if(exprICode(index) === GTHAN_CODE) {
    index++;
    return {
      type: ARROW_EXP,
      params: test ? [test] : null,
      body: gobbleExpression(),
    };
  }
} else {
  return test;
}

2 - feels difficult to make extensible except maybe by a boolean enable flag?

Let me know your thoughts and I can update the PR :-)

@6utt3rfly
Copy link
Collaborator Author

I added optional expression_parsers, if something along these lines works?

src/jsep.js Outdated
@@ -33,6 +34,8 @@
QUMARK_CODE = 63, // ?
SEMCOL_CODE = 59, // ;
COLON_CODE = 58, // :
EQUAL_CODE = 61, // =
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EQUAL_CODE is unused.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I debated using a Map for the expression_parsers versus an object which has to be keyed by strings? And wasn't sure if the arrow function should be included by default or described in the readme to be added if desired?

@EricSmekens
Copy link
Owner

Left 1 comment, looks quite promising.

I'll leave it open for a few more days, maybe @LeaVerou has an opinion about this as well.

src/jsep.js Outdated
@@ -33,6 +34,8 @@
QUMARK_CODE = 63, // ?
SEMCOL_CODE = 59, // ;
COLON_CODE = 58, // :
EQUAL_CODE = 61, // =
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I debated using a Map for the expression_parsers versus an object which has to be keyed by strings? And wasn't sure if the arrow function should be included by default or described in the readme to be added if desired?

src/jsep.js Outdated
expression_parser_scope = {
get index() { return index; },
set index(v) { index = v; },
exprI: exprI,
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be useful to include all methods here? Too bad it feels like code duplication though...

@LeaVerou
Copy link
Collaborator

This is a good implementation, but I still think arrow functions should be an optional add-on, and are not desirable in many, if not most, of expression use cases. We don't even support object literals, and we're going to support arrow functions by default?

@6utt3rfly If I prioritize the work to make jsep extensible and get it done this coming week, would you be ok to port your changes to an official plugin? Could we separate the changes to add arrow functions from the changes to move towards an expression_parsers data structure?

# Conflicts:
#	src/jsep.js
- added optional (plugin) support for arrow-functions, objects, assignments, and 'new'
- Fix omitting second binaryOp if precedence is zero (or not set)
@6utt3rfly
Copy link
Collaborator Author

6utt3rfly commented Apr 6, 2021

@LeaVerou - I took a look at PrismJS (mentioned in #137 ) and took a stab at adding some hooks. I was able to move the arrow-function support into a plugin, and also added plugins for object support, assignment, and new, (while also moving the ternary code into a default plugin), just to see if the hooks were sufficient. I'm happy to update however it's needed, and welcome any feedback! :-)

@LeaVerou
Copy link
Collaborator

LeaVerou commented Apr 7, 2021

Oh wow, lots of good work here! Thank you!!

I have a paper deadline tomorrow, but can review later in the week.

Btw, Prism's hooks implementation is a little old at this point, this is closer to the code I've been using for more recent projects: https://github.com/LeaVerou/bliss/blob/v2/src/Hooks.js
Same API with some minor conveniences.

Also if it's not too much trouble, please follow the existing coding style. E.g. we use tabs for indentation (and spaces for alignment). I can fix issues for you after merging (I wouldn't want to hold up such a good PR due to such minor issues), but I figured you may prefer to fix them yourself before merging. ESLint should help, we have an ESLint config now, so if you have an ESLint plugin in your editor it should pick it up automatically.

- renamed and updated hook types. Switched hook api system to support object argument and push vs unshift
- added ignoreComments plugin
- added templateLiterals plugin
-- this required moving the gobbleExpressions() code into a function so it could be reused within the template literal
@6utt3rfly
Copy link
Collaborator Author

@LeaVerou - I updated the tab/spacing and added the plugins folder to the lint script. Sorry about that! I switched the plugin API as requested, too. I also added an ignoreComments plugin (#121 ), and templateLiteral.

I didn't do anything to add the plugins to the build system, so that would still be needed. Or feel free to let me know if anything else needs to be done :-)

Copy link
Collaborator

@LeaVerou LeaVerou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic work overall!! Thank you so much for putting so much work into this!
Thank you for fixing the issues I pointed out previously and making more improvements.
I've left some comments, some minor, some less so.

I'm also missing tests for these plugins!

It may be more manageable to break this PR into two: one for adding extensibility and moving ternary (and other existing functionality that may not be always necessary) to a plugin, and one (or more!) for adding all these cool plugins, with tests.

That said, if you don't want to do that, that's cool too.

```javascript
const jsep = require('jsep');
const plugins = require('jsep/plugins');
plugins.forEach(p => p(jsep));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we switch to ESM-style imports here?

@@ -0,0 +1,40 @@
export default function (jsep) {
if (typeof jsep === 'undefined') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is a good pattern, for the following reasons:

  1. It exports a function that should never be called twice, but it's very easy to accidentally call it multiple times. What if multiple libraries on the same page include jsep, each importing different but overlapping sets of plugins?
  2. It forces each plugin to use boilerplate to check that jsep actually exists
  3. It depends on third party code to call it properly.
  4. The entire plugin code needs to be inside a callback, which is awkward.

Ideally, third-party code should not need to run any code to activate plugins, including them should just work.

One way to do that is if each plugin simply imports jsep using classic ESM imports.
Then, third party code using jsep can just import both:

import jsep from "jsep.js";
import "jsep-plugins/jsep-arrow-function.js";

The main downside of this is that since plugins import jsep, they need to know its path (relative or not). import.meta.url could help here.

@@ -144,47 +145,52 @@ let jsep = function(expr) {
while (ch === SPACE_CODE || ch === TAB_CODE || ch === LF_CODE || ch === CR_CODE) {
ch = exprICode(++index);
}
hooks.run('gobble-spaces', hookScope);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we rename this to env to be consistent with usage and docs? It's also much shorter, and that's important when it's something that will be repeated over and over.

@@ -600,6 +618,18 @@ let jsep = function(expr) {
index++;
return node;
}
else if (exprICode(index) === COMMA_CODE) {
// arrow function arguments:
Copy link
Collaborator

@LeaVerou LeaVerou Apr 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

???

leftover?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better fix is now in #139

@@ -617,37 +647,46 @@ let jsep = function(expr) {
};
};

let nodes = [], ch_i, node;
const hookScope = {
Copy link
Collaborator

@LeaVerou LeaVerou Apr 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See env is supposed to be the local environment with properties only for the variables of the current scope, not an object to be reused like this. You've noticed that this approach is awkward, as you need to null out properties you don't currently need etc. This makes it very easy to introduce bugs by modifying this object, not realizing it's re-used across hooks.

I quite like this approach with the accessor on index, it's much more elegant than using env.index throughout, but in balance, I think the disadvantages of re-using the env object outweigh the advantages. You can of course reuse env when it's in the same scope.

For exposing the functions to plugins, we could just export them as named exports.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I'm curious if you have any suggestions? I was trying to avoid adding a property accessor for existing jsep code on all usages of the scoped variables (and the functions that rely on them). If it were all a class using accessors, then the entire class could be passed as the env. The node (and nodes) have to be on the stack, but the others are all more global. I think named exports would lose the scope of the current js expression parsing...

I was also trying to avoid regenerating that object with every call to the hooks which could slow it down more, something like this:

hooks.run('name', { node });
function run(name, env) {
  hooks[name].forEach(h => h({ ...baseEnv, ...env }));
}

Any suggestions on which way to go?

Copy link
Collaborator

@LeaVerou LeaVerou Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we have very few hooks, it may be easier to just do index = env.index after the hook. Once we get more, we can start just using env.index instead of index.

I think at this point creating object literals has become pretty fast, and avoiding it for perf was a micro-optimization we did a long time ago? But I might always be wrong! Are you aware of any recent benchmarks?

You are right regarding named exports for some of these functions, and exports have to be top level anyway. I wonder if it would make sense to pass the index and expression to them as a parameter, so they could be top level. That would also make it clear what their input is.

Alternatively, we could go the direction you suggested and make them class methods, but jsep() would still need to work as a function and return the AST. Since ES6 constructors cannot be used as both (using a constructor as a function throws), it would need to be a separate Parser (name TBB) class that we can export as a named export, and jsep would still be the default export. It would create a Parser instance and return the AST, so that existing usage is the same. In this case, hooks would indeed just get this as their context.

I could do that if you want, though not sure how easy it would be to merge the changes into this PR.
In the end it may be better to break this PR into more manageable chunks anyway, if you're ok with that.

What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran a couple benchmarks on jsbench.me out of curiosity...

For the env variable, recreating the env per hook call is 50-280 times slower than sharing the env:
image

However, I ran another test comparing direct variable access versus object property access (i.e. adding this. in front of everything) and the property access was 5.78% faster (I expected it to be slower, tbh)!
image

So yeah, I think if we could encapsulate the parser and change all the code to use this. then the hooks could easily pass this or similar. There were 2 properties that would be at the stack level, rather than the global parsing level though, node, and nodes but maybe we could get clever with that.

I'm fine with using ES6 classes (especially if we're using rollup to build CJS and ESM). That could easily be its own PR, before any other work is done.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow, thank you so much for doing this, that's very illuminating!

Would you be ok with me doing the rewrite to use the ES6 class or do you want to do it?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totally feel free! (and if you feel like updating the build system/folder layout too, that would help ;-) )

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LeaVerou - I just created #140 - hopefully it's okay? Or if you still want to do it (or do it differently) I'm not offended :-)

@@ -658,6 +697,54 @@ jsep.toString = function() {
return 'JavaScript Expression Parser (JSEP) v' + jsep.version;
};

jsep.hooks = {
Copy link
Collaborator

@LeaVerou LeaVerou Apr 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a separate module for the hooks object that jsep imports, since we're using a build tool anyway.
You could even take the Bliss (v2) one wholesale.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not that familiar with rollup and not sure what you want the final output to look like. Do you want to build a separate set of output files for each plugin? Or do you want it all bundled together? Or maybe you could make an update or branch with a build system update with proper directory structure and I could split this PR from there?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bundles in dist/ would definitely include any JS in src, so if we have a hooks.js that will be included. But as to which plugins to include ...excellent question. I suppose we should have a default list of plugins that is included in the default bundles, e.g. ternary, and anyone that wishes to do something different can do manual imports or a custom build. Alternatively the default bundles could be without any plugins, or we could have multiple bundles with different sets of plugins (e.g. none, default, all). What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could potentially set up multiple npm packages (i.e. one per plugin). But these plugins are all so tiny, I think everything could be built together (and ESM will allow tree shaking anyway)?
So maybe:

jsep
├── dist
├── src
│   ├── jsep.js
│   ├── jsep.test.js
│   ├── hooks.js
│   ├── plugins
│   │   ├── ternary
│   │   │   ├── ternary.js
│   │   │   ├── ternary.test.js
│   │   ├── ...
├── test (maybe still a test folder, with esprima)?
├── typings
└── package.json, build files...

It would make it simpler for including ternary by default... thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But these plugins are all so tiny, I think everything could be built together (and ESM will allow tree shaking anyway)?

Yup.

I like the proposed dir structure, the only thing I'm not sure about is whether plugins/ should be in src/ or adjacent to it. Might be worth it to see what a sample of popular projects do, but I don't have a strong opinion either way.

@@ -783,4 +869,42 @@ jsep.removeAllLiterals = function() {
return this;
};

// ternary support as a plugin:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be in a separate file, just like the other plugins. We can include it by default (for compatibility, though I think eventually we should move away from that), but plugins should be separate.

@6utt3rfly
Copy link
Collaborator Author

I'm also missing tests for these plugins!

I added tests to the main tests.js file, but it felt wrong and I'd gladly move them! Would you like tests in the same folder as the plugin? Or sub-folders of the tests folder? I use jest in most of my other projects, but would you be interested in adding istanbul and nyc for code coverage?

@LeaVerou
Copy link
Collaborator

I added tests to the main tests.js file, but it felt wrong and I'd gladly move them! Would you like tests in the same folder as the plugin? Or sub-folders of the tests folder? I use jest in most of my other projects, but would you be interested in adding istanbul and nyc for code coverage?

Ideally each plugin should have its own separate tests, but I'm not familiar enough with QUnit to know if that's easily feasible. I haven't used jest, istanbul, or nyc before, but from a quick look they look cool. If you're willing to spend the time to switch from QUnit to them, I'm sure me and @EricSmekens would be happy to merge that PR (Eric correct me if not!)!

@6utt3rfly
Copy link
Collaborator Author

If you're willing to spend the time to switch from QUnit to them, I'm sure me and @EricSmekens would be happy to merge that PR (Eric correct me if not!)!

... maybe later. There are enough other things to do first ;-)

@6utt3rfly
Copy link
Collaborator Author

Closing for new plugin architecture and moving into individual PRs:
#147
#148
#149
#150
#151

@6utt3rfly 6utt3rfly closed this Jul 18, 2021
@6utt3rfly 6utt3rfly mentioned this pull request Aug 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
waiting Waiting for feedback
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants