Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert JSEP to ES6 class, while maintaining backward compatibility #140

Merged
merged 4 commits into from
Apr 28, 2021

Conversation

6utt3rfly
Copy link
Collaborator

Follow-up from #123, and related to #137 and #130

Converts JSEP to use ES6 classes (but roll-up builds ESM, IIFE, and CJS):

  • class Constants - just the string/numerical constants
  • class JsepConfig - all the current functions on jsep
  • class Jsep - the parser. Constructed with expr and parsed with parse()

Maintained backward compatibility by exporting a default function that uses those classes. Tests are unchanged to ensure consistency.

Performance comparison:

Before PR: 4303.82 op/s +/- 2.09% [5.74% slower]
This PR: 4566 ops/s +/- 2.99%
(from jsbench.me with both versions in the setup. Ran all of the current test cases in both versions)

Size comparison:

+----------------------+-----------+---------+
| FILE                 | Before PR | this PR |
+----------------------+-----------+---------+
| jsep.cjs.js          |   19972   |  23780  |
| jsep.cjs.js.map      |   41299   |  49294  |
| jsep.cjs.min.js      |   5102    |  9376   |
| jsep.cjs.min.js.map  |   31041   |  38040  |
| jsep.iife.js         |   20720   |  24672  |
| jsep.iife.js.map     |   41307   |  49302  |
| jsep.iife.min.js     |   5102    |  9382   |
| jsep.iife.min.js.map |   31107   |  38038  |
| jsep.js              |   19853   |  23616  |
| jsep.min.js          |   5029    |  9296   |
| jsep.min.js.map      |   31036   |  38035  |
+----------------------+-----------+---------+

Hopefully OK that I took a stab at this, @LeaVerou ? I'm happy to change it or whatever :-)

@EricSmekens
Copy link
Owner

Had a quick look through, thank you for this work on it!
I'll wait so LeaVerou can review this as well.

@LeaVerou
Copy link
Collaborator

Hey, so sorry for being absent lately, I was extremely busy with moving. Now settling in slowly, and should be able to take a look in the next couple of days.

@LeaVerou
Copy link
Collaborator

From a very cursory look, a lot of this looks like what I had in mind, but I'm not sure about the added complexity of having three classes, I was envisioning it as a single Jsep class. I can sort of see the point of JsepConfig (interesting idea to have multiple different parsers simultaneously, though not sure if there are enough use cases to justify the complexity), but not the one of Constants. However, I could very well be missing something, since I haven't personally done the work that this PR is introducing. @6utt3rfly could you elaborate a bit on this design decision?

@6utt3rfly
Copy link
Collaborator Author

When I was trying to write some plugins with the existing code, there were some constants that I had to duplicate because they were not accessible to the plugins in other files. I think it would be useful to have access to them from plugins, but I agree they could be an Object, or placed directly onto the class. I debated freezing it, but then thought it might be useful to allow to add to it (or possibly redefine, but I don't know a use-case for that).

For having separate config/parsing classes, I was mostly just thinking of separation of responsibilities. The parsing class makes sense to construct and destroy with each parsing request (similar to existing behavior). It would also allow passing that class instance into plugins. I don't know if there are cases where different configurations of parsing might be useful to run concurrently, but it also seemed like an easy thing to support with that separation. To me, it felt wrong to have methods to add/remove operators that would affect static variables on the class (or hidden variables in the file that plugins would then not have access to) -- and I didn't want to have to copy them or do extra work in the constructor of every parser.

Just my humble opinions though :-)

@LeaVerou
Copy link
Collaborator

Agreed that constants need to be available to plugins, and also agreed that the object shouldn't be frozen, so it can be extended.
It's just that if we take built-in JS APIs as examples, their constants tend to be static properties on the class itself, not separate classes (e.g. Node.ELEMENT_NODE), so a separate class for that seems unusual and a bit overkill.

Your reasoning about the config class makes sense. I think I like the idea.

@6utt3rfly
Copy link
Collaborator Author

Sounds good - do you prefer static getters, or assigning the constants as properties after declaring the class?

@LeaVerou
Copy link
Collaborator

I think simple properties are fine.

src/jsep.js Outdated Show resolved Hide resolved
src/jsep.js Outdated Show resolved Hide resolved
src/jsep.js Outdated
this.expr = expr;
this.config = config;
this.index = 0;
this.length = expr.length;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this should be a getter? Though that might hinder performance.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just ran a performance benchmark and there doesn't seem to be any performance difference between saving that local variable versus using this.expr.length. A getter is about 60% slower though. Mostly I just kept it to try not to change the existing code too much

@6utt3rfly
Copy link
Collaborator Author

@LeaVerou - I pushed my changes to switch to a single class with static methods and fields. I also added a getter for getCharAt and getCharCodeAt (there's also a getChar and getCharCode with default index = this.index which does the same thing ... could remove the default or just use that method)?

Revised sizes:

+----------------------+-----------+---------+
| FILE                 | Before PR | this PR |
+----------------------+-----------+---------+
| jsep.cjs.js          |   19972   |  23181  |
| jsep.cjs.js.map      |   41299   |  47737  |
| jsep.cjs.min.js      |   5102    |  9023   |
| jsep.cjs.min.js.map  |   31041   |  36733  |
| jsep.iife.js         |   20720   |  24059  |
| jsep.iife.js.map     |   41307   |  47745  |
| jsep.iife.min.js     |   5102    |  9041   |
| jsep.iife.min.js.map |   31107   |  36730  |
| jsep.js              |   19853   |  23058  |
| jsep.min.js          |   5029    |  8953   |
| jsep.min.js.map      |   31036   |  36728  |
+----------------------+-----------+---------+

@LeaVerou
Copy link
Collaborator

I thought we'd still use a separate class for the config? Anyway, I'm fine either way, probably a good idea to start simple and take it from there.
I do think that this.char and this.code (or this.charCode) might be more readable but if you have strong opinions on the current naming, I'm fine with that.

* @returns {Jsep}
*/
static addIdentifierChar(char) {
Jsep.additional_identifier_chars.add(char);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this function only exist for backwards compat? Because if additional_identifier_chars is exposed, people can just add characters to it directly.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, although it was only just added recently, so could possibly be removed with minimal impact?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed.

*/
static addBinaryOp(op_name, precedence) {
Jsep.max_binop_len = Math.max(op_name.length, Jsep.max_binop_len);
Jsep.binary_ops[op_name] = precedence;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not for this PR, but I think eventually we may want to move towards a data structure that just works, vs having to use special-purpose mutation methods. E.g. we could make binary_ops an instance of a BinaryOpMap which would inherit from Map but intercept its methods to update the max length accordingly. Then people can just modify it like a regular map. What do you think?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree. It makes sense to encapsulate and would allow reuse with the unaryOp.

src/jsep.js Outdated
* @param {string} message
*/
throwError(message, index = this.index) {
Jsep.throwError(message, index);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked through the code and did not find a single usage of throwError() with an index other than this.index. Should we have this parameter at all? Also, is there a reason to have both a static and an instance method to throw errors? The static method does not appear to be used anywhere except internally. Is it used in plugins perhaps?

In general, I think it's good practice to start with a smaller API and grow as needed, than a larger one because certain needs are anticipated that may never arise. But that might be a philosophical difference.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

src/jsep.js Show resolved Hide resolved
src/jsep.js Outdated
* @returns {boolean}
*/
isIdentifierPart(ch) {
return this.isIdentifierStart(ch) || Jsep.isDecimalDigit(ch);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a static method, it does not depend on this in any way.

src/jsep.js Outdated
* @returns {number}
*/
charCodeAt(index = this.index) {
return this.expr.charCodeAt(index);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Honestly, adding methods that just wrap the same function calls on this.expr seems like overabstraction. Especially since we already have convenience getters when what we want is this.expr.charCodeAt(this.index) so this is literally just a wrapper for the function call.
Same applies to the charAt() wrapper above.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree - changed

src/jsep.js Outdated
// top-level parser (but can be reused within as well)
let gobbleExpressions = function(untilICode) {
/**
* Top-level method to parse all expressions and returns compound or singl
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment seems truncated?

src/jsep.js Outdated
* @param {jsep.Expression} right
* @returns {jsep.BinaryExpression}
*/
static createBinaryExpression(operator, left, right) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would be fine inlined and would make the code more clear. It's only used twice and the function doesn't do much.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

- make methods static where possible
- throwError no longer static
- inline createBinaryExpression method
- rename currentChar/code to just char/code. Remove method, leave just getters for current index
*/
static addBinaryOp(op_name, precedence) {
Jsep.max_binop_len = Math.max(op_name.length, Jsep.max_binop_len);
Jsep.binary_ops[op_name] = precedence;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree. It makes sense to encapsulate and would allow reuse with the unaryOp.

* @returns {Jsep}
*/
static addIdentifierChar(char) {
Jsep.additional_identifier_chars.add(char);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, although it was only just added recently, so could possibly be removed with minimal impact?

src/jsep.js Outdated
* @param {jsep.Expression} right
* @returns {jsep.BinaryExpression}
*/
static createBinaryExpression(operator, left, right) {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree

src/jsep.js Outdated
* @returns {number}
*/
charCodeAt(index = this.index) {
return this.expr.charCodeAt(index);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree - changed

src/jsep.js Outdated
* @param {string} message
*/
throwError(message, index = this.index) {
Jsep.throwError(message, index);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

}

// Backward Compatibility (before adding the static fields):
const jsep = expr => (new Jsep(expr)).parse();
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figure this can be Jsep.parse or expr => (new Jsep(expr)).parse()?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's Jsep.parse then the functions we're adding to it afterwards would be added to Jsep.parse as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could however be expr => Jsep.parse(expr)

// Node Types
// ----------
// This is the full set of types that any JSEP node can be.
// Store them here to save space when minified
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just noticed this comment. If the only purpose of these constants is to "save space when minified", then we should get rid of them, per #131. What do you think?

I'm only referring to the expression type constants. The character code ones are obviously useful.
Not sure e.g. MEMBER_EXP is easier to remember than "MemberExpression". The only benefit I see is that perhaps we can change the type without having to change it everywhere in the code, but are we ever going to do that? 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm indifferent... there are some of the defines that are used multiple times, and others only once. For consistency it's kind of nice to have them here. And I think it's nicer to use a constant/enum rather than needing the exact string in several places.

const jsep = Jsep.parse;
const staticMethods = Object.getOwnPropertyNames(Jsep);
staticMethods
.slice(staticMethods.indexOf('version'), staticMethods.indexOf('removeAllLiterals') + 1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which properties are we trying to avoid here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full list it returns:

0: "length"
1: "prototype"
2: "version"
3: "toString"
4: "addUnaryOp"
5: "addBinaryOp"
6: "addIdentifierChar"
7: "addLiteral"
8: "removeUnaryOp"
9: "removeAllUnaryOps"
10: "removeIdentifierChar"
11: "removeBinaryOp"
12: "removeAllBinaryOps"
13: "removeLiteral"
14: "removeAllLiterals"
15: "parse"
16: "getMaxKeyLen"
17: "isDecimalDigit"
18: "binaryPrecedence"
19: "isIdentifierStart"
20: "isIdentifierPart"
21: "name"

(and would add all the subsequent ones too if I moved this to the end of the file)

Copy link
Collaborator

@LeaVerou LeaVerou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! See 2 minor comments (more like questions).

@LeaVerou LeaVerou merged commit 25fcb53 into EricSmekens:master Apr 28, 2021
@LeaVerou
Copy link
Collaborator

Merged! Thanks again!

@EricSmekens can we add @6utt3rfly as a collaborator? I think it's very well deserved.

@6utt3rfly 6utt3rfly deleted the es6-class branch April 29, 2021 04:02
@EricSmekens
Copy link
Owner

Yes, I fully agree! Will do that in a few minutes!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants