Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvement and potential tools to adopt #35

Open
StringKe opened this issue Nov 10, 2023 · 33 comments
Open

Performance improvement and potential tools to adopt #35

StringKe opened this issue Nov 10, 2023 · 33 comments

Comments

@StringKe
Copy link

I have a 5Mb sized file that needs to be processed, I've tried running it directly in node and it takes up to 21G of memory space in some transform processing.

However, in some of the processing, it may fail, so maybe it would be possible to save the data that was processed successfully? And use it again next time?

@pionxzh
Copy link
Owner

pionxzh commented Nov 10, 2023

Did you clone the project and manually run it directly in Node.js? I want to confirm how you run this whole thing up.

Can you share the source file here or privately via the mail?
21GB is quite scary 😢

@StringKe
Copy link
Author

Did you clone the project and manually run it directly in Node.js? I want to confirm how you run this whole thing up.

Can you share the source file here or privately via the mail? 21GB is quite scary 😢

Yes, I copied the source code to run it.

What is your email address, you can base64 and reply to me.

@pionxzh

This comment was marked as outdated.

@pionxzh pionxzh added the enhancement New feature or request label Nov 14, 2023
@pionxzh
Copy link
Owner

pionxzh commented Nov 14, 2023

I can confirm that this file will OOM during the unpacking process.

@0xdevalias
Copy link

The following issue is also probably tangentially relevant to this, as parsing a large file may end up in files getting silently lost along the way:

@pionxzh
Copy link
Owner

pionxzh commented Nov 16, 2023

Yes, they are relevant.

@0xdevalias
Copy link

0xdevalias commented Nov 17, 2023

I've tried running it directly in node and it takes up to 21G of memory space in some transform processing.

I can confirm that this file will OOM during the unpacking process.

My suggestion here would be a rather major change, so it's not worth looking into too deeply unless it turns out there is no good/efficient way to manage this with the current AST parsers/etc; but one thing I stumbled across/was thinking about the other day was that the swc / tree-sitter / etc Rust/etc parsers can apparently be used from JS apps; and how that might allow us to run the unminify process much faster and/or in a potentially more memory efficient way:

These links/resources probably aren't exhaustive; but figured I would share them as a starting point in case this was a path that was worth looking into at some stage:

tree-sitter / web-tree-sitter

swc / @swc/wasm-web / swc_ecma_parser

ast-grep

Etc

  • https://github.com/rustwasm/wasm-pack
    • This tool seeks to be a one-stop shop for building and working with rust- generated WebAssembly that you would like to interop with JavaScript, in the browser or with Node.js. wasm-pack helps you build rust-generated WebAssembly packages that you could publish to the npm registry, or otherwise use alongside any javascript packages in workflows that you already use

    • https://rustwasm.github.io/wasm-pack/book/

@pionxzh
Copy link
Owner

pionxzh commented Nov 19, 2023

I decided to suspend this request for a while. This feature will be implemented on the new web IDE instead of the current playground. In the meantime, you can use CLI to run on local machine.

@0xdevalias
Copy link

@pionxzh Oo, new web IDE sounds interesting. Are you able to tell us more about your plan/roadmap with it, and what you're hoping it will allow/enable?

@pionxzh
Copy link
Owner

pionxzh commented Nov 20, 2023

I tried the sample with the new CLI.

It took ~2GB and 60 seconds to pass for unpacker (and failed to unpack it into multiple files😞; I will check further!).

Generated 1 modules from main.ba3b216f.js to out (61,265.4ms)

For unminify, it took 3.4 hours... to finish the whole process. Memory usage is from 100MB ~ 1GB ~ 2GB depending on the rule.

I will improve the CLI to write files more frequently during the unminify process, and add a --perf flag for recording the time and memory usage.

@StringKe
Copy link
Author

With cli you may need to prompt the user to pass node V8's configuration --max-old-space-size to extend the amount of memory available to the node when it runs out of memory.

@pionxzh
Copy link
Owner

pionxzh commented Nov 20, 2023

Are you able to tell us more about your plan/roadmap with it, and what you're hoping it will allow/enable?

@0xdevalias I have created a package /packages/ide which provides a vscode-like experience. I want to use it to replace the existing playground. I am still trying to figure out the idea haha.. Module mapping won't be exposed, instead, you can directly edit the filename in the file explore. And two tabs will be provided for the compare view, which you can drag and organize in any form you like. The editor would be Monaco, which offers a better editing experience. Features like download, better unminify handling, error reporting...

But I might drop the idea and simply put Monaco into the existing playground.

@pionxzh
Copy link
Owner

pionxzh commented Nov 20, 2023

@StringKe We can add some hints in the CLI usage description. But I didn't hit on OOM during the whole process with your sample.

@0xdevalias
Copy link

I have created a package /packages/ide which provides a vscode-like experience.

@pionxzh Oo.. nice! I created a new issue for this, and will add my /2c over there so as not to clutter this thread too much:


For unminify, it took 3.4 hours... to finish the whole process. Memory usage is from 100MB ~ 1GB ~ 2GB depending on the rule.

I will improve the CLI to write files more frequently during the unminify process, and add a --perf flag for recording the time and memory usage.

@pionxzh With the --perf flag, it would be interesting to see how much time is spent in each different section of the CLI/etc; not sure how hard it would be to record stats/data that could be loaded into something like the Chrome DevTools debugger to be able to see a flamegraph of where the time was spent; as that could potentially help in knowing where to optimise first.

@pionxzh
Copy link
Owner

pionxzh commented Nov 20, 2023

I think you don't need additional work on the CLI to export the flamegraph? Should be same as how people measure normal nodejs project.

@0xdevalias
Copy link

0xdevalias commented Nov 20, 2023

Should be same as how people measure normal nodejs project

@pionxzh True true.. I think I got confused when you were talking about adding a --perf command; as I believe the flamegraph/etc profiling stuff tends to include timing/memory/etc data already doesn't it?

I asked ChatGPT, and it gave some good tool suggestions/how to, included in the expandable below:

Some notes from ChatGPT on profiling

tl;dr:

To get granular timing and performance data for a JavaScript library and a Node.js CLI application, you can use several tools and methods. Here's a step-by-step guide:

1. Node.js Built-in Profiler

Node.js has a built-in profiler that can be used to capture CPU profiles. You can run your Node.js application with the --prof flag. For example:

node --prof your-app.js

This will generate a .log file with profiling data.

2. Convert Profiling Data for Chrome DevTools

To analyze the profiling data in Chrome DevTools, you need to convert the .log file to a format that Chrome can understand. You can use the node --prof-process command for this:

node --prof-process isolate-0xnnnnnnnnnnnn-v8.log > processed-profile.txt

3. Visualizing with Chrome DevTools

Open Chrome DevTools in your Chrome browser. Go to the "Memory" tab, and load the processed profile (processed-profile.txt). This will display a flame graph and other useful visualizations.

4. Using Third-party Tools

a) 0x

0x is a powerful tool for generating flame graphs for Node.js applications. Install it globally using npm:

npm install -g 0x

Then, run your Node.js application with 0x:

0x your-app.js

0x will automatically generate an HTML file with an interactive flame graph.

b) clinic.js

Clinic.js is another comprehensive profiling toolset. It can be installed via npm:

npm install -g clinic

To profile your application, use:

clinic doctor -- node your-app.js

or

clinic flame -- node your-app.js

This will generate detailed reports and flame graphs.

5. Analyzing the Data

Once you have the flame graphs and performance reports:

  • Look for "hot" functions that consume a lot of CPU time.
  • Identify patterns of high memory usage.
  • Examine async operations and their impact on performance.

6. Optimizing Based on Data

  • Refactor or optimize hot functions.
  • Reduce unnecessary computations or memory allocations.
  • Optimize async flows and database queries if they are the bottlenecks.

Additional Tips

  • Ensure that your Node.js version is up-to-date for the best profiling support.
  • Sometimes, running these tools in a production-like environment can give more accurate results.

Using these steps, you can capture detailed performance data for your JavaScript library and Node.js CLI application, analyze it using tools like Chrome DevTools, and identify areas for optimization.


Running the previous tools on my sample webpack'd code (Ref):

cd ./unpacked/_next/static/chunks

0x:

⇒ npx 0x --open -- node ../../../../node_modules/@wakaru/unpacker/dist/cli.cjs 496.js -o ./496-v2
🔥  ProfilingGenerated 51 modules from 496.js to 496-v2 (5,408.6ms)
🔥  Flamegraph generated in
file:///Users/REDACTED/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/62569.0x/flamegraph.html

image

 ⇒ npx 0x --open -- node ../../../../node_modules/@wakaru/unminify/dist/cli.cjs ./496-v2/*.js -o ./496-v2-unminified
🔥  Profiling• Transforming 496-v2-unminified/496-v2/module-10604.js (1,681.9ms)
• Transforming 496-v2-unminified/496-v2/module-13282.js (650.8ms)
• Transforming 496-v2-unminified/496-v2/module-17915.js (1,994.3ms)
• Transforming 496-v2-unminified/496-v2/module-180.js (183ms)
• Transforming 496-v2-unminified/496-v2/module-19051.js (158ms)
• Transforming 496-v2-unminified/496-v2/module-21437.js (911.3ms)
• Transforming 496-v2-unminified/496-v2/module-2368.js (551.6ms)
• Transforming 496-v2-unminified/496-v2/module-24148.js (226.5ms)
• Transforming 496-v2-unminified/496-v2/module-25094.js (1,217ms)
• Transforming 496-v2-unminified/496-v2/module-25345.js (674.5ms)
• Transforming 496-v2-unminified/496-v2/module-25422.js (451.2ms)
• Transforming 496-v2-unminified/496-v2/module-30931.js (975.5ms)
• Transforming 496-v2-unminified/496-v2/module-31721.js (551.3ms)
• Transforming 496-v2-unminified/496-v2/module-32165.js (625.8ms)
• Transforming 496-v2-unminified/496-v2/module-32689.js (424.6ms)
• Transforming 496-v2-unminified/496-v2/module-36112.js (486.7ms)
• Transforming 496-v2-unminified/496-v2/module-36716.js (1,069.3ms)
• Transforming 496-v2-unminified/496-v2/module-37541.js (113.7ms)
• Transforming 496-v2-unminified/496-v2/module-38631.js (151.8ms)
• Transforming 496-v2-unminified/496-v2/module-44925.js (64ms)
• Transforming 496-v2-unminified/496-v2/module-45036.js (27,773.9ms)
• Transforming 496-v2-unminified/496-v2/module-46110.js (839.3ms)
[un-jsx] children from attribute and arguments are both present: jsx(g.Z, {
# ..snip..
• Transforming 496-v2-unminified/496-v2/module-48101.js (1,249.6ms)
[un-jsx] children from attribute and arguments are both present: jsx("div", { className: "text-yellow-500", children: e }, t)
[un-jsx] children from attribute and arguments are both present: jsx("div", { className: "text-red-500", children: e }, t)
• Transforming 496-v2-unminified/496-v2/module-49910.js (876.8ms)
• Transforming 496-v2-unminified/496-v2/module-5046.js (283.5ms)
• Transforming 496-v2-unminified/496-v2/module-56244.js (872.3ms)
• Transforming 496-v2-unminified/496-v2/module-57311.js (3,180.6ms)
• Transforming 496-v2-unminified/496-v2/module-57924.js (428.4ms)
• Transforming 496-v2-unminified/496-v2/module-61119.js (917.9ms)
• Transforming 496-v2-unminified/496-v2/module-62440.js (160.7ms)
[un-jsx] children from attribute and arguments are both present: jsx(w.Z, {
# ..snip..
[un-jsx] children from attribute and arguments are both present: jsx("div", {
# ..snip..
[un-jsx] children from attribute and arguments are both present: jsx(Et, { x: i, children: e }, t)
• Transforming 496-v2-unminified/496-v2/module-63390.js (5,776.7ms)
[un-jsx] children from attribute and arguments are both present: jsx(m.E.div, {
# ..snip..
[un-jsx] children from attribute and arguments are both present: jsxs(N.Z, {
# ..snip..
[un-jsx] children from attribute and arguments are both present: jsx(N.Z, {
# ..snip..
• Transforming 496-v2-unminified/496-v2/module-63727.js (4,760.3ms)
• Transforming 496-v2-unminified/496-v2/module-66523.js (447.6ms)
• Transforming 496-v2-unminified/496-v2/module-69403.js (309.9ms)
• Transforming 496-v2-unminified/496-v2/module-697.js (266.3ms)
• Transforming 496-v2-unminified/496-v2/module-74437.js (263.9ms)
• Transforming 496-v2-unminified/496-v2/module-75179.js (274.9ms)
• Transforming 496-v2-unminified/496-v2/module-75515.js (181.2ms)
• Transforming 496-v2-unminified/496-v2/module-75527.js (4,662ms)
• Transforming 496-v2-unminified/496-v2/module-76559.js (382.7ms)
• Transforming 496-v2-unminified/496-v2/module-77442.js (367.2ms)
• Transforming 496-v2-unminified/496-v2/module-85449.js (322.9ms)
• Transforming 496-v2-unminified/496-v2/module-86433.js (529.2ms)
• Transforming 496-v2-unminified/496-v2/module-86526.js (97.1ms)
• Transforming 496-v2-unminified/496-v2/module-86573.js (1,461.6ms)
• Transforming 496-v2-unminified/496-v2/module-870.js (268ms)
• Transforming 496-v2-unminified/496-v2/module-87105.js (6,067.3ms)
• Transforming 496-v2-unminified/496-v2/module-87316.js (194.1ms)
• Transforming 496-v2-unminified/496-v2/module-90076.js (1,509.4ms)
• Transforming 496-v2-unminified/496-v2/module-94860.js (621ms)
• Transforming 496-v2-unminified/496-v2/module-97732.js (1,447.6ms)
🔥  Flamegraph generated in
file:///Users/REDACTED/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/62934.0x/flamegraph.html

image


clinic:

⇒ npx clinic doctor -- node ../../../../node_modules/@wakaru/unpacker/dist/cli.cjs 496.js -o ./496-v2
To generate the report press: Ctrl + C
Generated 51 modules from 496.js to 496-v2 (4,701.4ms)
Analysing data
Generated HTML file is file:///Users/REDACTED/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/.clinic/65647.clinic-doctor.html

image

⇒ npx clinic doctor -- node ../../../../node_modules/@wakaru/unminify/dist/cli.cjs ./496-v2/*.js -o ./496-v2-unminified
To generate the report press: Ctrl + C
• Transforming 496-v2-unminified/496-v2/module-10604.js (891.1ms)
# ..snip..
Analysing data
Generated HTML file is file:///Users/REDACTED/dev/0xdevalias/REDACTED/unpacked/_next/static/chunks/.clinic/65956.clinic-doctor.html

image

⇒ npx clinic flame -- node ../../../../node_modules/@wakaru/unminify/dist/cli.cjs ./496-v2/*.js -o ./496-v2-unminified
To generate the report press: Ctrl + C
• Transforming 496-v2-unminified/496-v2/module-10604.js (1,591.3ms)
# ..snip..
[==  ] Analysing dataRangeError: Invalid string length
    at JSON.stringify (<anonymous>)
    at zeroEks (/Users/devalias/dev/0xdevalias/chatgpt-source-watch/node_modules/0x/index.js:56:51)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (node:internal/process/task_queues:96:5)

@pionxzh
Copy link
Owner

pionxzh commented Nov 28, 2023

I will try ast-grep for simple tasks. I have setup a few benchmark stuff that can help us make the decision.

@pionxzh
Copy link
Owner

pionxzh commented Nov 29, 2023

The benchmark shows that ast-grep can be 3-4 times faster and require a smaller memory footprint. However, at present, we only have napi binding that can only run on the CLI. A WASM build is required for the browser environment (playground). I'm not too sure about the future direction of this.

We have several options to consider:

  1. Keep migrating part of the rules to ast-grep, and only replace them on CLI.
  2. Pursue the development of a WASM build for ast-grep. This could involve either initiating our own build or waiting for an official build to become available. Once achieved, we could rewrite part of the core logic like import scanning and simple transformation to ast-grep.
  3. Forget about it and focus on optimizing the current implementation.

It's important to note that while ast-grep fulfills many of our requirements, we do not plan to rewrite complex rules using it. Complex rules often involve scope awareness, advanced structure matching, and AST node building. Therefore, it's challenging to determine whether the overall performance gains will justify the effort required for integration.

@0xdevalias
Copy link

0xdevalias commented Nov 29, 2023

@pionxzh Was there a reason you chose ast-grep specifically out of the suggestions I made above? What about tree-sitter / web-tree-sitter and similar? They would be usable across both CLI and web; and should also theoretically be faster and lower memory too?

I feel like it could be an ideal choice:


I feel like having rules that only work on the CLI and not the web UI will increase maintenance effort; and could easily lead to the two becoming out of sync; which sounds like a bad direction for the project.

Similarly, since you also say that you wouldn't use it for more complex rules/rewrites; even if we did have a web capable build of it, I wonder if it's worth fragmenting the rules and how they're implemented between 2 tools/methods/styles; as then contributors/maintainers will need to know how to make simpler rules in one tool, and more complex rules in another.

If this was a path that was followed, here are some potentially relevant resources; that seem to suggest ast-grep already compiles to webassembly to support their own playground; and that it's seemingly based on tree-sitter / web-tree-sitter anyway, so tree-sitter is probably a more direct parser to look at:


Edit: Looking at the ast-grep GitHub org, there are a few more projects there that might be useful with regards to tree-sitter:


@pionxzh Curious, are your benchmarking tools/etc added to this repo? Or just a local thing?

@pionxzh
Copy link
Owner

pionxzh commented Nov 29, 2023

Tree sitter is an excellent component. But it doesn't provide the matcher that required by rules. ast grep also uses tree sitter underlying, providing a robust matcher and replacer. I can also use magic string to manipulate the source code as a combination.

@0xdevalias
Copy link

0xdevalias commented Nov 29, 2023

@pionxzh When you say "doesn't provide the matcher", what do you mean? tree-sitter has a built in query language; that from my brief time reading through it, sounds pretty powerful?

  • https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries
    • Pattern Matching with Queries
      Many code analysis tasks involve searching for patterns in syntax trees. Tree-sitter provides a small declarative language for expressing these patterns and searching for matches.

    • A query consists of one or more patterns, where each pattern is an S-expression that matches a certain set of nodes in a syntax tree. The expression to match a given node consists of a pair of parentheses containing two things: the node’s type, and optionally, a series of other S-expressions that match the node’s children.

    • You can also constrain a pattern so that it only matches nodes that lack a certain field.

    • When matching patterns, you may want to process specific nodes within the pattern. Captures allow you to associate names with specific nodes in a pattern, so that you can later refer to those nodes by those names.

    • You can also specify arbitrary metadata and conditions associated with a pattern by adding predicate S-expressions anywhere within your pattern.

      • Note — Predicates are not handled directly by the Tree-sitter C library. They are just exposed in a structured form so that higher-level code can perform the filtering. However, higher-level bindings to Tree-sitter like the Rust Crate or the WebAssembly binding do implement a few common predicates like the #eq?, #match?, and #any-of? predicates explained above.

      • To recap about the predicates Tree-Sitter’s bindings support:

        • #eq?: checks for a direct match against a capture or string
        • #match?: checks for a match against a regular expression
        • #any-of?: checks for a match against a list of strings
        • Adding not- to the beginning of any of these predicates will negate the match
        • By default, a quantified capture will only match if all of the nodes match the predicate
        • Adding any- before the eq or match predicates will instead match if any of the nodes match the predicate
    • etc
  • https://tree-sitter.github.io/tree-sitter/using-parsers#the-query-api
    • Create a query by specifying a string containing one or more patterns

It's also seemingly commonly used as the basis for 'code navigation systems', which could be useful in the web-IDE side of things:

  • https://tree-sitter.github.io/tree-sitter/code-navigation-systems
    • Tree-sitter can be used in conjunction with its tree query language as a part of code navigation systems. An example of such a system can be seen in the tree-sitter tags command, which emits a textual dump of the interesting syntactic nodes in its file argument.

      A notable application of this is GitHub’s support for search-based code navigation.

      This document exists to describe how to integrate with such systems, and how to extend this functionality to any language with a Tree-sitter grammar.

    • Tagging and captures
      Tagging is the act of identifying the entities that can be named in a program. We use Tree-sitter queries to find those entities. Having found them, you use a syntax capture to label the entity and its name.
      The essence of a given tag lies in two pieces of data: the role of the entity that is matched (i.e. whether it is a definition or a reference) and the kind of that entity, which describes how the entity is used (i.e. whether it’s a class definition, function call, variable reference, and so on). Our convention is to use a syntax capture following the @role.kind capture name format, and another inner capture, always called @name, that pulls out the name of a given identifier.

      You may optionally include a capture named @doc to bind a docstring.


This 'magic string'?

  • https://github.com/Rich-Harris/magic-string
    • magic-string

    • Suppose you have some source code. You want to make some light modifications to it - replacing a few characters here and there, wrapping it with a header and footer, etc - and ideally you'd like to generate a source map at the end of it. You've thought about using something like recast (which allows you to generate an AST from some JavaScript, manipulate it, and reprint it with a sourcemap without losing your comments and formatting), but it seems like overkill for your needs (or maybe the source code isn't JavaScript).

    • Your requirements are, frankly, rather niche. But they're requirements that I also have, and for which I made magic-string. It's a small, fast utility for manipulating strings and generating sourcemaps.

  • https://github.com/sxzz/magic-string-ast
    • Extend Babel AST for magic-string

@pionxzh
Copy link
Owner

pionxzh commented Nov 30, 2023

Sorry, I didn't check the whole document.

This is the test result on Node.js (wasm on node.js is slower than the native binding but ¯_(ツ)_/¯ )
image

Perf with void 0 to undefined transformation. Tree-sitter's performance is good on large samples.

  • 10 LOC 1100 | 1000
  • 100 LOC 135 | 467
  • 1000 LOC 7 | 81
  • 5000 LOC 1? | 16

Its API is not so convenient but acceptable. I feel it's powerful, but there is a learning curve. Might need to take some time to really read through the document.


This 'magic string'?

Yes. The first link you provided.

@0xdevalias
Copy link

0xdevalias commented Dec 1, 2023

Sorry, I didn't check the whole document.

@pionxzh No worries :)

This is the test result on Node.js (wasm on node.js is slower than the native binding but ¯_(ツ)_/¯ )

@pionxzh Is that graph showing tree-sitter wasm on the left and native on the right? Or is that jscodeshift/babel on the left, and tree-sitter on the right?

Wasm is known to have some overheads that makes it slower than native bindings, so that makes sense.

Its API is not so convenient but acceptable. I feel it's powerful, but there is a learning curve. Might need to take some time to really read through the document.

@pionxzh nods that makes sense. I haven't worked with it properly before, so can't give any real world experiences on it; but from reading through the documentation, it sounds like it's pretty powerful.

Curious, are your benchmarking tools/etc added to this repo? Or just a local thing?

@pionxzh You might have missed this question earlier.

@pionxzh
Copy link
Owner

pionxzh commented Dec 1, 2023

Oh, you can check the benchmark at /benches. Run with npx tsx ./benches/un-undefined.ts. I have a branch for ast-grep, but I forgot to push it haha. And there is another branch for tree-sitter, still on my PC. Will push them when I get home.

@0xdevalias
Copy link

0xdevalias commented Dec 1, 2023

Oh, you can check the benchmark at /benches.

@pionxzh Oh true! I totally missed that somehow (committed 3cd85f6). Thanks :)

I have a branch for ast-grep, but I forgot to push it haha. And there is another branch for tree-sitter, still on my PC. Will push them when I get home.

@pionxzh Haha. Sounds good, thanks! :)


There was another parser project I stumbled across while looking ast-grep's own benchmarks (Ref), though haven't looked into it much yet. From a quick skim it sounds interesting as well though; and claims that it's parser is 2x faster than swc:

@pionxzh
Copy link
Owner

pionxzh commented Dec 1, 2023

Oh I know this project, will check em. I think now the requirement would be

  1. Having a wasm binding
  2. Provides ast query functionality (can go to parent and child node would be a plus)
  3. Query results provide range information

There is a wasm binding, but it's not so ready-to-use. It's currently built for their playground. Not sure, will check more on these options.

@0xdevalias
Copy link

0xdevalias commented Dec 1, 2023

I think now the requirement would be...

@pionxzh nods, these make sense and sound reasonable.

Query results provide range information

@pionxzh Though curious, what is this aspect needed for?

There is a wasm binding, but it's not so ready-to-use. It's currently built for their playground.

@pionxzh Yeah, sounds like it's not very maintained currently; or at the very least, it's not currently published:

The playground's package.json loads it as such:

There doesn't seem to be a npm/wasm-web in the main repo:

But the website/package.json has scripts to build it:

Some other historical context:

From a brief skim of the issues, some of these might be relevant/of interest also:

The issue on benchmarking linked to swc's benchmark page (which seems to have a really cool 'chart over time' to see performance based on commits, which seems to use this tool:

Boa's benchmarks also look similar, and seem to use a tool called criterion for it:

@0xdevalias
Copy link

0xdevalias commented Dec 1, 2023

ast grep also uses tree sitter underlying, providing a robust matcher and replacer.

Since I got curious about the specifics of ast-grep's matcher/replacer, a few resources related to the higher level docs for it:

And digging into the code (as it might be useful to understand aspects of it if going down the ast-grep and/or tree-sitter path):

This might also be an interesting/useful read:

@0xdevalias
Copy link

0xdevalias commented Dec 4, 2023

There's a similar 'performance' related issue over on the j4k0xb/webcrack repo:

While the full thread may be of interest, here are some particularly relevant snippets based on some of the alternative parsers/etc they have explored:


that the swc / tree-sitter / etc Rust/etc parsers can apparently be used from JS apps

I experimented a while ago with using rust parsers and converting back to the babel AST format: https://github.com/coderaiser/swc-to-babel

The parsing is faster but unfortunately transferring all the objects to the JS context has so much overhead that its slower than the babel parser. So it should most likely be entirely JS or a native language, not mixed.

Another WIP JS-only attempt with https://github.com/meriyah/meriyah + https://github.com/j4k0xb/estree-to-babel/tree/perf could speed up parsing from 1.5s to 900ms for a large file

Not sure if its worth it since parsing takes up like 2% and transforms 98% of the time

Originally posted by @j4k0xb in j4k0xb/webcrack#24 (comment)


How did you go about transfering the objects back to JS out of curiosity?

Its serializing it with serde_json::to_string and deserializing with JSON.parse.

Idk how to benchmark how long the native parsing vs serializing takes, here's the combined time for a 1.2MB bundle (380k nodes):

  • swc parse: 959 ms (bindings.parseSync 491 ms + JSON.parse 466 ms)
  • babel parse: 442 ms
  • meriyah parse: 171 ms

or alternatively, maybe a way to keep the actual AST objects within the rust side of things, but be able to manipulate them from JS functions/similar?

Definitely a possibility, but again the whole project would have to be rewritten in rust

sometimes it might take an unreasonable amount of time and/or memory to try and unbundle them

Do you have examples where this happens? The current time/memory usage is not that unreasonable imo:

  • 1.2MB webpack bundle: 1.1s unpack, 6s total, 450mb memory
  • 7.2MB obfuscated browserify bundle: 0.3s unpack, 14s total, 1.1GB memory
  • 8MB webpack bundle: 4s unpack, 17s total, 1.2GB memory

Originally posted by @j4k0xb in j4k0xb/webcrack#24 (comment)

@0xdevalias
Copy link

0xdevalias commented Dec 4, 2023

And here are some ideas I've had/been exploring around the potential for how to 'ease the overhead' of crossing back and forth between the JS/Rust side of things; and to avoid needing to re-write the entire project in Rust to benefit from rust-based parsers:

Definitely a possibility, but again the whole project would have to be rewritten in rust

Yeah.. that would be less than ideal.

I don't know how possible it is, or if there are existing libraries/etc that would make it easier, but my original thought in this area was if it was possible to basically 'describe the transformation to be made' on the JS side of things, and then pass that through to the rust side for it to do the 'heavy lifting'.

If that were possible, then it should remove the need to convert the whole project to rust, and also would remove the performance penalty of needing to pass the whole AST back and forth across the rust/JS boundary.

The 2 ways I was initially pondering that this might be possible were:

  • a rust library that exposes an 'AST transformation DSL', such that the JS side could use that DSL to describe the transformations, then pass those to the rust side of things to actually process/run
  • a rust library that can run/interpret some basic JS code within rust itself, and use that to apply the transformations (that are still written in JS)

There was also one more idea I was pondering, though not sure if it would be impractical due to needing to cross back and forth across the rust/JS boundary (this idea is less thought through than the others above; and realistically could probably work with/alongside them, particularly the DSL idea):

  • somehow do the main 'AST walking'/modification on the rust side of things, but be able to specify when it should pass a smaller subset of the AST back to the JS side for more specific processing

Some related ideas/exploration based on the above potentials was:

Originally posted by @0xdevalias in j4k0xb/webcrack#24 (comment)

@pionxzh
Copy link
Owner

pionxzh commented Dec 4, 2023

Thanks for the insight. https://github.com/meriyah/meriyah looks interesting. I will spend some time looking into it.

I wonder if there are any methods that could be used to optimize that in a way that would speed things up; or alternatively, maybe a way to keep the actual AST objects within the rust side of things, but be able to manipulate them from JS functions/similar?

Actually that's how ast-grep works. Here is a quote from Benchmark TypeScript Parsers: Demystify Rust Tooling Performance:

Tree-sitter and ast-grep avoid serde overhead by returning a tree object rather than a full AST structure. Accessing tree nodes requires invoking Rust methods from JavaScript, which distributes the cost over the reading process.

btw, we are kind of off the topic in this long thread lol. Let me update the title.

@pionxzh pionxzh changed the title Split code and save progress? Performance improvement and potential tools to adopt Dec 4, 2023
@pionxzh pionxzh added discussion and removed enhancement New feature or request labels Dec 4, 2023
@0xdevalias
Copy link

0xdevalias commented Dec 4, 2023

Thanks for the insight

@pionxzh No worries :)


I will spend some time looking into it.

@pionxzh Sounds good. Would be interested to hear what you think of it once you do.


Actually that's how ast-grep works.

@pionxzh Oh true. I was thinking it probably did, particularly after skimming the source the other day (Ref), but I wasn't 100% sure still.


Here is a quote from Benchmark TypeScript Parsers: Demystify Rust Tooling Performance:

tree-sitter and ast-grep avoid serde overhead by returning a tree object rather than a full AST structure. Accessing tree nodes requires invoking Rust methods from JavaScript, which distributes the cost over the reading process.

@pionxzh That's super interesting and neat to know. For future reference, here's a link to the non-medium-paywalled version of that article (Ref) It was a really interesting read in general!

The results of the benchmarks for synchronous performance were pretty interesting, I definitely didn't expect swc to perform so poorly in general, nor for babel to perform unexpectedly so much better on the medium-sized file compared to most other things (I wonder what made it perform so well there?). It was also interesting to see that ast-grep consistently beat tree-sitter on it's own, despite it using tree-sitter as it's parser.

image

It was also really interesting in the async parsing to see just how much ast-grep seems to dominate everything; again seeming to perform way better than the tree-sitter parser it's built on top of:

image

These notes towards the end were also interesting/worth paying attention to:

Native Parser Performance Tricks

tree-sitter & ast-grep' Edge

These parsers manage to bypass serde costs post-parsing by returning a Rust object wrapper to Node.js. This strategy, while efficient, can lead to slower AST access in JavaScript as the cost is amortized over the reading phase.

ast-grep's async advantage:

ast-grep's performance in concurrent parsing scenarios is largely due to its utilization of multiple libuv threads. By default, the libuv thread pool size is set to four, but there's potential to enhance performance further by expanding the thread pool size, thus fully leveraging the available CPU cores.

Cool to see this note at the very end aligns with one of the thoughts I had too!

Shifting Workloads to Rust: The creation of a domain-specific language (DSL) tailored for AST node querying could shift a greater portion of computational work to the Rust side, enhancing overall efficiency.


we are kind of off the topic in this long thread lol. Let me update the title.

@pionxzh Haha yeah, I was thinking that recently as well, it certainly seems to have evolved into more of a general 'performance optimisation' exploration.


Was skimming through the ast-grep repo/issues, and the following looked like potentially interesting ones to watch for future improvements (both features, and performance). The tl;dr is that they seem to have a lot of little performance improvement ideas in mind already:

@pionxzh
Copy link
Owner

pionxzh commented Jan 21, 2024

@StringKe

  • Data is now stored in IndexedDB, no more size limitation. Playground refactor #106
  • File size checker and the --max-old-space-size hint has been implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants