Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

build-wasm: generated .wasm file excessively large and very slow #1

Closed
cancerberoSgx opened this issue Jun 15, 2019 · 5 comments
Closed

Comments

@cancerberoSgx
Copy link

Similar to tree-sitter/tree-sitter-ocaml#30

build-wasm will generate a 7MB .wasm file.

Updating dependencies of this project improves to 3MB file but still very big compare to the others.

But the real problem is that running it with node.js is very slow, It takes 50s to parse a trivial statement like x = 10.

I tried to run tree-sitter build-wasm both using docker and local emcc with the same results.

I tried to digg in and I can't see any differences between other projects like tree-sitter-rust, bash, etc.

Basically the project is not usable while executing wasm. :(

@maxbrunsfeld
Copy link
Contributor

After generating with the latest tree-sitter CLI, the WASM binary is now 3.8M. I'm not sure why it's so large - the native compiled library is only 1.8M. The wasm binary gzips down to 186K though, so the size isn't too ridiculous.

I can't reproduce the performance issue using the tree-sitter web-ui command. Large files seem to parse as quickly as with any tree-sitter parser. How are you measuring the performance?

@maxbrunsfeld
Copy link
Contributor

Compiling this parser to wasm (and to native) does take longer than compiling some of the other parsers - it's because the lexing function is more complex due to all of the unicode operators that Julia allows.

@cancerberoSgx
Copy link
Author

I'm currently running it with node.js (tests), not in the browser. I'm seeing that in web-ui it works fine - I assumed that if in node was slow then in the browser too, but didn't test it in browser.

I will keep investigating and close this or update it. if is a problem only in node then is not big deal, and I'm not worried if it takes longer to compile the wasm . Also output size is acceptable, the current problem right now, in node.js is taht it takes 50s to load the wasm.

The tests in node.js do something like the following, and the bottleneck is not in parse() but in Parser.Language.load()

const Parser = require('web-tree-sitter');
(async () => {
  await Parser.init();
  const parser = new Parser();
  const Lang = await Parser.Language.load('path/to/tree-sitter-julia.wasm');
  parser.setLanguage(Lang);
  const tree = parser.parse('x = 1');
  console.log(tree.rootNode.toString());
})();

Thanks, again, i'm investigating and if it only happens in node.js will close this.

@cancerberoSgx
Copy link
Author

cancerberoSgx commented Jun 18, 2019

I think it has to do with emscripten-core/emscripten#6633 (comment)

Changing https://github.com/tree-sitter/tree-sitter/blob/177ba49e57abaf2bfe247ea3becdb2e3a425de86/lib/binding_web/binding.js#L591 to (async false):

.then(bytes => Promise.resolve( loadWebAssemblyModule(bytes, {loadAsync: false})) )

makes setLanguage and parse calls to behave like expected. But still, it seems there unterminated threads (or similar) since the program won't terminate, even calling process.exit(0) won't terminate it either:

...
const Lang = await Parser.Language.load('path/to/tree-sitter-julia.wasm');
parser.setLanguage(Lang);
const tree = parser.parse('x = 1');
process.exit(0)
...

@maxbrunsfeld
Copy link
Contributor

The parser has gotten smaller and compiles more quickly now recently. The WASM module works well for me in the web UI, so I'm going to close this out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants