Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow web-tree-sitter to work with Emscripten 3 #1913

Merged
merged 5 commits into from Nov 16, 2022

Conversation

J3RN
Copy link
Contributor

@J3RN J3RN commented Oct 16, 2022

Changes are explained in-line.

Originally I tried implementing the "Basic Usage" from inside a React application, and encountered this error message:
Screenshot from 2022-10-16 17-51-22

Per its advice, I changed the emscripten flags in both script/build-wasm (for tree-sitter's WASM) and cli/src/wasm.rs (for the grammar's WASM) to include -s ASSERTIONS=1 and made the following changes based on the issues surfaced.

Fixes #1560

@@ -83,6 +83,8 @@ pub fn compile_language_to_wasm(language_dir: &Path, force_docker: bool) -> Resu
"-s",
"NODEJS_CATCH_EXIT=0",
"-s",
"NODEJS_CATCH_REJECTION=0",
Copy link
Contributor Author

@J3RN J3RN Oct 16, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After enabling assertions, my first error was the generated tree-sitter grammar's WASM catching a Promise rejection. Sadly, I have not been able to recreate this for a screenshot.

The NODEJS_CATCH_REJECTION flag (which enables/disables that behavior) is already used in script/build-wasm

@@ -77,6 +77,8 @@ fi

mkdir -p target/scratch

runtime_methods='stringToUTF16','AsciiToString'
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next, I encountered this error:
Screenshot from 2022-10-16 18-17-01

After some googling I discovered that stringToUTF16 is from emscripten's preamble and just needed exporting.

Next, the line console.log(tree.rootNode.toString()); threw this error:
Screenshot from 2022-10-16 18-25-15

I quickly determined that the cause is the same and added it as well.

@verhovsky
Copy link
Contributor

verhovsky commented Oct 17, 2022

I tried to check your changes. I took the repro that I posted in #1560, here:

https://gist.github.com/verhovsky/a04c7256da13797e312acd0a8afd9023

and changed its package.json to point to my local copy of tree-sitter, like this:

    "web-tree-sitter": "../tree-sitter/lib/binding_web/",
    "tree-sitter-cli": "../tree-sitter/cli"

I made the same changes as you made in this PR to my local copy of tree-sitter and then I ran build-wasm in my local copy of tree-sitter

cd ../tree-sitter
cargo build --release
./script/build-wasm

then in the repro directory I ran

rm -rf node_modules
npm install
npm run build
npm start

But I see this error in the browser:

Uncaught (in promise) TypeError: Cannot read properties of undefined (reading 'apply')

Screen Shot 2022-10-17 at 03 21 47

If I change npm run build in package.json to do

    "build": "tree-sitter build-wasm --docker node_modules/tree-sitter-python",

then everything works, because the Docker has Emscripten 2, instead of Emscripten 3 which made the breaking changes it seems like you're trying to fix here.

@J3RN
Copy link
Contributor Author

J3RN commented Oct 17, 2022

@verhovsky I was able to replicate this issue. Building web-tree-sitter's WASM in debug mode yields the following error instead:
image

This isn't something that can be exported like stringToUTF16, and so I'm digging a bit deeper now, starting with trying the suggestion.

@J3RN
Copy link
Contributor Author

J3RN commented Oct 17, 2022

Silly me! The solution was to add _memset to exports.json!
image

@J3RN J3RN changed the title Allow web-tree-sitter and generated WASM to run in the browser Allow web-tree-sitter to work with Emscripten 3 Oct 17, 2022
@@ -24,6 +24,7 @@
"_memcmp",
"_memcpy",
"_memmove",
"_memset",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the issue from running tree-sitter-python, see my comment

@verhovsky
Copy link
Contributor

I tried a few more languages (tree-sitter-bash, -c, -cpp, -java, -javascript, -php and -python) and I can confirm this works. Great work

@verhovsky
Copy link
Contributor

verhovsky commented Oct 25, 2022

I updated my repro gist so now it tries to Parser.Language.load all the official tree-sitter-* packages and they are all able to load except tree-sitter-html which fails with Assertion failed: undefined symbol '__cxa_atexit'. in the console when trying to load it (I did build-wasm --debug for tree-sitter.wasm this time, like you suggested):

Screen Shot 2022-10-23 at 00 25 20

@J3RN
Copy link
Contributor Author

J3RN commented Oct 31, 2022

@verhovsky I don't have a good setup for testing this right now, but I just pushed a change that I believe will fix the issue you encountered with tree-sitter-html. Would you mind testing it and reporting back?

@verhovsky
Copy link
Contributor

@J3RN that works. I did try this but I didn't realize I needed to keep adding underscores until it starts working lol

@verhovsky
Copy link
Contributor

verhovsky commented Oct 31, 2022

Please change cli/emscripten-version to "3.1.24" and after that I think this should finally be good to merge.

It's currently set to "2.0.24" so the CI fails with this error:

emcc: error: undefined exported symbol: "___cxa_atexit" [-Wundefined] [-Werror]

I did a (quasi-)binary search changing that file to various versions and running cargo build --release && ./script/build-wasm --debug --docker and found that 3.1.1 is the first version that doesn't fail with the above error, but we might as well set it to the latest Emscripten version.

@maxbrunsfeld
Copy link
Contributor

Thanks for the PR @J3RN, and for helping out with it @verhovsky. It looks like the issues that were blocking us from upgrading to Emscripten 3 are fixed now. It'll be great to be on the latest version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Grammars built with Emscripten 3 don't load in browser: RuntimeError: abort(Assertion failed: undefined)
3 participants