Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vite and WASM files #661

Closed
wscourge opened this issue May 22, 2024 · 15 comments · Fixed by #662 or #664
Closed

Vite and WASM files #661

wscourge opened this issue May 22, 2024 · 15 comments · Fixed by #662 or #664

Comments

@wscourge
Copy link

I'd love an example on how to use the package with Vite - can't get it quite right as of yet. Here's what I've tried so far:

vite.config.ts

import { defineConfig } from "vite"
import { viteStaticCopy } from "vite-plugin-static-copy"

export default defineConfig(async () => ({
  plugins: [
    // ...other plugins
    viteStaticCopy({
      targets: [
        {
          src: 'node_modules/web-tree-sitter/tree-sitter.wasm',
          dest: 'public/tree-sitter.wasm',
        },
        {
          src: 'node_modules/curlconverter/dist/tree-sitter-bash.wasm',
          dest: 'public/tree-sitter-bash.wasm',
        },
      ],
    }),
  ],
  // ...other config
}));

But I'm not sure what else I need to do. In my React code, I do the following

import * as cURL from "curlconverter"

const Try = () => {
  return (<button onClick={() => cURL.toAnsible("curl example.com")}>curlconverter</button>)
}

The errors I get when trying to it

[Error] Unhandled Promise Rejection: TypeError: Load failed
[Warning] wasm streaming compile failed: TypeError: Unexpected response MIME type. Expected 'application/wasm' (curlconverter.js, line 271)
[Warning] falling back to ArrayBuffer instantiation (curlconverter.js, line 271)
[Warning] failed to asynchronously prepare wasm: CompileError: WebAssembly.Module doesn't parse at byte 0: module doesn't start with '\0asm' (curlconverter.js, line 260)
[Warning] Aborted(CompileError: WebAssembly.Module doesn't parse at byte 0: module doesn't start with '\0asm') (curlconverter.js, line 202)
[Error] Unhandled Promise Rejection: RuntimeError: Aborted(CompileError: WebAssembly.Module doesn't parse at byte 0: module doesn't start with '\0asm'). Build with -sASSERTIONS for more info.

any help/advice would be much appreciated - thanks!

@verhovsky
Copy link
Member

verhovsky commented May 22, 2024

Check the Network tab of the devtools to see what file path is actually getting requested and the actual data being sent. I bet it's requesting localhost:8888/tree-sitter.wasm and getting an HTML 404 page, which is obviously not a valid WASM file. Check the layout of the build directory (does Vite have something like that?) to see where the actual file is placed, it looks like it'll be localhost:8888/public/tree-sitter.wasm. curlconverter needs both Wasm files to be served from the root of your website, like localhost:8888/tree-sitter.wasm and localhost:8888/tree-sitter-bash.wasm

@verhovsky
Copy link
Member

verhovsky commented May 22, 2024

The fact that wasm files aren't just

import foo from 'foo.wasm'

which could be manipulated by all the javascript tools that already work with importing javascript and instead are

const foo = fetch('/some/url/foo.wasm').then(f => WebAssembly.instantiate(f))

(which also makes the object async, so technically all the javascript functions that ever use it have to be async, but that's "solved" with a top-level await, which also causes problems, because it's not supported by some tools)

really sucks. Like all the open issues on this repo are either this problem or requests for generating code in various other languages.

I would like to replace it with a JavaScript-based parser at some point.

@wscourge
Copy link
Author

I confirmed that files placed in the public directory are successfully served at the project's root. However, in my network tab, I can see a single tree-sitter.wasm request, that's falsely HTTP 200, as it is sent to http://localhost:1420/**index.html/**tree-sitter.wasm for the reason uknown:

image

and (correctly) responds with my index.html file:

image

Do you have any idea why does it send an /index.html-prefixed request?

@verhovsky
Copy link
Member

It's 200 because Vite and other single-page app frameworks handle the URL in JavaScript, so it just always returns index.html which contains some js that then determines that the user should be shown a 404 page after all.

To make it load /tree-sitter.wasm instead of /index.html/tree-sitter.wasm , you could fork curlconverter and then change this

await Parser.init();

to

await Parser.init({
  locateFile(scriptName: string, scriptDirectory: string) {
    return scriptName;
  },
});

https://github.com/tree-sitter/tree-sitter/tree/master/lib/binding_web#running-wasm-in-browser

@wscourge
Copy link
Author

Seems like a sane default for this package tho, doesn't it? You already require it to be at the server root, so it only makes sense to actually ignore the scriptDirectory altogether, right?

@verhovsky
Copy link
Member

guess so.

@verhovsky
Copy link
Member

verhovsky commented May 22, 2024

So it turns out that tree-sitter (Emscripten-compiled Wasm in general) loads tree-sitter.wasm relative to the directory of the JavaScript file that loads it, not relative to the HTML file that includes the JavaScript that loads it, like this

https://github.com/emscripten-core/emscripten/blame/e3c4213c90830fb7931a7f1e54f97fc8cf21c0aa/src/shell.js#L392

https://github.com/emscripten-core/emscripten/blame/e3c4213c90830fb7931a7f1e54f97fc8cf21c0aa/src/shell.js#L410

So right now curlconverter requests these two URLs

// actually some older code possibly, as well as more complex logic that checks if it's running in a workeror a blob
scriptDirectory = document.currentScript.src;
scriptDirectory = scriptDirectory.substr(0, scriptDirectory.replace(/[?#].*/, "").lastIndexOf("/") + 1);

scriptDirectory + 'tree-sitter.wasm';
'/tree-sitter-bash.wasm';

We could change it to (as you suggest and as our documentation claims it works)

'/tree-sitter.wasm';
'/tree-sitter-bash.wasm';

or we could do what I tried to do in #649 properly

scriptDirectory + 'tree-sitter.wasm';
scriptDirectory + 'tree-sitter-bash.wasm';

@verhovsky
Copy link
Member

Where are your JavaScript files being loaded from, http://localhost:1420/index.html/main.js ? or http://localhost:1420/main.js? Or do you have multiple JavaScript files?

@wscourge
Copy link
Author

wscourge commented May 23, 2024

I've got the following in my index.html:

<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <!-- <link rel="icon" type="image/svg+xml" href="/vite.svg" /> -->
    <!-- <meta name="viewport" content="width=device-width, initial-scale=1.0" /> -->
    <title>SafeUtils</title>
  </head>

  <body style="overflow:hidden">
    <div id="root" class="windowfocus"></div>
    <script type="module" src="/src/main.tsx"></script>
  </body>
</html>

Found in devtools:

image

and the src/main.tsx is a pretty standard React setup:

import ReactDOM from "react-dom/client";
import App from "./App";

const container = document.getElementById("root") as HTMLElement

ReactDOM.createRoot(container).render(<App />);

Here's the index.html in Devtools:
image

As you can see, there are 2 additional Vite's <script /> tags at the top.

This is development environment, I haven't run it on prod yet.

So to answer that:

do you have multiple JavaScript files?

I do have multiple typescript files, yet it looks like Vite builds them internally into a single one and then loads it. I'm not familiar with it.

@verhovsky
Copy link
Member

Go to the Sources tab, find web-tree-sitter/tree-sitter.js , unminify it ({ } button bottom left), look for "scriptDirectory" and set a breakpoint here

                function locateFile(e) {
                    return Module.locateFile ? Module.locateFile(e, scriptDirectory) : scriptDirectory + e
                }

and check the value of scriptDirectory

Screenshot 2024-05-23 at 09 54 26

Then we need to understand why the value of scriptDirectory is http://localhost:1420/index.html/ when it should be http://localhost:1420/src/, since that's where your main.tsx is coming from, which you can do by running this code that it runs to set it in the JS Console

(ENVIRONMENT_IS_WEB || ENVIRONMENT_IS_WORKER) && (ENVIRONMENT_IS_WORKER ? scriptDirectory = self.location.href : void 0 !== document && document.currentScript && (scriptDirectory = document.currentScript.src);
scriptDirectory = 0 !== scriptDirectory.indexOf("blob:") ? scriptDirectory.substr(0, scriptDirectory.replace(/[?#].*/, "").lastIndexOf("/") + 1) : "";

and debugging it.

I'm guessing that it's running in a web worker, so it's doing self.location.href instead of document.currentScript.src.

@verhovsky
Copy link
Member

Let me know if 4.10.0 doesn't fix your issue

@wscourge
Copy link
Author

After the update, I get this:

TypeError: "" cannot be parsed as a URL.

image

image

@wscourge
Copy link
Author

wscourge commented May 24, 2024

It makes sense that the code uses self.location.href - here's what I see in the console:

image

the /#/ comes from react-router-dom package's hash router and is later treated as a URL fragment I guess, leaving the /index.html/.

@wscourge
Copy link
Author

wscourge commented May 24, 2024

I tried inspecting the Sources tab as you described, but the Pretty print is disabled for some reason. I think that checking the self.location.href answers your question though.

image

@verhovsky
Copy link
Member

verhovsky commented May 24, 2024

It's not using location.href, it's using document.currentScript.src and the issue is type="module" in the HTML because

The Document.currentScript property returns the <script> element whose script is currently being processed and isn't a JavaScript module.

https://developer.mozilla.org/en-US/docs/Web/API/Document/currentScript

So my idea of using new URL() doesn't work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment