-
Notifications
You must be signed in to change notification settings - Fork 45
Add WASM support #14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Indeed, WebAssembly is definitely an interesting topic. I have not dealt with it so far but would like to do some time in the future. Do you know whether modern browsers are capable of storing all language models on the client side? Are there any memory limitations? Yes, please, let me know about anything you find out. I'm curious about these things. Feel free to send me a pull request if you like. Adding WASM support to Lingua would surely be an exciting project. |
Cool to see there's interest in this, because I might have a solution. I'm looking to have lingua work on While looking into it I found two issues that I made changes for in #19. bzip2As Zachary pointed out, the rayonIn most Wasm environments (except maybe when using WASI in the future) there's no access to threading. Unfortunately, unlike the
|
Thanks for the reminder. I had totally forgotten about this and had gone down the rabbit hole of learning about wasm. If you're going to be using C# have you looked into using Blazor? |
I haven't actually had a use case for Blazor yet. As I understand it (and that might be wrong) Blazor works by having the full runtime compiled as Wasm, which then allows for loading of normal DLLs which are your code and any of its dependencies. So in that scenario you don't actually compile your code to Wasm. What I'm trying to do is pretty different, I want to use Rust from C#, just not in the standard native FFI way that you'd achieve with P/Invoke in .NET. It's basically WebAssembly as the portable compilation target for running lightweight isolated modules on the server. So far with the modifications from in the linked PR I was able to compile some code using lingua into a Wasm module that I can load and use from C# with no trouble. But I suspect language detectors such as lingua won't be the best candidates to use in this way, since they rely on lots of things to be loaded into memory (language models and the like), which comes at a relatively heavy "startup" cost. So ideally you'd keep this state around and reuse it for further requests, but that goes against the typical execution model for WebAssembly because it's supposed to be short-lived and you'd want to recreate the environment for subsequent requests to maintain isolation. |
Hi @martindisch, I apologize for my late response. I was busy writing the Python implementation of the library. Thank you for your effort to make my library compatible with WASM environments. As far as Rayon is concerned, I actually favor the first option of making it an optional but default feature. For users who have disabled all default features, the library will not break but only run on a single CPU core. I think people are capable of reading updated documentation and adding the Rayon feature again, that is not much work to do. The alternative, namely adding a feature that disables Rayon, is ugly in my opinion. That's not the approach that I want for my library. I will make some updates to your pull request and then merge it nevertheless. Thanks again for your work. :) |
No worries, all in good time. I like your approach, have at it! Let me know if you want any help. And there's no rush, it's just something that came up in an experiment and there's no expectation this has to make it into the library. I'm just happy it exists! By the way, we evaluated a bunch of language detection libraries (mainly from the C# ecosystem) at work and yours was the uncontested winner. Great job and thanks for all the effort you're putting in! |
Wow, that's cool. Thank you for your kind words. :-) I'm still a bit surprised that nobody has come up with the algorithm that I use. I've always wanted to contribute something useful to the open source community and I'm very happy that I've found something. Beste Grüße in die Schweiz. (-: |
@martindisch @zacharywhitley I've finally found the time to make the library compile to WASM. Would you like to test it? The easiest way to compile is to use
In your HTML, you can then call it like this, for instance: <script type="module">
import init, { LanguageDetectorBuilder } from './pkg/lingua.js';
init().then(_ => {
const detector = LanguageDetectorBuilder.fromAllLanguages().build();
console.log(detector.computeLanguageConfidenceValues("languages are awesome"));
});
</script> I will add unit tests for the WASM module later on. |
I hope that means you are feeling better. That's great. I'll take a look, thanks. |
Yes, I do. :) Fortunately, after three vaccinations I wasn't as sick as I feared to be. |
Wow, you even made a nice wrapper that lets people conveniently use it from JS, therefore opening up the library to a whole other ecosystem. That's going above and beyond! It's nice to see Rust really shine when it comes to compiling to or interoperating with other targets and languages. I can definitely tell you that since it now compiles to Wasm it works for me already, that's all that's needed for embedding it in different environments. I'll post some news here about my latest experiment this weekend, although don't expect too much, since as I was saying it comes with some considerable downsides. |
All done, you can check it out at #54. |
This isn't quite specific to lingua-rs but I've been looking into WebAssembly lately and it would be great to be able to use Lingua-rs into a wasm project. I did an initial test but it failed on a problem with bzip2-sys. I'll have to keep looking into it and let you know what I find but I thought you might be interested.
The text was updated successfully, but these errors were encountered: