Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Find a more compact compression algorithm for language model files #189

Closed
pemistahl opened this issue May 29, 2023 · 1 comment
Closed
Labels
enhancement New feature or request
Milestone

Comments

@pemistahl
Copy link
Owner

The json language models are currently compressed as zip files. For WASM compilation and usage of the library in the browser, it is beneficial to compress the language models as compactly as possible. Let's investigate whether there is a more compact compression algorithm that produces smaller language model files.

A promising candidate could be the Brotli algorithm.

@pemistahl pemistahl added the enhancement New feature or request label May 29, 2023
@pemistahl pemistahl added this to the Lingua 1.5.0 milestone May 29, 2023
@pemistahl pemistahl changed the title Find a more compact compression algorithm for json files Find a more compact compression algorithm for language model files May 29, 2023
@pemistahl
Copy link
Owner Author

With Brotli compression, the language model files now consume 15 % less storage space compared to the former zip compression. Not bad.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant