Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collecting Requirements for Per-Language Splitting #88

Open
LorisSigrist opened this issue Apr 22, 2024 — with Linear · 4 comments
Open

Collecting Requirements for Per-Language Splitting #88

LorisSigrist opened this issue Apr 22, 2024 — with Linear · 4 comments
Assignees
Labels

Comments

Copy link
Member

LorisSigrist commented Apr 22, 2024

Context

Paraglide currently splits messages by component / page. If you load a page with 3 client components (or your framework's equivalent) only the messages for those three components are sent to the client. But, they are currently sent in all languages. Ideally we would only send messages in the language that is displayed.

This issue collects ideas on how that could be achieved

Expected Impact - Case Study Inlang.com

The average translation (1 message in one language) on Inlang.com is about 50 - 60 bytes. Times that by the number of languages (7) & you get the average impact per message. About 400 bytes.

There are about 200 messages on the Website, but because of per-page splitting only an average of 20 are loaded when you go to a page. This leaves us with a bundle-size impact of 400 * 20 = 8kB per page on average.

If we got per-language splitting to work on top of that it could save 6 out of 7 bytes, leaving us at just over 1kB. This would be a huge win, but only if the language-splitting adds less than 7kB to the client bundle.

Inlang.com has 7 languages, which is more than most sites. Usually you would have between 2 and 4. So the actual size-limit for the per-page splitting runtime would be about 2kB. For context: i18next is 40kB.

Work done so far

We have already tried a few approaches & run into various challenges.

  • Copying the routes/ directory for each language & using middleware to multiplex between the different builds based on language.
    • Imports from in/out of the routes/ folder are incredibly fragile
    • Doesn't work for all routers
    • Only works if the framework has a rewrite mechanism
  • Post-processing the build output by copying each output file for each language and replacing messages with the language-specific version.
    • Doesn't work with compressed build ouputs
    • Introduces various linking issues

Fundamentally this is a dynamic linking problem in a world of ESM and static linking, which is really hard.

Another promising idea that we haven't tried yet is to serialize the messages & pass them along with the page-data. However, there are open questions on how we would know which messages need to be sent .

Note: Lazy Loading is not the Solution

Any solution using fetch or await import is bound to introduce a render-fetch waterfall which drastically increases Time-To-Interactive. Eagerly loading messages in all languages is preferable in the vast majority of cases.

Most projects have between 2-4 languages, lazy-loading only becomes justifiable at 10<.

@LorisSigrist LorisSigrist added the Feature label Apr 22, 2024 — with Linear
@LorisSigrist LorisSigrist self-assigned this Apr 22, 2024
@osdiab
Copy link

osdiab commented Apr 30, 2024

Keenly watching this. Seems like a core make or break feature that determines if this library can truly scale.

@LorisSigrist
Copy link
Member Author

Per-Language splitting is one of our big goals!

That being said, Paraglide already does scale really well. Because of it's small footprint (tiny runtime, minified message ids, per-client-component-splitting) it already stays small, even when shipping extra languages.

We did some benchmarks on this:

  • As long as you stay under 5 Languages Paraglide already is the smallest choice.
  • If you're using a Framework with Server-Components / Islands / Some sort of partial hydration it stays the best choice for up to 10 languages.

Per-Language splitting will make it so that paraglide stays the best regardless of how many languages you have, but for a lot of projects it's already the best choice.

@osdiab
Copy link

osdiab commented May 24, 2024

Another promising idea that we haven't tried yet is to serialize the messages & pass them along with the page-data. However, there are open questions on how we would know which messages need to be sent

Maybe leveraging AsyncLocalStorage (NextJS already seems to use this for headers()) to have a request context for this could help, having the translation functions add to a list at runtime?

Copy link
Member Author

That's an interesting idea, however, that likely only catches the messages that are actually executed during server-rendering, not messages that are used conditionally. We would need those too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants