Design for Import/Export APIs to replace storage plugins #1585

samuelstroschein · 2023-11-03T20:14:36Z

This issue has been raised by @martin-lysk. We agreed that importers/exporters is the right way to go but decided at the Berlin Offsite in Oct 23 to work around the issue as long as possible. First users are confused now why they storage plugins are limiting them.

Problem

Inlang's set up to be "provide your storage plugin" leads to numerous issues:

inlang's features are limited by the provided storage plugin (a no-go)
users don't understand why certain features work or don't work with a provide storage plugin (e.g. [bug] plugin-inlang-json and plugin-inlang-i18next compile namespaces to invalid js #1577 (comment))

Proposal

Storing messages should be controlled by the SDK to ensure that inlang is not limited by external plugins.

loadMessages should be succeeded by importMessages
saveMessages should be succeeded by exportMessages

message-29sn82
-> json import: login.button
-> paraglide export: login_button
-> ios export: LOGIN_BUTTON
-> android export: login-button

Pros

inlang is not limited by external plugins
import/export APIs can expose what features they support
multi-platform exports (export for iOS, Android, Paraglide) become possible
users are communicated what is supported by a target platform and what not.
we can performance optimize the storage instead of naively calling saveMessages() and loadMessages()
importer/exporter plugins could store additional data like message id ieb3s should be exported as login-button, or ieb2s exists in the namespace file en/login.json. requires .inlang folder to avoid multiple files and enable caching #1418

Cons

effort.
the introduction of .inlang folder to avoid multiple files and enable caching #1418 should be completed with this change too. have a project.inlang folder to avoid massive scatter across a repo

Requirements

allow multiple importers and exporters to be used in a project (load and save messages only support one "storage" plugin at a time)
must allow for "namespacing" logic see [bug] plugin-inlang-json and plugin-inlang-i18next compile namespaces to invalid js #1577 (comment). otherwise existing projects can't migrate to inlang
[bug] Watcher in inlang SDK breaks fink for large language configurations #1769 (comment)
when is export triggered? onSave [can be ignored for now]
how to deal with creation/deletion

The text was updated successfully, but these errors were encountered:

samuelstroschein · 2023-11-03T20:24:17Z

@martin-lysk this seems to be a great issue for you. after all, you raised this issue and now it is hurting our growth because users don't understand why feature limitations exist for different storage formats

felixhaeberle · 2023-11-03T23:30:56Z

As discussed in Berlin, this would be a API which finally solves limitations by external plugins and is therefore a good thing.

What (breaking) code change does this mean?

martin-lysk · 2023-11-08T10:01:08Z

I try to get the whole picture - collecting the inputs from the tickets referenced it seems like you have a concept on how this should be integrated already. Before I make a proposal that might not meet your thoughts - Shall we have a kickoff about that issue @samuelstroschein?

Storing messages should be controlled by the SDK to ensure that inlang is not limited by external plugins.

So inlang should come "with batteries included" and defines a default way to load messages and save messages? Shall plugins be able to override this behaviour at all?

loadMessages should be succeeded by importMessages
saveMessages should be succeeded by exportMessages

So instead of loading and saving messages (persistance) plugins would import / export messages from sources like like sting.dict files or even api like poeditor/localize etc. The messages imported would than be managed by the inlang sdk and stored in the .inlang folder?

Storage Format: How would inlang store messages in the inlang folder?

Are the files supposed to be edited by users directly like it is the case at the moment? (compare paraglide-js preview feedback thread #1464 (reply in thread))

Yes. You can use any storage format you'd like. For example, the JSON storage plugin https://inlang.com/m/ig84ng0o/plugin-inlang-json, which also reduces the clutter of the inlang message format plugin/is it easier to write translations manually. This question is important for the question about the format we use

How do we store the data:

What format

using the JSON encoded AST like in (https://inlang.com/m/reootnfj/plugin-inlang-messageFormat)
using the message format schema from mf-wg https://github.com/unicode-org/message-format-wg/tree/main/spec/data-model
other format

How do we split data

1. Store everything in one big json. All - messages with there locales/variants

Pro

straight forward - just dump the whole AST json into a file like we do in https://inlang.com/m/reootnfj/plugin-inlang-messageFormat
loading messages means just load one file - easy
...

Con

pulling a change of a single message (done by another editor or push to repo) means fetching all messages with all variants and loading the whole file for now
merge conflicts - two edits of different messages might lead to merge conflicts until lix understands the format
loading messages means just load one file - if the project contains thousands of messages with dozends of languages and variants this might become a memory issue
...

2. Store Messages split by languages / split by namespaces
Pro

smaller files
if separated by namespace one could load only a subset of messages by a given namespace without loading the whole file
files could get handed over to translators by language/namespace
devs are used to this kind of separation

Con

motivated by current status quo not by the needs of a storage format
leads to manual edits of the files that might not be wanted at all
if a problem occures in one message (for example a user edited the file manually - or a tool had an error with encoding) it breaks the whole format - to reverse this user has to look into huge files

My thoughts:
This is how those files are often stored in the old world to be able to deliver only messages on websites or to load messages in memory only for the current language (ios language bundles). Since inlang is usually interested in a message as a whole - all its properties (languages / variants) splitting it this way doesn't make sense for the storage format.

The use case for translators should be managed by import/export plugins instead.

3. Store each message with its locales and variants in a separate file

Pro

each message is an atomic entity - on fs level already. as long as you don't edit variants or languages of a message at the same time you don't have to deal with merge conflicts (even without lix - semantic meaning) even if we have simultanes edits a last write wins approach woulnd't hurt to much
versioning comes out of the box by git's version history
updates on a messages get propagated via the file watcher
the format could be checked against the message format schema (not really an argument - nothing prevents us to use the schema in our own schema that wraps the messages with a map)
loading a subset of keys in large project's could be done by filtering filenames
the api we design around this format is more likely to be similiar to the one we have when lix can store it as a whole
if a problem occures in one message (for example a user edited the file manually - or a tool had an error with encoding) it breaks the whole format - to reverse this user has to look into huge files
...

Cons

initial load would need to load all files in a folder - thousands of fs.read's instead of one if the project contains thousands of messages
git might become slow (compare https://www.monperrus.net/martin/one-million-files-on-git-and-github)
we add a lot of file to the inlang folder - people using inlang may not like that
...

samuelstroschein · 2023-11-08T14:44:20Z

hey @martin-lysk

The "inlang directory" change should happen before the importer/exporter stuff .inlang folder to avoid multiple files and enable caching #1418.
I have yet to write a proposal for the directory stuff. Planned to do that early next week.
The directory proposal will make changes to the message format really easy so we don't have to worry about one big json or not

When would you start with this/the to be proposed directory change so that I know when to write my proposal ?

martin-lysk · 2023-11-08T18:04:40Z

I guess the planned iterations in #1459 (comment) will keep me busy this week - I could have a look next week tuesday.

samuelstroschein · 2023-11-08T18:12:24Z

@martin-lysk okay i intend to publish more "inlang directory" proposal monday/tuesday which I would give you to implement. afterwards, the import/export stuff can be handled

martin-lysk · 2023-11-10T11:37:21Z

@janfjohannes

felixhaeberle · 2023-11-23T15:42:15Z

@samuelstroschein @martin-lysk @janfjohannes What's the status here? Who is leading the importer/exporter & should be assigned?

janfjohannes · 2023-11-23T16:17:06Z

@felixhaeberle see samuels comment: #1585 (comment)

samuelstroschein · 2023-11-23T16:19:33Z

@felixhaeberle my directory implementaiton will come first. progress can be tracked in the https://github.com/inlang/monorepo/tree/1678-project-directory branch

NilsJacobsen · 2023-12-04T08:46:23Z

Another take on the splitting proposals of martin:

While working with @NiklasBuchfink on the editor/sdk watcher we saw that not having granularity of messages is a nightmare. We watch files with thousands of messages. But we can only lint per message so as you can imagine that leads to a lot of reactive work. If we would have a solution like proposal 3 we can watch for messages and then only update and lint one message. That would be a huge advantage.

I see the problem like Samuel said, that we could have problem with a lot of files then. If we can build a granular watcher that works with one file but watches on every message could solve the problem to, by shifting complexity to the watcher.

-> Being able to watch for only one message should be a requirement for the SDK improvements.

NiklasBuchfink · 2023-12-04T12:59:04Z

We need granular reactivity per pattern that changes. Updating a single pattern should only have the linting for that particular pattern as a side effect. A reactivity matrix is needed where we can observe changes in individual patterns and apply CRUD operations. A mappable watcher that acts like a proxy over the files would be the dream. Basically, we need the same approach for file watching as SolidJS has for its reactivity system. Avoid diffing by creating a pub/sub pattern for each small entity that is reactive.
That's why the normal watcher is the bottleneck or we have to split everything into a thousand files, which has its own limitations as described in suggestion 3.

samuelstroschein · 2023-12-04T16:14:14Z

@NiklasBuchfink i created #1817. Let's keep this issue for importer/exporter only.

martin-lysk · 2023-12-05T14:12:58Z

@samuelstroschein whats your take on the scope here? shall we just add importers exporters to inlang sdk and keep the storage topic completely out of the scope of this issue? If so we would still need a plugin that provides load and save method right?

Open questions:

What will become the execution points for importers / exporters - when shall we trigger an import or an export?

Only If storage is part of the ticket (if not we can answer those later):

should the change be backward compatible / should plugins loadMessage and saveMessages marked as deprecated as part of this
since we plan to reimplement the save logic and it will most likely need a migration for existing projects i would iterate on the target persistance format first

Thoughts on import/export triggers:

Compared to the current setup that only accepts one load and one save - importers and exporters will coexists. Projects may have an ios exporter and an android exporter and a json exporter all configured in one project.
Compared to save and load, where save triggered on each change and load that is executed by the watcher and initially exports/imports are usually triggered externally and not by events coming from the sdk:

Use cases for Export Plugins

triggers when configured within a ci pipeline as part of the cli
a button within the editor to trigger an export
a button in the editor that a allows a developert to download the localization files
... any hooks in the sdk that an exporter should be triggered on?

Use cases for Import Plugins

a cli command that allows to import all keys from an existing set of messages - like ios/android/....
an external webhook (like one from lokalise https://developers.lokalise.com/docs/webhook-events#projectkeymodified that updates the keys) intagration - this would most likely only be a trigger to an import
an upload of an file like ios strings file within the editor
... do you see any triggering of an import other than external ones?

samuelstroschein · 2023-12-05T14:40:24Z

@martin-lysk do you think a sync call where we draft spec in google docs is quicker than github back and forth? If so, let's schedule a call

samuelstroschein · 2023-12-08T21:13:07Z

Issue #1844 will be completed before this one.

samuelstroschein · 2023-12-18T15:52:04Z

This proposal is a reaction to #1844 (comment)

Proposal - introduction of aliases via amap

Introduce an alias map plugins can use to establish a relationship between message id and exported/imported key name.

Pros*

plugins can make use import/export keys
apps can display and edit "alias for i18next is login.button" because the plugin id is known

cons

?

const message = {
  id: "human_airplane_globe", 
+  alias: {
+    plugin.inlang.i18next: "login.button",
+    plugin.inlang.xml: "login-button"
+  }
}

maige-app bot added type: feature New feature or request scope: inlang/sdk Related to source-code/sdk. labels Nov 3, 2023

samuelstroschein mentioned this issue Nov 3, 2023

[bug] plugin-inlang-json and plugin-inlang-i18next compile namespaces to invalid js #1577

Closed

samuelstroschein mentioned this issue Nov 7, 2023

Watcher causes stuck build for paraglide-js compile #1590

Closed

samuelstroschein mentioned this issue Nov 13, 2023

[bug] i18next format cannot handle namespaces #1632

Closed

NilsJacobsen mentioned this issue Nov 28, 2023

[bug] Watcher in inlang SDK breaks fink for large language configurations #1769

Closed

samuelstroschein mentioned this issue Dec 4, 2023

Split languages on a per message basis #1817

Closed

martin-lysk self-assigned this Dec 5, 2023

martin-lysk added a commit that referenced this issue Dec 5, 2023

#1585 todos added

e8f7428

samuelstroschein mentioned this issue Dec 7, 2023

SDK persistence of messages in project directory #1844

Closed

4 tasks

samuelstroschein mentioned this issue Dec 12, 2023

vs code extension: inlang tab #1879

Closed

5 tasks

samuelstroschein mentioned this issue Dec 22, 2023

Problem with UI Display During Flutter Plugin Installation opral/inlang.com#32

Closed

jldec mentioned this issue Jan 25, 2024

WIP 1844 Part 1: auto-generated human-IDs and aliases #2108

Merged

22 tasks

This was referenced Feb 9, 2024

Communicate new ID and storage concept. #2043

Closed

Update ide docs #2226

Merged

samuelstroschein mentioned this issue Mar 8, 2024

createNewProject() v1 #2349

Closed

2 tasks

jldec changed the title ~~introduce importer/exporter APIs to replace "storage" plugins~~ Design for Import/Export APIs to replace storage plugins Apr 8, 2024

jldec closed this as completed Apr 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design for Import/Export APIs to replace storage plugins #1585

Design for Import/Export APIs to replace storage plugins #1585

samuelstroschein commented Nov 3, 2023 •

edited

Loading

samuelstroschein commented Nov 3, 2023

felixhaeberle commented Nov 3, 2023

martin-lysk commented Nov 8, 2023 •

edited

Loading

samuelstroschein commented Nov 8, 2023 •

edited

Loading

martin-lysk commented Nov 8, 2023

samuelstroschein commented Nov 8, 2023

martin-lysk commented Nov 10, 2023

felixhaeberle commented Nov 23, 2023

janfjohannes commented Nov 23, 2023 •

edited

Loading

samuelstroschein commented Nov 23, 2023

NilsJacobsen commented Dec 4, 2023

NiklasBuchfink commented Dec 4, 2023

samuelstroschein commented Dec 4, 2023

martin-lysk commented Dec 5, 2023

samuelstroschein commented Dec 5, 2023

samuelstroschein commented Dec 8, 2023

samuelstroschein commented Dec 18, 2023 •

edited

Loading

Design for Import/Export APIs to replace storage plugins #1585

Design for Import/Export APIs to replace storage plugins #1585

Comments

samuelstroschein commented Nov 3, 2023 • edited Loading

Problem

Proposal

Requirements

samuelstroschein commented Nov 3, 2023

felixhaeberle commented Nov 3, 2023

martin-lysk commented Nov 8, 2023 • edited Loading

How do we store the data:

What format

How do we split data

samuelstroschein commented Nov 8, 2023 • edited Loading

martin-lysk commented Nov 8, 2023

samuelstroschein commented Nov 8, 2023

martin-lysk commented Nov 10, 2023

felixhaeberle commented Nov 23, 2023

janfjohannes commented Nov 23, 2023 • edited Loading

samuelstroschein commented Nov 23, 2023

NilsJacobsen commented Dec 4, 2023

NiklasBuchfink commented Dec 4, 2023

samuelstroschein commented Dec 4, 2023

martin-lysk commented Dec 5, 2023

Open questions:

Only If storage is part of the ticket (if not we can answer those later):

Thoughts on import/export triggers:

Use cases for Export Plugins

Use cases for Import Plugins

samuelstroschein commented Dec 5, 2023

samuelstroschein commented Dec 8, 2023

samuelstroschein commented Dec 18, 2023 • edited Loading

Proposal - introduction of aliases via amap

samuelstroschein commented Nov 3, 2023 •

edited

Loading

martin-lysk commented Nov 8, 2023 •

edited

Loading

samuelstroschein commented Nov 8, 2023 •

edited

Loading

janfjohannes commented Nov 23, 2023 •

edited

Loading

samuelstroschein commented Dec 18, 2023 •

edited

Loading