Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design for Import/Export APIs to replace storage plugins #1585

Closed
5 tasks
samuelstroschein opened this issue Nov 3, 2023 · 17 comments
Closed
5 tasks

Design for Import/Export APIs to replace storage plugins #1585

samuelstroschein opened this issue Nov 3, 2023 · 17 comments
Assignees
Labels
scope: inlang/sdk Related to source-code/sdk. type: feature New feature or request

Comments

@samuelstroschein
Copy link
Member

samuelstroschein commented Nov 3, 2023

This issue has been raised by @martin-lysk. We agreed that importers/exporters is the right way to go but decided at the Berlin Offsite in Oct 23 to work around the issue as long as possible. First users are confused now why they storage plugins are limiting them.

Problem

Inlang's set up to be "provide your storage plugin" leads to numerous issues:

Proposal

Storing messages should be controlled by the SDK to ensure that inlang is not limited by external plugins.

  • loadMessages should be succeeded by importMessages
  • saveMessages should be succeeded by exportMessages
message-29sn82
-> json import: login.button
-> paraglide export: login_button
-> ios export: LOGIN_BUTTON
-> android export: login-button

Pros

  • inlang is not limited by external plugins
  • import/export APIs can expose what features they support
  • multi-platform exports (export for iOS, Android, Paraglide) become possible
  • users are communicated what is supported by a target platform and what not.
  • we can performance optimize the storage instead of naively calling saveMessages() and loadMessages()
  • importer/exporter plugins could store additional data like message id ieb3s should be exported as login-button, or ieb2s exists in the namespace file en/login.json. requires .inlang folder to avoid multiple files and enable caching #1418

Cons

Requirements

@samuelstroschein
Copy link
Member Author

@martin-lysk this seems to be a great issue for you. after all, you raised this issue and now it is hurting our growth because users don't understand why feature limitations exist for different storage formats

@felixhaeberle
Copy link
Contributor

As discussed in Berlin, this would be a API which finally solves limitations by external plugins and is therefore a good thing.

What (breaking) code change does this mean?

@martin-lysk
Copy link
Contributor

martin-lysk commented Nov 8, 2023

I try to get the whole picture - collecting the inputs from the tickets referenced it seems like you have a concept on how this should be integrated already. Before I make a proposal that might not meet your thoughts - Shall we have a kickoff about that issue @samuelstroschein?

Storing messages should be controlled by the SDK to ensure that inlang is not limited by external plugins.

  • So inlang should come "with batteries included" and defines a default way to load messages and save messages? Shall plugins be able to override this behaviour at all?

loadMessages should be succeeded by importMessages
saveMessages should be succeeded by exportMessages

So instead of loading and saving messages (persistance) plugins would import / export messages from sources like like sting.dict files or even api like poeditor/localize etc. The messages imported would than be managed by the inlang sdk and stored in the .inlang folder?

Storage Format: How would inlang store messages in the inlang folder?

Yes. You can use any storage format you'd like. For example, the JSON storage plugin https://inlang.com/m/ig84ng0o/plugin-inlang-json, which also reduces the clutter of the inlang message format plugin/is it easier to write translations manually. This question is important for the question about the format we use

How do we store the data:

What format

  1. using the JSON encoded AST like in (https://inlang.com/m/reootnfj/plugin-inlang-messageFormat)
  2. using the message format schema from mf-wg https://github.com/unicode-org/message-format-wg/tree/main/spec/data-model
  3. other format

How do we split data

1. Store everything in one big json. All - messages with there locales/variants

Pro

Con

  • pulling a change of a single message (done by another editor or push to repo) means fetching all messages with all variants and loading the whole file for now
  • merge conflicts - two edits of different messages might lead to merge conflicts until lix understands the format
  • loading messages means just load one file - if the project contains thousands of messages with dozends of languages and variants this might become a memory issue
  • ...

2. Store Messages split by languages / split by namespaces
Pro

  • smaller files
  • if separated by namespace one could load only a subset of messages by a given namespace without loading the whole file
  • files could get handed over to translators by language/namespace
  • devs are used to this kind of separation

Con

  • motivated by current status quo not by the needs of a storage format
  • leads to manual edits of the files that might not be wanted at all
  • if a problem occures in one message (for example a user edited the file manually - or a tool had an error with encoding) it breaks the whole format - to reverse this user has to look into huge files

My thoughts:
This is how those files are often stored in the old world to be able to deliver only messages on websites or to load messages in memory only for the current language (ios language bundles). Since inlang is usually interested in a message as a whole - all its properties (languages / variants) splitting it this way doesn't make sense for the storage format.

The use case for translators should be managed by import/export plugins instead.

3. Store each message with its locales and variants in a separate file

Pro

  • each message is an atomic entity - on fs level already. as long as you don't edit variants or languages of a message at the same time you don't have to deal with merge conflicts (even without lix - semantic meaning) even if we have simultanes edits a last write wins approach woulnd't hurt to much
  • versioning comes out of the box by git's version history
  • updates on a messages get propagated via the file watcher
  • the format could be checked against the message format schema (not really an argument - nothing prevents us to use the schema in our own schema that wraps the messages with a map)
  • loading a subset of keys in large project's could be done by filtering filenames
  • the api we design around this format is more likely to be similiar to the one we have when lix can store it as a whole
  • if a problem occures in one message (for example a user edited the file manually - or a tool had an error with encoding) it breaks the whole format - to reverse this user has to look into huge files
  • ...

Cons

@samuelstroschein
Copy link
Member Author

samuelstroschein commented Nov 8, 2023

hey @martin-lysk

  • The "inlang directory" change should happen before the importer/exporter stuff .inlang folder to avoid multiple files and enable caching #1418.
  • I have yet to write a proposal for the directory stuff. Planned to do that early next week.
  • The directory proposal will make changes to the message format really easy so we don't have to worry about one big json or not

When would you start with this/the to be proposed directory change so that I know when to write my proposal ?

@martin-lysk
Copy link
Contributor

I guess the planned iterations in #1459 (comment) will keep me busy this week - I could have a look next week tuesday.

@samuelstroschein
Copy link
Member Author

@martin-lysk okay i intend to publish more "inlang directory" proposal monday/tuesday which I would give you to implement. afterwards, the import/export stuff can be handled

@martin-lysk
Copy link
Contributor

@janfjohannes

@felixhaeberle
Copy link
Contributor

@samuelstroschein @martin-lysk @janfjohannes What's the status here? Who is leading the importer/exporter & should be assigned?

@janfjohannes
Copy link
Contributor

janfjohannes commented Nov 23, 2023

@felixhaeberle see samuels comment: #1585 (comment)

@samuelstroschein
Copy link
Member Author

@felixhaeberle my directory implementaiton will come first. progress can be tracked in the https://github.com/inlang/monorepo/tree/1678-project-directory branch

@NilsJacobsen
Copy link
Collaborator

Another take on the splitting proposals of martin:

While working with @NiklasBuchfink on the editor/sdk watcher we saw that not having granularity of messages is a nightmare. We watch files with thousands of messages. But we can only lint per message so as you can imagine that leads to a lot of reactive work. If we would have a solution like proposal 3 we can watch for messages and then only update and lint one message. That would be a huge advantage.

I see the problem like Samuel said, that we could have problem with a lot of files then. If we can build a granular watcher that works with one file but watches on every message could solve the problem to, by shifting complexity to the watcher.

-> Being able to watch for only one message should be a requirement for the SDK improvements.

@NiklasBuchfink
Copy link
Member

We need granular reactivity per pattern that changes. Updating a single pattern should only have the linting for that particular pattern as a side effect. A reactivity matrix is needed where we can observe changes in individual patterns and apply CRUD operations. A mappable watcher that acts like a proxy over the files would be the dream. Basically, we need the same approach for file watching as SolidJS has for its reactivity system. Avoid diffing by creating a pub/sub pattern for each small entity that is reactive.
That's why the normal watcher is the bottleneck or we have to split everything into a thousand files, which has its own limitations as described in suggestion 3.

@samuelstroschein
Copy link
Member Author

@NiklasBuchfink i created #1817. Let's keep this issue for importer/exporter only.

@martin-lysk martin-lysk self-assigned this Dec 5, 2023
martin-lysk added a commit that referenced this issue Dec 5, 2023
@martin-lysk
Copy link
Contributor

@samuelstroschein whats your take on the scope here? shall we just add importers exporters to inlang sdk and keep the storage topic completely out of the scope of this issue? If so we would still need a plugin that provides load and save method right?

Open questions:

What will become the execution points for importers / exporters - when shall we trigger an import or an export?

Only If storage is part of the ticket (if not we can answer those later):

  • should the change be backward compatible / should plugins loadMessage and saveMessages marked as deprecated as part of this
  • since we plan to reimplement the save logic and it will most likely need a migration for existing projects i would iterate on the target persistance format first

Thoughts on import/export triggers:

Compared to the current setup that only accepts one load and one save - importers and exporters will coexists. Projects may have an ios exporter and an android exporter and a json exporter all configured in one project.
Compared to save and load, where save triggered on each change and load that is executed by the watcher and initially exports/imports are usually triggered externally and not by events coming from the sdk:

Use cases for Export Plugins

  • triggers when configured within a ci pipeline as part of the cli
  • a button within the editor to trigger an export
  • a button in the editor that a allows a developert to download the localization files
  • ... any hooks in the sdk that an exporter should be triggered on?

Use cases for Import Plugins

  • a cli command that allows to import all keys from an existing set of messages - like ios/android/....
  • an external webhook (like one from lokalise https://developers.lokalise.com/docs/webhook-events#projectkeymodified that updates the keys) intagration - this would most likely only be a trigger to an import
  • an upload of an file like ios strings file within the editor
  • ... do you see any triggering of an import other than external ones?

@samuelstroschein
Copy link
Member Author

@martin-lysk do you think a sync call where we draft spec in google docs is quicker than github back and forth? If so, let's schedule a call

@samuelstroschein
Copy link
Member Author

Issue #1844 will be completed before this one.

@samuelstroschein
Copy link
Member Author

samuelstroschein commented Dec 18, 2023

This proposal is a reaction to #1844 (comment)

Proposal - introduction of aliases via amap

Introduce an alias map plugins can use to establish a relationship between message id and exported/imported key name.

Pros*

  • plugins can make use import/export keys
  • apps can display and edit "alias for i18next is login.button" because the plugin id is known

cons

  • ?
const message = {
  id: "human_airplane_globe", 
+  alias: {
+    plugin.inlang.i18next: "login.button",
+    plugin.inlang.xml: "login-button"
+  }
}

@jldec jldec changed the title introduce importer/exporter APIs to replace "storage" plugins Design for Import/Export APIs to replace storage plugins Apr 8, 2024
@jldec jldec closed this as completed Apr 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
scope: inlang/sdk Related to source-code/sdk. type: feature New feature or request
Projects
None yet
Development

No branches or pull requests

7 participants