human readable and random ids for messages by default #1892

samuelstroschein · 2023-12-14T16:49:41Z

Problem

@inlang/editor needs a message id generation algorithm to add a "create a message" button
devs choosing message ids leads to multiple problems like namespacing, renaming IDs and thereby breaking the relation to translations, or Programmatic Linting #1889

Proposal

Apps choose message IDs for users that are human-readable and memorizable but have no "meaning" by default.

blue_dot_map
car_sky_keyboard
phone_table_chocolate
...

Pros

already best practice for large projects because ids should (must!) have no meaning
better UX/DX because apps don't need to prompt users for ids
a wide class of bugs is eliminated like choosing paraglide incompatible ids
users can still search and memorize messages
- "did you change blue_dot_map?"
- "we have a missing translation for car_sky_keyboard"

Cons

maybe unexpected behavior for devs. I propose to test implement this and wait for user reactions.

// own function for tree-shakability 
import { generateMessageId } from "@inlang/sdk"

// checks the project for id conflicts
const id = generateMessageId({ project })

project.query.message.create({ id })

The text was updated successfully, but these errors were encountered:

samuelstroschein · 2023-12-14T16:50:04Z

@inlang/editor @inlang/ide-extension @inlang/paraglide-js @martin-lysk good idea? The implementation cost is relatively simple

samuelstroschein · 2023-12-14T16:53:56Z

The idea is inspired by what3words.com. @felixhaeberle in the ide extension you can auto fill the generated message id. i expect most devs will just hit enter

samuelstroschein · 2023-12-14T17:00:29Z

Lovely, a lib exists for this https://www.npmjs.com/package/human-id. three words have 15 million possibilities. 15 million possibilities should be enough for even the largest enterprise use cases. adding a forth word increases the possibilities manifold further

NiklasBuchfink · 2023-12-14T17:09:20Z

It depends on the developer's workflow and whether they like it. I only mention this because these could be possible thoughts:

I'm building a modal for the user; why isn't everything prefixed with "modal_user_" since autocomplete can help me with that? (human readable ids are good for memorizing short-term and for autocompletion)
Clean code says good variable names don't need explanation. We link content with these ids and without context inside the id, I don't know what is behind it (... unless I use the vs code extension and as we know, the problems begin when we start renaming ids)

An alias/comment/description/context field may be necessary. Again, it is something to fill in and find names.

samuelstroschein · 2023-12-14T17:11:31Z

@NiklasBuchfink Clean code says good variable names don't need explanation. We link content with these ids and without context inside the id,

This is exactly the problem. Every large enterprise project states: Do not link message ids/keys to content. It breaks everywhere. Hence, my proposal to choose a default for inlang users that is human readable but has no meaning.

martin-lysk · 2023-12-14T19:56:59Z

We could start with this approach as the message name and let people change it if we don't want the user to stuck in the "hmmm what would be a good name for that thing" loop - as an Id i am not convinced since it lacks some properties but i need to think on this a bit more

samuelstroschein · 2023-12-14T20:22:55Z

@martin-lysk as an Id i am not convinced since it lacks some properties but i need to think on this a bit more

What properties are missing?

unique ✅
immutable ✅ (if we disallow ID renames which we will do once "keys" are introduced as a concept)
human-readable ✅ (avoids the need for "keys" altogether for fresh projects)

openscript · 2023-12-14T20:52:42Z

I like the idea very much! As a dev, I sometimes recall the id to reference messages I repeatedly use. This would become harder if the ids don't reflect the message. If the IDE helps me to select a messages id (maybe with fuzzy finding), even better, than trying to recall ids and coming up with some structure for the id names.

LorisSigrist · 2023-12-15T08:25:40Z

I have to say I was initially very sceptical of using random IDs, however, the more I think about it the more I come around to it. Many developers will likely have the same experience.

I do have one concern.
Devs won't be ok with completely random IDs, unless we provide an alternate way of finding messages. I really like @openscript's suggestion of fuzz-finding messages by content.

(Perhaps the IDE extension could kick in after someone types m., treating any text afterwards as a search query and suggest messages)

Prefilling message-id fields with randomly generated ids would nudge developers towards the optimal workflow, but without forcing them. They can still use meaningful ids if they want. If we provide the appropriate IDE tooling, devs will come around to it. That's the Tailwind effect.

martin-lysk · 2023-12-15T09:27:34Z

The concern I have here: One should not reuse messages in different contexts.
Think about a button with a label in a delete modal that a user should confirm with the very generic message "Ok. One creates a message with the id "blue_dot_map". Cool we have now one message with a label "Ok".

Now the next feature is developed: a screen with an information about a new Feature - again the initial iteration just contains "Ok" as a dismiss button.

Fuzzy search will bring up the "blue_dot_map" button if we choose it - two buttons in complete different context reference the same Ok message. This is more likely to happen with this Approach since developer we loose the information about where the message should be used, also such a case would be hard to check in a code review.

updateLoginScreen() {
   button.setText($blue_dot_map)
}

vs.

updateLoginScreen() {
   button.setText($new_feature_dismiss)
}

I see the point that developer should not struggle with giving missing messages a meaningfull name so.
A good article about naming and idea behind this:

https://lokalise.com/blog/translation-keys-naming-and-organizing/

I think messages should have an id (immutable / unique) and a name maybe even aliases

felixhaeberle · 2023-12-15T11:35:43Z

I think messages should have an id (immutable / unique) and a name maybe even aliases

Yes. This is the way to go.

Treat the "name" as any other (meta) information according to a message, like a category (modal) or department (marketing).

What's really important for the dev is the ID, and we should simply design a great UX in the IDE extension to search by any of the meta information or unique id & provide great auto-filling / discovering.

Additionally: Very high incentive to then install the IDE extension because without, you are stuck with id gibberish.

IDE extension: It's the same with Git. Near nobody uses command-line only for Git anymore when you have built-in Git functionality with a nice GUI UX in your IDE. And Git extensions are skyrocketing in installs.

But this doesn't has to be the case ultimately, because paraglide could also offer resolving from key OR from id. Duplications in key names could be found by a lint rule. Tree-shaking could also be preserved.

I'm building a modal for the user; why isn't everything prefixed with "modal_user_" since autocomplete can help me with that? (human readable ids are good for memorizing short-term and for autocompletion)

Clean code says good variable names don't need explanation. We link content with these ids and without context inside the id, I don't know what is behind it (... unless I use the vs code extension and as we know, the problems begin when we start renaming ids)

Both can be solved through either resolve from key or from id, or with a great UX in conjunction with the IDE extension. Let's face it – the problem is complex & we need tooling to make it better.

In the end, looking at big enterprises, no other implementation besides the unique id will scale.

samuelstroschein · 2023-12-15T15:13:03Z

Let's conclude the discussion 📺 watch the LOOM

Proposal

Introduce random, human-readable IDs by default.

unique ✅ (three words have a minimum of 15 million unique ids which can be extended with more words)
immutable ✅ (has no meaning -> will not be renamed)
human-readable ✅ (eliminates the need to come up with names and naming conventions!)

blue_dot_map
car_sky_keyboard
phone_table_chocolate
...

Why

Random IDs are a necessity for any large project and any app that is non-dev facing.
The only question is whether we introduce human or non-human readable IDs. If we introduce random human-readable IDs, we eliminate the need to think about and implement name logic for most inlang projects.
Thinking about naming is just wrong. If inlang users, and everyone in an organization, need to agree on naming conventions and read overwhelming articles like this, we won't make internationalization simple (enough).
Inlang's ecosystem will provide context through pre-rendering UIs or similar mechanisms in the future; pushing meaning into a message ID/name is redundant.

Additional notes

SEARCH: this is a follow-up issue. Fuzzy stuff or not is not important atm. We will see what users request.
https://discord.com/channels/897438559458430986/1185239478172909568/1185267189624873190

martin-lysk · 2023-12-20T00:40:54Z

will be part of #1844

martin-lysk · 2023-12-21T00:36:50Z

Some inspirations for word dicts

https://blog.asana.com/2011/09/6-sad-squid-snuggle-softly/
https://github.com/moby/moby/blob/master/pkg/namesgenerator/names-generator.go#L131
https://github.com/PerWiklander/IdentifierSentence/blob/master/src/main/java/biz/wiklander/tools/IdentifierSentence.java
https://github.com/EmpowerCode/human-readable-ids.js

ferdnyc · 2024-01-07T02:27:42Z

A somewhat devil's-advocate reaction follows. (IOW, I'm not trying to dispute this proposal or argue against it. Consider this as coming from a place of neutrality -- neither for nor against the idea.)

@samuelstroschein

already best practice for large projects because ids should (must!) have no meaning

[citation needed]?

@martin-lysk shared the "overwhelming article" (...? it's a 5-minute, large-font read), which contains arguments/advice in direct opposition to what's proposed here. So it feels like there should at least be some sort of supporting evidence on the pro side, as well.

better UX/DX because apps don't need to prompt users for ids

That's fair, and a good argument for at least some sort of automatically-generated ID scheme.

a wide class of bugs is eliminated like choosing paraglide incompatible ids

Surely a sufficiently good IDE can prevent that even when IDs are user-chosen, though? Sort of conflating unrelated things, here -- again, devil's advocate.

"Ensure users cannot choose invalid IDs" is solvable in more ways than "choose IDs for the user", isn't it? Even if the latter does technically avoid the former problem, in a swatting-a-fly-with-a-sledgehammer sort of way.

users can still search and memorize messages

"did you change blue_dot_map?"

"we have a missing translation for car_sky_keyboard"

They can, but is there any empirical data indicating that they will? Or is that merely a hypothetical scenario?

If a piece of code has a message ID blue_dot_map that needs to be updated, what's the real-world data (or even anecdata) on how users will discuss that message?

Are they more likely to say:

Did you change blue_dot_map?

or will they ignore randomly-chosen IDs and resort to contextual descriptions, like:

Did you change the translation for the export format label in the render dialog?

samuelstroschein · 2024-01-07T15:00:39Z

Hey @ferdnyc,

I am replying to address your concern, but please do not reply. This discussion is closed. We formed a decision. Re-opening this discussion would take resources from other tasks.

Before I start, It is crucial to understand that anything we implement at inlang needs to work across an organization and, therefore, across different teams with different needs. I assume that you are coming from a dev (only) perspective, which fails inlang's mission to make globalization of software simple(r).

[...] shared the "overwhelming article" (...? it's a 5-minute, large-font read), which contains arguments/advice in direct opposition to what's proposed here.

The article is overwhelming because this 5-minute read is part of hundreds if not thousands, of hours that larger teams will discuss naming conventions. Wasted hours because a consensus will not emerge. Rules like "describe in the ID where messages are used" will be ignored, will differ between teams, and sometimes can't even be established.

For example, we know that users want to create messages via Fink. They have no context to create a message according to a "provide context rule". And neither might a system that automatically creates messages (think of automatic extraction).

Surely a sufficiently good IDE can prevent that even when IDs are user-chosen, though?

Every app in the ecosystem (designers, translators, marketing, ...) would need this validation. Yes, we could add a mechanism to the linting system, but why lint something that we can (likely) avoid altogether by using human readable IDs instead of random hashes?

That's fair, and a good argument for at least some sort of automatically-generated ID scheme.

You came to the button of the proposal here. This discussion is not about preventing you from aliasing messages, merely that our ID system is human readable instead of random hashes. We believe human-readable ids will eliminate the need for naming discussions.

They can, but is there any empirical data indicating that they will? Or is that merely a hypothetical scenario?

Experience we have in i18n software. Naming conventions are rotten because they don't work for i18n, where different teams need to agree on a convention.

or will they ignore randomly-chosen IDs and resort to contextual descriptions, like:

Nothing prevents that. In that moment, we achieved our goal. The ID of a message became meaningless, and naming discussions are eliminated :)

martin-lysk · 2024-01-12T09:11:42Z

@opral/inlang-cli @opral/inlang-cli @opral/inlang-fink @opral/inlang-ide-extension

Please check the spreadsheet of terms we plan to use for human id's (i will share the link in discord).

The table has a total of 4 tabs with different "adjectives", "nouns", "adverbs", "verbs".

Please take 30 minutes to check the current words for.
Uniqueness
Bad example:
Live vs. life

Pronounceability
Bad example:
Draught

politically incorrect hurtful or negatively coannotated words
Bad example:
fuck, master, bitch,

spellings in British vs American English
Bad example:
energize vs energise

Just delete the ones where you see problems. If you unsure of one of those properties - its a reason enought to drop it - no discussion needed!
Add good new words - in the end we need 256 words per category to get enought ids out of the combination.

Pleas only change column a. Column c and d will provide you with example ids including the term defined in a #excel_magic

Please react to this comment with a rocked if you are done 🚀

samuelstroschein · 2024-01-15T21:58:57Z

@martin-lysk i pressed 🚀 because I thought ppl were excited. i doubt that people went through the spreadsheet https://docs.google.com/spreadsheets/d/1AsAgZi9V8R_5xxSK8-spp0mkLojlT-0MFVozcF0MZ6I.

going through it now

NiklasBuchfink · 2024-01-17T10:53:03Z

My notes:

we should add Jurgen as an Easter egg too
we got fink and finch, not sure if this is confusing somehow. Finch is the English translation of the German Fink
I see the awful-niklas-arrogant-mix incoming, but I'm okay with that 😄

Is it correct that fink can be translated with:

a betrayer, traitor, snitch
an unpleasant or contemptible person
a person who informs on people to the authorities

samuelstroschein · 2024-01-17T14:34:52Z

we got fink and finch, not sure if this is confusing somehow. Finch is the English translation of the German Fink

just change it

jldec · 2024-01-30T17:29:54Z

I'll make a pass on this today since there's a cost to making changes to the word lists e.g. impacting mocks / tests.

additional scan to remove unwanted words
fix weird adverbs like "dai" (used to be daily)
remove duplicates (fly)

samuelstroschein · 2024-04-06T00:14:16Z

conversation continues in https://linear.app/opral/issue/MESDK-12/human-readable-and-random-ids-for-messages-by-default

maige-app bot added scope: inlang/sdk Related to source-code/sdk. type: feature New feature or request labels Dec 14, 2023

samuelstroschein changed the title ~~human readable but random ids for messages~~ human readable and random ids for messages by default Dec 14, 2023

samuelstroschein self-assigned this Dec 14, 2023

samuelstroschein mentioned this issue Dec 19, 2023

aliases in paraglide js #1920

Closed

martin-lysk assigned martin-lysk and unassigned samuelstroschein Dec 20, 2023

This was referenced Dec 22, 2023

Improve Paraglide Compiler output #1940

Merged

[bug] Message IDs clashing with internal variables #1938

Closed

jldec self-assigned this Jan 30, 2024

jldec unassigned martin-lysk Jan 30, 2024

jldec mentioned this issue Feb 9, 2024

WIP 1844 Part 1: auto-generated human-IDs and aliases #2108

Merged

22 tasks

samuelstroschein mentioned this issue Feb 28, 2024

[bug] plugin-inlang-json and plugin-inlang-i18next compile namespaces to invalid js #1577

Closed

samuelstroschein closed this as not planned Won't fix, can't repro, duplicate, stale Apr 6, 2024

jldec mentioned this issue Apr 8, 2024

Design for human IDs and aliases opral/inlang-message-sdk#23

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

human readable and random ids for messages by default #1892

human readable and random ids for messages by default #1892

samuelstroschein commented Dec 14, 2023 •

edited

samuelstroschein commented Dec 14, 2023 •

edited

samuelstroschein commented Dec 14, 2023 •

edited

samuelstroschein commented Dec 14, 2023

NiklasBuchfink commented Dec 14, 2023 •

edited

samuelstroschein commented Dec 14, 2023

martin-lysk commented Dec 14, 2023

samuelstroschein commented Dec 14, 2023 •

edited

openscript commented Dec 14, 2023 •

edited

LorisSigrist commented Dec 15, 2023

martin-lysk commented Dec 15, 2023

felixhaeberle commented Dec 15, 2023 •

edited

samuelstroschein commented Dec 15, 2023 •

edited

martin-lysk commented Dec 20, 2023

martin-lysk commented Dec 21, 2023

ferdnyc commented Jan 7, 2024

samuelstroschein commented Jan 7, 2024

martin-lysk commented Jan 12, 2024 •

edited

samuelstroschein commented Jan 15, 2024

NiklasBuchfink commented Jan 17, 2024

samuelstroschein commented Jan 17, 2024

jldec commented Jan 30, 2024 •

edited

samuelstroschein commented Apr 6, 2024

human readable and random ids for messages by default #1892

human readable and random ids for messages by default #1892

Comments

samuelstroschein commented Dec 14, 2023 • edited

Problem

Proposal

samuelstroschein commented Dec 14, 2023 • edited

samuelstroschein commented Dec 14, 2023 • edited

samuelstroschein commented Dec 14, 2023

NiklasBuchfink commented Dec 14, 2023 • edited

samuelstroschein commented Dec 14, 2023

martin-lysk commented Dec 14, 2023

samuelstroschein commented Dec 14, 2023 • edited

openscript commented Dec 14, 2023 • edited

LorisSigrist commented Dec 15, 2023

martin-lysk commented Dec 15, 2023

felixhaeberle commented Dec 15, 2023 • edited

samuelstroschein commented Dec 15, 2023 • edited

Let's conclude the discussion 📺 watch the LOOM

Proposal

Why

martin-lysk commented Dec 20, 2023

martin-lysk commented Dec 21, 2023

ferdnyc commented Jan 7, 2024

samuelstroschein commented Jan 7, 2024

martin-lysk commented Jan 12, 2024 • edited

samuelstroschein commented Jan 15, 2024

NiklasBuchfink commented Jan 17, 2024

samuelstroschein commented Jan 17, 2024

jldec commented Jan 30, 2024 • edited

samuelstroschein commented Apr 6, 2024

samuelstroschein commented Dec 14, 2023 •

edited

samuelstroschein commented Dec 14, 2023 •

edited

samuelstroschein commented Dec 14, 2023 •

edited

NiklasBuchfink commented Dec 14, 2023 •

edited

samuelstroschein commented Dec 14, 2023 •

edited

openscript commented Dec 14, 2023 •

edited

felixhaeberle commented Dec 15, 2023 •

edited

samuelstroschein commented Dec 15, 2023 •

edited

martin-lysk commented Jan 12, 2024 •

edited

jldec commented Jan 30, 2024 •

edited