Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rethink language-translation support #24011

Open
mcclure opened this issue Mar 6, 2023 · 8 comments
Open

Rethink language-translation support #24011

mcclure opened this issue Mar 6, 2023 · 8 comments
Labels
i18n Internationalization and localization suggestion Feature suggestion

Comments

@mcclure
Copy link

mcclure commented Mar 6, 2023

Pitch

Automatic post translation is a great feature in theory. Back when I used Twitter I got a lot of benefit from their "translate button". It is great to be able to have full conversations with people I share no languages with. On Mastodon, there is a similar translate feature, but I never get the chance to use it. It's possible it works better for users of non-English languages but as an English speaker, because of a combination of UX problems I rarely get to use it— in fact I do not think I have ever successfully used it to translate a post I casually saw on my feed/notifications.

Below I outline three problems I see with the language support, which combine to make the feature effectively useless. I make two suggestions which may not be a full solution but would help greatly with the problem.

Problem 1: Many posts are marked as being in English, but are not.

Many users have their languages set to English (because it is the default, and they never changed it?), but in fact they speak some other language.

Here is a random poster from the local feed. All their posts are in Portuguese, but I never get the "translate" offer button because all their posts have English language in the status. The translation feature does not help me here.

Problem 2: Many posts are in English, but marked as being some other language.

Many users properly set their default language, but happen to be multilingual and do not set the individual post languages. When they post in English, an English user viewing the post will see a useless "translate" button at the bottom.

This user is Danish, but this post is in English. They did not set the post language to English; in fact, looking at their history, they never set the post language to English. And I can't at all blame them, De vez en cuando yo escribo en Español and I never remember to set the box to "es" myself.

Because I am an English user I am much more likely to see this Danish user's English posts (because I follow English-speaking users who would probably not RT Danish posts) so all this user's posts will have a "translate" button at the bottom which can't do anything.

Problem 3: Language code is automatically copied into replies.

Say I reply to a post which is marked as being in a non-English language:

It has autofilled my reply's language as being Danish.

In some cases, such as when two multilingual people are having a conversation, this is probably a good behavior. However I, personally, will never make a post in Danish. Since (see problems 1 and 2) many statuses have an incorrect language set for one reason or another, this feature serves to propagate language-setting errors, and makes it nigh-impossible to do the thing of two people having a conversation each speaking different languages but using the site features to translate.

Suggestions

All three of the problems I mention above are variations of one problem: It is easy to overlook the languages box and hard to remember to set it. The most complete fix for this would be a version of automatic language detection, but I assume this is difficult or at least expensive so I won't suggest it here. But there are two "easy" fixes that could be done right now and would mitigate the UX problem a little bit:

Suggestion 1: Unconditionally include "translate" in the … menu.

What if there were just an option to translate in the popup menu, and the reading user picked what language to translate from? This way if someone messed up setting the language in the status, you'd still be able to manually request a translation and it would be faster than copypasting into translate.google.com.

A variant of this idea is implied by issue #23882 . This user apparently finds "Translate" so unhelpful they just want it to go ahead and remove it altogether. A item in the … menu would not be as discoverable as a proactive "translate" link, so one option would be to default to the current behavior but have a checkbox to remove the "translate" links from all posts and if that option is selected add the … menu translate option (regardless of whether Mastodon guesses it's needed). People irritated by the "translate" links would be freed from them but still be able to translate languages they don't speak, and posts would be translatable regardless of whether the status language is set right.

Suggestion 2: Allow users to set more than one language.

Right now, it appears that default language in statuses is set to the UI language. That's basically reasonable, but it isn't totally reliable, especially on a server like mastodon.social that has many multilingual European users (for example, the Danish user I screenshot above actually posts predominantly in English). If there were a way for users to click "customize languages" and switch from "posting language set by interface language" to having a list of languages they advertise themselves as speaking, this would come with several benefits:

This suggestion is limited though because it depends on people setting their preferences properly and as discussed above people tend not to do that.

IDK

These suggestions aside, is there some "fundamental" UX change that would address the problem of posts' JSON-advertised languages often failing to match the actual language they are written in?

@mcclure mcclure added the suggestion Feature suggestion label Mar 6, 2023
@awebeer256
Copy link

Another suggestion would be to have a user-level feature—disabled by default—that when you submit a post, it'll send the text through language auto-detection software, and if the detected language is different than the one you specified, show a popup asking whether you want to change it.

There could also be a tool to fix the historical instances of this problem: a button in your account settings which runs language detection on a large batch of your posts, and brings any mismatches to your attention. It should probably allow you to filter based on current language metadata (e.g. only examine posts that are supposedly in Danish) and start and end dates.

One more thing: I have my interface language set to "English (British)", and I get a translate link for all "English" posts. This would automatically be fixed by your suggestion to declare multiple spoken languages.

@mcclure
Copy link
Author

mcclure commented Mar 31, 2023

One more thing: I have my interface language set to "English (British)", and I get a translate link for all "English" posts. This would automatically be fixed by your suggestion to declare multiple spoken languages.

Yeah, there's a bug for that at #19302 . In general Mastodon does not understand language codes with dialect/region specifiers, such as en-GB. In addition to the problem you mention it causes visual glitches, and also DeepL doesn't understand dialects (but Mastodon doesn't know to strip them) so I believe a (for example) German speaker who clicks the "translate" link will actually not be able to translate any of your posts because it will ask for an "en-GB" translation but DeepL doesn't recognize that.

@c960657
Copy link
Contributor

c960657 commented Apr 1, 2023

One more thing: I have my interface language set to "English (British)", and I get a translate link for all "English" posts.
This issue has been fixed in #23879. The fix will be included in Mastodon 4.2.

@mcclure
Copy link
Author

mcclure commented Apr 1, 2023

One more thing: I have my interface language set to "English (British)", and I get a translate link for all "English" posts.
This issue has been fixed in #23879. The fix will be included in Mastodon 4.2.

That's great. And with #23879/#24037 does it now do "something reasonable" when confronted with a region/dialect string (IE "es-419", which should be shortened to "es"?)

(Of course this is a separate problem from the original problem described in this issue, which I would summarize as "language field is set incorrectly for some or most Mastodon posts".)

@c960657
Copy link
Contributor

c960657 commented Apr 2, 2023

And with #23879/#24037 does it now do "something reasonable" when confronted with a region/dialect string

No, that is a separate issue (#19302).

@Moilleadoir
Copy link

Detaching translate from the posting language is a great idea, but it would also be good to choose the target language. I often post in Irish and have the interface in Irish, but the translations to Irish are pretty horrible. I’d prefer translations to English.

@msikma
Copy link

msikma commented Aug 9, 2023

I really don't want to post just a "+1" type reply, but this is actively confusing, to the point where I just made an issue (#26410) that I now realize is probably just me being hit by this UX decision.

What I don't understand is how someone even came up with this behavior to begin with. It seems to me like peak material for a "falsehoods programmers believe about language/i18n" blog post. Several things about language that should be kept in mind:

  • People can mix and match their language use a lot, instead of sticking mostly with a primary one
  • Interface language is not necessarily correlated with communication language
  • The target language for translation is not necessarily the interface language

While I get that, most likely, the current behavior was chosen because "it's not great but it's better than nothing", it's just confusing at this point. I feel that unless actual language detection is something that can be implemented in the relatively near future, I'd consider not even showing the translate link by default, perhaps making it an option that needs to be turned on.

@mcclure
Copy link
Author

mcclure commented Aug 9, 2023

While I get that, most likely, the current behavior was chosen because "it's not great but it's better than nothing", it's just confusing at this point

Honestly, from my perspective the current behavior is slightly worse than nothing. Although there is no benefit to including the translate link (due to it not appearing when needed— I have successfully used the "translate" button to translate a non-English post either once or zero times), most English posts I read have a fallacious "Translate" button (often because the other user's language is "en-uk", which can't be translated), so the feature is a source of unnecessary UI noise.

@renchap renchap added the i18n Internationalization and localization label Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
i18n Internationalization and localization suggestion Feature suggestion
Projects
None yet
Development

No branches or pull requests

6 participants