Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addon Language Filter - Option to completely hide a language that is not spoken #10136

Open
Singh-in opened this issue Apr 9, 2021 · 7 comments

Comments

@Singh-in
Copy link

Singh-in commented Apr 9, 2021

"I always get frustrated when I see a language in the stream that I don't speak. I would like to hide this completely.

Describe the function you want.

The addon "Language Filter" detects, after a certain probability, posts by language and offers the possibility to reduce these posts. However, these posts are not completely hidden so that they can still be viewed if necessary.

Unfortunately, there is no option to completely hide posts in languages that are not spoken, so that only the content is displayed whose language is actually spoken.

As a result, it often happens that only reduced contributions are displayed on the community page. Contributions in the spoken language are hidden behind tons of reduced contributions. This makes the community page less useful.

It would therefore be a great feature if you could optionally filter (remove) all these posts completely and only receive the content whose language you speak.

@MrPetovan
Copy link
Collaborator

Technical note: the langfilter addon piggy-backs off the content filter feature that only collapses posts. This request would require altering the community timeline post database query which is way more involved.

@Singh-in
Copy link
Author

#10052

@bkil
Copy link

bkil commented Apr 21, 2021

I'd like to support this feature request from a user perspective to easy life on multi-national instances and I did consider requesting this in the past.

However, note that language detection can make errors quite often, especially if the post has an insufficient amount of text. If no placeholder appears in your timeline, how can you possibly override the decision of the machine?

The following edge cases also show how difficult this is:

  • An external link that may or may not point to content in a language that you speak (some sites also allow switching between languages manually or automatically)
  • An image that may or not not need any language comprehension at all (or where the few words burned into it aren't necessary for comprehension)
  • A video that may or may not contain audio tracks or subtitles in a language you understand

I don't have a good recommendation as of now, but I'm all ears. All workarounds I could think of are a bit of a kludge:

  • Better enforcement of specifying a language with new posts, perhaps even showing what the machine detects to motivate the user in correcting it
  • Collaboratively tagging the language of each other's posts (maybe introducing a new language tag for "culture and language independent")
  • Restrict detection error by intersecting the language detector's output of one's posts by the user's language (that may also be detected or tagged collaboratively)
  • Just join a local instance where others are speaking a language that you understand and ask others to post in a single language using any given single account (and perhaps create multiple accounts)

@schmaker
Copy link

schmaker commented Feb 4, 2022

Well, even though lang-detection addon is not perfect, i seriously would like to support this idea. Even hiding posts with Russian and Chinese / Japanese text would clean the mess a lot.

@bkil
Copy link

bkil commented Feb 4, 2022

Well, even though lang-detection addon is not perfect, i seriously would like to support this idea. Even hiding posts with Russian and Chinese / Japanese text would clean the mess a lot.

  • ( ͡° ͜ʖ ͡°)
  • ¯\_(ツ)_/¯

@MrPetovan
Copy link
Collaborator

MrPetovan commented Feb 4, 2022

Good idea, but the Language Detection library is using a weighted approach so these wouldn't be ruled as 100% non-Latin languages just based on these smileys.

@bkil
Copy link

bkil commented Feb 4, 2022

Yes, a single emoji would fall through the threshold of the detector - it would need much more text to decide. I just wanted to warn you to exercise caution before doing a Unicode code point based grep blindly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants