Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add first class support for languages in posts #300

Open
ltogniolli opened this issue Sep 21, 2016 · 18 comments
Open

Add first class support for languages in posts #300

ltogniolli opened this issue Sep 21, 2016 · 18 comments

Comments

@ltogniolli
Copy link

There should be a way to specify which language a post is written in, and for a user to specify which languages he would like to see and filter out the rest. Currently tags are used for this purpose ( kr or cn for example) but this makes it so it loses the benefits of tag organization and doesn't allow filtering out languages a user doesn't want to see.

According to steemit/steem#430 this is already supported in the blockchain so for this would be a website only change.

@NateBrune
Copy link
Contributor

This is just a simple case of adding json metadata into the post. If language was a setting in user settings, where profile pictures will be set and such, it could be assumed that the language specified in user settings is the language used in the post. I propose that an additional field of metadata be added into steemit posts using the ISO 639-3 standard for now. I understand that implementing first-class language support into steemit at the moment may not be a part of the minimum viable product for now... but it would be foolish to not include it for future clients. Especially when it is such a simple thing to implement now.

@samupaha
Copy link

Is anybody working on this? It seems that i18n support is progressing, but it's kind of useless until we can actually define the language of posts and comments.

Benefit 1: My guess is that this feature would bring quite easily lots of new users to Steemit. There are people who would like to use Steemit but wont to do it if it's English only. Using tags to define language is not a good long-term solution.

Benefit 2: When this feature is enabled, people can start to build communities based on languages they speak. IMHO one of the weaknesses of Steemit is currently that building subcommunities is not easy. It's too difficult to separate which posts belong to the community members and which don't. It's all the same. When people can define the language of the post and filter out all the languages they don't speak, they have much tighter community which will cause more active usage and loyalty to the platform.

Benefit 3: When this feature is enabled, people can filter out all posts that they don't understand. There is too much noise currently. When people see only languages that they understand, their user experience is much better.

@samupaha
Copy link

Any estimate when this could be implemented? I think this is essential if we want to have users who don't like to communicate in English.

In many countries freedom of speech is in danger. We could provide a platform to have open and uncensored discussion for many different groups and communities.

For example here in Finland politicians have been attacking "politically incorrect" opinions quite aggressively. Last week this become even more terrifying when the police started a campaign focused on kids. Basically they want children to snitch on their parents for politically incorrect opinions.

https://heatst.com/culture-wars/creepy-finland-police-urge-kids-to-report-parents-who-say-mean-things-about-politicians-on-facebook/

Of course Finland is not alone. In many other countries ruling class is actively repressing open discussion about politics and culture.

From this perspective, Steem is really necessary platform because it gives an opportunity for people all over the world to have open and uncensored discussions.

People whose freedom of speech is limited are great marketing segment for Steem. Those people don't care much about monetary rewards – they just want a place where to talk freely. Money is secondary motivator. But if we want to have them here we need to give them a way to discuss in their native languages.

I think this is a low-hanging fruit. Add support for different languages (and also option to filter out languages that user doesn't understand) and we will have new and active users very quickly.

@sneak
Copy link
Contributor

sneak commented Jan 20, 2017

I think that with the addition of communities, this will generally sort itself out without us having to implement specific functionality for languages.

See how already we have language-specific tags. Communities will take this a step further.

@sneak
Copy link
Contributor

sneak commented Jan 20, 2017

I want to be 100% clear that we want non-english-speakers to have a first class experience on the site.

@samupaha
Copy link

There are two things that language support will solve (I'm generally in favor for solutions that solve several problems at once).

  • There is too much noise. When users see languages they don't understand, it's pure noise and makes their user experience worse. This is causing harm to English speakers also, especially when there starts to be more non-English writers on Steem. Seeing languages that you don't speak is ultimate noise. It might be very meaningful for somebody writing in that language, but for you it's just garbage because you have no idea what it is even about.

  • It's really hard to find other writers to discuss with other languages besides English. If user can specify that he wants to see only his native language, Steemit will filter out all other languages and users will see only what they want.

For example, if Finns would migrate here now successfully, it would cause a lot of noise for English speakers. Maybe if a couple of well-known bloggers decided to move their blogs here, it could bring many of their followers, too. Then the followers start to post also and after a while we'd have a nice community of few thousand active people. Some of them would probably buy SP to have more voting power, which would cause Finnish posts to be seen in trending and hot. That would be bad for English speakers because they don't understand any of that. It's pure noise for them.

And that's not even very far-fetched scenario. Finns really like written communication, that's why we invented IRC and SMS.

I mentioned Finland as an example because I see daily discussions about the state of freedom of speech and I could go and ask people to join Steemit. But because the user experience would be bad I'm pretty sure they would leave soon if they even would come.

Yes, addition of communities will help but it's not a perfect solution. Using tags to indicate language is very poor system.

Currently the user experience is something like this:

  • Some users start posting in their native language. They have to search for other users who write in the same language and follow them (but some of them might also write in English).
  • Because there is no standard way of tagging languages, there must be active effort to make sure that all existing and new users know what to use. For example, for posts in Finnish there might be four different tags: #fi, #fin, #finnish, #suomi. It needs active and continuous work to make sure that everybody who writes in Finnish knows which one to use.
  • Using tags to indicate language takes one tag away from their real purpose: to indicate the topics that the post is about. If it's not about linguistics, there shouldn't be a need to put language in the tags.
  • Because the effort is quite considerable, it's really hard for communities to arise organically.

Over the years I've tried several different communication technologies and softwares, but even when they are technologically great, it's really hard to get other users to join. People are lazy to try anything new, especially if it's too hard to use. My intuition says that using other languages besides English is so difficult currently in Steemit that potential users don't even bother to try. If they try, they will post a few times, get discouraged because nobody comments or likes the posts and then leave.

The ideal user experience would something like this:

  • New user signs up. He is asked about language preferences: he will choose to see only posts written in Finnish and by default all his posts are recorded as "written in Finnish" to the blockchain (so that all other UIs will know it, too).
  • User starts to browse around Steemit and will see only the language he wants to communicate with. Zero noise is caused by languages he don't understand.
  • If the user wants to use also English, he can either choose to see languages mixed (like they are now) or easily (with few clicks) change between seeing Finnish or English posts.

It could be possible to use special tags to indicate language. Filtering could be done by giving a possibility to hide certain tags (which should be applied to normal tags too, to reduce noise).

This might be the easiest way to implement language support because it would need only a little bit tinkering to the UI. Users who want to indicate the language of the post could just use the tag form and write the language tag. No drop-down menus or anything like that is needed. Only change that is needed is to implement a new special tag for languages. So users could add five normal tags (like now), and in addition, five language tags (there should be several language tags for users writing about, for example, differences between languages).

@TimCliff
Copy link
Contributor

This is related to #233, but I think they are asking for different things. #233 is more intended to make the condenser code handle translations, while this issue is looking for a better handling of languages for user created content. Right now this is being handled primarily through language tags. Would more robust language handling need to be done at the blockchain level, or could it be handled 100% at the UI level?

@samupaha
Copy link

Compared to traditional WWW, this is like indicating the language of a webpage in the html code. #233 is like translating the UI of web browser.

Would more robust language handling need to be done at the blockchain level, or could it be handled 100% at the UI level?

It needs to be at the blockchain level so that all UIs will handle it in the same way.

It's also much more complicated if it's done 100% in the UI because it needs some kind of language recognition system. Those are unreliable in some cases, for example with posts with little text and lots of pictures.

@sneak
Copy link
Contributor

sneak commented Jul 24, 2017

The plan will be to tag each post with the language it's written in, and then the UI will have various filtering options available to the user. This isn't super high priority until UIs have filtering support, though.

I can say with pretty much certainty that UIs won't all handle it in the same way.

@roadscape roadscape added the i18n label Jul 24, 2017
@samupaha
Copy link

I can say with pretty much certainty that UIs won't all handle it in the same way.

What I meant to say is that all UIs should use a standardized way of indicating the language.

If I write a post and mark it as "written in Finnish", all UIs will know that it's in Finnish. There shouldn't be several different ways to mark the language. If I want to filter and to see only posts in Finnish, the result should be same in all UIs.

It's not optimal situation if one UI uses normal tags (like now), another json metadata, and yet another some kind of automatic detection system.

I'd love to see this feature prioritized because the signal-to-noise ratio is already really bad and it's getting worse everyday with new users coming in.

@sneak
Copy link
Contributor

sneak commented Jul 25, 2017

I don't agree that all UIs should work the same way.

@samupaha
Copy link

I don't agree that all UIs should work the same way.

Can you elaborate a little bit?

All I'm proposing is just a basic standard.

If I post an article that's written in Finnish with UI1, all other UIs will recognize it as Finnish. We need a standard to make sure that UI2 don't think it's just some unknown gibberish or UI3 recognize it as Estonian. Of course, it's up to the UI designer what to do with the information (to filter it or not to filter it in some way), but there shouldn't be any need for guessing or risk of misrepresenting what the language is.

@TimCliff
Copy link
Contributor

TimCliff commented Aug 3, 2017

related issue: #1608

@plink01001
Copy link
Contributor

Thank you for the suggestion. We have reviewed your request and will consider implementing it at a later point in time. For now, the issue has been closed and moved to the enhancement-review category. These issues are being reviewed and considered for future development, but are not actively being worked on at this time. We may re-open the issue at a later date, if/when it is ready to be assigned to a developer. We encourage users to open issues that pertain specifically to bugs or existing functionality that does not work correctly or as intended. If you have additional enhancement suggestions, please make them by following the steps in the Guidelines for Contributing.

@sneak sneak reopened this Nov 14, 2017
@sneak
Copy link
Contributor

sneak commented Nov 14, 2017

Sorry to override; this one is on our todo, just back burner.

@sneak
Copy link
Contributor

sneak commented Nov 14, 2017

I would like this to be transparent to the user; it should autodetect the language that is being posted (or default to the browser language) and maybe give the user the option to override.

@TimCliff
Copy link
Contributor

Something to consider along with this issues is support for "right to left" languages such as Hebrew and Arabic.

Suggestion from @aaroncox :
The post itself would need some sort of language identifier probably, then some sort of CSS class could be added onto the body of the page. From there, it’d just be CSS controlling the text alignment.

@shawnpringle
Copy link

I have done this kind of thing for my bot. Here is an excerpt using steem-python:

steem = Steem(keys=[posting_key], nodes=node_list)
steem.commit.post(title, message, site_user_name, permlink=None, tags=tags, json_metadata = {"format":"markdown", "encoding": "utf-8", "languages": [language]}, self_vote=True)

I use "languages" rather than "language" because some posts may be multilingual. The idea needs adoption. Once adopted by the popular platforms you could filter by language. I would prefer to see posts in all the languages supported by the user agent's ACCEPT_LANGUAGE string it sends.

The UI would have to determine somehow what languages the user is using to compose a given post, in a combination of asking and detecting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants