Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove language settings #12502

Merged
merged 4 commits into from Feb 4, 2021
Merged

Conversation

citizen428
Copy link
Contributor

@citizen428 citizen428 commented Feb 1, 2021

What type of PR is this? (check all applicable)

  • Refactor

Description

While doing research for an RFC, I found some unused/unfinished code and we decided to clean that up.

This PR does the following:

  1. Remove language_settings and all related code. Even though this was marked as "beta", it seems a bit too unfinished to provide any real benefit. The list of 7 languages seems very arbitrary: for example, we included only 4 out of the 10 most spoken languages, with Japanese being the only non-European one. If we revisit this future we probably also want to design the data model a bit better and have a separate table for this, not a JSONB column on users.
  2. Usage of the language column on Article. It seems it was primary based on scoring an article's language against the user's preferred language(s), so with this feature gone it doesn't make sense to keep around.

This whole topic can and probably should be revisited in the future, but with a proper RFC.

There will also be follow-up PRs to ignore and remove the two relevant columns.

Questions:

  • @mstruve I removed the relevant attributes from ES indexing, but IIRC we can't remove them from the mapping, is that correct?

Related Tickets & Documents

#12356

QA Instructions, Screenshots, Recordings

While this is a big change, it should primarily be covered by existing specs.

For manual QA you can visit settings/customization and browse some articles and the feed to ensure everything is still working.

UI accessibility concerns?

None

Added tests?

  • Yes, updated/removed existing ones

Added to documentation?

@pr-triage pr-triage bot added the PR: draft bot applied label for PR's that are a work in progress label Feb 1, 2021
@citizen428 citizen428 marked this pull request as ready for review February 1, 2021 04:39
@citizen428 citizen428 requested a review from a team February 1, 2021 04:39
@citizen428 citizen428 requested a review from a team as a code owner February 1, 2021 04:39
@pr-triage pr-triage bot removed the PR: draft bot applied label for PR's that are a work in progress label Feb 1, 2021
@citizen428
Copy link
Contributor Author

@vaidehijoshi I explicitly requested your review here to make 100% sure that removing this is ok from a product perspective.

Copy link
Contributor

@rhymes rhymes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! Definitely in favor of removing unfinished features

Should we add language to Article.ignored_columns ?

@citizen428
Copy link
Contributor Author

citizen428 commented Feb 2, 2021

Should we add language to Article.ignored_columns ?

We can't really ignore it as long as code that potentially uses it is still in the app, that would lead to a NoMethodError.

Article.ignored_columns << "language"
Article.first.language
# NoMethodError: undefined method `language' for #<Article:0x0000560f946f6dd8>

So to play it safe in regards to AR column caching I wanted to first remove all the code interacting with this, before ignoring the column. See original PR description:

There will also be follow-up PRs to ignore and remove the two relevant columns.

Maybe I'm a bit too paranoid here, but we're dealing with our two core models, Article and User here.

@rhymes rhymes changed the title WIP: Remove language settings Remove language settings Feb 2, 2021
@pr-triage pr-triage bot added the PR: unreviewed bot applied label for PR's with no review label Feb 2, 2021
Copy link
Contributor

@rhymes rhymes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's do this

@pr-triage pr-triage bot added PR: partially-approved bot applied label for PR's where a single reviewer approves changes and removed PR: unreviewed bot applied label for PR's with no review labels Feb 2, 2021
Copy link
Contributor

@jacobherrington jacobherrington left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me, and we can ignore the column as a next step if that feels safe.

I think we need @vaidehijoshi to 👍 this change before we merge, but FWIW I think we should remove it.

@mstruve
Copy link
Contributor

mstruve commented Feb 2, 2021

@mstruve I removed the relevant attributes from ES indexing, but IIRC we can't remove them from the mapping, is that correct?

That is correct. You can actually remove the field from our .json file if you want but it will still exist in the mapping on Elasticsearch. In this case since we dont have much data its not a huge deal to have an unused field. If we had a bunch of them and lots of data then we might consider reindexing to a new index.

I would suggest removing it from the mapping or placing a comment that it is unused so people who see it later on know what the deal is with it. It's always confusing when you find a field in ES and you can't find it in the codebase anywhere and then you wonder why and how it got there.

@vaidehijoshi
Copy link
Contributor

vaidehijoshi commented Feb 3, 2021

I let this marinate in my mind all day today, and went back and forth on whether removing this was the right call or not. While the language_settings concept is certainly half-baked, it wasn't immediately obvious to me from a product sense whether we needed to actually remove it or not right now.

As a thought exercise with @lisasy, I asked myself the following: "If we were starting with a brand new Forem tomorrow, would this feature be immediately useful?"*

If we were starting with a brand new Forem that was just spun up, I don't think that the "filter content by language" setting is one that would be a big sell or be that actively used. The more that I think about it, the more apparent it is that this is a feature that is useful to Forem instances with:

  1. a significant amount of content
  2. content in more than one language

I would say that this doesn't apply to the majority of Forems right now ("right now" being very key words here). An out-of-the-box Forem would likely not need a "filter content based on selected language" feature. However, I could definitely see this being something we'd need to provide as Forems become more robust, with more users from around the world, who speak + write content in different languages. But before that, a Forem Creator would likely be concerned with growing that community and getting that content to begin with.

So, my answer to that original thought exercise was "No". In fact, at the moment, this feels like a very DEV-specific feature, as DEV is an example of a Forem that fulfills requirements 1) and 2).

I would be okay with us coming back to a "language preference filtering" feature like this one when the time is right -- and with all Forems in mind. That moment would be the right one to re-evaluate the data model, and consider which languages are actually in demand across the Forem ecosystem. I lean towards us removing this feature for now, and circling back to it when the time is right (when we have a demand for it across our Forems, meaning that we have Forems that have grown enough to the point where we can see enough variety of content, in different languages).

Two relevant pieces of information I pulled while letting this marinate:

  1. Total number of articles on DEV with languages set.
  2. Total number of users on DEV with preferred languages set.

To this end, I think it's important that we think about how making this change will impact the product experience. I'd like us to write a forem.dev post if we deprecate this feature with some reasoning as to why we decided to remove this (admittedly, the current state of this feature isn't useful, and I think that's a good thing to note), and our hope to revisit it in the future. Given the number of users who have a language setting other than English, I think we owe it to them to explain why we're deprecating this feature after having it in beta for so long. I think if we can plan to do that, then I can get onboard with removing the language_settings concept until we come back it at the right time down the road. 👍🏽

*Obviously, we can't ask ourselves only this question when making all product decisions, but it seemed like the right question given our current goals.

@citizen428
Copy link
Contributor Author

citizen428 commented Feb 3, 2021

it wasn't immediately obvious to me from a product sense whether we needed to actually remove it or not right now.

I have a potential one: the list of available languages is extremely arbitrary and heavily skewed towards European languages. The only one that's not is Japanese, which is less spoken than e.g. Mandarin, Hindi, or Indonesian. So for every person we make happy with this feature, we potentially disgruntle many more.

Total number of articles on DEV with languages set.

The top 2 are "en" and "null". If we exclude these two there are 30,033 articles that have a language set, out of 475,073. That's about 6.3%, a very low number for a Forem instance that already fulfills both your criteria. Though I'd wager that this number could be significantly higher if we had a better implementation and supported more than 7 seemingly arbitrarily chosen languages. That's why I propose starting over from scratch with an actual RFC.

Total number of users on DEV with preferred languages set.

For non-English that's 12,675 out of 561,408 or 2.25% percent. Same comment as above, I believe this number could be much higher with a better implementation.

I'd like us to write a forem.dev post if we deprecate this feature with some reasoning as to why we decided to remove this

Who is "us" here, me or you? 😃

@citizen428 citizen428 marked this pull request as draft February 3, 2021 03:27
@citizen428
Copy link
Contributor Author

In the meantime, I'm setting this PR back to draft to prevent accidental merges until the above product discussion is resolved.

@citizen428
Copy link
Contributor Author

I would suggest removing it from the mapping

@mstruve

@pr-triage pr-triage bot removed the PR: partially-approved bot applied label for PR's where a single reviewer approves changes label Feb 3, 2021
@pr-triage pr-triage bot added the PR: draft bot applied label for PR's that are a work in progress label Feb 3, 2021
@rhymes
Copy link
Contributor

rhymes commented Feb 3, 2021

Two relevant pieces of information I pulled while letting this marinate:

Total number of articles on DEV with languages set.

I took the liberty to fork that query to only view published articles https://dev.to/admin/blazer/queries/318-published-article-language-totals

@vaidehijoshi
Copy link
Contributor

vaidehijoshi commented Feb 3, 2021

I have a potential one: the list of available languages is extremely arbitrary and heavily skewed towards European languages. The only one that's not is Japanese, which is less spoken than e.g. Mandarin, Hindi, or Indonesian. So for every person we make happy with this feature, we potentially disgruntle many more.

Yes, I totally see this point. I definitely am not asserting that the languages that we originally chose for this feature are fully inclusive, by any means. 😓

The top 2 are "en" and "null". If we exclude these two there are 30,033 articles that have a language set, out of 475,073. That's about 6.3%, a very low number for a Forem instance that already fulfills both your criteria. Though I'd wager that this number could be significantly higher if we had a better implementation and supported more than 7 seemingly arbitrarily chosen languages. That's why I propose starting over from scratch with an actual RFC.

For non-English that's 12,675 out of 561,408 or 2.25% percent. Same comment as above, I believe this number could be much higher with a better implementation.

Yep, I realize the overall percentages here are low. And I fully acknowledge that the feature is not widely used (probably because it is half-baked 🙃).

But I do think it means something that even 2 percent had a language set. Perhaps some of those users were power users who joined in the early days and were excited by us rolling out beta features for language support. I know the feature is half-baked, but we did release it at some point, and there were some folks who have a language set, so it is a feature that a small percentage of users expect to "do something", even if it is a small percentage.

To be clear, the reason I wrote up my previous comment with my thought process was not because I disagreed with what this PR does. 😃 From a technical perspective, it makes total sense. But, I also wanted to consider what it would feel like for those 2% of users who might notice that we suddenly dropped multiple-language settings without any explanation. As our team is predominantly English-speaking/anglocentric, I worried that dropping this feature without at least acknowledging the reasoning behind it (and the intent to return to it) would send a message of, "Oh, we don't care about multiple language support anymore!", which I know is not the case as many people on our team care deeply about it. Hence, my suggestion to write a forem.dev post on this change.

I'd like us to write a forem.dev post if we deprecate this feature with some reasoning as to why we decided to remove this

Who is "us" here, me or you? 😃

I'm happy to whip up a draft so as not to block you on this any longer :)

Copy link
Contributor

@vaidehijoshi vaidehijoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This beta feature deprecation has been shared with the greater community, so I think we're go to go! 🚀

@citizen428 citizen428 marked this pull request as ready for review February 4, 2021 01:27
@pr-triage pr-triage bot added PR: partially-approved bot applied label for PR's where a single reviewer approves changes and removed PR: draft bot applied label for PR's that are a work in progress labels Feb 4, 2021
@citizen428
Copy link
Contributor Author

citizen428 commented Feb 4, 2021

Thanks for the context @vaidehijoshi, makes perfect sense. And also thanks for writing this post!

@citizen428 citizen428 merged commit e258e36 into master Feb 4, 2021
@citizen428 citizen428 deleted the citizen428/remove-language-settings branch February 4, 2021 01:35
@pr-triage pr-triage bot added PR: merged bot applied label for PR's that are merged and removed PR: partially-approved bot applied label for PR's where a single reviewer approves changes labels Feb 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: merged bot applied label for PR's that are merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants