Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve moderation behaviors: show alert/inform sources and improve UX around threads #3677

Merged
merged 12 commits into from
May 23, 2024

Conversation

pfrazee
Copy link
Collaborator

@pfrazee pfrazee commented Apr 24, 2024

  • Don't show account-level severity=alert blurs=none labels on the posts of users unless it's from the app's baked in moderation After some discussion (mainly in the replies here) we decided not to move forward with this.
  • Tune the UI of labels on posts & profiles to show who gave the label
  • Bundle blurred replies into a single element at the bottom

The code for this isn't exactly incredible so I'm open to discussions.

Tune the rendering of blurs=none labels

They now show who applied the label:

On a post On a profile
image image

Group together blurred items in threads

Threads now move blurred items down to the bottom and put them behind a single blur.

Before After
CleanShot 2024-04-23 at 18 05 21@2x CleanShot 2024-04-23 at 18 05 04@2x

If all of the items are blurred by account muting, the blur says "Show muted replies" and will show all of them with one click. If there are other reasons (eg labels) then the blur says "Show hidden replies" and will then show each item under their respective blurs.

To test

Start the dev-env. Log in as bob.test.

  • To test the labeling behavior, subscribe to labeler.test and then view threads.
  • To test the muting behavior, mute various users and then view threads.

Copy link

render bot commented Apr 24, 2024

Copy link

github-actions bot commented Apr 24, 2024

Old size New size Diff
7.3 MB 7.3 MB 2.96 KB (0.04%)

@EstrogenEmpress
Copy link

EstrogenEmpress commented Apr 24, 2024

This is a great iteration on the system as-is, and I definitely think it'll cause a lot less confusion when it comes to the UX being misinterpreted. I assume this also applies to quote posts 'inheriting' labels if the quoted user carries them, like so?

bafkreigbmfmnhuozkjeeegboo4hvfjuffecuiajb7a4cihcf6thydye4ce

Only sticking points with the patch is that sometimes mid-thread, it's useful, say, were someone to be tagged at the account level for a certain behavior, call it Intolerance or Trolling, to name a couple, for labeler subscribers to still be aware of it if they want. Any thoughts on adding an optional toggle in the Moderation > Advanced section that can be set defaulted to OFF that instead reapplies the more aggressive in-thread label behavior as before? Or maybe making this configurable on the labeler's end, just spitballing, nothing too concrete but just getting my thoughts out as a labeler operator. Overall it's a good fix, I'd just worry about folks thinking that the labeler system "broke" because it no longer shows mid-thread, if that makes any sense.

EDIT: I'd also worry that for more moderation-focused labelers, this could potentially spike duplicate reports for accounts that no longer have the obvious flag on them, which isn't necessarily a problem per se but does add additional inflow to labeler queues that could potentially be avoided depending on what way this gets implemented.

@pfrazee
Copy link
Collaborator Author

pfrazee commented Apr 24, 2024

This is a great iteration on the system as-is, and I definitely think it'll cause a lot less confusion when it comes to the UX being misinterpreted. I assume this also applies to quote posts 'inheriting' labels if the quoted user carries them, like so?

This doesn't affect that situation. I believe the issue there is, the UI is applying a blur to the quoted content, but not indicating strongly enough that the blur is about the quote and not the containing post. I think perhaps some explanatory text might help ("The quoted post has been labeled") or perhaps a better visual indication.

Any thoughts on adding an optional toggle

I'm not wholly against it, though I'm wary of preferences profusion. Let me sleep on it.

Since I haven't explained the reasoning for the change in this PR, let me do that now for everyone. The main issue at the moment is that showing the label on every post is acting like a kind of scarlet letter punishment, when the intent was to A) enable folks to filter posts from labeled users, via the hide setting, B) give a way to audit what the labeler is up to, via the warn setting, and C) enable neutral or positive labels, via the warn or show badge setting. Having used it for a bit in prod, I think C is going to likely need some additional thought (perhaps a different kind of custom label), and B is not really that meaningful except perhaps for people running the labeler, which is partly why you're asking for the toggle. Overwhelmingly the effect is to make me view the person in a negative light (scarlet letter effect) which I think may worsen rifts between people.

I'm also inclined to examine how we're showing the warnings on profiles. Rather than directly showing the label, I wonder if we ought to adopt a slightly more neutral phrasing (something like "1 label by X") which you then see in a modal when you tap to open. My general sense is that de-escalation is almost always better, and that often happens by reducing attention focused on things. It's why I think our approach to blocks has been the right call.

I'll hit the quote-post issue in a followup PR, and possibly also the account rendering of the label, and think about the toggle.

@EstrogenEmpress
Copy link

EstrogenEmpress commented Apr 24, 2024

the intent was to A) enable folks to filter posts from labeled users, via the hide setting

Gotcha, this I did not know, it's useful info to have as far as the intended use case of the settings.

enable neutral or positive labels, via the warn or show badge setting

This would be very useful as I know a couple labeler services are experimenting with value-neutral, informational labels (like "verified" tags) and it'd be good to be able to have one that isn't intended solely for content attrition

Overwhelmingly the effect is to make me view the person in a negative light (scarlet letter effect) which I think may worsen rifts between people.

Very true, this has been somewhat upsetting to see as it was also an unintended use case as far as the labeler's end as well, even as it pertains to a moderation-focused service. Ideally these were purely 'caution' signs if someone was exhibiting a behavior pattern and are intended to be appealed when there's a cessation of said behavior but that doesn't seem to be how folks are using our systems currently, will advise our operator team on maybe putting out some documentation on this

Rather than directly showing the label, I wonder if we ought to adopt a slightly more neutral phrasing (something like "1 label by X") which you then see in a modal when you tap to open.

On this one I would offer some pushback if only because when neutral or positive affect labels are assigned (eventually), it's an additional step to finding said info and, depending on the use case, people may see "1 label by [x]" and if it's primarily a moderation focused labeler, may misattribute that to effectively replicate the unintended Scarlet Letter effect you mentioned (i.e. "there's A Label on this account and therefore it must be a bad one")

It's why I think our approach to blocks has been the right call.

Wholeheartedly agree, the complete cauterization of info flow is very useful at deterring that sort of unpleasant behavior and it's one of the best changes you've implemented over other iterations of the feature

@pfrazee
Copy link
Collaborator Author

pfrazee commented Apr 24, 2024

What have we wrought

CleanShot 2024-04-24 at 13 53 01@2x

We're going to need to tweak this PR to continue supporting that. Some scattered thoughts.

  • We could tweak it so severity=inform labels always show on posts but severity=alert labels don't. (To answer your question above about the "1 label by [x]" on the profile, I'd do the same there.)
  • We really ought to be showing who the badge is from. This is unfortunate because also...
  • The badge is taking up a fair amount of space right now.

I'm going to play with it a bit.

Ideally these were purely 'caution' signs if someone was exhibiting a behavior pattern

Yeah I really don't know if we can make that usecase work because of the realities of how it comes off. I think labeling an account entirely is useful for the "hide" behavior but showing any kind of warning on the posts is just a lot. I was debating a small ⚠️ icon which expands to show the alert labels, but in some ways that's even worse.

On this one I would offer some pushback if only because when neutral or positive affect labels are assigned (eventually), it's an additional step to finding said info and, depending on the use case, people may see "1 label by [x]" and if it's primarily a moderation focused labeler, may misattribute that to effectively replicate the unintended Scarlet Letter effect you mentioned (i.e. "there's A Label on this account and therefore it must be a bad one")

Yes I agree, and I'm inclined to make the neutral or positive ones show fully and only negative ones go behind a grouping. I might special case some of our official labels though -- impersonation for instance is a pretty important badge to be showing.

I think there are very useful and fun things that can be done with neutral & positive labels. The "verified" badge is a bit... hm. It might work if we indicate who did it, and I quite like delegating the power of verification. The question is whether it can go systemically awry, but for labeling I have a strong philosophy of allowing some risk of downside in order to give room to the potential for upside.

@pfrazee
Copy link
Collaborator Author

pfrazee commented Apr 25, 2024

Tuned label rendering to be a little smaller, a little roumder, and to show who applied the label.

On posts:

Feed Expanded
CleanShot 2024-04-25 at 11 16 40@2x CleanShot 2024-04-25 at 11 17 01@2x

On the account:

CleanShot 2024-04-25 at 11 21 25@2x

@pfrazee
Copy link
Collaborator Author

pfrazee commented Apr 25, 2024

Modified the logic of when a label shows. This is the comment in the code:

The issue we have with labels on accounts is that 'negative' labels are showing everywhere, acting as a kind of "scarlet letter" punishment, when their intent is to just enable users to hide other users that are causing issues. Labelers don't have a way to express that an account-level label shouldnt show on every post.

However, there are some cases where we really do want to show the labels:

  1. When the label is informational or positive (like "Verified")
  2. When the label is crucial (like "Impersonation")

The solution we're applying FOR NOW is to hide severity=alert labels on accounts when looking at posts unless they're from the app's baked in moderation.

The labeling system will need to be expanded to improve this situation. See bluesky-social/atproto#2444

@aetaric
Copy link

aetaric commented Apr 25, 2024

Hey Paul,

I think there's actually some really good value to the "scarlet letter" effect.
example: at://did:plc:p3dhh5eapbqv6zugzvq7itkm/app.bsky.feed.post/3kqy2oppfts2p

This kinda behavior is exactly what negative labels are meant to prevent, someone jumping in a thread and being overtly problematic and is the entire point of the account. This change would also seemingly make lists more powerful than labels since lists would still have the scarlet letter effect to them. Yes, some labels (and lists) are negative. And they probably should be. It's a warning sign to people that would be affected by the behavior in question.

Additionally, even with positive or neutral labels, labels still seem to have a scarlet letter effect: https://bsky.app/profile/erinbiba.bsky.social/post/3kqygolvjks27
maybe this is just because positive/neutral labels are "newer" than negative labels, but it's happening.

@pfrazee
Copy link
Collaborator Author

pfrazee commented Apr 26, 2024

Hm! Challenging issue.

I personally believe in de-escalation and separation in the context of social networks. The way I see it, social has a natural draw towards conflict. To fight that, I think we have to err toward de-escalation.

My concern is that the utility of the scarlet letter is less than its potential to induce conflict. It's extremely embarrassing to be tagged like that, and I don't want to create an atmosphere of aggregate grievance. My ideal would be: when somebody crosses the line with a community of users, they get removed from the space and that's the end of it. My fear is: we scarlet letter a bunch of people and their grievance grows into a backlash.

To some degree I think we may need to answer this by giving more control to the labelers, so they can make these calls, but I also don't want to pass the buck by just pushing it onto others. I figure I should at least weigh in as a participant before I default to "yall decide."

If the full answer really is "give this power to the labelers to decide," then my immediate question is -- should we leave it until the labelers have that power (which will take more dev work, so it'll be a bit) or should we enact this change in the interim?

@EstrogenEmpress
Copy link

EstrogenEmpress commented Apr 26, 2024

This is gonna be a doozy of a writeup but I have some thoughts on this mechanically.

I think there's a lot of good points here, but if I can be frank, I think a lot of this is through the lens of a broader top-down approach when I believe that looking at this from an end user's angle is possibly more beneficial to understanding how this dynamic evolved and how the divergence in use cases emerged.

Let's say, for the sake of argument, you are a Turbo Dude. Yes I'm using your own title, but bear with me. Let's say Turbo Dudes are often brigaded by extremely vitriolic and hateful people, solely because they are Dudes who are Turbo. This is an undesirable interaction for you as a user, so you seek out ways to mitigate your exposure or contact with, well, let's call them Dudephobes. A labeler is developed by another Turbo Dude that says it will place a Dudephobia warning on these sorts of people ahead of time so that you might see that much less, provided they do their job well. You subscribe to said labeler, on the default settings (Warn) and so now, whenever you wander into a thread, there's now visible warning signs on users who are Dudephobes. You have a couple of options here as a user. You can:

  1. Leave it set to Warn and just give these guys a wide berth when it comes to your interactions with or around them, if at all (anecdotal evidence shows most users will leave most everything on Warn, perhaps out of morbid curiosity or out of an interest of seeing if someone got falsely flagged, which does happen). From there they tend to apply individual blocks as-needed.
  2. Set it to Hide (anecdotal data has found that very few people are actively doing this) and never see a Dudephobe again, even though they definitely exist, can still see your profile, and still harass you in now-hidden comments but only for your app view. We'll note that in the above example, it can sometimes create this effect where folks, well intentioned or otherwise, engage the Dudephobe in a hostile manner, effectively escalating said situation and creating a firestorm in your replies that, because of your settings, you have no idea is happening.
  3. Use the labeler's unique hybrid moderation model and set the Dudephobes moderation list to Block, effectively cauterizing that interaction and creating the community expulsion feature you mentioned. Anecdotal data shows this is the least likely course of action, probably because, again, users are incentivized towards conflict because of both past and present experiences on social networks.

Let's sit with that last point because I think it also paints a salient picture of why, for instance, the most popular labeler services at present are designed to remove content from a user's timeline, rather than categorize it on a more neutral basis:

For a certain subset of users, typically marginalized users, they are not seeking de-escalation, per se. They're not looking for a fight, either, though they'll definitely put one up if backed into a corner. They want a walled garden. The Posting Garden of Eden. It's why browser extensions like Shinigami Eyes and Soupcan exist and are very popular with this subset - people do not want certain folks on their feeds, and will actively seek out methods to identify and excise them, and this is the exact demographic which moderation labelers like Aegis, Taurus, etc. serve and, in their hopes, protect.

It's also one of the things myself and other operators of the Aegis system have ruminated on in the past month - if the moderation provided at the site level was sufficient to meet this need for a safe space online, our service would not be one of the most popular labelers at this point in time, nor would any moderation labeler. In other words, users are demonstrating that they want MORE controls and measures to lock down their experience, rather than open it up. This isn't just in our cluster, either, the system is utilized by everyone from newskies to 100K+ follower accounts, as our reports and posts showing off the labels in action have shown.

This also explains why the Scarlet Letter effect kicks in, or rather why it seems to escalate tensions. If you consider the user profile of someone who, for instance, gets an account level label of Dudephobe applied, this typically happens not because they were sitting in their own cluster, minding their own business - labelers with human driven operation don't often "see" content like this outside of their scope. They tend to get these labels because they demonstrate a repeat pattern of engaging in unwanted ways with communities who do not want to deal with them - thus the labels, for some users, are this:

ideal would be: when somebody crosses the line with a community of users, they get removed from the space and that's the end of it.

This is sort of the same dynamic that emerged with moderation lists, but with the ability to apply them with more precision (i.e. labels can go on the post level, lists can't): a few trusted services will emerge, reach wide adoption, at which point inclusion in that label set is, yes, an effective Excommunicado from a community. I also personally remember similar pushback with moderation lists when they first rolled out in May. But now I'm a member of over 250 moderation lists, and it doesn't quite phase me as much as it did way back when. So I guess my question, then, is this:

What, specifically, is the envisioned distinction between labels and moderation lists when it comes to their application to accounts?

Because if we think of them mechanically, they're identical right now at the account level - they apply to and mask everything a user posts, are indicated on the profile if someone is using the labeler or list, and are invisible to the user themselves otherwise, save for third party services or subscribing to the list/labeler yourself. They both lack the precision of a post level label, and both carry the weight, when combined with trust and adoption, of an effective scarlet letter, regardless of the labeler's intent (for the record, ours is one focused on disincentivizing that behavior by leveraging the label's perceived social power against the offending user - we often do appeal these when it's evident that the tagged user hasn't reoffended or expresses an intent to make amends in their appeal).

So what's causing the increase in perceived pushback? Perhaps looking at similar user sentiments in past PR's for mod lists back at their initial rollout could give us some perspective on the conundrum we're facing now. Maybe it's the misconception that labeler service labels are "official" or direct from Bluesky - in that circumstance, the suggestion to include the attribution of which labeler applied it in the UI itself might be the way to go. Or maybe it's just growing pains. Who knows.

Anyways, I rambled on quite a bit but I figured an extended peek under the hood and end user perspective might be beneficial at having a more holistic view of the situation while you weigh what path you wanna pursue.

@bnewbold
Copy link
Contributor

My quick takes on the proposal, not replying to any of the other discussion:

  • big plus one to grouping the muted posts at the bottom
  • adding attribution to badge-labels is also good. we might need to use the handle of labeler account to avoid impersonation at some point? and/or have short vs. long names; want a very short name for this badge attribution
  • I don't love special-casing our mod service, and it smells like something that could stick around indefinitely
  • the current uses under mod policy that I can think off the top of my head are impersonation, spam, inauthentic, and misleading/scam. that is more than one label, which makes it feel like more of a category/pattern that just a one-off need.
  • I do think that generally account-level severity=alert / blurs=none should show on every post. the use-cases for that feel legit to me, and if it is being weaponized we should try to mitigate that another way (I haven't read the threads above which I think touch on this)
  • this is unearthing old complexity skeletons, but isn't this just the profile-record vs account-level distinction?
  • my intuition/preference would be to find a way to have the account alert badge still apply on every post, but have it visually be more obvious that it is account-level not post-level. and make the "inform" vs "alert" distinction clearer. a warning on the avatar has been thrown around, I think we don't love that? or something more "above the fold" or upper-right-corner. this UI design is hard to get polished/a11y/etc; if we don't have time to design this right now, we could keep the existing behavior and recommend/guide/document that folks name the account-level labels "Account Blah" to help make it clearer it isn't a post-level label? Not great.

Big picture, i'm not a hard-block on the proposal. I know paul has wrangled with this deeply and trust direction on this. I do think that we are still in the "how will folks use labels" phase; I think we definitely get to iterate on the semantics/behaviors/capabilities at least once in response to real-world use, but don't want to churn on it too much? Maybe this is novel enough to warrant a bunch of iteration though. 🤔

@aetaric
Copy link

aetaric commented Apr 26, 2024

this is unearthing old complexity skeletons, but isn't this just the profile-record vs account-level distinction?

Yeah, as it currently works, profile-level labels only show on the profile and not on posts. Account-level labels show on all posts and the profile.

my intuition/preference would be to find a way to have the account alert badge still apply on every post, but have it visually be more obvious that it is account-level not post-level. and make the "inform" vs "alert" distinction clearer. a warning on the avatar has been thrown around, I think we don't love that? or something more "above the fold" or upper-right-corner. this UI design is hard to get polished/a11y/etc; if we don't have time to design this right now, we could keep the existing behavior and recommend/guide/document that folks name the account-level labels "Account Blah" to help make it clearer it isn't a post-level label? Not great.

Ideally, playing off this last point, one could easily see a solution in noting that a label is done at the account level since that would be easy to see programatically. Would be pretty messy to do something like that on the ozone-side where labeler operators would basically have to duplicate their entire label set... Could keep the same format too Account: label-display-name

@pfrazee
Copy link
Collaborator Author

pfrazee commented Apr 26, 2024

@EstrogenEmpress and @aetaric I appreciate yall sharing your thoughts about this. My beliefs about de-escalation are real -- but I'm not at all certain that showing the alert badges on posts is a net negative. I'm not the one experiencing the outcomes or living the same stakes. If you tell me showing the alert badges is a net positive, that's the feedback I'm looking for and I'm going to follow it! And if you end up changing your mind, I want to hear about that too!

I think for sure it's worth seeing how the other changes in this PR affect things first.

@pfrazee pfrazee changed the title Improve moderation behaviors: reduce account-level alerts and improve UX around threads Improve moderation behaviors: show alert/inform sources and improve UX around threads Apr 29, 2024
@agentjabsco
Copy link

an idea I wanted to float here: what if moderation service's labels included a tiny icon of the moderation service's profile pic at the end of the bubble? you don't have to worry about impersonation since the user only sees labels for the services they've opted into anyway

@surfdude29
Copy link
Contributor

an idea I wanted to float here: what if moderation service's labels included a tiny icon of the moderation service's profile pic at the end of the bubble? you don't have to worry about impersonation since the user only sees labels for the services they've opted into anyway

I like this idea too and Paul seemed open to considering it:

https://bsky.app/profile/pfrazee.com/post/3kqxs3fubo22i

@gaearon gaearon merged commit f7ee532 into main May 23, 2024
6 checks passed
@gaearon gaearon deleted the paul/mod-behavior-updates branch May 23, 2024 23:39
estrattonbailey added a commit that referenced this pull request May 28, 2024
* origin/main: (392 commits)
  Remove old onboarding (#4224)
  Replace getAgent() with reading agent (#4243)
  Bump 1.85.0 (#4237)
  bump iOS target to `14.0` (#4238)
  set `onEndReachedThreshold` to `2` for notifications (#4235)
  Run intl extract (#4217)
  Updated Japanese translation (#4144)
  Updated Chinese translation (#4147)
  Update Korean localization (#4148)
  Update catalan messages.po (#4149)
  Update Indonesian translation (#4165)
  [🐴] update convo list from message bus (#4189)
  Recover from initial failed firehose state (#4211)
  Move ALT indicator right and shrink it a bit (#4213)
  Make sure failed messages enter error state (#4210)
  [🐴] Don't submit the message on return press when on a phone (web input) (#4203)
  Include feedContext in DOM as data- (#4206)
  Improve moderation behaviors: show alert/inform sources and improve UX around threads (#3677)
  Privileged app passwords (#4200)
  [🐴] Overfetch follow for default new dialog state (#4205)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants