-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve moderation behaviors: show alert/inform sources and improve UX around threads #3677
Conversation
Your Render PR Server URL is https://social-app-pr-3677.onrender.com. Follow its progress at https://dashboard.render.com/web/srv-cok5nbed3nmc73ajcuug. |
|
This is a great iteration on the system as-is, and I definitely think it'll cause a lot less confusion when it comes to the UX being misinterpreted. I assume this also applies to quote posts 'inheriting' labels if the quoted user carries them, like so? Only sticking points with the patch is that sometimes mid-thread, it's useful, say, were someone to be tagged at the account level for a certain behavior, call it Intolerance or Trolling, to name a couple, for labeler subscribers to still be aware of it if they want. Any thoughts on adding an optional toggle in the Moderation > Advanced section that can be set defaulted to OFF that instead reapplies the more aggressive in-thread label behavior as before? Or maybe making this configurable on the labeler's end, just spitballing, nothing too concrete but just getting my thoughts out as a labeler operator. Overall it's a good fix, I'd just worry about folks thinking that the labeler system "broke" because it no longer shows mid-thread, if that makes any sense. EDIT: I'd also worry that for more moderation-focused labelers, this could potentially spike duplicate reports for accounts that no longer have the obvious flag on them, which isn't necessarily a problem per se but does add additional inflow to labeler queues that could potentially be avoided depending on what way this gets implemented. |
This doesn't affect that situation. I believe the issue there is, the UI is applying a blur to the quoted content, but not indicating strongly enough that the blur is about the quote and not the containing post. I think perhaps some explanatory text might help ("The quoted post has been labeled") or perhaps a better visual indication.
I'm not wholly against it, though I'm wary of preferences profusion. Let me sleep on it. Since I haven't explained the reasoning for the change in this PR, let me do that now for everyone. The main issue at the moment is that showing the label on every post is acting like a kind of scarlet letter punishment, when the intent was to A) enable folks to filter posts from labeled users, via the hide setting, B) give a way to audit what the labeler is up to, via the warn setting, and C) enable neutral or positive labels, via the warn or show badge setting. Having used it for a bit in prod, I think C is going to likely need some additional thought (perhaps a different kind of custom label), and B is not really that meaningful except perhaps for people running the labeler, which is partly why you're asking for the toggle. Overwhelmingly the effect is to make me view the person in a negative light (scarlet letter effect) which I think may worsen rifts between people. I'm also inclined to examine how we're showing the warnings on profiles. Rather than directly showing the label, I wonder if we ought to adopt a slightly more neutral phrasing (something like "1 label by X") which you then see in a modal when you tap to open. My general sense is that de-escalation is almost always better, and that often happens by reducing attention focused on things. It's why I think our approach to blocks has been the right call. I'll hit the quote-post issue in a followup PR, and possibly also the account rendering of the label, and think about the toggle. |
Gotcha, this I did not know, it's useful info to have as far as the intended use case of the settings.
This would be very useful as I know a couple labeler services are experimenting with value-neutral, informational labels (like "verified" tags) and it'd be good to be able to have one that isn't intended solely for content attrition
Very true, this has been somewhat upsetting to see as it was also an unintended use case as far as the labeler's end as well, even as it pertains to a moderation-focused service. Ideally these were purely 'caution' signs if someone was exhibiting a behavior pattern and are intended to be appealed when there's a cessation of said behavior but that doesn't seem to be how folks are using our systems currently, will advise our operator team on maybe putting out some documentation on this
On this one I would offer some pushback if only because when neutral or positive affect labels are assigned (eventually), it's an additional step to finding said info and, depending on the use case, people may see "1 label by [x]" and if it's primarily a moderation focused labeler, may misattribute that to effectively replicate the unintended Scarlet Letter effect you mentioned (i.e. "there's A Label on this account and therefore it must be a bad one")
Wholeheartedly agree, the complete cauterization of info flow is very useful at deterring that sort of unpleasant behavior and it's one of the best changes you've implemented over other iterations of the feature |
What have we wrought We're going to need to tweak this PR to continue supporting that. Some scattered thoughts.
I'm going to play with it a bit.
Yeah I really don't know if we can make that usecase work because of the realities of how it comes off. I think labeling an account entirely is useful for the "hide" behavior but showing any kind of warning on the posts is just a lot. I was debating a small
Yes I agree, and I'm inclined to make the neutral or positive ones show fully and only negative ones go behind a grouping. I might special case some of our official labels though -- impersonation for instance is a pretty important badge to be showing. I think there are very useful and fun things that can be done with neutral & positive labels. The "verified" badge is a bit... hm. It might work if we indicate who did it, and I quite like delegating the power of verification. The question is whether it can go systemically awry, but for labeling I have a strong philosophy of allowing some risk of downside in order to give room to the potential for upside. |
… the source of a label
Modified the logic of when a label shows. This is the comment in the code:
|
Hey Paul, I think there's actually some really good value to the "scarlet letter" effect. This kinda behavior is exactly what negative labels are meant to prevent, someone jumping in a thread and being overtly problematic and is the entire point of the account. This change would also seemingly make lists more powerful than labels since lists would still have the scarlet letter effect to them. Yes, some labels (and lists) are negative. And they probably should be. It's a warning sign to people that would be affected by the behavior in question. Additionally, even with positive or neutral labels, labels still seem to have a scarlet letter effect: https://bsky.app/profile/erinbiba.bsky.social/post/3kqygolvjks27 |
Hm! Challenging issue. I personally believe in de-escalation and separation in the context of social networks. The way I see it, social has a natural draw towards conflict. To fight that, I think we have to err toward de-escalation. My concern is that the utility of the scarlet letter is less than its potential to induce conflict. It's extremely embarrassing to be tagged like that, and I don't want to create an atmosphere of aggregate grievance. My ideal would be: when somebody crosses the line with a community of users, they get removed from the space and that's the end of it. My fear is: we scarlet letter a bunch of people and their grievance grows into a backlash. To some degree I think we may need to answer this by giving more control to the labelers, so they can make these calls, but I also don't want to pass the buck by just pushing it onto others. I figure I should at least weigh in as a participant before I default to "yall decide." If the full answer really is "give this power to the labelers to decide," then my immediate question is -- should we leave it until the labelers have that power (which will take more dev work, so it'll be a bit) or should we enact this change in the interim? |
This is gonna be a doozy of a writeup but I have some thoughts on this mechanically. I think there's a lot of good points here, but if I can be frank, I think a lot of this is through the lens of a broader top-down approach when I believe that looking at this from an end user's angle is possibly more beneficial to understanding how this dynamic evolved and how the divergence in use cases emerged. Let's say, for the sake of argument, you are a Turbo Dude. Yes I'm using your own title, but bear with me. Let's say Turbo Dudes are often brigaded by extremely vitriolic and hateful people, solely because they are Dudes who are Turbo. This is an undesirable interaction for you as a user, so you seek out ways to mitigate your exposure or contact with, well, let's call them Dudephobes. A labeler is developed by another Turbo Dude that says it will place a Dudephobia warning on these sorts of people ahead of time so that you might see that much less, provided they do their job well. You subscribe to said labeler, on the default settings (Warn) and so now, whenever you wander into a thread, there's now visible warning signs on users who are Dudephobes. You have a couple of options here as a user. You can:
Let's sit with that last point because I think it also paints a salient picture of why, for instance, the most popular labeler services at present are designed to remove content from a user's timeline, rather than categorize it on a more neutral basis: For a certain subset of users, typically marginalized users, they are not seeking de-escalation, per se. They're not looking for a fight, either, though they'll definitely put one up if backed into a corner. They want a walled garden. The Posting Garden of Eden. It's why browser extensions like Shinigami Eyes and Soupcan exist and are very popular with this subset - people do not want certain folks on their feeds, and will actively seek out methods to identify and excise them, and this is the exact demographic which moderation labelers like Aegis, Taurus, etc. serve and, in their hopes, protect. It's also one of the things myself and other operators of the Aegis system have ruminated on in the past month - if the moderation provided at the site level was sufficient to meet this need for a safe space online, our service would not be one of the most popular labelers at this point in time, nor would any moderation labeler. In other words, users are demonstrating that they want MORE controls and measures to lock down their experience, rather than open it up. This isn't just in our cluster, either, the system is utilized by everyone from newskies to 100K+ follower accounts, as our reports and posts showing off the labels in action have shown. This also explains why the Scarlet Letter effect kicks in, or rather why it seems to escalate tensions. If you consider the user profile of someone who, for instance, gets an account level label of Dudephobe applied, this typically happens not because they were sitting in their own cluster, minding their own business - labelers with human driven operation don't often "see" content like this outside of their scope. They tend to get these labels because they demonstrate a repeat pattern of engaging in unwanted ways with communities who do not want to deal with them - thus the labels, for some users, are this:
This is sort of the same dynamic that emerged with moderation lists, but with the ability to apply them with more precision (i.e. labels can go on the post level, lists can't): a few trusted services will emerge, reach wide adoption, at which point inclusion in that label set is, yes, an effective Excommunicado from a community. I also personally remember similar pushback with moderation lists when they first rolled out in May. But now I'm a member of over 250 moderation lists, and it doesn't quite phase me as much as it did way back when. So I guess my question, then, is this: What, specifically, is the envisioned distinction between labels and moderation lists when it comes to their application to accounts? Because if we think of them mechanically, they're identical right now at the account level - they apply to and mask everything a user posts, are indicated on the profile if someone is using the labeler or list, and are invisible to the user themselves otherwise, save for third party services or subscribing to the list/labeler yourself. They both lack the precision of a post level label, and both carry the weight, when combined with trust and adoption, of an effective scarlet letter, regardless of the labeler's intent (for the record, ours is one focused on disincentivizing that behavior by leveraging the label's perceived social power against the offending user - we often do appeal these when it's evident that the tagged user hasn't reoffended or expresses an intent to make amends in their appeal). So what's causing the increase in perceived pushback? Perhaps looking at similar user sentiments in past PR's for mod lists back at their initial rollout could give us some perspective on the conundrum we're facing now. Maybe it's the misconception that labeler service labels are "official" or direct from Bluesky - in that circumstance, the suggestion to include the attribution of which labeler applied it in the UI itself might be the way to go. Or maybe it's just growing pains. Who knows. Anyways, I rambled on quite a bit but I figured an extended peek under the hood and end user perspective might be beneficial at having a more holistic view of the situation while you weigh what path you wanna pursue. |
My quick takes on the proposal, not replying to any of the other discussion:
Big picture, i'm not a hard-block on the proposal. I know paul has wrangled with this deeply and trust direction on this. I do think that we are still in the "how will folks use labels" phase; I think we definitely get to iterate on the semantics/behaviors/capabilities at least once in response to real-world use, but don't want to churn on it too much? Maybe this is novel enough to warrant a bunch of iteration though. 🤔 |
Yeah, as it currently works, profile-level labels only show on the profile and not on posts. Account-level labels show on all posts and the profile.
Ideally, playing off this last point, one could easily see a solution in noting that a label is done at the account level since that would be easy to see programatically. Would be pretty messy to do something like that on the ozone-side where labeler operators would basically have to duplicate their entire label set... Could keep the same format too |
@EstrogenEmpress and @aetaric I appreciate yall sharing your thoughts about this. My beliefs about de-escalation are real -- but I'm not at all certain that showing the alert badges on posts is a net negative. I'm not the one experiencing the outcomes or living the same stakes. If you tell me showing the alert badges is a net positive, that's the feedback I'm looking for and I'm going to follow it! And if you end up changing your mind, I want to hear about that too! I think for sure it's worth seeing how the other changes in this PR affect things first. |
an idea I wanted to float here: what if moderation service's labels included a tiny icon of the moderation service's profile pic at the end of the bubble? you don't have to worry about impersonation since the user only sees labels for the services they've opted into anyway |
I like this idea too and Paul seemed open to considering it: |
* origin/main: (392 commits) Remove old onboarding (#4224) Replace getAgent() with reading agent (#4243) Bump 1.85.0 (#4237) bump iOS target to `14.0` (#4238) set `onEndReachedThreshold` to `2` for notifications (#4235) Run intl extract (#4217) Updated Japanese translation (#4144) Updated Chinese translation (#4147) Update Korean localization (#4148) Update catalan messages.po (#4149) Update Indonesian translation (#4165) [🐴] update convo list from message bus (#4189) Recover from initial failed firehose state (#4211) Move ALT indicator right and shrink it a bit (#4213) Make sure failed messages enter error state (#4210) [🐴] Don't submit the message on return press when on a phone (web input) (#4203) Include feedContext in DOM as data- (#4206) Improve moderation behaviors: show alert/inform sources and improve UX around threads (#3677) Privileged app passwords (#4200) [🐴] Overfetch follow for default new dialog state (#4205) ...
Don't show account-levelAfter some discussion (mainly in the replies here) we decided not to move forward with this.severity=alert blurs=none
labels on the posts of users unless it's from the app's baked in moderationThe code for this isn't exactly incredible so I'm open to discussions.
Tune the rendering of
blurs=none
labelsThey now show who applied the label:
Group together blurred items in threads
Threads now move blurred items down to the bottom and put them behind a single blur.
If all of the items are blurred by account muting, the blur says "Show muted replies" and will show all of them with one click. If there are other reasons (eg labels) then the blur says "Show hidden replies" and will then show each item under their respective blurs.
To test
Start the dev-env. Log in as bob.test.