You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The auto-moderator is overly aggressive in classifying 'suggestive' images. With the introduction of self-flagging, misclassifying images and taking the 'higher' rating is leading to more and more false positives, especially for users that generally try do the right thing (i.e. they will be 'punished' on visibility both when they self-flag AND when they try post an unflagged image).
All these images are essentially of the same subject, with minor feature tweaks. The AI is not consistent or reasonable here. These images do not suggest anything.
A quick fix would be to increase the threshold that is used for hive matching from 0.9 to 0.95 - actual suggestive images tend to get 0.99+ ratings and so this would serve to reduce the false positive rate.
The text was updated successfully, but these errors were encountered:
Arguably more objective example - I have this post saved because it was automatically labeled by AI and the label had to be manually removed. Scores .995 suggestive because of the semblence of their tops to a bra. This is just one example of many ways in which femme presenting people especially have unduly strict limits
Extending this - for 'female', just adding a little bit of bias away from labelling (i.e. lower that threshold artificially) will probably go a long way. We have a year of data showing the bias - and even if it's just vibes, there's an argument that a tweak like that will better meet user expectations.
Describe the bug
The auto-moderator is overly aggressive in classifying 'suggestive' images. With the introduction of self-flagging, misclassifying images and taking the 'higher' rating is leading to more and more false positives, especially for users that generally try do the right thing (i.e. they will be 'punished' on visibility both when they self-flag AND when they try post an unflagged image).
To Reproduce
Steps to reproduce the behavior:
Compare:
https://bsky.app/profile/bossett.bsky.social/post/3kbjaidth322l (not flagged)
https://bsky.app/profile/bossett.bsky.social/post/3kbj5f3xngj2q (wasn't flagged for >45m - manual review?)
https://bsky.app/profile/frecksandframes.fans/post/3kbgr4r2iwm2q (flagged)
All these images are essentially of the same subject, with minor feature tweaks. The AI is not consistent or reasonable here. These images do not suggest anything.
Expected behavior
None of those images should be flagged.
Long term, please consider this proposal: bluesky-social/proposals#37
A quick fix would be to increase the threshold that is used for hive matching from 0.9 to 0.95 - actual suggestive images tend to get 0.99+ ratings and so this would serve to reduce the false positive rate.
The text was updated successfully, but these errors were encountered: