Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

External API for sensitive content detection #422

Open
2 tasks
obulat opened this issue Feb 18, 2023 · 0 comments
Open
2 tasks

External API for sensitive content detection #422

obulat opened this issue Feb 18, 2023 · 0 comments
Labels
✨ goal: improvement Improvement to an existing user-facing feature 🧭 project: thread An issue used to track a project and its progress 🧱 stack: api Related to the Django API

Comments

@obulat
Copy link
Contributor

obulat commented Feb 18, 2023

Summary

There are some external APIs that can label images with semantic labels and detect whether the image is sensitive. This can be highly relevant to search relevancy and content safety.

This project will need to select the optimal service to use and need to make a performance determination about implementing this as ‘on the fly’ vs. as a job queue.

Description

The project would have three basic parts:

  1. Working through a queue of works that need scanning

  2. Auto-moderating works based on high-confidence scan results and integrating others into the overall moderation queue (whether as part of the same queue as user reports or a secondary one)

  3. Indicating on a work when its sensitivity designation is a result of auto or manual moderation based on machine labels

The last one also like increases the need for a "moderation challenge" queue so that auto-moderated works in particular have an easy avenue for users to challenge the moderation result

Best guess at list of implementation plans:

  • Groundwork: Investigate the performance characteristics of various approaches to building and working through the queue; propose the most-likely-to-succeed version

  • Groundwork: Choose a tool to use for machine labelling and identify confidence characteristics of the output; make a proposal for auto-moderation, if it can happen at all on both sides (i.e., how confident is the output that a work is not sensitivity or that it is sensitive: both are basically auto-moderation in one direction or another)

  • Groundwork: Work with moderators to decide whether to integrate machine labelled works into the moderation queue and how to prioritise review

  • Groundwork: Propose an approach for communicating machine-labelling based moderation decisions on works designated as sensitive

  • IP: The to-be-labelled queue and integration with the labelling tool

  • IP: Surface results on the API

  • IP: Integrate results into the moderation queue

  • IP: Frontend presentation of machine-labelling based auto-moderation

Documents

  • Project Proposal
  • Implementation Plan

Issues

Prior Art

@obulat obulat added the 🧭 project: thread An issue used to track a project and its progress label Feb 18, 2023
dhruvkb pushed a commit that referenced this issue Apr 14, 2023
* Round duration for provider ingestion completion message

* Report duration in humanized format

* Add case for one interval unit
@AetherUnbound AetherUnbound added ✨ goal: improvement Improvement to an existing user-facing feature 🧱 stack: api Related to the Django API labels Dec 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
✨ goal: improvement Improvement to an existing user-facing feature 🧭 project: thread An issue used to track a project and its progress 🧱 stack: api Related to the Django API
Projects
Status: ⌛ Todo
Development

No branches or pull requests

2 participants