Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move to a more forgiving distance at safety checker #2

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

shauray8
Copy link

@shauray8 shauray8 commented Apr 10, 2024

Moving to Jaccard distance in order to make the safety checker more forgiving with SFW prompts/images

Compared to cosine distance, Jaccard distance is considered more forgiving because it only considers the presence or absence of features. This is still not perfect for what we want as the safety model itself is build upon a lot of shaky structure mainly a vector compare to a 17 element vector which basically is -

nsfw:
  concepts:
    sexual: 0.2
    nude: 0.20
    sex: 0.206
    18+: 0.21
    naked: 0.195
    nsfw: 0.2
    porn: 0.2
    dick: 0.19
    vagina: 0.19
    naked child: 0.22
    explicit content: 0.19
    uncensored: 0.2
    fuck: 0.2
    nipples: 0.2
    visible nipples: 0.21
    naked breasts: 0.214
    areola: 0.2

which in itself is not a very extensive list and does not include terms like killing or blood and is basically a CLIPVisionModel underneath.

Here [Experimental might work might not] the Jaccard distance is calculated with some stupid estimations which just worked for the simple set of data I had.

and I think that cosine distance would be more influenced by the differences in vector lengths and term frequencies. so this patch has a small similarity measure change not for long term though!

For the wanderes

This is what Jaccard Distance logic looks like -
J(A, B) = 1 - (|A ∩ B|) / (|A ∪ B|)

@shauray8 shauray8 marked this pull request as ready for review April 10, 2024 20:51
@shauray8
Copy link
Author

@adhikjoshi

@shauray8 shauray8 self-assigned this Apr 10, 2024
@adhikjoshi
Copy link

Understood, looks good

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants