Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is difference? #4591

Closed
yeskent85 opened this issue Jul 24, 2024 · 1 comment
Closed

What is difference? #4591

yeskent85 opened this issue Jul 24, 2024 · 1 comment
Labels
feature Work on a feature request

Comments

@yeskent85
Copy link

What's the difference between fiftyone.brain.similarity.find_unique() and fiftyone.brain.compute_uniqueness()?
The output of both the functions using the same embeddings are slightly different.

@yeskent85 yeskent85 added the feature Work on a feature request label Jul 24, 2024
@swheaton
Copy link
Contributor

The difference is slightly subtle.
compute_uniqueness() adds a uniqueness score to each sample representing how unique it is with respect to the rest of the samples. Taking the top N would give you the N samples that are most unique compared to the entirety of the dataset.

find_unique() selects count samples that are maximally unique with respect to each other.

To explain with help of a degenerate case ... let's say we have 1000 images of dogs and 2 duplicate images of a tree. Those 2 tree images are probably going to have very high uniqueness scores because they're very different than the 1000 dog images. But if we call find_unique() then you would not expect to get the 2 tree images as a result because they are not unique with respect to each other.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Work on a feature request
Projects
None yet
Development

No branches or pull requests

2 participants