Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could this be used to compare audio similarity? #12

Open
youssefabdelm opened this issue Nov 5, 2022 · 1 comment
Open

Could this be used to compare audio similarity? #12

youssefabdelm opened this issue Nov 5, 2022 · 1 comment
Labels
question Further information is requested

Comments

@youssefabdelm
Copy link

❓ Questions

I'm curious how to extract embeddings, and if that's the output of the compress function / command line tool, and whether that could be used to compare, via cosine similarity, how similar 2 audio files are?

@youssefabdelm youssefabdelm added the question Further information is requested label Nov 5, 2022
@adefossez
Copy link
Contributor

Good question, we actually haven't tried. We definitely believe that the model performs some "collapse" of similar audio on the same representation, and it eliminates some of the variability that might occur between two similar audios (e.g. phase difference, white noise components). Note that we have good reasons to believe the representation is mostly at the acoustic level. Thus semantic comparisons (e.g. two musics with the same genre, or two people talking of the same topic) wouldn't be close in the latent space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants