-
-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Image search example #5
Conversation
9553d20
to
77622a7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea to provide an image search example with tiny vector search implementation directly in C++, but I highlighted some points.
Also, let's not use tab characters for indents please.
general code clean up. but the rest of the repo is still in a similar state 😝
Things may be messy at early stages to quickly introduce new features, but I'm trying to tidy up the code and README as we make progress in the project. You can always suggest better structures for the code and the overall project.
Ohhhhh yea. sorry. forgot to reverse this 😆 |
yea, I also intentionally maked this PR as draft for those reasons. :) |
Great, thank you. Your work really appreciated. And to be honest, I was not aware of Usearch, but instead considering the original hnswlib library for the search example. But Usearch looks like a better candidate. |
forcepushed with |
2385eaa
to
7d170cf
Compare
924cc29
to
dfad626
Compare
Do you need an extra hand in this? |
Was doing something completely different last week :) |
(edit: why did you have to invert the cmake options <.<)
just rebased and changed some stuff you committed on main issues that I found:
|
Yes I'm ok with it as is, and then we can continue to further improve it. When #30 is merged, we can also the use batch inference when indexing images. |
yea, there have also been changes upstream (especially llama) , where you obviously
Yea. Let me run some small tests and then i will make it ready for review. |
Exactly. The entire ggml-based projects are moving very fast, so I believe we don't need to target 100% production-grade multiplatform-ability. The speed of iterations and innovations is prioritized currently.
Great. Thank you. |
USearch reports distance values instead of similarities, so I made changes to emphasize it. LGTM, merging. TODOs can be fixed later. Thanks for your contributions. |
I used the usearch vector database library, to build a image embedding database and then to search in it by similarity using a text string.
build the database:
it generates an
images.usearch
andimages.path
file.search by similarity to string:
It is a very rough implementation. It requires
c++17
andstd::filesystem
for iterating the filesystem.A similarity search via an image would be nice too.
basically we could copy all the features from here: https://github.com/yurijmikhalevich/rclip
TODO: