Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support vor vector stores ? #6

Closed
behrica opened this issue Apr 30, 2023 · 6 comments
Closed

support vor vector stores ? #6

behrica opened this issue Apr 30, 2023 · 6 comments

Comments

@behrica
Copy link
Contributor

behrica commented Apr 30, 2023

I was reflecting about the minimal tooling we need to work on larger texts with LLMs.
In my view we need 3 things:

  1. a LLM as such able to do 2 operations "completing " and "create embedding"
  2. splitting of texts
  3. a vector database wit a least 2 operatins (storing vectors, finding closest vectors to given vector)
  1. and 2) we have in bosquet, at least minimal

I am not sure if 3) is existing in the Clojure world. I believe there are some vector database having a java binding, at least Milvus does:
https://github.com/milvus-io/milvus-sdk-java/blob/master/examples/main/java/io/milvus/GeneralExample.java

Ideally bosquet would support various vector databases, maybe via an abstraction
Any thoughts ?

@behrica
Copy link
Contributor Author

behrica commented Apr 30, 2023

This has a Clojure client :
https://github.com/vdaas/vald-client-clj

@zmedelis
Copy link
Owner

zmedelis commented May 3, 2023

I would probably go with Pinecone as the first simple implementation. It has REST API https://docs.pinecone.io/reference/list_indexes/

@behrica
Copy link
Contributor Author

behrica commented May 3, 2023

Ok, it has a free hosted edition, which is good enough for testing.

I will probably try to implement a little uses case I have, as depicted above.

I am not sure, if this needs any addition / change to bosquet.

@zmedelis
Copy link
Owner

zmedelis commented May 4, 2023

Thanks for suggesting this and let's see what changes it will require.

But there is a more fundamental question. Does this project tries to be Langchain for Clojure (replicating the whole plethora of functionality vector stores and whatnot) or finds some specific and at least slightly different take on LLM use? Hence the pause in Bosquet development.

@behrica
Copy link
Contributor Author

behrica commented May 4, 2023

It is of course a good question, on which I have no answer.
I think, in general, the functional approach of Clojure is asking for "combining tools" and not re-invent / duplicate complete solutions.

So far I think "my usecase" will not need changes in bosquet.

I personally think we should not replicate an existing python library, but use it via libpython-clj

@behrica
Copy link
Contributor Author

behrica commented May 25, 2023

I think we can close this for know.
I tried some things combining text splitting , vector databases and the LLMs and it composed nicely, all just being data.

@behrica behrica closed this as completed May 25, 2023
@behrica behrica mentioned this issue Jul 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants