-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RAG? #1
Comments
Hi Jim. Thank you. I tried to make a simple frontend that I would like to use (while LM Studio is more advanced, I do not like that much the looks of it at the moment). I also prefer Svelte and had to learn React in order to do this app because there are already some components for it I could use. I have been thinking as well on adding retrieval-augmented generation, but I did not jump yet into any implementation. I incline to think that this is more of a feature Ollama should support (similar to vision). Otherwise such an extension will require extra stuff to be installed, which is not something users may be accustomed or willing to doing. In my view, ideally all of this would be bundled with Ollama and easily accessible via their API (eg. send "files" array in addition to "images" and, in addition, have a dedicated ingestion/management API for the files) For me, at the moment, it's mostly a problem of bundling/distributing/standardizing the app and its requirements than coding. Frankly, I would extend the Ollama code directly, add something like Weviate to it which I think is also written in Go and once I have the Ollama side working I would use it in a simple way in LLaMazing. If this works, other Ollama users can benefit, not just the LLaMazing users. Anyway, since both LLaMazing and Ollama are open source, of course you are free to try and see where this leads to :) I suggest you start by forking the project. In principle, the hook idea is good and flexible. I am definitely interested, but at the moment still ruminating :) |
Cool. I hadn't considered contributing RAG support to ollama, but perhaps I should. Unfortunately, except for brief flirtation with Go 11 years ago I have no experience with it. I'm also motivate right now to do some relatively modest prototype before I take on something as ambitious as becoming an ollama contributor for a significant feature. I'll let you know if I fork this repo and add RAG via the hook interface. |
I just came across
llamazing
and it seems very nicely done. I have been working on adapting a differentollama
front-end to support my concept for RAG and am wondering if I should switch tollamazing
. I am more of a back-end developer though I have done some work with React in the past. More recently, I've done some work with Svelte, which I think I like better, but I could consider switching back to React to use this code.But before I do any of that, I am wondering if you have any thoughts for extending this project to support RAG? We could start by just defining an interface to hook into the chat request/response. Something like:
The first RAG implementation would just use the hooks to write each new user message and assistant message to the vector store.
Next we modify the Request by doing a semantic search of the vector store filter the ChatRequest messages[] to include only the top 3 most semantically relevant request/response pairs. This would make it easy to play with and see that the filtering is working. The idea is that if carry out a short conversation on one topic, then switch to digression topic, then switch back to the original topic, the digression should be omitted.
Then we would need a way to ingest documents, and change the filtering to include passages from documents.
Does this interest you?
The text was updated successfully, but these errors were encountered: