Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RAG? #1

Open
jimlloyd opened this issue Feb 7, 2024 · 2 comments
Open

Add RAG? #1

jimlloyd opened this issue Feb 7, 2024 · 2 comments

Comments

@jimlloyd
Copy link

jimlloyd commented Feb 7, 2024

I just came across llamazing and it seems very nicely done. I have been working on adapting a different ollama front-end to support my concept for RAG and am wondering if I should switch to llamazing. I am more of a back-end developer though I have done some work with React in the past. More recently, I've done some work with Svelte, which I think I like better, but I could consider switching back to React to use this code.

But before I do any of that, I am wondering if you have any thoughts for extending this project to support RAG? We could start by just defining an interface to hook into the chat request/response. Something like:

import type {
    ChatRequest,
    ChatResponse,
} from "./interfaces.js";


interface Hook
{
    onRequest(request: ChatRequest): Promise<ChatRequest>;
    onResponse(response: ChatResponse): Promise<ChatResponse>;
}

export class DefaultHook implements Hook
{
    async onRequest(request: ChatRequest): Promise<ChatRequest>
    {
        return request;
    }

    async onResponse(response: ChatResponse): Promise<ChatResponse>
    {
        return response;
    }
}

The first RAG implementation would just use the hooks to write each new user message and assistant message to the vector store.

Next we modify the Request by doing a semantic search of the vector store filter the ChatRequest messages[] to include only the top 3 most semantically relevant request/response pairs. This would make it easy to play with and see that the filtering is working. The idea is that if carry out a short conversation on one topic, then switch to digression topic, then switch back to the original topic, the digression should be omitted.

Then we would need a way to ingest documents, and change the filtering to include passages from documents.

Does this interest you?

@da-z
Copy link
Owner

da-z commented Feb 7, 2024

Hi Jim.

Thank you. I tried to make a simple frontend that I would like to use (while LM Studio is more advanced, I do not like that much the looks of it at the moment). I also prefer Svelte and had to learn React in order to do this app because there are already some components for it I could use.

I have been thinking as well on adding retrieval-augmented generation, but I did not jump yet into any implementation. I incline to think that this is more of a feature Ollama should support (similar to vision).

Otherwise such an extension will require extra stuff to be installed, which is not something users may be accustomed or willing to doing.

In my view, ideally all of this would be bundled with Ollama and easily accessible via their API (eg. send "files" array in addition to "images" and, in addition, have a dedicated ingestion/management API for the files)

For me, at the moment, it's mostly a problem of bundling/distributing/standardizing the app and its requirements than coding.

Frankly, I would extend the Ollama code directly, add something like Weviate to it which I think is also written in Go and once I have the Ollama side working I would use it in a simple way in LLaMazing.

If this works, other Ollama users can benefit, not just the LLaMazing users.

Anyway, since both LLaMazing and Ollama are open source, of course you are free to try and see where this leads to :) I suggest you start by forking the project. In principle, the hook idea is good and flexible.

I am definitely interested, but at the moment still ruminating :)

@jimlloyd
Copy link
Author

jimlloyd commented Feb 7, 2024

Cool. I hadn't considered contributing RAG support to ollama, but perhaps I should. Unfortunately, except for brief flirtation with Go 11 years ago I have no experience with it. I'm also motivate right now to do some relatively modest prototype before I take on something as ambitious as becoming an ollama contributor for a significant feature. I'll let you know if I fork this repo and add RAG via the hook interface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants