Possibility of offloading to Ollama based endpoint instead of OpenAI? #154

jessestevens5b · 2024-07-15T03:28:20Z

I'm still pretty new to your project, but through digging around I can see various calls to OpenAI's API and in the config there's a section for the API key.

Is it possible to offload to an Ollama instance instead so that all data endpoints are locally based?

Our big focus is that any kind of processing like this for our sensitive documents needs to occur 100% locally.

I see that LlamaIndex supports Ollama as an endpoint, is that how OpenAI is being reached (through LlamaIndex?)

JSv4 · 2024-07-15T04:24:00Z

You want to look in the tasks module, specifically opencontractserver/tasks/data_extract_tasks.py. We are using LlamaIndex there, but you can write your extractors and use whatever Python code you want. It's probably easiest for starters to use my code and replace the OpenAI LLM with Ollama, which you can easily do with LlamaIndex. LlamaIndex also supports using HuggingFace inference endpoints, so you could host LLMs there too.

jessestevens5b · 2024-07-15T04:56:30Z

Could it be possible to have an if statement in there to test if settings.OLLAMA_MODEL etc are there and switch to ollama if they exist? That way we could change the endpoint by providing config for the base_url, the model, the request_timeout for Ollama.

I don't know deeply enough how this would affect everything yet, or how you are pulling your settings in, but this would be super useful for those in the situation we are in that we cannot offload to hardware owned by others

jessestevens5b · 2024-07-15T05:03:01Z

Some quick and dirty code using the LlamaIndex library that does the job with my local ollama endpoint:

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms import ollama

Settings.llm = ollama.Ollama(model='llama3', base_url='http://192.168.20.200:11434', request_timeout=120.0)
Settings.embed_model = HuggingFaceEmbedding(model_name='multi-qa-MiniLM-L6-cos-v1', cache_folder="./models")


print("Loading documents...")
documents = SimpleDirectoryReader('./data').load_data()
print("Loaded", len(documents), "documents")

print("Indexing documents...")
index = VectorStoreIndex.from_documents(documents)
print("done")

print("Saving index...")
query_engine = index.as_query_engine()
print("Index saved")

print("Querying...")
response = query_engine.query(
    "Does this document mention article 690?"
)

print(response)

JSv4 · 2024-07-15T05:26:45Z

Awesome! If you wanted to open a PR to create a task that uses Ollama, that would be AWESOME :-). We'd need to run Ollama, probably in another container, so you'd need to update the compose stack too. Would definitely welcome the contribution (and would be happy to review / pair / consult). It's something I've wanted to do, I just don't have the time to do it all :-)

jessestevens5b · 2024-07-15T05:28:32Z

Ok I'll see if I can put some time into it. I'm not all that familiar with docker, so some of that aspect is a mystery to me.

Personally, I'd probably leave the Ollama installation as a separate object, as it changes often, and you'd want it to be a separate service rather than lumped in. Access is still via API so it's nice and simple.

JSv4 added the question label Jul 15, 2024

JSv4 closed this as completed Jul 15, 2024

JSv4 mentioned this issue Aug 8, 2024

Feature/Enhancement Add or Help to add Open Source LLM support #177

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possibility of offloading to Ollama based endpoint instead of OpenAI? #154

Possibility of offloading to Ollama based endpoint instead of OpenAI? #154

jessestevens5b commented Jul 15, 2024

JSv4 commented Jul 15, 2024

jessestevens5b commented Jul 15, 2024

jessestevens5b commented Jul 15, 2024 •

edited

Loading

JSv4 commented Jul 15, 2024

jessestevens5b commented Jul 15, 2024

Possibility of offloading to Ollama based endpoint instead of OpenAI? #154

Possibility of offloading to Ollama based endpoint instead of OpenAI? #154

Comments

jessestevens5b commented Jul 15, 2024

JSv4 commented Jul 15, 2024

jessestevens5b commented Jul 15, 2024

jessestevens5b commented Jul 15, 2024 • edited Loading

JSv4 commented Jul 15, 2024

jessestevens5b commented Jul 15, 2024

jessestevens5b commented Jul 15, 2024 •

edited

Loading