Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possibility of offloading to Ollama based endpoint instead of OpenAI? #154

Closed
jessestevens5b opened this issue Jul 15, 2024 · 5 comments
Closed
Labels
question Further information is requested

Comments

@jessestevens5b
Copy link

I'm still pretty new to your project, but through digging around I can see various calls to OpenAI's API and in the config there's a section for the API key.

Is it possible to offload to an Ollama instance instead so that all data endpoints are locally based?

Our big focus is that any kind of processing like this for our sensitive documents needs to occur 100% locally.

I see that LlamaIndex supports Ollama as an endpoint, is that how OpenAI is being reached (through LlamaIndex?)

@JSv4
Copy link
Owner

JSv4 commented Jul 15, 2024

You want to look in the tasks module, specifically opencontractserver/tasks/data_extract_tasks.py. We are using LlamaIndex there, but you can write your extractors and use whatever Python code you want. It's probably easiest for starters to use my code and replace the OpenAI LLM with Ollama, which you can easily do with LlamaIndex. LlamaIndex also supports using HuggingFace inference endpoints, so you could host LLMs there too.

@JSv4 JSv4 added the question Further information is requested label Jul 15, 2024
@JSv4 JSv4 closed this as completed Jul 15, 2024
@jessestevens5b
Copy link
Author

Could it be possible to have an if statement in there to test if settings.OLLAMA_MODEL etc are there and switch to ollama if they exist? That way we could change the endpoint by providing config for the base_url, the model, the request_timeout for Ollama.

I don't know deeply enough how this would affect everything yet, or how you are pulling your settings in, but this would be super useful for those in the situation we are in that we cannot offload to hardware owned by others

@jessestevens5b
Copy link
Author

jessestevens5b commented Jul 15, 2024

Some quick and dirty code using the LlamaIndex library that does the job with my local ollama endpoint:

from llama_index.core import SimpleDirectoryReader, VectorStoreIndex, Settings
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms import ollama

Settings.llm = ollama.Ollama(model='llama3', base_url='http://192.168.20.200:11434', request_timeout=120.0)
Settings.embed_model = HuggingFaceEmbedding(model_name='multi-qa-MiniLM-L6-cos-v1', cache_folder="./models")


print("Loading documents...")
documents = SimpleDirectoryReader('./data').load_data()
print("Loaded", len(documents), "documents")

print("Indexing documents...")
index = VectorStoreIndex.from_documents(documents)
print("done")

print("Saving index...")
query_engine = index.as_query_engine()
print("Index saved")

print("Querying...")
response = query_engine.query(
    "Does this document mention article 690?"
)

print(response)

@JSv4
Copy link
Owner

JSv4 commented Jul 15, 2024

Awesome! If you wanted to open a PR to create a task that uses Ollama, that would be AWESOME :-). We'd need to run Ollama, probably in another container, so you'd need to update the compose stack too. Would definitely welcome the contribution (and would be happy to review / pair / consult). It's something I've wanted to do, I just don't have the time to do it all :-)

@jessestevens5b
Copy link
Author

Ok I'll see if I can put some time into it. I'm not all that familiar with docker, so some of that aspect is a mystery to me.

Personally, I'd probably leave the Ollama installation as a separate object, as it changes often, and you'd want it to be a separate service rather than lumped in. Access is still via API so it's nice and simple.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants