We can run open source LLM locally on our machine by using a powerful library named ollama
.
Although it's name has llama it does support other open source LLMs too. Like Mistral, Gemmma and more.
- First, we need to install ollama (I am doing on Linux)
- We can do direct install on Linux or use Docker to do so.
curl -fsSL https://ollama.com/install.sh | sh
- Install your LLM. (I am doing
phi3
here, Pick yours)
ollama pull phi3
- Run your installed LLM.
ollama run phi3
# this will install model if not already install
- To remove your model.
ollama rm phi3
- Can embedding model and LLM model can be different? Answer: Yes, they can. Because when you use embedding model, we are basically coverting the source(context) text's chunks and our query text to embedding and fetching matching/similar N number of chunks of source text to the query text. So there is no direct relation between them.