## Ollama Python Example

### Intro

Ollama is a tool for using LLMs on local environment via API.
This gives simplicity and flexibility for creating AI/LLM/RAG based applications.


### Setup
1. Download and install ollama https://ollama.com/download
2. Download Llama 3 8B

`ollama pull llama3:latest`

3. Download model for embedding

`ollama pull mxbai-embed-large`

4. Make sure models are downloaded and ollama is running

```
ollama list
NAME                    	ID          	SIZE  	MODIFIED       
mxbai-embed-large:latest	468836162de7	669 MB	7 seconds ago 	
llama3:latest           	365c0bd3c000	4.7 GB	17 seconds ago	
```

`ollama serve`


### Testing API requests to ollama

CURL:

```
curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama3",
  "prompt":"Why people don't absorb water directly from air?"
 }'
```

CLI:

```
ollama run llama3:latest
Send a message (/? for help)
/bye
```

In [1]:
# install Python libraries
!pip install chromadb==0.5.3
!pip install ollama



### Demo

In [2]:
# import libraries
import ollama
# vector database
import chromadb
# for reding source documents
import os

# create vector database client
vectorDb = chromadb.Client()

# create collection to store embedded documents
collection = vectorDb.create_collection(name="documents")

# define embedding model
embeddingModel = 'mxbai-embed-large'
# define LLM model
llmModel = 'llama3'

# source documents folder Path 
path = "data/cryptocurrency_wikipedia"

# iterate through all file 
for i,file in enumerate(os.listdir(path)):
    # check for .txt extension
    if file.endswith(".txt"):
        file_path = f"{path}/{file}"
        with open(file_path, 'r') as f:
            # get file content
            document = f.read()
            # get embedding for document
            response = ollama.embeddings(model=embeddingModel, prompt=document)
            embedding = response["embedding"]
            # load document to vector database
            collection.add(
                ids=[str(i)],
                embeddings=[embedding],
                documents=[document]
            )

# define the prompt
prompt = 'is Coinye the name of an animal?'

# create embeddings for the prompt
response = ollama.embeddings(
    prompt=prompt,
    model=embeddingModel
)

# Query the collection for the most similar document
results = collection.query(
    query_embeddings=[response["embedding"]],
    n_results=1
)

# get best matching document
data = results['documents'][0][0]

# generate a response using the LLM
output = ollama.generate(
    model=llmModel,
    prompt=f"Using this data: {data}. Respond to this prompt: {prompt}"
)

# print result
print(output['response'])


According to the Wikipedia article, Coinye is actually a cryptocurrency, but at one point, its mascot was a "half-man-half-fish hybrid" (likely a reference to a South Park episode where Kanye West is called a "gay fish"). So, while Coinye isn't directly an animal name, it has some aquatic creature connotations!
