### Trying out RAG with ollama and chromadb
ollama is installed in the python environment venvcrc5 from where this notebook is started.

ollama is recommended over hugging face for local experimentation

it uses a docker-like syntax

In [2]:
# fancy UI
from ollama import chat
from ollama import ChatResponse
import gradio as gr

In [1]:
!ollama --help

Large language model runner

Usage:
  ollama [flags]
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  stop        Stop a running model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

Flags:
  -h, --help      help for ollama
  -v, --version   Show version information

Use "ollama [command] --help" for more information about a command.


#### This model is required for embedding the additional information to the supplied query

In [1]:
!ollama pull nomic-embed-text

[?25lpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest ⠦ [?25h[?25l[2K[1Gpulling manifest ⠧ [?25h[?25l[2K[1Gpulling manifest ⠏ [?25h[?25l[2K[1Gpulling manifest 
pulling 970aa74c0a90... 100% ▕████████████████▏ 274 MB                         
pulling c71d239df917... 100% ▕████████████████▏  11 KB                         
pulling ce4a164fc046... 100% ▕████████████████▏   17 B                         
pulling 31df23ea7daa... 100% ▕████████████████▏  420 B                         
verifying sha256 digest 
writing manifest 
success [?25h


#### This is one of the basic ollama LLMs

In [None]:
!ollama pull llama3.1

In [3]:
!ollama list

NAME                       ID              SIZE      MODIFIED   
nomic-embed-text:latest    0a109f422b47    274 MB    5 days ago    
llama3.1:latest            46e0c10c039e    4.9 GB    6 days ago    


#### Run a simple query with the llama3.1 model

In [8]:
def ask(question):
    response: ChatResponse = chat(model='llama3.1', messages=[
      {
        'role': 'user',
        'content': str(question),
      },
    ])
    return response.message.content 

gr.Interface(fn=ask, inputs="text", outputs="text").launch()

* Running on local URL:  http://127.0.0.1:7861

To create a public link, set `share=True` in `launch()`.




#### The pdfreader translates any pdf document to text readable by the model

In [16]:
from pypdf import PdfReader
# my textbook, 5th ed
reader = PdfReader("/home/mort/LaTeX/new projects/CRC5/main.pdf")
total_pages = len(reader.pages)
all_text = ""
for page_num in range(total_pages):
    page = reader.pages[page_num]
    all_text += page.extract_text()
f = open("/home/mort/temp/main.txt", "w")
f.write(all_text)
f.close()

#### This starts a chroma embedding database as a docker container with entrypoint on port 8000

In [3]:
!docker images

REPOSITORY        TAG       IMAGE ID       CREATED       SIZE
chromadb/chroma   latest    1295eb7aaaed   2 weeks ago   469MB
mort/crc5docker   latest    9f6ec881b8c5   6 weeks ago   5GB


In [9]:
!docker start chromadb

chromadb


In [2]:
!docker ps -a

CONTAINER ID   IMAGE             COMMAND                  CREATED       STATUS         PORTS                                       NAMES
47137d05119e   chromadb/chroma   "/docker_entrypoint.…"   2 weeks ago   Up 7 seconds   0.0.0.0:8000->8000/tcp, :::8000->8000/tcp   chromadb


#### Code for preprocessing the RAG supplementary text

In [10]:
import os
import re
import ollama

def readtextfiles(path):
  text_contents = {}
  directory = os.path.join(path)

  for filename in os.listdir(directory):
    if filename.endswith(".txt"):
      file_path = os.path.join(directory, filename)

      with open(file_path, "r", encoding="utf-8") as file:
        content = file.read()

      text_contents[filename] = content

  return text_contents

# split text into equal size chunks
def chunksplitter(text, chunk_size=100):
  words = re.findall(r'\S+', text)

  chunks = []
  current_chunk = []
  word_count = 0

  for word in words:
    current_chunk.append(word)
    word_count += 1

    if word_count >= chunk_size:
      chunks.append(' '.join(current_chunk))
      current_chunk = []
      word_count = 0

  if current_chunk:
    chunks.append(' '.join(current_chunk))

  return chunks

# use the nomic-embed-text model to calculate vector embeddings for all text chunks
def getembedding(chunks):
  embeds = ollama.embed(model="nomic-embed-text", input=chunks)
  return embeds.get('embeddings', [])

In [11]:
import chromadb

# use REST API for chroma vector database
chromaclient = chromadb.HttpClient(
    host="localhost",
    port=8000
)

#### Add the supplementary text to a new database collection

In [9]:
# erase any existing collection and create a new empty one
chromaclient.delete_collection(name="ragwithpython")
collection = chromaclient.get_or_create_collection(name="ragwithpython", metadata={"hnsw:space": "cosine"}  )

# the RAG supplementary data
textdocspath = "/home/mort/temp"
text_data = readtextfiles(textdocspath)

# read, break into chunks, embed and add to the chroma vector database 
for filename, text in text_data.items():
  # chunk size 256
  chunks = chunksplitter(text, 256)
  embeds = getembedding(chunks)
  chunknumber = list(range(len(chunks)))
  ids = [filename + str(index) for index in chunknumber]
  metadatas = [{"source": filename} for index in chunknumber]
  collection.add(ids=ids, documents=chunks, embeddings=embeds, metadatas=metadatas)


#### Execute a query with the supplementary text (RAG)

In [23]:
collection = chromaclient.get_or_create_collection(name="ragwithpython", metadata={"hnsw:space": "cosine"})

def ragask(query):
    # embed the current query
    queryembed = ollama.embed(model="nomic-embed-text", input=query)['embeddings']
    
    # use the embedded current query to retrieve the most relevant document chunks (as text NOT as embeddings)
    relateddocs = '\n\n'.join(collection.query(query_embeddings=queryembed, n_results=4)['documents'][0])
    
    # generate an answer
    prompt = f"Answer the question: {query} referring to the following text as a resource: {relateddocs}"
    ragoutput = ollama.generate(model="llama3.1", prompt=prompt, stream=False)

    return ragoutput['response']

# use the gradio interface
gr.Interface(fn=ragask, inputs="text", outputs="text").launch()   

* Running on local URL:  http://127.0.0.1:7860

To create a public link, set `share=True` in `launch()`.




Created dataset file at: .gradio/flagged/dataset1.csv
