### Trying out RAG with ollama and chromadb
ollama is installed in the python environment venvcrc5 from where this notebook is started

In [1]:
!ollama --help

Large language model runner

Usage:
  ollama [flags]
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  stop        Stop a running model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

Flags:
  -h, --help      help for ollama
  -v, --version   Show version information

Use "ollama [command] --help" for more information about a command.


#### This model is required for embedding the additional information to the supplied query

In [2]:
!ollama pull nomic-embed-text

[?25lpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest ⠦ [?25h[?25l[2K[1Gpulling manifest 
pulling 970aa74c0a90... 100% ▕████████████████▏ 274 MB                         
pulling c71d239df917... 100% ▕████████████████▏  11 KB                         
pulling ce4a164fc046... 100% ▕████████████████▏   17 B                         
pulling 31df23ea7daa... 100% ▕████████████████▏  420 B                         
verifying sha256 digest 
writing manifest [?25h[?25l[2K[1G[A[2K[1G[A[2K[1G[A[2K[1G[A[2K[1G[A[2K[1G[A[2K[1Gpulling manifest 
pulling 970aa74c0a90... 100% ▕████████████████▏ 274 MB                         
pulling c71d239df917... 100% ▕████████████████▏  11 KB                         
pulling ce4a164fc046... 100% ▕████████████████▏   17 B                       

#### This is one of the basic ollama LLMs

In [3]:
!ollama pull llama3.1 

[?25lpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest 
pulling 667b0c1932bc... 100% ▕████████████████▏ 4.9 GB                         
pulling 948af2743fc7... 100% ▕████████████████▏ 1.5 KB                         
pulling 0ba8f0e314b4... 100% ▕████████████████▏  12 KB                         
pulling 56bb8bd477a5... 100% ▕████████████████▏   96 B                         
pulling 455f34728c9b... 100% ▕████████████████▏  487 B                         
verifying sha256 digest 
writing manifest 
success [?25h


In [7]:
!ollama rm llama3.2

[?25l[?25l[?25h[2K[1G[?25hdeleted 'llama3.2'


In [2]:
!ollama list

NAME                       ID              SIZE      MODIFIED     
llama3.1:latest            46e0c10c039e    4.9 GB    23 hours ago    
nomic-embed-text:latest    0a109f422b47    274 MB    23 hours ago    


In [3]:
from ollama import chat
from ollama import ChatResponse
from pypdf import PdfReader

In [5]:
def ask(question):
    response: ChatResponse = chat(model='llama3.1', messages=[
      {
        'role': 'user',
        'content': str(question),
      },
    ])
    print(response.message.content)    

In [6]:
ask('What is RAG (retrieval augmented generation)?')

RAG stands for Retrieval Augmented Generation. It's a type of artificial intelligence model architecture that combines the strengths of retrieval-based models with those of generative models.

**Retrieval-based models** are designed to retrieve relevant information from a large database or knowledge graph based on user input. They're typically used for question answering, natural language processing, and other tasks where retrieving existing knowledge is sufficient.

**Generative models**, on the other hand, generate new content, such as text, images, or code, based on patterns learned from existing data. They can create novel outputs that don't exist in the training dataset.

RAG combines these two approaches by augmenting a generative model with retrieval capabilities. Here's how it works:

1. **Retrieval**: The RAG model first searches a large database (e.g., a knowledge graph) to retrieve relevant information related to the user input.
2. **Augmentation**: This retrieved informatio

#### The pdfreader translates any pdf document to text readable by the model

In [4]:
# my entire book
reader = PdfReader("/home/mort/LaTeX/new projects/remotesensing2019/draft.pdf")
total_pages = len(reader.pages)
all_text = ""
for page_num in range(total_pages):
    page = reader.pages[page_num]
    all_text += page.extract_text()
#print(all_text)
f = open("/home/mort/temp/main.txt", "w")
f.write(all_text)
f.close()

#### This starts a chroma embedding database as a docker container with entrypoint on port 8000

In [5]:
!docker images

REPOSITORY        TAG       IMAGE ID       CREATED       SIZE
chromadb/chroma   latest    1295eb7aaaed   6 days ago    469MB
mort/crc5docker   latest    9f6ec881b8c5   4 weeks ago   5GB


In [6]:
!docker ps -a

CONTAINER ID   IMAGE             COMMAND                  CREATED      STATUS                    PORTS     NAMES
47137d05119e   chromadb/chroma   "/docker_entrypoint.…"   2 days ago   Exited (0) 13 hours ago             chromadb


In [7]:
!docker start chromadb

chromadb


#### Code for preprocessing the RAG supplementary text

In [8]:
import os
import re

def readtextfiles(path):
  text_contents = {}
  directory = os.path.join(path)

  for filename in os.listdir(directory):
    if filename.endswith(".txt"):
      file_path = os.path.join(directory, filename)

      with open(file_path, "r", encoding="utf-8") as file:
        content = file.read()

      text_contents[filename] = content

  return text_contents

def chunksplitter(text, chunk_size=100):
  words = re.findall(r'\S+', text)

  chunks = []
  current_chunk = []
  word_count = 0

  for word in words:
    current_chunk.append(word)
    word_count += 1

    if word_count >= chunk_size:
      chunks.append(' '.join(current_chunk))
      current_chunk = []
      word_count = 0

  if current_chunk:
    chunks.append(' '.join(current_chunk))

  return chunks

def getembedding(chunks):
  embeds = ollama.embed(model="nomic-embed-text", input=chunks)
  return embeds.get('embeddings', [])

In [9]:
import chromadb

chromaclient = chromadb.HttpClient(
    host="localhost",
    port=8000
)

collection = chromaclient.get_or_create_collection(name="ragwithpython", metadata={"hnsw:space": "cosine"}  )

#### Add the supplementary text to a database collection

In [11]:
import ollama

# *run only to erase existing collection 'ragwithpython'*
chromaclient.delete_collection(name="ragwithpython")
collection = chromaclient.get_or_create_collection(name="ragwithpython", metadata={"hnsw:space": "cosine"}  )

textdocspath = "/home/mort/temp"
text_data = readtextfiles(textdocspath)

for filename, text in text_data.items():
  # chunk size 100
  chunks = chunksplitter(text, 100)
  embeds = getembedding(chunks)
  chunknumber = list(range(len(chunks)))
  ids = [filename + str(index) for index in chunknumber]
  metadatas = [{"source": filename} for index in chunknumber]
  collection.add(ids=ids, documents=chunks, embeddings=embeds, metadatas=metadatas)


#### Execute a query with the supplementary text (RAG)

In [18]:
import ollama

query = "how does sequential sar change detection work?"

queryembed = ollama.embed(model="nomic-embed-text", input=query)['embeddings']

relateddocs = '\n\n'.join(collection.query(query_embeddings=queryembed, n_results=20)['documents'][0])

prompt = f"{query} - Answer that question using the following text as a resource: {relateddocs}"
ragoutput = ollama.generate(model="llama3.1", prompt=prompt, stream=False)

print(f"Answered with RAG: {ragoutput['response']}")

Answered with RAG: This text appears to be a research article related to remote sensing and Earth observation using the Google Earth Engine (GEE). Here's a summary of the main points:

**Introduction**

* The article discusses the use of Sentinel-1 data from the GEE for change detection tasks.
* The authors highlight the advantages of Sentinel-1, including its high spatial resolution (up to 20 meters), short revisit times (every 6 days), and independence from solar illumination and cloud cover.

**Background**

* The article mentions that the GEE provides a powerful platform for processing and analyzing large datasets.
* However, it notes that only dual polarization multi-look intensity format data are available in the GEE archive, which limits the use of interferometric coherence methods.

**Methodology**

* The authors describe a sequential omnibus algorithm for change detection, which is based on complex Wishart statistics.
* The algorithm is designed to identify significant changes