### Trying out RAG with ollama
ollama is installed in the python environment venvcrc5 from where this notebook is started

In [1]:
!ollama --help

Large language model runner

Usage:
  ollama [flags]
  ollama [command]

Available Commands:
  serve       Start ollama
  create      Create a model from a Modelfile
  show        Show information for a model
  run         Run a model
  stop        Stop a running model
  pull        Pull a model from a registry
  push        Push a model to a registry
  list        List models
  ps          List running models
  cp          Copy a model
  rm          Remove a model
  help        Help about any command

Flags:
  -h, --help      help for ollama
  -v, --version   Show version information

Use "ollama [command] --help" for more information about a command.


#### This model is required for embedding the additional information to the supplied query

In [2]:
!ollama pull nomic-embed-text

[?25lpulling manifest ⠋ [?25h[?25l[2K[1Gpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest ⠼ [?25h[?25l[2K[1Gpulling manifest ⠴ [?25h[?25l[2K[1Gpulling manifest ⠦ [?25h[?25l[2K[1Gpulling manifest 
pulling 970aa74c0a90... 100% ▕████████████████▏ 274 MB                         
pulling c71d239df917... 100% ▕████████████████▏  11 KB                         
pulling ce4a164fc046... 100% ▕████████████████▏   17 B                         
pulling 31df23ea7daa... 100% ▕████████████████▏  420 B                         
verifying sha256 digest 
writing manifest [?25h[?25l[2K[1G[A[2K[1G[A[2K[1G[A[2K[1G[A[2K[1G[A[2K[1G[A[2K[1Gpulling manifest 
pulling 970aa74c0a90... 100% ▕████████████████▏ 274 MB                         
pulling c71d239df917... 100% ▕████████████████▏  11 KB                         
pulling ce4a164fc046... 100% ▕████████████████▏   17 B                       

#### This is one of the basic ollama LLMs

In [3]:
!ollama pull llama3.1 

[?25lpulling manifest ⠙ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠹ [?25h[?25l[2K[1Gpulling manifest ⠸ [?25h[?25l[2K[1Gpulling manifest 
pulling 667b0c1932bc... 100% ▕████████████████▏ 4.9 GB                         
pulling 948af2743fc7... 100% ▕████████████████▏ 1.5 KB                         
pulling 0ba8f0e314b4... 100% ▕████████████████▏  12 KB                         
pulling 56bb8bd477a5... 100% ▕████████████████▏   96 B                         
pulling 455f34728c9b... 100% ▕████████████████▏  487 B                         
verifying sha256 digest 
writing manifest 
success [?25h


In [7]:
!ollama rm llama3.2

[?25l[?25l[?25h[2K[1G[?25hdeleted 'llama3.2'


In [8]:
!ollama list

NAME                       ID              SIZE      MODIFIED      
llama3.1:latest            46e0c10c039e    4.9 GB    4 minutes ago    
nomic-embed-text:latest    0a109f422b47    274 MB    4 minutes ago    


In [9]:
from ollama import chat
from ollama import ChatResponse
from pypdf import PdfReader

In [10]:
def ask(question):
    response: ChatResponse = chat(model='llama3.1', messages=[
      {
        'role': 'user',
        'content': str(question),
      },
    ])
    print(response.message.content)    

In [24]:
ask('What is RAG (retrieval augmented generation)?')

RAG, or Retrieval Augmented Generation, is a recent technique that has been gaining attention in the natural language processing (NLP) and artificial intelligence communities. It's an approach to building more effective and efficient language models by combining retrieval-based methods with generative models.

**Traditional vs. RAG:**

In traditional generative models, such as transformers or sequence-to-sequence models, the model generates a response from scratch based on its internal knowledge. However, this can lead to limitations like:

1. Lack of contextual understanding
2. Insufficient domain-specific knowledge

To address these issues, RAG combines a retrieval module with a generative module.

**How RAG works:**

The Retrieval Module:

1. Takes the input query as a prompt.
2. Uses an index (e.g., a database or an external memory) to retrieve relevant documents or passages that are most relevant to the query.
3. The retrieved documents are then used as contextual information for 

#### The pdfreader translates any pdf document to text readable by the model

In [49]:
reader = PdfReader("/home/mort/LaTeX/new projects/CRC5/main.pdf")
total_pages = len(reader.pages)
all_text = ""
for page_num in range(total_pages):
    page = reader.pages[page_num]
    all_text += page.extract_text()
#print(all_text)
f = open("/home/mort/temp/main.txt", "w")
f.write(all_text)
f.close()

#### This starts a chroma embedding database as a docker container with entrypoint on port 8000

In [13]:
!docker images

REPOSITORY        TAG       IMAGE ID       CREATED       SIZE
chromadb/chroma   latest    1295eb7aaaed   5 days ago    469MB
mort/crc5docker   latest    9f6ec881b8c5   4 weeks ago   5GB


In [13]:
!docker ps -a

CONTAINER ID   IMAGE             COMMAND                  CREATED        STATUS                    PORTS     NAMES
47137d05119e   chromadb/chroma   "/docker_entrypoint.…"   22 hours ago   Exited (0) 19 hours ago             chromadb


In [12]:
!docker start chromadb

chromadb


#### Code for preprocessing the RAG supplementary text

In [17]:
import os
import re

def readtextfiles(path):
  text_contents = {}
  directory = os.path.join(path)

  for filename in os.listdir(directory):
    if filename.endswith(".txt"):
      file_path = os.path.join(directory, filename)

      with open(file_path, "r", encoding="utf-8") as file:
        content = file.read()

      text_contents[filename] = content

  return text_contents

def chunksplitter(text, chunk_size=100):
  words = re.findall(r'\S+', text)

  chunks = []
  current_chunk = []
  word_count = 0

  for word in words:
    current_chunk.append(word)
    word_count += 1

    if word_count >= chunk_size:
      chunks.append(' '.join(current_chunk))
      current_chunk = []
      word_count = 0

  if current_chunk:
    chunks.append(' '.join(current_chunk))

  return chunks

def getembedding(chunks):
  embeds = ollama.embed(model="nomic-embed-text", input=chunks)
  return embeds.get('embeddings', [])

#### Add the supplementary text to a database collection

In [54]:
import chromadb
import ollama

chromaclient = chromadb.HttpClient(host="localhost", port=8000)
textdocspath = "/home/mort/temp"
text_data = readtextfiles(textdocspath)

collection = chromaclient.get_or_create_collection(name="ragwithpython", metadata={"hnsw:space": "cosine"}  )

for filename, text in text_data.items():
  chunks = chunksplitter(text)
  embeds = getembedding(chunks)
  chunknumber = list(range(len(chunks)))
  ids = [filename + str(index) for index in chunknumber]
  metadatas = [{"source": filename} for index in chunknumber]
  collection.add(ids=ids, documents=chunks, embeddings=embeds, metadatas=metadatas)


#### Execute a query with the supplementary text

In [55]:
query = "describe the iMAD change detection algorithm"

queryembed = ollama.embed(model="nomic-embed-text", input=query)['embeddings']

relateddocs = '\n\n'.join(collection.query(query_embeddings=queryembed, n_results=10)['documents'][0])
prompt = f"{query} - Answer that question using the following text as a resource: {relateddocs}"
ragoutput = ollama.generate(model="llama3.1", prompt=prompt, stream=False)

print(f"Answered with RAG: {ragoutput['response']}")

Answered with RAG: The iMAD (Iteratively Reweighted Multivariate Alteration Detection) change detection algorithm is a technique used to identify changes between two images by iteratively re-weighting the data based on the significance of the observed changes.

Here's an overview of how the iMAD algorithm works, as described in the provided text:

1. The algorithm starts with an initial iteration where the canonical correlations for each band are calculated.
2. In each subsequent iteration, the P-values from the previous iteration are used to weight each pixel before re-sampling to determine the means and covariance matrices for the next iteration.
3. This process continues until a stopping criterion is met, such as lack of significant change in the canonical correlations.

The algorithm uses the following steps:

1. Initialize the input data, which includes the two images to be compared and the number of iterations (maxitr).
2. Iterate over the list of integers from 1 to maxitr using 