<a href="https://colab.research.google.com/github/RDGopal/IB9CW0-Text-Analytics/blob/main/day_eight_rag_query_modes.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# RAG with Llama Index: Query modes
In this tutorial we will compare different query modes in LlamaIndex to see how they impact results. We will need to start with some installs again:

In [None]:
%%capture
# Install Ollama v0.1.30
!curl https://ollama.ai/install.sh | sed 's#https://ollama.ai/download#https://github.com/jmorganca/ollama/releases/download/v0.1.30#' | sh

In [None]:
%%capture
# Setup the model as a global variable
OLLAMA_MODEL='phi:latest'

# Add the model to the environment of the operating system
import os
os.environ['OLLAMA_MODEL'] = OLLAMA_MODEL
!echo $OLLAMA_MODEL # print the global variable to check it saved

import subprocess
import time

# Start ollama on the server ("serve")
command = "nohup ollama serve&" # "nohup" and "&" means run in the background

# Use subprocess.Popen to run the command
process = subprocess.Popen(command,
                            shell=True,
                            stdout=subprocess.PIPE,
                            stderr=subprocess.PIPE)

time.sleep(5)  # Makes Python wait for 5 seconds

# Install prerequisites
!pip install llama-index-embeddings-huggingface
!pip install llama-index-llms-ollama
!pip install llama-index ipywidgets
!pip install llama-index-llms-huggingface
!pip install llama_index.readers.web
!pip install llama-index-vector-stores-chroma
!pip install chromadb

# Import required modules from the llama_index library
from llama_index.core import VectorStoreIndex, SummaryIndex, SimpleDirectoryReader
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings
from llama_index.llms.ollama import Ollama
from llama_index.core import StorageContext

# Import ChromaVectorStore and chromadb module
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb

# Import the Ollama class
from llama_index.llms.ollama import Ollama

# Use the global variable (OLLAMA_MODEL) as our LLM
# Set a timeout of 8 minutes in case of CPU
llm = Ollama(model=OLLAMA_MODEL, request_timeout=480.0)

In [None]:
# Query the model via the command line
# First time running it will "pull" (import) the model
!ollama run $OLLAMA_MODEL "Tell me a joke"

We have a generator but now we needed a retriever to augment our results. Here we will extract some data from a webpage:

In [None]:
# Initialize a HuggingFace Embedding model
embed_model = HuggingFaceEmbedding(model_name="BAAI/bge-small-en-v1.5")

# Specify the LLM and embedding model into LlamaIndex's settings
Settings.llm = llm
Settings.embed_model = embed_model

In [None]:
from llama_index.readers.web import BeautifulSoupWebReader

# web page
url = "https://www.assetfinanceinternational.com/index.php/technology/technology-archive/technology-articles/22703-warwick-business-school-launches-3m-gillmore-centre-for-financial-technology"

# use BeautifulSoup to parse the HTML data
documents = BeautifulSoupWebReader().load_data([url])
documents = documents[0].text

# replace "\n" (paragraph break) and "\t" (tab character)
documents = documents.replace("\n", "")
documents = documents.replace("\t", "")

# print to screen
print("Cleaned text")
print(documents)

Cleaned text
Warwick Business School launches £3m Gillmore Centre for Financial TechnologyContactSign upPrivacySearch ...   Toggle NavigationHomeAuto financeFleet financeEquipment financeMarket DataRegulationTechnologyPeopleJobsWhite Papers InnovationDigitalisationDigital events  Warwick Business School launches £3m Gillmore Centre for Financial Technology{JLinkedShare} Written by Lisa Laverick Published: 28 July 2023Created: 28 July 2023Warwick Business School has launched the Gillmore Centre for Financial Technology, backed by £3 million funding, to spearhead cutting-edge research and innovation for the UK’s financial and technology sectors.The new Centre will deliver GillmoreGPT, a unique index of FinTech research, a Crypto Index that tracks all crypto prices charted against inflation, mobile and platform based fintech solutions, immersive technologies for financial literacy, as well as leading research on AI development, machine learning and fraud.Minister for tech and the digital 

Next we will use a basic version of chunking - no LlamaIndex - to split the documents based on full stops with spaces after them. After splitting we'll write each document to disk storage:

In [None]:
split_docs = documents.split(". ")

In [None]:
!mkdir -p '/content/data/' # create an empty directory called "data"

count = 0

for doc in split_docs: # iterate through the results
  fname = "/content/data/Output" + str(count) + ".txt"
  with open(fname, "w") as text_file:
    text_file.write(doc) # save the file
  count += 1 # increment the count

As before, we can save this in a vector database:

In [None]:
# Import ChromaVectorStore and chromadb module
from llama_index.vector_stores.chroma import ChromaVectorStore
import chromadb

# Load documents
reader = SimpleDirectoryReader("/content/data") # load documents from the /data folder
docs = reader.load_data()
print(f"Loaded {len(docs)} docs")

# Create client ("db") and a database ("chroma_db")
db = chromadb.PersistentClient(path="./chroma_db")

# Create a collection/table ("demo-for-ram") in the db
chroma_collection = db.create_collection("another-day-another-demo")

# Set up ChromaVectorStore and load in data
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
# Specify Chroma as our vector db
storage_context = StorageContext.from_defaults(vector_store=vector_store)

# Create the vector index
vector_index = VectorStoreIndex.from_documents(
    docs, # the file created earlier
    storage_context = storage_context,
    embed_model = embed_model
)

# Print the metadata
print(chroma_collection)

# Print the name of the collection (table)
print(f'Collection name is: {chroma_collection.name}')

Loaded 5 docs
name='another-day-another-demo' id=UUID('986ece56-109d-4175-a623-2a899495c32a') metadata=None tenant='default_tenant' database='default_database'
Collection name is: another-day-another-demo


## Compact Queries

In [None]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("How much was invested in the Gillmore Centre?")

response.response

" The Gillmore Centre has received a funding of $3 million to spearhead cutting-edge research and innovation for the UK's financial and technology sectors.\n"

## Refine Queries

In [None]:
query_engine = vector_index.as_query_engine(response_mode="refine")

response = query_engine.query("How much was invested in the Gillmore Centre?")

response.response

" In the context of funding allocation for the Gillmore Centre's research on AI (A), machine learning (M), and fraud detection (F), we can say that $8 million will be allocated to AI development and  $4 million each for machine learning and fraud detection, given that the total budget is set at $12 million. This solution adheres to the conditions provided in the puzzle, demonstrating a direct proof. It also shows how inductive logic is used when forming hypotheses based on existing data (AI's funding being twice of Fraud Detection) to reach conclusions about new situations (allocation of funds). The tree of thought reasoning is also evident in this problem as each decision leads to subsequent decisions until the final allocation is achieved.\n"

## Tree Summarise Queries

In [None]:
query_engine = vector_index.as_query_engine(response_mode="tree_summarize")

response = query_engine.query("How much was invested in the Gillmore Centre?")

response.response

' The Gillmore Centre was launched with a funding of £3 million.\n'

## Comparison with a Different Type of Query

In [None]:
query_engine = vector_index.as_query_engine(response_mode="compact")

response = query_engine.query("Who would know most about cloud-based analytics?")

response.response

' It is difficult to determine who would know the most about cloud-based analytics without additional context or information. However, based on the given context information, it seems that the Centre has plans to hire a number of top research fellows by the end of the year to expand its expertise in financial technology, which includes cloud computing and data analytics. The Centre also features state-of-the-art computing facilities with advanced wall screens and virtual reality capabilities, as well as access to AI and data analytics software. \n'

In [None]:
query_engine = vector_index.as_query_engine(response_mode="refine")

response = query_engine.query("Who would know most about cloud-based analytics?")

response.response

' The answer can vary depending on the context and additional information provided. If Michael Mortenson has published papers or presented at conferences on cloud computing and data analytics, then he would have more expertise in this area compared to Amit Choudhary who specializes in machine learning but not necessarily cloud-based analytics. However, if both of them have similar areas of research and experience, then they could be considered as equally knowledgeable in cloud-based analytics. Without any specific information about their respective fields of research and experience, it is difficult to determine who would know the most about cloud-based analytics.\n'

In [None]:
query_engine = vector_index.as_query_engine(response_mode="tree_summarize")

response = query_engine.query("Who would know most about cloud-based analytics?")

response.response

' Michael Mortenson, an expert in cloud computing and data analytics who has joined the Gillmore Centre as a research fellow, would be the best resource for information on cloud-based analytics.\n'

Finally, lets check which documents were used in the last generation:

In [None]:
response.metadata

{'5ae52443-43b4-4f45-bfb2-7e4379f53241': {'file_path': '/content/data/Output2.txt',
  'file_name': 'Output2.txt',
  'file_type': 'text/plain',
  'file_size': 1199,
  'creation_date': '2024-05-06',
  'last_modified_date': '2024-05-06'},
 '24974fff-1b82-427c-a5a2-985b9240cac5': {'file_path': '/content/data/Output3.txt',
  'file_name': 'Output3.txt',
  'file_type': 'text/plain',
  'file_size': 911,
  'creation_date': '2024-05-06',
  'last_modified_date': '2024-05-06'}}

And that's it! Well done 👍