<a href="https://colab.research.google.com/github/Zenith1618/LLM/blob/main/Intro_Notebook_to_LlamaIndex_and_RAG.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Installing Libraries and Setting up Env

In [11]:
%pip install llama-index-llms-azure-openai
%pip install llama-index-embeddings-azure-openai
!pip install llama-index

Collecting llama-index-embeddings-azure-openai
  Downloading llama_index_embeddings_azure_openai-0.2.3-py3-none-any.whl.metadata (746 bytes)
Downloading llama_index_embeddings_azure_openai-0.2.3-py3-none-any.whl (3.3 kB)
Installing collected packages: llama-index-embeddings-azure-openai
Successfully installed llama-index-embeddings-azure-openai-0.2.3


In [2]:
import os

os.environ["OPENAI_API_KEY"] = "6359488b89ab45d2ac5fc78aeee9a4b8"
os.environ["AZURE_OPENAI_ENDPOINT"] = "https://openaillm69.openai.azure.com/"
os.environ["OPENAI_API_VERSION"] = "2024-05-01-preview"

In [9]:
api_key = os.environ.get("OPENAI_API_KEY")
azure_endpoint = os.environ.get("AZURE_OPENAI_ENDPOINT")
api_version = os.environ.get("OPENAI_API_VERSION")

In [12]:
from llama_index.embeddings.azure_openai import AzureOpenAIEmbedding

embed_model = AzureOpenAIEmbedding(
    model="text-embedding-ada-002",
    deployment_name="embedModel",
    api_key=api_key,
    azure_endpoint=azure_endpoint,
    api_version=api_version,
)

In [13]:
from llama_index.core import Settings

Settings.llm = llm
Settings.embed_model = embed_model  #used to build index

# Experiments with LLM

In [3]:
from llama_index.llms.azure_openai import AzureOpenAI
from llama_index.core.llms import ChatMessage

llm = AzureOpenAI(engine="ragModel")

message = [
    ChatMessage(role="system", content = "You are an AI assistant to user."),
    ChatMessage(role="user", content = "What is the revenue of uber in 2023?")
]

response = llm.chat(message)
print(response)

assistant: I'm sorry, but as an AI assistant, I cannot predict the future with certainty. However, according to Uber's financial reports, their revenue has been steadily increasing over the years. It is possible that their revenue will continue to grow in 2023, but it is impossible to provide an exact figure.


# Data Connector

A Data connector in LlamaIndex is a functional component that facilitates the conversion of data from various sources like PDF's, Youtube Videos, Audio Files, webpages, SQL Databases, docx etc into a Document format, making it ready for ingestion by LlamaIndex

In [4]:
from pathlib import Path
from llama_index.core import download_loader

PDFReader = download_loader("PDFReader")
loader = PDFReader()
documents = loader.load_data(file=Path('uber_2023.pdf'))

  PDFReader = download_loader("PDFReader")


In [5]:
documents[8]

Document(id_='c69daed1-bf60-4614-aab8-b8e247e9c769', embedding=None, metadata={'page_label': 'i', 'file_name': 'uber_2023.pdf'}, excluded_embed_metadata_keys=[], excluded_llm_metadata_keys=[], relationships={}, text='Together, these elements power movement from point A to point B.\nMassive Network Our massive, efficient, and intelligent network consists of tens of millions of Drivers , consumers, \nMerchants , Shippers  and Carriers , as well as underlying data, technology, and shared infrastructure. \nOur network becomes smarter with every trip. In more than 10,000  cities around the world (as of \nDecember 31, 2023), our network powers movement at the touch of a button for millions, and we hope \neventually billions, of people.\nLeading Technology We have built proprietary marketplace, routing, and payments technologies. Marketplace technologies \nare the core of our deep technology advantage and include demand prediction, matching and \ndispatching, and pricing technologies. Our tec

In [6]:
len(documents)

148

# Core Components of LlamaIndex

1. Index: It's the "library" of your data- Stores your data.
2. Retriever: It's the "librarian" that finds relevant data - Finds Data.
3. Response Synthesizer: It's the "storyteller" that creates a response - Make responses.
4. QueryEngine: It's the "director" that makes everything works together - Coordinates Everything

In [15]:
from llama_index.core.node_parser import SimpleNodeParser
from llama_index.core import VectorStoreIndex, Settings
from IPython.display import display, HTML

#create parser and parse the document
parser = SimpleNodeParser()
nodes = parser.get_nodes_from_documents(documents)

#build index
index = VectorStoreIndex(nodes)

#Construct Query Engine
query_engine = index.as_query_engine()

#Query the engine
response = query_engine.query("What is the revenue of uber in 2023?")

#print the response
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

## Adding one more document

In [16]:
from llama_index.core import SimpleDirectoryReader

reader = SimpleDirectoryReader(input_files=["paul_graham_essay.txt"])
documents = reader.load_data()

In [18]:
# Create new nodes
new_nodes = parser.get_nodes_from_documents(documents)

# Add nodes to existing index
index.insert_nodes(new_nodes)

#Construct Query Engine
query_engine = index.as_query_engine()

#Query the engine
response = query_engine.query("Why did paul graham start YC?")

#print the response
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

In [19]:
#Query the engine
response = query_engine.query("What did the author do growing up?")

#print the response
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

# Defining Retriever and response synthesizer(Customizing)

We can even customize nodes, llm parameter, embed model parameter etc

In [25]:
from llama_index.core.retrievers import VectorIndexRetriever
from llama_index.core import get_response_synthesizer
from llama_index.core.query_engine import RetrieverQueryEngine

# define retriever
retriever = VectorIndexRetriever(
    index=index,
    similarity_top_k=3
)

# configure response synthesizer
response_synthesizer = get_response_synthesizer(
    response_mode="accumulate"
)

# assemble query engine
query_engine = RetrieverQueryEngine(
    retriever=retriever,
    response_synthesizer=response_synthesizer
)

#Query the engine
response = query_engine.query("What information do you have about zomato investment?")

#print the response
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

# Node Post Processor

Once we retreive from the retriever, but now i want to filter the nodes i got on some criteria, like certain keywords or similarity score

In [33]:
from llama_index.core.postprocessor import SimilarityPostprocessor

# Filtering nodes with similarity score
node_post_processor = SimilarityPostprocessor(similarity_cutoff=0.7)

query_engine = index.as_query_engine(node_postprocessors = [node_post_processor])

#Query the engine
response = query_engine.query("What information do you have about zomato investment?")

#print the response
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

# Querying over Multiple indices



## Router Query Engine

Example we have 2 indices, vectorstore(specific context queries) and List index(Summarization), now router engine is used to decide which one to use

In [34]:
from llama_index.core import SummaryIndex

summary_index = SummaryIndex(new_nodes)
vector_index = VectorStoreIndex(new_nodes)


In [43]:
summary_query_engine = summary_index.as_query_engine(
    response_mode="tree_summarize",
    # use_async = True
)
vector_query_engine = vector_index.as_query_engine()

In [44]:
from llama_index.core.tools import QueryEngineTool

summary_tool = QueryEngineTool.from_defaults(
    query_engine = summary_query_engine,
    description = "Useful for summarization question related to Paul Graham Essay"
)

vector_tool = QueryEngineTool.from_defaults(
    query_engine = vector_query_engine,
    description = "Useful for retrieving specific context from Paul Graham Essay"
)

In [45]:
from llama_index.core.query_engine import RouterQueryEngine
from llama_index.core.selectors import LLMSingleSelector

query_engine = RouterQueryEngine(
    selector = LLMSingleSelector.from_defaults(),
    query_engine_tools = [summary_tool, vector_tool]
)

In [46]:
#Query the engine
response = query_engine.query("What is the summary of the document?")

#print the response
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

In [47]:
#Query the engine
response = query_engine.query("Why did paul graham start YC.")

#print the response
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

## Subquestion Query Engine

Example: compare revenue growth of uber and lyft in 2023.<br>
Here we will have 2 different document to get the revenue for both the company and we will need to combine the answer.

In [65]:
from llama_index.core.query_engine import SubQuestionQueryEngine

lyft_docs = SimpleDirectoryReader(input_files=["lyft_2023.pdf"]).load_data()
uber_docs = SimpleDirectoryReader(input_files=["uber_2023.pdf"]).load_data()

In [66]:
# Create Indices: If you dont want to play with nodes and directly create indices we can do that as well

lyft_index = VectorStoreIndex.from_documents(lyft_docs)
uber_index = VectorStoreIndex.from_documents(uber_docs)

### Basic QA

In [67]:
lyft_engine = lyft_index.as_query_engine(similarity_top_k = 3)
uber_engine = uber_index.as_query_engine(similarity_top_k = 3)

In [68]:
response = await uber_engine.aquery("What is the revenue of uber in 2023?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

In [69]:
response = lyft_engine.query("What is the revenue of lyft in 2023?")
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

In [70]:
from llama_index.core.tools import QueryEngineTool, ToolMetadata

query_engine_tools = [
    QueryEngineTool(
        query_engine = lyft_engine,
        metadata = ToolMetadata(name = "lyft_10k", description = "Provides information about Lyft")
    ),
    QueryEngineTool(
        query_engine = uber_engine,
        metadata = ToolMetadata(name = "uber_10k", description = "Provides information about Uber")
    )
]

s_engine = SubQuestionQueryEngine.from_defaults(query_engine_tools=query_engine_tools, use_async=True,)

In [83]:
response = await s_engine.aquery('Compare the growth of revenue of uber and lyft from 2021 to 2023')
display(HTML(f'<p style="font-size:20px">{response.response}</p>'))

Generated 4 sub questions.
[1;3;38;2;237;90;200m[uber_10k] Q: What is the revenue of Uber in 2021?
[0m[1;3;38;2;90;149;237m[uber_10k] Q: What is the revenue of Uber in 2023?
[0m[1;3;38;2;11;159;203m[lyft_10k] Q: What is the revenue of Lyft in 2021?
[0m[1;3;38;2;155;135;227m[lyft_10k] Q: What is the revenue of Lyft in 2023?
[0m[1;3;38;2;11;159;203m[lyft_10k] A: Lyft's revenue in 2021 was $3,208,323 thousand.
[0m[1;3;38;2;90;149;237m[uber_10k] A: The revenue of Uber in 2023 was $37.3 billion, which represents a 17% increase compared to the previous year.
[0m[1;3;38;2;237;90;200m[uber_10k] A: The revenue of Uber in 2021 was $17,455 million.
[0m[1;3;38;2;155;135;227m[lyft_10k] A: Lyft's revenue in 2023 was $4,403,589 thousand.
[0m

The comparison between the document is done but internally we did some semantic search backend because each of these queries were posed to the respective query engine and the answer was combined