# **Build and Evaluate LLM Applications using advanced RAG( using Llama Index and Trulens Frameworks)**

In this notebook, we show how to use the Gemini text models from Google in LlamaIndex. Check out the Gemini site or the announcement.

If you're opening this Notebook on colab, you will need to install LlamaIndex 🦙 and the Gemini Python SDK.

### [Optional] step to help restart the Colab Enterprise Notebook Google Cloud

In [None]:
import os
os.kill(os.getpid(), 9)

### Install the required pip packages for this Notebook

In [None]:
!pip install pypdf cohere llama-index==0.9.48 google-generativeai trulens_eval==0.22.1 litellm==1.23.10 torch sentence-transformers

### Set the Environment variables for API Keys

You will need to get an API key from [Google AI Studio](https:///makersuite.google.com/app/apikey). Once you have one, you can either pass it explicity to the model, or use the GOOGLE_API_KEY environment variable.


In [1]:
%env GOOGLE_API_KEY=...
%env GEMINI_API_KEY=...

env: GOOGLE_API_KEY=...
env: GEMINI_API_KEY=...


In [2]:
import os

GOOGLE_API_KEY = ""  # add your GOOGLE API key here
GEMINI_API_KEY = ""  # add your GEMINI API key here
OPEN_API_KEY = "" # add your OPEN API key here
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY
os.environ["GEMINI_API_KEY"] = GEMINI_API_KEY
os.environ["OPEN_API_KEY"] = OPEN_API_KEY

### This is solve the bug related to the typing_extensions incompatibility with the versions of the library we are using

In [3]:
%%writefile /usr/local/lib/python3.10/dist-packages/openai/_utils/_streams.py
from typing import Any
from typing_extensions import AsyncIterator
from typing import Iterator # import Iterator from the correct library

def consume_sync_iterator(iterator: Iterator[Any]) -> None:
    for _ in iterator:
        ...

async def consume_async_iterator(iterator: AsyncIterator[Any]) -> None:
    async for _ in iterator:
        ...

Overwriting /usr/local/lib/python3.10/dist-packages/openai/_utils/_streams.py


### An example to use the Gemini text models from Google in LlamaIndex

In [4]:
from llama_index.llms import Gemini


resp = Gemini().complete("Write a poem about a magic backpack")
print(resp)

**The Magic Backpack**

In a realm where wonders reside,
A backpack with secrets it hides.
Its fabric gleams with an ethereal hue,
A portal to realms both old and new.

With a whisper, it opens wide,
Revealing treasures it cannot hide.
Books that whisper ancient lore,
Maps that guide to distant shores.

A compass points to hidden trails,
Where adventure's call never fails.
A flashlight illuminates the dark,
Guiding through shadows, leaving its mark.

A first aid kit, for wounds it heals,
A blanket that comforts, a warmth it feels.
A water bottle, quenching thirst,
A snack to replenish, at its best.

But its greatest power lies within,
A space where dreams can begin.
A place to store hopes and fears,
A sanctuary where laughter cheers.

With every step, it carries more,
A weightless burden, a precious store.
A symbol of wonder, a friend so true,
The magic backpack, forever new.

So let us wander, hand in hand,
With this backpack, a treasure planned.
For in its depths, we'll find our way,

### Authenticate to the Google Cloud Storage where the artifacts and other documents to run this lab are stored.

In [4]:
from google.colab import auth

auth.authenticate_user()

### Download the document used for this exercise from the Google Cloud Storage

In [5]:
!gsutil cp gs://machine-learning-gemini/eBook-How-to-Build-a-Career-in-AI.pdf .

Copying gs://machine-learning-gemini/eBook-How-to-Build-a-Career-in-AI.pdf...
/ [0 files][    0.0 B/  3.6 MiB]                                                / [1 files][  3.6 MiB/  3.6 MiB]                                                
Operation completed over 1 objects/3.6 MiB.                                      


# Basic RAG Pipeline
In this example, we ingest the pdf, chunk into smaller sizes, create embeddings using our prefer model and index using VectorStoreIndex.

In [6]:
from llama_index import SimpleDirectoryReader

documents = SimpleDirectoryReader(
    input_files=["eBook-How-to-Build-a-Career-in-AI.pdf"]
).load_data()

In [7]:
print(type(documents), "\n")
print(len(documents), "\n")
print(type(documents[0]))
print(documents[0])
print(documents[1])

<class 'list'> 

41 

<class 'llama_index.schema.Document'>
Doc ID: b92643de-e93d-4aef-bd28-def6edd2f1e6
Text: PAGE 1Founder, DeepLearning.AICollected Insights from Andrew Ng
How to  Build Your Career in AIA Simple Guide
Doc ID: ef7f9dbf-f981-4c6f-87fa-f88ff4b1abc8
Text: PAGE 2"AI is the new  electricity. It will  transform and
improve  all areas of human life." Andrew Ng


In [8]:
from llama_index import Document

document = Document(text="\n\n".join([doc.text for doc in documents]))

In [10]:
from llama_index import VectorStoreIndex
from llama_index import ServiceContext
from llama_index.llms import Gemini


llm = Gemini(model="models/gemini-pro", temperature=0.1)

# Create a service context using the Gemini LLM and the model
service_context = ServiceContext.from_defaults(
    llm=llm, embed_model="local:BAAI/bge-small-en-v1.5"
)
index = VectorStoreIndex.from_documents([document],
                                        service_context=service_context)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [11]:
query_engine = index.as_query_engine()

### We run a sample Query to against the index to retrive the documents

In [13]:
response = query_engine.query(
    "What are steps to take when finding projects to build your experience?"
)
print(str(response))

1. Join existing projects.
2. Keep reading and talking to people.
3. Focus on an application area.
4. Develop a side hustle.


In [12]:
!gsutil cp gs://machine-learning-gemini/eval_questions.txt .

Copying gs://machine-learning-gemini/eval_questions.txt...
/ [1 files][  517.0 B/  517.0 B]                                                
Operation completed over 1 objects/517.0 B.                                      


## Here are some sample questions to help us with the evaluation and benchmarking of the RAG pipelines

In [9]:
eval_questions = []
with open('eval_questions.txt', 'r') as file:
    for line in file:
        # Remove newline character and convert to integer
        item = line.strip()
        print(item)
        eval_questions.append(item)

What are the keys to building a career in AI?
How can teamwork contribute to success in AI?
What is the importance of networking in AI?
What are some good habits to develop for a successful career?
How can altruism be beneficial in building a career?
What is imposter syndrome and how does it relate to AI?
Who are some accomplished individuals who have experienced imposter syndrome?
What is the first step to becoming good at AI?
What are some common challenges in AI?
Is it normal to find parts of AI challenging?


In [10]:
# You can try your own question:
new_question = "What is the right AI job for me?"
eval_questions.append(new_question)

In [11]:
print(eval_questions)

['What are the keys to building a career in AI?', 'How can teamwork contribute to success in AI?', 'What is the importance of networking in AI?', 'What are some good habits to develop for a successful career?', 'How can altruism be beneficial in building a career?', 'What is imposter syndrome and how does it relate to AI?', 'Who are some accomplished individuals who have experienced imposter syndrome?', 'What is the first step to becoming good at AI?', 'What are some common challenges in AI?', 'Is it normal to find parts of AI challenging?', 'What is the right AI job for me?']


Set up TruLens Feedback functions and dashboard for Evaluation and Benchmarking query results.

Trulens helps you to objectively measure the quality and effectiveness of your LLM-based applications using feedback functions. Feedback functions help to programmatically evaluate the quality of inputs, outputs, and intermediate results, so that you can expedite and scale up experiment evaluation. Use it for a wide variety of use cases including question answering, summarization, retrieval-augmented generation, and agent-based applications.

In [12]:
from IPython.display import JSON
from trulens_eval import Tru, Feedback

tru = Tru()

tru.reset_database()

🦑 Tru initialized with db url sqlite:///default.sqlite .
🛑 Secret keys may be written to the database. See the `database_redact_keys` option of `Tru` to prevent this.


Import custom utils.py .Utils.py contains custom functions to set up retreivers and feedback functions for a smooth run of this tutorial

In [13]:
! rm utils.py
!ls -ltr utils.py
!gsutil cp gs://machine-learning-gemini/utils.py .
!pwd


ls: cannot access 'utils.py': No such file or directory
Copying gs://machine-learning-gemini/utils.py...
/ [1 files][  5.5 KiB/  5.5 KiB]                                                
Operation completed over 1 objects/5.5 KiB.                                      
/content


In [14]:
import importlib
importlib.import_module('utils')



✅ In Answer Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Answer Relevance, input response will be set to __record__.main_output or `Select.RecordOutput` .
✅ In Context Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Context Relevance, input response will be set to __record__.app.query.rets.source_nodes[:].node.text .
✅ In Groundedness, input source will be set to __record__.app.query.rets.source_nodes[:].node.text .
✅ In Groundedness, input statement will be set to __record__.main_output or `Select.RecordOutput` .


<module 'utils' from '/content/utils.py'>

In [19]:
from utils import get_prebuilt_trulens_recorder

tru_recorder = get_prebuilt_trulens_recorder(query_engine,
                                             app_id="Direct Query Engine")

In [20]:
with tru_recorder as recording:
    for question in eval_questions:
        response = query_engine.query(question)

In [21]:
records, feedback = tru.get_records_and_feedback(app_ids=[])

In [22]:
records.head()

Unnamed: 0,app_id,app_json,type,record_id,input,output,tags,record_json,cost_json,perf_json,ts,Answer Relevance,Groundedness,Context Relevance,Answer Relevance_calls,Groundedness_calls,Context Relevance_calls,latency,total_tokens,total_cost
0,Direct Query Engine,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.query_engine....,record_hash_96ccbb406a23c9e6455f0f847d9cc09a,"""What are the keys to building a career in AI?""","""The keys to building a career in AI are learn...",-,"{""record_id"": ""record_hash_96ccbb406a23c9e6455...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-02-14T23:10:56.134866"", ""...",2024-02-14T23:11:00.444236,0.5,1.0,0.95,[{'args': {'prompt': 'What are the keys to bui...,"[{'args': {'source': 'PAGE 1Founder, DeepLearn...",[{'args': {'prompt': 'What are the keys to bui...,4,0,0.0
1,Direct Query Engine,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.query_engine....,record_hash_6b95110b3efae8afb0258bddda2e44f1,"""How can teamwork contribute to success in AI?""","""Teamwork can contribute to success in AI by a...",-,"{""record_id"": ""record_hash_6b95110b3efae8afb02...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-02-14T23:11:02.566360"", ""...",2024-02-14T23:11:09.651930,0.9,0.7,0.4,[{'args': {'prompt': 'How can teamwork contrib...,[{'args': {'source': 'Hopefully the previous c...,[{'args': {'prompt': 'How can teamwork contrib...,7,0,0.0
2,Direct Query Engine,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.query_engine....,record_hash_3d13e9b50c819471931b5f46d2f90d83,"""What is the importance of networking in AI?""","""Networking is important in AI because it can ...",-,"{""record_id"": ""record_hash_3d13e9b50c819471931...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-02-14T23:11:11.212540"", ""...",2024-02-14T23:11:14.749454,0.3,1.0,0.45,[{'args': {'prompt': 'What is the importance o...,[{'args': {'source': 'Hopefully the previous c...,[{'args': {'prompt': 'What is the importance o...,3,0,0.0
3,Direct Query Engine,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.query_engine....,record_hash_9e715dde3dc6135b47e0577b487b5219,"""What are some good habits to develop for a su...","""Develop good habits in eating, exercise, slee...",-,"{""record_id"": ""record_hash_9e715dde3dc6135b47e...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-02-14T23:11:15.826011"", ""...",2024-02-14T23:11:19.000269,1.0,1.0,0.25,[{'args': {'prompt': 'What are some good habit...,[{'args': {'source': 'Hopefully the previous c...,[{'args': {'prompt': 'What are some good habit...,3,0,0.0
4,Direct Query Engine,"{""tru_class_info"": {""name"": ""TruLlama"", ""modul...",RetrieverQueryEngine(llama_index.query_engine....,record_hash_dfb8af48b4f50907e5289c41da405fe9,"""How can altruism be beneficial in building a ...","""Altruism can be beneficial in building a care...",-,"{""record_id"": ""record_hash_dfb8af48b4f50907e52...","{""n_requests"": 0, ""n_successful_requests"": 0, ...","{""start_time"": ""2024-02-14T23:11:20.054960"", ""...",2024-02-14T23:11:24.001644,0.8,1.0,0.35,[{'args': {'prompt': 'How can altruism be bene...,[{'args': {'source': 'Hopefully the previous c...,[{'args': {'prompt': 'How can altruism be bene...,3,0,0.0


In [23]:
tru.run_dashboard()

Starting dashboard ...
npx: installed 22 in 5.446s

Go to this url and submit the ip given here. your url is: https://clever-cups-wash.loca.lt

  Submit this IP Address: 35.238.6.193



<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

##**Advanced Rag Pipeline**

## 1. Sentence Window retrieval

In [15]:
from llama_index.llms import Gemini

llm = Gemini(model="models/gemini-pro", temperature=0.1)

In [16]:
from utils import build_sentence_window_index

sentence_index = build_sentence_window_index(
    document,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="sentence_index"
)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


In [17]:
from utils import get_sentence_window_query_engine

sentence_window_engine = get_sentence_window_query_engine(sentence_index)


In [18]:
window_response = sentence_window_engine.query(
    "how do I get started on a personal project in AI?"
)
print(str(window_response))

This context does not mention how to get started on a personal project in AI, so I cannot answer this question from the provided context.


In [19]:
from utils import get_prebuilt_trulens_recorder
tru.reset_database()

tru_recorder_sentence_window = get_prebuilt_trulens_recorder(
    sentence_window_engine,
    app_id = "Sentence Window Query Engine"
)

In [None]:
with tru_recorder_sentence_window as recording:
    for question in eval_questions:
        response = sentence_window_engine.query(question)
        print(str(response))


In [39]:
tru.get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Answer Relevance,Groundedness,Context Relevance,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Sentence Window Query Engine,0.6125,0.607143,0.584375,5.3125,0.0


In [40]:
# launches on http://localhost:8501/
tru.run_dashboard()

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.
npx: installed 22 in 2.505s

Go to this url and submit the ip given here. your url is: https://brown-rules-smile.loca.lt

Dashboard closed.
Dashboard closed.


RuntimeError: Dashboard failed to start in time. Please inspect dashboard logs for additional information.

## 2. Auto Merging Retreival

In [22]:
import warnings
warnings.filterwarnings('ignore')

In [23]:
import utils
import os
from llama_index.llms import Gemini

llm = Gemini(model="gemini-pro")



In [24]:
from llama_index import SimpleDirectoryReader

documents = SimpleDirectoryReader(
    input_files=["./eBook-How-to-Build-a-Career-in-AI.pdf"]
).load_data()

In [25]:
print(type(documents), "\n")
print(len(documents), "\n")
print(type(documents[0]))
print(documents[0])

<class 'list'> 

41 

<class 'llama_index.schema.Document'>
Doc ID: 6e839c67-8a12-4ef9-97f9-dc976b747c18
Text: PAGE 1Founder, DeepLearning.AICollected Insights from Andrew Ng
How to  Build Your Career in AIA Simple Guide


In [26]:
from llama_index import Document

document = Document(text="\n\n".join([doc.text for doc in documents]))

In [27]:
from llama_index.node_parser import HierarchicalNodeParser

# create the hierarchical node parser w/ default settings
node_parser = HierarchicalNodeParser.from_defaults(
    chunk_sizes=[2048, 512, 128]
)

In [28]:
nodes = node_parser.get_nodes_from_documents([document])

In [29]:
from llama_index.node_parser import get_leaf_nodes

leaf_nodes = get_leaf_nodes(nodes)
print(leaf_nodes[30].text)

But this became less important as numerical linear algebra libraries matured.
Deep learning is still an emerging technology, so when you train a neural network and the 
optimization algorithm struggles to converge, understanding the math behind gradient 
descent, momentum, and the Adam  optimization algorithm will help you make better decisions. 
Similarly, if your neural network does something funny — say, it makes bad predictions on 
images of a certain resolution, but not others — understanding the math behind neural network 
architectures puts you in a better position to figure out what to do.
Of course, I also encourage learning driven by curiosity.


In [None]:
nodes_by_id = {node.node_id: node for node in nodes}

parent_node = nodes_by_id[leaf_nodes[30].parent_node.node_id]
print(parent_node.text)

In [31]:
from llama_index import ServiceContext
from llama_index.llms import Gemini


llm = Gemini(model="gemini-pro", temperature=0.1)

auto_merging_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    node_parser=node_parser,
)

In [32]:
!rm -rf merging_index/
!ls -ltr merging_index

ls: cannot access 'merging_index': No such file or directory


In [34]:
from llama_index import VectorStoreIndex, StorageContext
from llama_index import set_global_service_context

set_global_service_context(auto_merging_context)

storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

automerging_index = VectorStoreIndex(
    leaf_nodes, storage_context=storage_context, service_context=auto_merging_context
)

automerging_index.storage_context.persist(persist_dir="./merging_index")

In [35]:
# This block of code is optional to check
# if an index file exist, then it will load it
# if not, it will rebuild it

import os
from llama_index import VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index import load_index_from_storage

if not os.path.exists("./merging_index"):
    storage_context = StorageContext.from_defaults()
    storage_context.docstore.add_documents(nodes)

    automerging_index = VectorStoreIndex(
            leaf_nodes,
            storage_context=storage_context,
            service_context=auto_merging_context
        )

    automerging_index.storage_context.persist(persist_dir="./merging_index")
else:
    automerging_index = load_index_from_storage(
        StorageContext.from_defaults(persist_dir="./merging_index"),
        service_context=auto_merging_context
    )

In [36]:
from llama_index.indices.postprocessor import SentenceTransformerRerank
from llama_index.retrievers import AutoMergingRetriever
from llama_index.query_engine import RetrieverQueryEngine
from llama_index.llms import Gemini


automerging_retriever = automerging_index.as_retriever(
    similarity_top_k=12
)

retriever = AutoMergingRetriever(
    automerging_index.service_context,
    automerging_retriever,
    automerging_index.storage_context,
    verbose=True
)

rerank = SentenceTransformerRerank(top_n=6, model="BAAI/bge-reranker-base")

auto_merging_engine = RetrieverQueryEngine.from_args(
    automerging_retriever, node_postprocessors=[rerank], verbose=True
)

In [37]:
auto_merging_response = auto_merging_engine.query(
    "What is the importance of networking in AI?"
)

In [38]:
from llama_index.response.notebook_utils import display_response

display_response(auto_merging_response)

**`Final Response:`** Networking is important in AI because it can help you find potential employers, build up your community, and meet more people and make friends.