# Lesson 4: Auto-merging Retrieval


In [12]:
import warnings
warnings.filterwarnings('ignore')

In [24]:
from scripts import utils

import os
import openai
openai.api_key = utils.get_openai_api_key()

In [18]:
from llama_index import SimpleDirectoryReader

documents = SimpleDirectoryReader(
    input_files=["pdfs/eBook-How-to-Build-a-Career-in-AI.pdf"]
).load_data()

In [4]:
print(type(documents), "\n")
print(len(documents), "\n")
print(type(documents[0]))
print(documents[0])

<class 'list'> 

41 

<class 'llama_index.schema.Document'>
Doc ID: a5e2a1fb-7cf6-4619-9609-329a826c85b8
Text: PAGE 1Founder, DeepLearning.AICollected Insights from Andrew Ng
How to  Build Your Career in AIA Simple Guide


In [15]:
from llama_index import Document

document = Document(text="\n\n".join([doc.text for doc in documents]))

## Auto-merging retrieval setup


In [6]:
from llama_index.node_parser import HierarchicalNodeParser

# create the hierarchical node parser w/ default settings
node_parser = HierarchicalNodeParser.from_defaults(
    chunk_sizes=[2048, 512, 128]
)

In [7]:
nodes = node_parser.get_nodes_from_documents([document])

In [8]:
from llama_index.node_parser import get_leaf_nodes

leaf_nodes = get_leaf_nodes(nodes)
print(leaf_nodes[3].text)

It took centuries for literacy to spread, and now society is far richer for it.
Words enable deep human-to-human communication. Code is the deepest form of human-to-
machine communication. As machines become more central to daily life, that communication 
becomes ever more important.
Traditional software engineering — writing programs that explicitly tell a computer sequences 
of steps to execute — has been the main path to code literacy. Many introductory programming 
classes use creating a video game or building a website as examples. But AI, machine learning, 
and data science offer a new paradigm in which computers extract knowledge from data.


In [9]:
import tiktoken

def num_tokens_from_string(string: str, encoding_name: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens


In [10]:
num_tokens_from_string(leaf_nodes[3].text, "cl100k_base")


121

In [11]:
nodes_by_id = {node.node_id: node for node in nodes}

parent_node = nodes_by_id[leaf_nodes[30].parent_node.node_id]
print(parent_node.text)

PAGE 12Should You 
Learn Math to 
Get a Job in AI? CHAPTER 3
LEARNING

PAGE 13Should you Learn Math to Get a Job in AI? CHAPTER 3
Is math a foundational skill for AI? It’s always nice to know more math! But there’s so much to 
learn that, realistically, it’s necessary to prioritize. Here’s how you might go about strengthening 
your math background.
To figure out what’s important to know, I find it useful to ask what you need to know to make 
the decisions required for the work you want to do. At DeepLearning.AI, we frequently ask, 
“What does someone need to know to accomplish their goals?” The goal might be building a 
machine learning model, architecting a system, or passing a job interview.
Understanding the math behind algorithms you use is often helpful, since it enables you to 
debug them. But the depth of knowledge that’s useful changes over time. As machine learning 
techniques mature and become more reliable and turnkey, they require less debugging, and a 
shallower understand

In [12]:
num_tokens_from_string(parent_node.text, "cl100k_base")

465

create a markdown file writing the steps taken sentence window retriever and auto mergin retrieval


### Building the index


In [28]:
from llama_index.llms import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

In [14]:
from llama_index import ServiceContext

auto_merging_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    node_parser=node_parser,
)

In [15]:
from llama_index import VectorStoreIndex, StorageContext

storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

automerging_index = VectorStoreIndex(
    leaf_nodes, storage_context=storage_context, service_context=auto_merging_context
)

automerging_index.storage_context.persist(persist_dir="./merging_index")

In [16]:
# This block of code is optional to check
# if an index file exist, then it will load it
# if not, it will rebuild it

import os
from llama_index import VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index import load_index_from_storage

storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

if not os.path.exists("./merging_index"):
    automerging_index = VectorStoreIndex(
        leaf_nodes,
        storage_context=storage_context,
        service_context=auto_merging_context,
    )

    automerging_index.storage_context.persist(persist_dir="./merging_index")
else:
    automerging_index = load_index_from_storage(
        StorageContext.from_defaults(persist_dir="./merging_index"),
        service_context=auto_merging_context,
    )

### Defining the retriever and running the query engine

In [17]:
from llama_index.indices.postprocessor import SentenceTransformerRerank
from llama_index.retrievers import AutoMergingRetriever
from llama_index.query_engine import RetrieverQueryEngine

automerging_retriever = automerging_index.as_retriever(
    similarity_top_k=12
)

retriever = AutoMergingRetriever(
    automerging_retriever, 
    automerging_index.storage_context, 
    verbose=True
)

rerank = SentenceTransformerRerank(top_n=6, model="BAAI/bge-reranker-base")

auto_merging_engine = RetrieverQueryEngine.from_args(
    automerging_retriever, node_postprocessors=[rerank]
)

In [18]:
auto_merging_response = auto_merging_engine.query(
    "What is the importance of networking in AI?"
)

In [19]:
from llama_index.response.notebook_utils import display_response

display_response(auto_merging_response)

**`Final Response:`** Networking is important in AI because it allows individuals to build a strong professional network and community. This network can provide valuable information, help with career advancement, and offer support and advice when needed. By connecting with others in the AI community, individuals can also increase their visibility and recognition for their expertise. Additionally, networking can lead to referrals for potential job opportunities. Building a community and network in AI is seen as more beneficial than simply focusing on personal connections, as it allows for the exchange of ideas and the opportunity to make friends.

### putting it together

In [46]:
import os

from llama_index import (
    ServiceContext,
    StorageContext,
    VectorStoreIndex,
    load_index_from_storage,
)
from llama_index.node_parser import HierarchicalNodeParser
from llama_index.node_parser import get_leaf_nodes
from llama_index import StorageContext, load_index_from_storage
from llama_index.retrievers import AutoMergingRetriever
from llama_index.indices.postprocessor import SentenceTransformerRerank
from llama_index.query_engine import RetrieverQueryEngine


def build_automerging_index(
    documents,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index",
    chunk_sizes=None,
):
    chunk_sizes = chunk_sizes or [2048, 512, 128]
    node_parser = HierarchicalNodeParser.from_defaults(chunk_sizes=chunk_sizes)
    nodes = node_parser.get_nodes_from_documents(documents)
    leaf_nodes = get_leaf_nodes(nodes)
    merging_context = ServiceContext.from_defaults(
        llm=llm,
        embed_model=embed_model,
        node_parser=node_parser,
    )
    storage_context = StorageContext.from_defaults()
    storage_context.docstore.add_documents(nodes)

    if not os.path.exists(save_dir):
        automerging_index = VectorStoreIndex(
            leaf_nodes, storage_context=storage_context, service_context=merging_context
        )
        automerging_index.storage_context.persist(persist_dir=save_dir)
    else:
        automerging_index = load_index_from_storage(
            StorageContext.from_defaults(persist_dir=save_dir),
            service_context=merging_context,
        )
    return automerging_index


def get_automerging_query_engine(
    automerging_index,
    similarity_top_k=12,
    rerank_top_n=6,
):
    base_retriever = automerging_index.as_retriever(similarity_top_k=similarity_top_k)
    retriever = AutoMergingRetriever(
        base_retriever, automerging_index.storage_context, verbose=True
    )
    rerank = SentenceTransformerRerank(
        top_n=rerank_top_n, model="BAAI/bge-reranker-base"
    )
    auto_merging_engine = RetrieverQueryEngine.from_args(
        retriever, node_postprocessors=[rerank]
    )
    return auto_merging_engine

In [47]:
from llama_index.llms import OpenAI

index = build_automerging_index(
    [document],
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
    save_dir="./merging_index",
)

In [48]:
query_engine = get_automerging_query_engine(index, similarity_top_k=6)

In [24]:
display_response(query_engine.query("What is the importance of networking in AI?"))

**`Final Response:`** Networking is important in AI because it allows individuals to build a strong professional network that can help propel them forward in their careers. By connecting with others in the AI community, individuals can receive help and advice when needed, as well as recognition for their expertise. Additionally, networking can lead to opportunities for collaboration and the sharing of knowledge and resources. Overall, having a strong network in AI can contribute to personal and professional growth in the field.

### TruLens Evaluation

In [19]:
from trulens_eval import Tru

tru = Tru(database_url="sqlite:///db/auto_merging.sqlite")

Tru was already initialized. Cannot change database_url=sqlite:///db/auto_merging.sqlite or database_file=None .


### Two Layers

In [2]:
auto_merging_index_0 = build_automerging_index(
    documents,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index_0",
    chunk_sizes=[2048,512],
)


KeyboardInterrupt



In [27]:
auto_merging_engine_0 = get_automerging_query_engine(
    auto_merging_index_0,
    similarity_top_k=12,
    rerank_top_n=6,
)

In [29]:
from scripts.utils import get_prebuilt_trulens_recorder

tru_recorder = get_prebuilt_trulens_recorder(
    auto_merging_engine_0,
    app_id ='app_0'
)

In [30]:
eval_questions = []
with open('text/generated_questions.text', 'r') as file:
    for line in file:
        # Remove newline character and convert to integer
        item = line.strip()
        eval_questions.append(item)

In [32]:
def run_evals(eval_questions, tru_recorder, query_engine):
    for question in eval_questions:
        with tru_recorder as recording:
            response = query_engine.query(question)

In [33]:
run_evals(eval_questions, tru_recorder, auto_merging_engine_0)

> Merging 2 nodes into parent node.
> Parent node id: 8496a448-3ef7-4726-8a36-64b4cd561687.
> Parent node text: PAGE 20Working on projects requires making tough choices about what to build and how to go 
about...

> Merging 1 nodes into parent node.
> Parent node id: 979502cc-42c0-4243-87ab-14e87403922a.
> Parent node text: PAGE 7These phases apply in a wide 
range of professions, but AI 
involves unique elements.
For e...

> Merging 1 nodes into parent node.
> Parent node id: 5d557aa8-e56d-4fc3-91af-afa161406a86.
> Parent node text: PAGE 18It goes without saying that we should only work on projects that are responsible, ethical,...

> Merging 1 nodes into parent node.
> Parent node id: c157ff26-12c6-431a-9c0b-7a1fa0458b7f.
> Parent node text: PAGE 22Over the course of a career, you’re likely to work on projects in succession, each growing...

> Merging 1 nodes into parent node.
> Parent node id: 96e9b122-70d6-4d90-80f5-a58e0dc5ee18.
> Parent node text: PAGE 15One of the most important

In [20]:
from trulens_eval import Tru

tru.get_leaderboard(app_ids=[])

Unnamed: 0_level_0,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1


In [4]:
tru.run_dashboard()

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.


Accordion(children=(VBox(children=(VBox(children=(Label(value='STDOUT'), Output())), VBox(children=(Label(valu…

Dashboard started at http://192.168.1.74:8501 .


<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

In [1]:
# tru.stop_dashboard()

NameError: name 'tru' is not defined

### all Layers

In [25]:
tru.reset_database()

In [26]:
from scripts.utils import get_prebuilt_trulens_recorder
from typing import List

def build_eval_layer(chunk_sizes: List[int], num_layers: int):

    auto_merging_index = build_automerging_index(
        documents,
        llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
        embed_model="local:BAAI/bge-small-en-v1.5",
        save_dir=f"merging_index_{num_layers}",
        chunk_sizes=chunk_sizes,
    )

    auto_merging_engine = get_automerging_query_engine(
        auto_merging_index,
        similarity_top_k=12,
        rerank_top_n=6,
    )

    tru_recorder = get_prebuilt_trulens_recorder(
        auto_merging_engine,
        app_id =f'num_Layers-{num_layers}'
    )

    return tru_recorder






In [37]:
tru_recorder_2 = build_eval_layer(chunk_sizes=[2048, 512], num_layers=2)

In [38]:
eval_questions = []
with open('text/generated_questions.text', 'r') as file:
    for line in file:
        # Remove newline character and convert to integer
        item = line.strip()
        eval_questions.append(item)

In [39]:
def run_evals(eval_questions, tru_recorder, C):
    for question in eval_questions:
        with tru_recorder as _:
            response = query_engine.query(question)

In [49]:
run_evals(eval_questions, tru_recorder_2, query_engine)

A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x205ce82db10 is calling an instrumented method <function BaseQueryEngine.query at 0x00000205C63537E0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x205ce82f890) using this function.
A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x205ce82db10 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x00000205CE5322A0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x205ce82f890) using this function.
A new object of type <class 'llama_index.retrievers.auto_merging_retriever.AutoMergingRetriever'> at 0x205dfde6ed0 is calling an instrumented method <function BaseRetriever.retrieve at 0x00000205C8DA3C40>. The path of this call may be incorrect.
Guessing path of new object is app.retriever based on other obje

> Merging 3 nodes into parent node.
> Parent node id: 6078d7c0-502d-41c7-8b9e-36d40ee29a41.
> Parent node text: When taking a shot is inexpensive, it also makes sense to take many shots. In this 
case, the pro...



A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x205e02d5910 is calling an instrumented method <function CompactAndRefine.get_response at 0x00000205C8DA0540>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x205ce82e890) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x205e02d5910 is calling an instrumented method <function Refine.get_response at 0x00000205C8DA19E0>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x205ce82e890) using this function.


In [50]:
from trulens_eval import Tru

Tru().get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Context Relevance,Answer Relevance,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
num_Layers-2,0.575,0.9,26.0,0.001729


In [51]:
Tru().run_dashboard()

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.
Dashboard already running at path:   Network URL: http://192.168.1.74:8501



<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

In [52]:
tru_recorder_3 = build_eval_layer(chunk_sizes=[2048, 512, 128], num_layers=3)

In [53]:
run_evals(eval_questions, tru_recorder_3, query_engine)

A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x205ce82db10 is calling an instrumented method <function BaseQueryEngine.query at 0x00000205C63537E0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x205e2643890) using this function.
A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x205ce82db10 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x00000205CE5322A0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x205e2643890) using this function.
A new object of type <class 'llama_index.retrievers.auto_merging_retriever.AutoMergingRetriever'> at 0x205dfde6ed0 is calling an instrumented method <function BaseRetriever.retrieve at 0x00000205C8DA3C40>. The path of this call may be incorrect.
Guessing path of new object is app.retriever based on other obje

> Merging 3 nodes into parent node.
> Parent node id: 6078d7c0-502d-41c7-8b9e-36d40ee29a41.
> Parent node text: When taking a shot is inexpensive, it also makes sense to take many shots. In this 
case, the pro...



A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x205e02d5910 is calling an instrumented method <function CompactAndRefine.get_response at 0x00000205C8DA0540>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x205e2642d10) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x205e02d5910 is calling an instrumented method <function Refine.get_response at 0x00000205C8DA19E0>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x205e2642d10) using this function.


In [55]:
tru.get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Context Relevance,Answer Relevance,Groundedness,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
num_Layers-2,0.575,0.9,0.666667,27.0,0.001729
num_Layers-3,,1.0,,27.0,0.001809


In [57]:
# launches on http://localhost:8501/
tru.run_dashboard()

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.
Dashboard already running at path:   Network URL: http://192.168.1.74:8501



<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>