In [1]:
import warnings
warnings.filterwarnings('ignore')

from scripts import utils

import os
import openai
openai.api_key = utils.get_openai_api_key()

‚úÖ In Answer Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
‚úÖ In Answer Relevance, input response will be set to __record__.main_output or `Select.RecordOutput` .
‚úÖ In Context Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
‚úÖ In Context Relevance, input response will be set to __record__.app.query.rets.source_nodes[:].node.text .
‚úÖ In Groundedness, input source will be set to __record__.app.query.rets.source_nodes[:].node.text .
‚úÖ In Groundedness, input statement will be set to __record__.main_output or `Select.RecordOutput` .


In [2]:
from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader(
    input_files=["pdfs/eBook-How-to-Build-a-Career-in-AI.pdf"]
).load_data()

In [3]:
from llama_index.core import Document

document = Document(text="\n\n".join([doc.text for doc in documents]))

### Auto Merging Retrieval Setup

In [4]:
from llama_index.core.node_parser import HierarchicalNodeParser

# create the hierarchical node parser w/ default settings
node_parser = HierarchicalNodeParser.from_defaults(chunk_sizes=[2048, 512, 128])

In [5]:
nodes = node_parser.get_nodes_from_documents([document])

In [6]:
# nodes

In [7]:
from llama_index.core.node_parser import get_leaf_nodes

leaf_nodes = get_leaf_nodes(nodes)
print(leaf_nodes[3].text)

It took centuries for literacy to spread, and now society is far richer for it.
Words enable deep human-to-human communication. Code is the deepest form of human-to-
machine communication. As machines become more central to daily life, that communication 
becomes ever more important.
Traditional software engineering ‚Äî writing programs that explicitly tell a computer sequences 
of steps to execute ‚Äî has been the main path to code literacy. Many introductory programming 
classes use creating a video game or building a website as examples. But AI, machine learning, 
and data science offer a new paradigm in which computers extract knowledge from data.


In [8]:
import tiktoken 

def num_tokens_from_string(string: str, encoding_name: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens

num_tokens_from_string(leaf_nodes[3].text, "cl100k_base")

121

In [9]:
nodes_by_id = {node.node_id: node for node in nodes}

parent_node = nodes_by_id[leaf_nodes[30].parent_node.node_id]
print(parent_node.text)

PAGE 12Should You 
Learn Math to 
Get a Job in AI? CHAPTER 3
LEARNING

PAGE 13Should you Learn Math to Get a Job in AI? CHAPTER 3
Is math a foundational skill for AI? It‚Äôs always nice to know more math! But there‚Äôs so much to 
learn that, realistically, it‚Äôs necessary to prioritize. Here‚Äôs how you might go about strengthening 
your math background.
To figure out what‚Äôs important to know, I find it useful to ask what you need to know to make 
the decisions required for the work you want to do. At DeepLearning.AI, we frequently ask, 
‚ÄúWhat does someone need to know to accomplish their goals?‚Äù The goal might be building a 
machine learning model, architecting a system, or passing a job interview.
Understanding the math behind algorithms you use is often helpful, since it enables you to 
debug them. But the depth of knowledge that‚Äôs useful changes over time. As machine learning 
techniques mature and become more reliable and turnkey, they require less debugging, and a 
shal

In [10]:
num_tokens_from_string(parent_node.text, "cl100k_base")

465

#### Building the index

In [11]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

In [12]:
from llama_index.core import ServiceContext

auto_merging_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    node_parser=node_parser,
)

In [13]:
from llama_index.core import VectorStoreIndex, StorageContext

storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

automerging_index = VectorStoreIndex(
    leaf_nodes, storage_context=storage_context, service_context=auto_merging_context
)

automerging_index.storage_context.persist(persist_dir="./merging_index")

In [14]:
# This block of code is optional to check
# if an index file exist, then it will load it
# if not, it will rebuild it

import os
from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index.core import load_index_from_storage

storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

if not os.path.exists("./merging_index"):
    automerging_index = VectorStoreIndex(
        leaf_nodes,
        storage_context=storage_context,
        service_context=auto_merging_context,
    )

    automerging_index.storage_context.persist(persist_dir="./merging_index")
else:
    automerging_index = load_index_from_storage(
        StorageContext.from_defaults(persist_dir="./merging_index"),
        service_context=auto_merging_context,
    )

#### Defining the retriever and running the query engine¬∂

In [15]:
from llama_index.core.indices.postprocessor import SentenceTransformerRerank
from llama_index.core.retrievers import AutoMergingRetriever
from llama_index.core.query_engine import RetrieverQueryEngine

automerging_retriever = automerging_index.as_retriever(
    similarity_top_k=12
)

retriever = AutoMergingRetriever(
    automerging_retriever, 
    automerging_index.storage_context, 
    verbose=True
)

rerank = SentenceTransformerRerank(top_n=6, model="cross-encoder/ms-marco-TinyBERT-L-2-v2")

auto_merging_engine = RetrieverQueryEngine.from_args(
    automerging_retriever, node_postprocessors=[rerank]
)

config.json:   0%|          | 0.00/787 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/17.6M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/525 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

In [16]:
auto_merging_response = auto_merging_engine.query(
    "What is the importance of networking in AI?"
)

In [17]:
from llama_index.core.response.notebook_utils import display_response

display_response(auto_merging_response) 

**`Final Response:`** Networking in AI is crucial as it helps individuals build a strong professional community that can provide support, guidance, and opportunities. By connecting with others in the field, individuals can gain valuable insights, receive help when needed, and stay updated on the latest trends and developments. Additionally, networking can lead to potential collaborations, mentorship opportunities, and even referrals to potential employers. In AI, having a robust network can be instrumental in advancing one's career and staying motivated through challenges.

### Putting it all together

In [4]:
import os

from llama_index.core import (
    ServiceContext,
    StorageContext,
    VectorStoreIndex,
    load_index_from_storage,
)
from llama_index.core.node_parser import HierarchicalNodeParser
from llama_index.core.node_parser import get_leaf_nodes
from llama_index.core.retrievers import AutoMergingRetriever
from llama_index.core.indices.postprocessor import SentenceTransformerRerank
from llama_index.core.query_engine import RetrieverQueryEngine


def build_automerging_index(
    documents,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index",
    chunk_sizes=None,
):
    chunk_sizes = chunk_sizes or [2048, 512, 128]
    node_parser = HierarchicalNodeParser.from_defaults(chunk_sizes=chunk_sizes)
    nodes = node_parser.get_nodes_from_documents(documents)
    leaf_nodes = get_leaf_nodes(nodes)
    merging_context = ServiceContext.from_defaults(
        llm=llm,
        embed_model=embed_model,
        node_parser=node_parser,
    )
    storage_context = StorageContext.from_defaults()
    storage_context.docstore.add_documents(nodes)

    if not os.path.exists(save_dir):
        automerging_index = VectorStoreIndex(
            leaf_nodes, storage_context=storage_context, service_context=merging_context
        )
        automerging_index.storage_context.persist(persist_dir=save_dir)
    else:
        automerging_index = load_index_from_storage(
            StorageContext.from_defaults(persist_dir=save_dir),
            service_context=merging_context,
        )
    return automerging_index


def get_automerging_query_engine(
    automerging_index,
    similarity_top_k=12,
    rerank_top_n=6,
):
    base_retriever = automerging_index.as_retriever(similarity_top_k=similarity_top_k)
    retriever = AutoMergingRetriever(
        base_retriever, automerging_index.storage_context, verbose=True
    )
    rerank = SentenceTransformerRerank(
        top_n=rerank_top_n, model="cross-encoder/ms-marco-TinyBERT-L-2-v2"
    )
    auto_merging_engine = RetrieverQueryEngine.from_args(
        retriever, node_postprocessors=[rerank]
    )
    return auto_merging_engine

In [5]:
from llama_index.llms.openai import OpenAI

index = build_automerging_index(
    [document],
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
    save_dir="./merging_index",
)

In [6]:
query_engine = get_automerging_query_engine(index, similarity_top_k=6)

In [11]:
from llama_index.core.response.notebook_utils import display_response

display_response(query_engine.query("What is the importance of networking in AI?"))

**`Final Response:`** Networking is important in AI because it allows individuals to build a strong professional network that can help propel them forward in their careers. By connecting with others in the AI community, individuals can receive help and advice when needed, as well as recognition for their expertise. Additionally, networking can lead to opportunities for collaboration and the sharing of knowledge and resources. Overall, having a strong professional network in AI can contribute to personal and professional growth in the field.

### TruLens Evaluation

In [7]:
from trulens_eval import Tru

tru = Tru(database_url="sqlite:///db/auto_merging.sqlite")

ü¶ë Tru initialized with db url sqlite:///db/auto_merging.sqlite .
üõë Secret keys may be written to the database. See the `database_redact_keys` option of `Tru` to prevent this.


In [8]:
from scripts.utils import get_prebuilt_trulens_recorder
from typing import List


def build_eval_layers(chunk_sizes: List[int], num_layers: int):
    auto_merging_index = build_automerging_index(
        documents,
        llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
        embed_model="local:BAAI/bge-small-en-v1.5",
        save_dir=f"merging_index_{num_layers}",
        chunk_sizes=chunk_sizes,
    )

    auto_merging_engine = get_automerging_query_engine(
        auto_merging_index,
        similarity_top_k=12,
        rerank_top_n=6,
    )

    tru_recorder = get_prebuilt_trulens_recorder(
        auto_merging_engine, app_id=f"num_layers-{num_layers}"
    )

    return tru_recorder

In [12]:
def run_evals(eval_questions, tru_recorder, query_engine):
    for question in eval_questions:
        with tru_recorder as _:
            query_engine.query(question)

In [None]:
eval_questions = []
with open("texts/generated_questions.text", "r") as file:
    for line in file:
        # Remove newline character and convert to integer
        item = line.strip()
        eval_questions.append(item)

#### Two Layers 

In [9]:
tru_recorder_2 = build_eval_layers(chunk_sizes=[2048, 512], num_layers=2)

In [13]:
run_evals(eval_questions, tru_recorder_2, query_engine)

A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x7f258bd3a2d0 is calling an instrumented method <function BaseQueryEngine.query at 0x7f259f47d3a0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x7f255ff00ed0) using this function.
A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x7f258bd3a2d0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x7f258c663420>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x7f255ff00ed0) using this function.
A new object of type <class 'llama_index.retrievers.auto_merging_retriever.AutoMergingRetriever'> at 0x7f256651b1d0 is calling an instrumented method <function BaseRetriever.retrieve at 0x7f258f550f40>. The path of this call may be incorrect.
Guessing path of new object is app.retriever based on other object (0x7

> Merging 3 nodes into parent node.
> Parent node id: cd3eebae-55ab-458e-a3a6-c9052b5406a4.
> Parent node text: When taking a shot is inexpensive, it also makes sense to take many shots. In this 
case, the pro...



A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x7f2607bc5bd0 is calling an instrumented method <function CompactAndRefine.get_response at 0x7f258f521800>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x7f255ff015d0) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x7f2607bc5bd0 is calling an instrumented method <function Refine.get_response at 0x7f258f522ac0>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x7f255ff015d0) using this function.


In [17]:
tru.get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Answer Relevance,Context Relevance,Groundedness,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
num_layers-2,1.0,0.575,0.814286,21.0,0.001847


In [16]:
tru.run_dashboard()

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.


Accordion(children=(VBox(children=(VBox(children=(Label(value='STDOUT'), Output())), VBox(children=(Label(valu‚Ä¶

Dashboard started at http://192.168.1.161:8501 .


<Popen: returncode: None args: ['streamlit', 'run', '--server.headless=True'...>

#### Three Layers

In [18]:
tru_recorder_3 = build_eval_layers(chunk_sizes=[2048, 512, 128], num_layers=3)

In [19]:
run_evals(eval_questions, tru_recorder_3, query_engine)

A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x7f258bd3a2d0 is calling an instrumented method <function BaseQueryEngine.query at 0x7f259f47d3a0>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x7f255ff00890) using this function.
A new object of type <class 'llama_index.query_engine.retriever_query_engine.RetrieverQueryEngine'> at 0x7f258bd3a2d0 is calling an instrumented method <function RetrieverQueryEngine.retrieve at 0x7f258c663420>. The path of this call may be incorrect.
Guessing path of new object is app based on other object (0x7f255ff00890) using this function.
A new object of type <class 'llama_index.retrievers.auto_merging_retriever.AutoMergingRetriever'> at 0x7f256651b1d0 is calling an instrumented method <function BaseRetriever.retrieve at 0x7f258f550f40>. The path of this call may be incorrect.
Guessing path of new object is app.retriever based on other object (0x7

> Merging 3 nodes into parent node.
> Parent node id: cd3eebae-55ab-458e-a3a6-c9052b5406a4.
> Parent node text: When taking a shot is inexpensive, it also makes sense to take many shots. In this 
case, the pro...



A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x7f2607bc5bd0 is calling an instrumented method <function CompactAndRefine.get_response at 0x7f258f521800>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x7f255c534710) using this function.
A new object of type <class 'llama_index.response_synthesizers.compact_and_refine.CompactAndRefine'> at 0x7f2607bc5bd0 is calling an instrumented method <function Refine.get_response at 0x7f258f522ac0>. The path of this call may be incorrect.
Guessing path of new object is app._response_synthesizer based on other object (0x7f255c534710) using this function.


In [27]:
tru.get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Answer Relevance,Context Relevance,Groundedness,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
num_layers-2,1.0,0.575,0.814286,21.0,0.001847
num_layers-3,1.0,0.575,0.714286,21.0,0.001839
