# Auto-merging Retrieval

In [1]:
import warnings
warnings.filterwarnings('ignore')

In [2]:
import os
import openai

os.environ["OPENAI_API_KEY"] = "sk-proj-[.....OpenAI API KEY]"

openai.api_key = ('sk-proj-[.....OpenAI API KEY]')

In [3]:
import utils

import os
import openai
openai.api_key = utils.get_openai_api_key()

In [4]:
from llama_index.core import SimpleDirectoryReader

documents = SimpleDirectoryReader(
    "data"
).load_data()

In [5]:
print(type(documents), "\n")
print(len(documents), "\n")
print(type(documents[0]))
print(documents[0])

<class 'list'> 

1 

<class 'llama_index.core.schema.Document'>
Doc ID: eeedb840-ab51-4409-a5d9-32af5c2e0679
Text: My guest today is Sam Altman. He, of course, is the CEO of
OpenAI. He’s been an entrepreneur and a leader in the tech industry
for a long time, including running Y Combinator, that did amazing
things like funding Reddit, Dropbox, Airbnb. A little while after I
recorded this episode, I was completely taken by surprise when, at
least briefly, he w...


## Auto-merging retrieval setup

In [6]:
from llama_index.core import Document

document = Document(text="\n\n".join([doc.text for doc in documents]))

In [7]:
from llama_index.core.node_parser import HierarchicalNodeParser

# create the hierarchical node parser w/ default settings
node_parser = HierarchicalNodeParser.from_defaults(
    chunk_sizes=[2048, 512, 128]
)

In [8]:
nodes = node_parser.get_nodes_from_documents([document])

In [9]:
from llama_index.core.node_parser import get_leaf_nodes

leaf_nodes = get_leaf_nodes(nodes)
print(leaf_nodes[30].text)

I guess you and I do have some concern, along with this good thing, that it’ll force us to adapt faster than we’ve had to ever before. That’s the scary part. It’s not that we have to adapt. It’s not that humanity is not super-adaptable.


In [10]:
nodes_by_id = {node.node_id: node for node in nodes}

parent_node = nodes_by_id[leaf_nodes[30].parent_node.node_id]
print(parent_node.text)

It’s working super-well. But if you make a programmer three times more effective, it’s not just that they can do three times more stuff, it’s that they can – at that higher level of abstraction, using more of their brainpower – they can now think of totally different things. It’s like going from punch cards to higher level languages didn’t just let us program a little faster, it let us do these qualitatively new things. We’re really seeing that. As we look at these next steps of things that can do a more complete task, you can imagine a little agent that you can say, "Go write this whole program for me, I’ll ask you a few questions along the way, but it won’t just be writing a few functions at a time.” That’ll enable a bunch of new stuff. And then again, it’ll do even more complex stuff. Someday, maybe there’s an AI where you can say, "Go start and run this company for me." And then someday, there’s maybe an AI where you can say, "Go discover new physics." The stuff that we’re seeing n

### Building the index

In [11]:
from llama_index.llms.openai import OpenAI

llm = OpenAI(model="gpt-3.5-turbo", temperature=0.1)

In [12]:
from llama_index.core import ServiceContext

auto_merging_context = ServiceContext.from_defaults(
    llm=llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    node_parser=node_parser,
)

In [13]:
from llama_index.core import VectorStoreIndex, StorageContext

storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

automerging_index = VectorStoreIndex(
    leaf_nodes, storage_context=storage_context, service_context=auto_merging_context
)

automerging_index.storage_context.persist(persist_dir="./merging_index")

In [14]:
# This block of code is optional to check
# if an index file exist, then it will load it
# if not, it will rebuild it

import os
from llama_index.core import VectorStoreIndex, StorageContext, load_index_from_storage
from llama_index.core import load_index_from_storage

if not os.path.exists("./merging_index"):
    storage_context = StorageContext.from_defaults()
    storage_context.docstore.add_documents(nodes)

    automerging_index = VectorStoreIndex(
            leaf_nodes,
            storage_context=storage_context,
            service_context=auto_merging_context
        )

    automerging_index.storage_context.persist(persist_dir="./merging_index")
else:
    automerging_index = load_index_from_storage(
        StorageContext.from_defaults(persist_dir="./merging_index"),
        service_context=auto_merging_context
    )


### Defining the retriever and running the query engine

In [15]:
from llama_index.core.indices.postprocessor import SentenceTransformerRerank
from llama_index.core.retrievers import AutoMergingRetriever
from llama_index.core.query_engine import RetrieverQueryEngine

automerging_retriever = automerging_index.as_retriever(
    similarity_top_k=12
)

retriever = AutoMergingRetriever(
    automerging_retriever, 
    automerging_index.storage_context, 
    verbose=True
)

rerank = SentenceTransformerRerank(top_n=6, model="BAAI/bge-reranker-base")

auto_merging_engine = RetrieverQueryEngine.from_args(
    automerging_retriever, node_postprocessors=[rerank]
)

## Putting it all Together

In [16]:
import os

from llama_index.core import (
    ServiceContext,
    StorageContext,
    VectorStoreIndex,
    load_index_from_storage,
)
from llama_index.core.node_parser import HierarchicalNodeParser
from llama_index.core.node_parser import get_leaf_nodes
from llama_index.core import StorageContext, load_index_from_storage
from llama_index.core.retrievers import AutoMergingRetriever
from llama_index.core.indices.postprocessor import SentenceTransformerRerank
from llama_index.core.query_engine import RetrieverQueryEngine


def build_automerging_index(
    documents,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index",
    chunk_sizes=None,
):
    chunk_sizes = chunk_sizes or [2048, 512, 128]
    node_parser = HierarchicalNodeParser.from_defaults(chunk_sizes=chunk_sizes)
    nodes = node_parser.get_nodes_from_documents(documents)
    leaf_nodes = get_leaf_nodes(nodes)
    merging_context = ServiceContext.from_defaults(
        llm=llm,
        embed_model=embed_model,
    )
    storage_context = StorageContext.from_defaults()
    storage_context.docstore.add_documents(nodes)

    if not os.path.exists(save_dir):
        automerging_index = VectorStoreIndex(
            leaf_nodes, storage_context=storage_context, service_context=merging_context
        )
        automerging_index.storage_context.persist(persist_dir=save_dir)
    else:
        automerging_index = load_index_from_storage(
            StorageContext.from_defaults(persist_dir=save_dir),
            service_context=merging_context,
        )
    return automerging_index


def get_automerging_query_engine(
    automerging_index,
    similarity_top_k=12,
    rerank_top_n=6,
):
    base_retriever = automerging_index.as_retriever(similarity_top_k=similarity_top_k)
    retriever = AutoMergingRetriever(
        base_retriever, automerging_index.storage_context, verbose=True
    )
    rerank = SentenceTransformerRerank(
        top_n=rerank_top_n, model="BAAI/bge-reranker-base"
    )
    auto_merging_engine = RetrieverQueryEngine.from_args(
        retriever, node_postprocessors=[rerank]
    )
    return auto_merging_engine

In [17]:
from llama_index.llms.openai import OpenAI

index = build_automerging_index(
    [document],
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
    save_dir="./merging_index",
)


In [18]:
query_engine = get_automerging_query_engine(index, similarity_top_k=6)

## TruLens Evaluation

In [19]:
from trulens_eval import Tru

Tru().reset_database()

🦑 Tru initialized with db url sqlite:///default.sqlite .
🛑 Secret keys may be written to the database. See the `database_redact_keys` option of Tru` to prevent this.


### Two layers

In [20]:
auto_merging_index_0 = build_automerging_index(
    documents,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index_0",
    chunk_sizes=[2048,512],
)

In [21]:
auto_merging_engine_0 = get_automerging_query_engine(
    auto_merging_index_0,
    similarity_top_k=12,
    rerank_top_n=6,
)

In [22]:
from utils import get_prebuilt_trulens_recorder

tru_recorder = get_prebuilt_trulens_recorder(
    auto_merging_engine_0,
    app_id ='app_0'
)

✅ In Answer Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Answer Relevance, input response will be set to __record__.main_output or `Select.RecordOutput` .
✅ In Context Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Context Relevance, input response will be set to __record__.app.query.rets.source_nodes[:].node.text .
✅ In Groundedness, input source will be set to __record__.app.query.rets.source_nodes[:].node.text .
✅ In Groundedness, input statement will be set to __record__.main_output or `Select.RecordOutput` .


[nltk_data] Downloading package punkt to
[nltk_data]     /home/dcsmahasiswa1/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [23]:
eval_questions = []
with open('prompt/questions.txt', 'r') as file:
    for line in file:
        # Remove newline character and convert to integer
        item = line.strip()
        eval_questions.append(item)

In [24]:
def run_evals(eval_questions, tru_recorder, query_engine):
    for question in eval_questions:
        with tru_recorder as recording:
            response = query_engine.query(question)

In [25]:
run_evals(eval_questions, tru_recorder, auto_merging_engine_0)

> Merging 1 nodes into parent node.
> Parent node id: 9866e9a3-a43b-43ba-92c4-ed071069faf4.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...

> Merging 1 nodes into parent node.
> Parent node id: bfff5276-6df4-47e7-8d44-92bffa34735d.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 2 nodes into parent node.
> Parent node id: 9f3128f5-5b7e-489d-8bea-b3939985a6d1.
> Parent node text: PAGE 20Working on projects requires making tough choices about what to build and how to go 
about...

> Merging 1 nodes into parent node.
> Parent node id: 7004df90-801e-4899-89b2-b569e891e368.
> Parent node text: PAGE 28Using Informational 
Interviews to Find 
the Right JobCHAPTER 8
JOBS

> Merging 1 nodes into parent node.
> Parent node id: 29470da0-749d-4b2b-8f16-95e2da57c46a.
> Parent node text: PAGE 30Finding someone to interview isn’t always easy, bu

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)


Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: 214e7c0d-b34a-4ca5-b9cc-64e9843432bd.
> Parent node text: PAGE 16Determine milestones. Once you’ve deemed a project sufficiently 
valuable, the next step i...

> Merging 1 nodes into parent node.
> Parent node id: 9866e9a3-a43b-43ba-92c4-ed071069faf4.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...

> Merging 2 nodes into parent node.
> Parent node id: 821fae8a-1a93-4cab-a00f-729251efd0a9.
> Parent node text: PAGE 9In the previous chapter, I introduced three key steps for building a career in AI: learning...

> Merging 1 nodes into parent node.
> Parent node id: 35b57086-26c6-4e74-ac74-4a32b9221783.
> Parent node text: PAGE 3Table of 
ContentsIntroduction: Coding AI is the New Literacy.
Chapter 1: Three Steps to Ca...

> Merging 1 nodes into parent node.
> Parent node id: a921d792-d8d4-46cc-92c3-7924adb2114e.
> Parent node text: PAGE 18It goes without saying th

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: 9866e9a3-a43b-43ba-92c4-ed071069faf4.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...

> Merging 1 nodes into parent node.
> Parent node id: bfff5276-6df4-47e7-8d44-92bffa34735d.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 5ae708b4-5a7b-47fe-86f4-95cce3d58c44.
> Parent node text: PAGE 38Before we dive into the final chapter of this book, I’d like to address the serious matter...

> Merging 1 nodes into parent node.
> Parent node id: f7c2b113-024f-435d-9135-1028ecf53e83.
> Parent node text: PAGE 26If you’re considering a role switch, a startup can be an easier place to do it than a big ...

> Merging 1 nodes into parent node.
> Parent node id: a921d792-d8d4-46cc-92c3-7924adb2114e.
> Parent node text: PAGE 18It goes without saying th

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: 214e7c0d-b34a-4ca5-b9cc-64e9843432bd.
> Parent node text: PAGE 16Determine milestones. Once you’ve deemed a project sufficiently 
valuable, the next step i...

> Merging 1 nodes into parent node.
> Parent node id: b0a5cc01-6652-4c95-aaab-d9d334c9c2f9.
> Parent node text: PAGE 13Should you Learn Math to Get a Job in AI? CHAPTER 3
Is math a foundational skill for AI? I...

> Merging 1 nodes into parent node.
> Parent node id: 9866e9a3-a43b-43ba-92c4-ed071069faf4.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...

> Merging 1 nodes into parent node.
> Parent node id: a921d792-d8d4-46cc-92c3-7924adb2114e.
> Parent node text: PAGE 18It goes without saying that we should only work on projects that are responsible, ethical,...

> Merging 1 nodes into parent node.
> Parent node id: 8e83bbe2-a29f-4555-8082-a34d1e26a65f.
> Parent node text: PAGE 10This is a lot to learn!
E

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: 9866e9a3-a43b-43ba-92c4-ed071069faf4.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...

> Merging 1 nodes into parent node.
> Parent node id: 0dd7e278-3070-4641-a671-7b0f4c20e389.
> Parent node text: PAGE 39My three-year-old daughter (who can barely count to 12) regularly tries to teach things to...

> Merging 1 nodes into parent node.
> Parent node id: bfff5276-6df4-47e7-8d44-92bffa34735d.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 214e7c0d-b34a-4ca5-b9cc-64e9843432bd.
> Parent node text: PAGE 16Determine milestones. Once you’ve deemed a project sufficiently 
valuable, the next step i...

> Merging 1 nodes into parent node.
> Parent node id: a921d792-d8d4-46cc-92c3-7924adb2114e.
> Parent node text: PAGE 18It goes without saying th

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: b0a5cc01-6652-4c95-aaab-d9d334c9c2f9.
> Parent node text: PAGE 13Should you Learn Math to Get a Job in AI? CHAPTER 3
Is math a foundational skill for AI? I...

> Merging 1 nodes into parent node.
> Parent node id: 35b57086-26c6-4e74-ac74-4a32b9221783.
> Parent node text: PAGE 3Table of 
ContentsIntroduction: Coding AI is the New Literacy.
Chapter 1: Three Steps to Ca...

> Merging 1 nodes into parent node.
> Parent node id: 8e83bbe2-a29f-4555-8082-a34d1e26a65f.
> Parent node text: PAGE 10This is a lot to learn!
Even after you master everything on this list, I hope you’ll keep ...

> Merging 2 nodes into parent node.
> Parent node id: 821fae8a-1a93-4cab-a00f-729251efd0a9.
> Parent node text: PAGE 9In the previous chapter, I introduced three key steps for building a career in AI: learning...

> Merging 1 nodes into parent node.
> Parent node id: 9866e9a3-a43b-43ba-92c4-ed071069faf4.
> Parent node text: PAGE 4Coding AI Is the New Liter

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: 9866e9a3-a43b-43ba-92c4-ed071069faf4.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...

> Merging 1 nodes into parent node.
> Parent node id: bfff5276-6df4-47e7-8d44-92bffa34735d.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 214e7c0d-b34a-4ca5-b9cc-64e9843432bd.
> Parent node text: PAGE 16Determine milestones. Once you’ve deemed a project sufficiently 
valuable, the next step i...

> Merging 1 nodes into parent node.
> Parent node id: a921d792-d8d4-46cc-92c3-7924adb2114e.
> Parent node text: PAGE 18It goes without saying that we should only work on projects that are responsible, ethical,...

> Merging 1 nodes into parent node.
> Parent node id: a16cc012-5bc6-4b76-a154-8127ef388c78.
> Parent node text: PAGE 23Each project is only one 

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: 9866e9a3-a43b-43ba-92c4-ed071069faf4.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...

> Merging 1 nodes into parent node.
> Parent node id: 5ae708b4-5a7b-47fe-86f4-95cce3d58c44.
> Parent node text: PAGE 38Before we dive into the final chapter of this book, I’d like to address the serious matter...

> Merging 1 nodes into parent node.
> Parent node id: 1f97200e-31dd-41e0-9897-08008421ca8a.
> Parent node text: PAGE 15One of the most important skills of an AI architect is the ability to identify ideas that ...

> Merging 1 nodes into parent node.
> Parent node id: 35b57086-26c6-4e74-ac74-4a32b9221783.
> Parent node text: PAGE 3Table of 
ContentsIntroduction: Coding AI is the New Literacy.
Chapter 1: Three Steps to Ca...

> Merging 1 nodes into parent node.
> Parent node id: a16cc012-5bc6-4b76-a154-8127ef388c78.
> Parent node text: PAGE 23Each project is only one 

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: bfff5276-6df4-47e7-8d44-92bffa34735d.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: a921d792-d8d4-46cc-92c3-7924adb2114e.
> Parent node text: PAGE 18It goes without saying that we should only work on projects that are responsible, ethical,...

> Merging 1 nodes into parent node.
> Parent node id: b0a5cc01-6652-4c95-aaab-d9d334c9c2f9.
> Parent node text: PAGE 13Should you Learn Math to Get a Job in AI? CHAPTER 3
Is math a foundational skill for AI? I...

> Merging 1 nodes into parent node.
> Parent node id: 35b57086-26c6-4e74-ac74-4a32b9221783.
> Parent node text: PAGE 3Table of 
ContentsIntroduction: Coding AI is the New Literacy.
Chapter 1: Three Steps to Ca...

> Merging 1 nodes into parent node.
> Parent node id: 9866e9a3-a43b-43ba-92c4-ed071069faf4.
> Parent node text: PAGE 4Coding AI Is the New Liter

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: 8e83bbe2-a29f-4555-8082-a34d1e26a65f.
> Parent node text: PAGE 10This is a lot to learn!
Even after you master everything on this list, I hope you’ll keep ...

> Merging 1 nodes into parent node.
> Parent node id: a921d792-d8d4-46cc-92c3-7924adb2114e.
> Parent node text: PAGE 18It goes without saying that we should only work on projects that are responsible, ethical,...

> Merging 1 nodes into parent node.
> Parent node id: 9866e9a3-a43b-43ba-92c4-ed071069faf4.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...

> Merging 1 nodes into parent node.
> Parent node id: 5ae708b4-5a7b-47fe-86f4-95cce3d58c44.
> Parent node text: PAGE 38Before we dive into the final chapter of this book, I’d like to address the serious matter...

> Merging 1 nodes into parent node.
> Parent node id: 214e7c0d-b34a-4ca5-b9cc-64e9843432bd.
> Parent node text: PAGE 16Determine milestones. Onc

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

In [30]:
from trulens_eval import Tru

Tru().get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Groundedness,Context Relevance,Answer Relevance,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
app_0,0.8125,0.33,0.82,6.4,0.003854


### Three layers

In [32]:
auto_merging_index_1 = build_automerging_index(
    documents,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index_1",
    chunk_sizes=[2048,512,128],
)

In [33]:
auto_merging_engine_1 = get_automerging_query_engine(
    auto_merging_index_1,
    similarity_top_k=12,
    rerank_top_n=6,
)


In [34]:
tru_recorder = get_prebuilt_trulens_recorder(
    auto_merging_engine_1,
    app_id ='app_1'
)

✅ In Answer Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Answer Relevance, input response will be set to __record__.main_output or `Select.RecordOutput` .
✅ In Context Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Context Relevance, input response will be set to __record__.app.query.rets.source_nodes[:].node.text .
✅ In Groundedness, input source will be set to __record__.app.query.rets.source_nodes[:].node.text .
✅ In Groundedness, input statement will be set to __record__.main_output or `Select.RecordOutput` .


[nltk_data] Downloading package punkt to
[nltk_data]     /home/dcsmahasiswa1/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [35]:
run_evals(eval_questions, tru_recorder, auto_merging_engine_1)

> Merging 2 nodes into parent node.
> Parent node id: bc13074d-b75a-4e29-a20b-b60238bf2198.
> Parent node text: PAGE 30Finding someone to interview isn’t always easy, but many people who are in senior position...

> Merging 1 nodes into parent node.
> Parent node id: b8999b4d-203e-40ca-adcc-05d434785cae.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: d26b9927-6a01-4241-b692-1020d6654606.
> Parent node text: PAGE 30Finding someone to interview isn’t always easy, but many people who are in senior position...

> Merging 1 nodes into parent node.
> Parent node id: 7aae1fd5-0883-42b2-90d4-6cc03f6dc0a0.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...



Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: f3832304-359b-4003-bf93-0c7380bb26c9.
> Parent node text: I’ve found EDA particularly 
useful in data-centric AI development, where analyzing errors and ga...



Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: b8999b4d-203e-40ca-adcc-05d434785cae.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 7aae1fd5-0883-42b2-90d4-6cc03f6dc0a0.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...



Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: b8999b4d-203e-40ca-adcc-05d434785cae.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 7aae1fd5-0883-42b2-90d4-6cc03f6dc0a0.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...



Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: f3832304-359b-4003-bf93-0c7380bb26c9.
> Parent node text: I’ve found EDA particularly 
useful in data-centric AI development, where analyzing errors and ga...



Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: b8999b4d-203e-40ca-adcc-05d434785cae.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 7aae1fd5-0883-42b2-90d4-6cc03f6dc0a0.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...



Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 4 nodes into parent node.
> Parent node id: f9dc6b70-41a0-4907-9130-6703e9488631.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...

> Merging 1 nodes into parent node.
> Parent node id: 3ebaee5a-f148-4b35-ae77-211bd204defc.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...



Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: b8999b4d-203e-40ca-adcc-05d434785cae.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 24901ded-2b7a-4c88-b4c6-2b2d29dcc723.
> Parent node text: If you’re leaving 
a job, exit gracefully. Give your employer ample notice, give your full effort...

> Merging 1 nodes into parent node.
> Parent node id: 7aae1fd5-0883-42b2-90d4-6cc03f6dc0a0.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...



Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

> Merging 3 nodes into parent node.
> Parent node id: 27339558-e431-40a4-b78d-50788291f048.
> Parent node text: PAGE 10This is a lot to learn!
Even after you master everything on this list, I hope you’ll keep ...

> Merging 1 nodes into parent node.
> Parent node id: 24e91f48-f5f6-4c51-9513-f2a35196917a.
> Parent node text: PAGE 10This is a lot to learn!
Even after you master everything on this list, I hope you’ll keep ...



In [38]:
from trulens_eval import Tru

Tru().get_leaderboard(app_ids=[])

Unnamed: 0_level_0,Groundedness,Context Relevance,Answer Relevance,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
app_1,0.676667,0.31,0.82,6.4,0.001394
app_0,0.52,0.33,0.82,6.4,0.003854


In [37]:
# Tru().run_dashboard()

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/4 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

### Five layers

In [39]:
auto_merging_index_3 = build_automerging_index(
    documents,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index_1",
    chunk_sizes=[2048,1024,512,256,128],
)

In [40]:
auto_merging_engine_3 = get_automerging_query_engine(
    auto_merging_index_3,
    similarity_top_k=12,
    rerank_top_n=6,
)


In [41]:
tru_recorder = get_prebuilt_trulens_recorder(
    auto_merging_engine_3,
    app_id ='app_3'
)

✅ In Answer Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Answer Relevance, input response will be set to __record__.main_output or `Select.RecordOutput` .
✅ In Context Relevance, input prompt will be set to __record__.main_input or `Select.RecordInput` .
✅ In Context Relevance, input response will be set to __record__.app.query.rets.source_nodes[:].node.text .
✅ In Groundedness, input source will be set to __record__.app.query.rets.source_nodes[:].node.text .
✅ In Groundedness, input statement will be set to __record__.main_output or `Select.RecordOutput` .


[nltk_data] Downloading package punkt to
[nltk_data]     /home/dcsmahasiswa1/nltk_data...
[nltk_data]   Package punkt is already up-to-date!


In [42]:
run_evals(eval_questions, tru_recorder, auto_merging_engine_3)

> Merging 2 nodes into parent node.
> Parent node id: bc13074d-b75a-4e29-a20b-b60238bf2198.
> Parent node text: PAGE 30Finding someone to interview isn’t always easy, but many people who are in senior position...

> Merging 1 nodes into parent node.
> Parent node id: b8999b4d-203e-40ca-adcc-05d434785cae.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: d26b9927-6a01-4241-b692-1020d6654606.
> Parent node text: PAGE 30Finding someone to interview isn’t always easy, but many people who are in senior position...

> Merging 1 nodes into parent node.
> Parent node id: 7aae1fd5-0883-42b2-90d4-6cc03f6dc0a0.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...



Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: f3832304-359b-4003-bf93-0c7380bb26c9.
> Parent node text: I’ve found EDA particularly 
useful in data-centric AI development, where analyzing errors and ga...



Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: b8999b4d-203e-40ca-adcc-05d434785cae.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 7aae1fd5-0883-42b2-90d4-6cc03f6dc0a0.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...



Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: b8999b4d-203e-40ca-adcc-05d434785cae.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 7aae1fd5-0883-42b2-90d4-6cc03f6dc0a0.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...



Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: f3832304-359b-4003-bf93-0c7380bb26c9.
> Parent node text: I’ve found EDA particularly 
useful in data-centric AI development, where analyzing errors and ga...



Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: b8999b4d-203e-40ca-adcc-05d434785cae.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 7aae1fd5-0883-42b2-90d4-6cc03f6dc0a0.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...



Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 4 nodes into parent node.
> Parent node id: f9dc6b70-41a0-4907-9130-6703e9488631.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...

> Merging 1 nodes into parent node.
> Parent node id: 3ebaee5a-f148-4b35-ae77-211bd204defc.
> Parent node text: PAGE 4Coding AI Is the New Literacy
Today we take it for granted that many people know how to rea...



Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

> Merging 1 nodes into parent node.
> Parent node id: b8999b4d-203e-40ca-adcc-05d434785cae.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...

> Merging 1 nodes into parent node.
> Parent node id: 24901ded-2b7a-4c88-b4c6-2b2d29dcc723.
> Parent node text: If you’re leaving 
a job, exit gracefully. Give your employer ample notice, give your full effort...

> Merging 1 nodes into parent node.
> Parent node id: 7aae1fd5-0883-42b2-90d4-6cc03f6dc0a0.
> Parent node text: PAGE 2"AI is the new 
electricity. It will 
transform and improve 
all areas of human life."
Andr...



Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

> Merging 3 nodes into parent node.
> Parent node id: 27339558-e431-40a4-b78d-50788291f048.
> Parent node text: PAGE 10This is a lot to learn!
Even after you master everything on this list, I hope you’ll keep ...

> Merging 1 nodes into parent node.
> Parent node id: 24e91f48-f5f6-4c51-9513-f2a35196917a.
> Parent node text: PAGE 10This is a lot to learn!
Even after you master everything on this list, I hope you’ll keep ...



Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/1 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/2 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

Groundedness per statement in source:   0%|          | 0/3 [00:00<?, ?it/s]

In [2]:
from trulens_eval import Tru

Tru().get_leaderboard(app_ids=[])

🦑 Tru initialized with db url sqlite:///default.sqlite .
🛑 Secret keys may be written to the database. See the `database_redact_keys` option of Tru` to prevent this.


Unnamed: 0_level_0,Context Relevance,Answer Relevance,Groundedness,latency,total_cost
app_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
app_0,0.33,0.82,0.52,6.4,0.003854
app_1,0.31,0.82,0.676667,6.4,0.001394
app_3,0.303333,0.75,0.62,6.4,0.001378


In [3]:
Tru().run_dashboard()

Starting dashboard ...
Config file already exists. Skipping writing process.
Credentials file already exists. Skipping writing process.


Accordion(children=(VBox(children=(VBox(children=(Label(value='STDOUT'), Output())), VBox(children=(Label(valu…

RuntimeError: Dashboard failed to start in time. Please inspect dashboard logs for additional information.