### Auto merging Retrieval

Auto merging retrieva uses recursive tree and node structure to select relevant information from a provided query


In [9]:
import os
import openai
from dotenv import load_dotenv
from llama_index.llms.openai import OpenAI
from llama_index.readers.file.base import SimpleDirectoryReader
from llama_index.retrievers.auto_merging_retriever import AutoMergingRetriever
from llama_index.indices.vector_store import VectorStoreIndex
from llama_index.node_parser import HierarchicalNodeParser

In [10]:
load_dotenv()

True

In [11]:
documents = SimpleDirectoryReader(
    input_files=["./MIV2 - LLM paper.pdf"]
).load_data()

In [12]:
print(type(documents), "\n")
print(len(documents), "\n")
print(type(documents[0]))
print(documents[0])

<class 'list'> 

17 

<class 'llama_index.schema.Document'>
Doc ID: 623338b3-1043-4fbd-b40a-eb09c9fbeeed
Text: Human-Robot interaction through joint robot planning with Large
Language Models Kosi Asuzu1* 1*Birmingham City University. Abstract
Large Language Models (LLMs) have demonstrated remarkable zero-shot
generalisation capa- bilities, expanding their utility beyond natural
language processing into various applications. Leveraging extensive
web knowl...


#### Auto merging retrieval setup

In [13]:
from llama_index.schema import Document

In [14]:
document = Document(text="\n\n".join([doc.text for doc in documents]))

In [15]:
document.text[:10]

'Human-Robo'

In [16]:
node_parser = HierarchicalNodeParser.from_defaults(
    chunk_sizes=[2048, 512, 128]
)

In [17]:
nodes = node_parser.get_nodes_from_documents(documents=[document])

In [18]:
len(nodes)

170

In [30]:
from llama_index.node_parser import get_leaf_nodes # to get the leaf nodes we use the get_leaf_nodes function from the node parser

leaf_nodes = get_leaf_nodes(nodes)

In [20]:
len(leaf_nodes)

132

In [21]:
print(leaf_nodes[30].text)

Instead, reliance is placed
on the LLM-induced policies in their original form.
Recent research endeavors have prominently concentrated on formulating task plans grounded
in high-level natural language descriptions, thereby endowing robots with the capability to execute
intricate tasks with minimal human intervention [34] [21] [18]. Yu et al. advanced the field by utilizing
a language model to generate rewards applicable to robots for skill synthesis [19].


In [22]:
type(nodes[0])

llama_index.schema.TextNode

In [23]:
from llama_index.schema import TextNode

In [24]:
nodes_by_id = {node.node_id: node for node in nodes}

parent_node:TextNode= nodes_by_id[leaf_nodes[30].parent_node.node_id]
print(parent_node.text)

Seminal works by Brohan et al. and Jiang et al. have
demonstrated the feasibility of converting human instructions into robot actions. LLMs have been
employed to improve robotic task planning and execution [37] [20]. TidyBot show how LLMs can be
used in the personalisation of robot policy [2].
Prior studies have delved into the application of Large Language Models (LLMs) in planning
within a communicative environment [47]. In their work, a specialized framework was formulated for
cooperative agents operating within a multi-agent embodied environment. Showcasing the capabilities
of the GPT-4 model, the investigation illustrated its capacity to outperform robust planning-based
methods, exemplifying emergent effective communication devoid of the necessity for fine-tuning.
The LLM-MCTS (Monte Carlo Tree Search) investigation contributed valuable insights by reveal-
ing that LLMs not only provide a policy for action but also offer a commonsense model of the world
[45]. Monte Carlo Tree Sear

#### Now it is time to build the index
In order to build the index, we have to get two things done:
- Build the Service Context
- Build the Vector Index

In [25]:
mistral7b = OpenAI(model="mistralai/Mistral-7B-Instruct-v0.2")

In [26]:
from llama_index.service_context import ServiceContext

automerge_context = ServiceContext.from_defaults(
    llm=mistral7b,
    node_parser=node_parser,
    embed_model="local:BAAI/bge-small-en-v1.5"
)

  from .autonotebook import tqdm as notebook_tqdm


In [27]:
from llama_index.storage import StorageContext

storage_context = StorageContext.from_defaults()
storage_context.docstore.add_documents(nodes)

In [29]:
automerging_index = VectorStoreIndex( # to build the vector store we need to include the storage context and service context, we also need to add the leave nodes
    leaf_nodes, storage_context=storage_context, service_context=automerge_context
)

automerging_index.storage_context.persist(persist_dir="./cache")

In [31]:
from llama_index.indices import load_indices_from_storage

if os.path.exists("./cache"):
    query_index = load_indices_from_storage(
       storage_context=StorageContext.from_defaults(persist_dir="./cache"), 
       service_context=automerge_context
    )
else:
    query_index = VectorStoreIndex(
        leaf_nodes, storage_context=storage_context, service_context=automerge_context
    )
    query_index.storage_context.persist(persist_dir="./cache")

In [32]:
automerge_retriever = automerging_index.as_retriever(similarity_top_k=12)


In [33]:
retriever = AutoMergingRetriever(
    automerge_retriever,
    automerging_index.storage_context,
    verbose=True
)

In [34]:
from llama_index.postprocessor import SentenceTransformerRerank

rerank = SentenceTransformerRerank(top_n=6, model="BAAI/bge-reranker-base")

huggingface/tokenizers: The current process just got forked, after parallelism has already been used. Disabling parallelism to avoid deadlocks...
	- Avoid using `tokenizers` before the fork if possible
	- Explicitly set the environment variable TOKENIZERS_PARALLELISM=(true | false)
model.safetensors:   0%|          | 0.00/1.11G [00:03<?, ?B/s]


KeyboardInterrupt: 

In [35]:
from llama_index.query_engine import RetrieverQueryEngine

In [36]:
auto_merging_engine = RetrieverQueryEngine.from_args(
    automerge_retriever, node_postprocessors=[rerank]
)

NameError: name 'rerank' is not defined

In [None]:
auto_merging_response = auto_merging_engine.query(
    "What is the importance of networking in AI?"
)


In [None]:
from llama_index.response.notebook_utils import display_response

display_response(auto_merging_response)

In [37]:
from llama_index.indices import load_index_from_storage

In [39]:
def build_automerging_index(
    documents,
    llm,
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index",
    chunk_sizes=None,
):
    chunk_sizes = chunk_sizes or [2048, 512, 128]
    node_parser = HierarchicalNodeParser.from_defaults(chunk_sizes=chunk_sizes)
    nodes = node_parser.get_nodes_from_documents(documents)
    leaf_nodes = get_leaf_nodes(nodes)
    merging_context = ServiceContext.from_defaults(
        llm=llm,
        embed_model=embed_model,
    )
    storage_context = StorageContext.from_defaults()
    storage_context.docstore.add_documents(nodes)

    if not os.path.exists(save_dir):
        automerging_index = VectorStoreIndex(
            leaf_nodes, storage_context=storage_context, service_context=merging_context
        )
        automerging_index.storage_context.persist(persist_dir=save_dir)
    else:
        automerging_index = load_index_from_storage(
            StorageContext.from_defaults(persist_dir=save_dir),
            service_context=merging_context,
        )
    return automerging_index


def get_automerging_query_engine(
    automerging_index,
    similarity_top_k=12,
    rerank_top_n=6,
):
    base_retriever = automerging_index.as_retriever(similarity_top_k=similarity_top_k)
    retriever = AutoMergingRetriever(
        base_retriever, automerging_index.storage_context, verbose=True
    )
    rerank = SentenceTransformerRerank(
        top_n=rerank_top_n, model="BAAI/bge-reranker-base"
    )
    auto_merging_engine = RetrieverQueryEngine.from_args(
        retriever, node_postprocessors=[rerank]
    )
    return auto_merging_engine

In [40]:
from llama_index.llms import OpenAI

index = build_automerging_index(
    [document],
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
    save_dir="./merging_index",
)


KeyboardInterrupt: 

In [None]:
query_engine = get_automerging_query_engine(index, similarity_top_k=6)

In [None]:
from trulens_eval import Tru

Tru().reset_database()

### Two Layers

In [None]:
auto_merging_index_0 = build_automerging_index(
    documents,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index_0",
    chunk_sizes=[2048,512],
)

In [None]:
auto_merging_engine_0 = get_automerging_query_engine(
    auto_merging_index_0,
    similarity_top_k=12,
    rerank_top_n=6,
)

In [None]:
from utils import get_prebuilt_trulens_recorder

tru_recorder = get_prebuilt_trulens_recorder(
    auto_merging_engine_0,
    app_id ='app_0'
)

In [None]:
eval_questions = []
with open('generated_questions.text', 'r') as file:
    for line in file:
        # Remove newline character and convert to integer
        item = line.strip()
        eval_questions.append(item)

In [None]:
def run_evals(eval_questions, tru_recorder, query_engine):
    for question in eval_questions:
        with tru_recorder as recording:
            response = query_engine.query(question)

In [None]:
run_evals(eval_questions, tru_recorder, auto_merging_engine_0)

In [None]:
from trulens_eval import Tru

Tru().get_leaderboard(app_ids=[])

In [None]:
Tru().run_dashboard()

#### Three layers

In [None]:
auto_merging_index_1 = build_automerging_index(
    documents,
    llm=OpenAI(model="gpt-3.5-turbo", temperature=0.1),
    embed_model="local:BAAI/bge-small-en-v1.5",
    save_dir="merging_index_1",
    chunk_sizes=[2048,512,128],
)

In [None]:
auto_merging_engine_1 = get_automerging_query_engine(
    auto_merging_index_1,
    similarity_top_k=12,
    rerank_top_n=6,
)

In [None]:
tru_recorder = get_prebuilt_trulens_recorder(
    auto_merging_engine_1,
    app_id ='app_1'
)

In [None]:
run_evals(eval_questions, tru_recorder, auto_merging_engine_1)

In [None]:
from trulens_eval import Tru

Tru().get_leaderboard(app_ids=[])

In [None]:
Tru().run_dashboard()