# Multi-Queries Exploration

Query Expansion & Queries:

- Multi-Step Query Engine: [doc](https://docs.llamaindex.ai/en/stable/examples/query_transformations/SimpleIndexDemo-multistep/)
- Subquery Query Engine: [doc](https://docs.llamaindex.ai/en/stable/examples/query_engine/sub_question_query_engine/)

In [None]:
%pip install llama-index-postprocessor-cohere-rerank

In [22]:
import os
from IPython.display import Markdown, display
from dotenv import find_dotenv, load_dotenv

load_dotenv(find_dotenv())

True

In [43]:
from llama_index.core import (
    Settings, VectorStoreIndex
)
from llama_index.core.callbacks import (
    TokenCountingHandler, 
    CallbackManager,
    LlamaDebugHandler,
)
from llama_index.llms.openai import OpenAI
from langchain_community.embeddings import VoyageEmbeddings
from llama_index.postprocessor.cohere_rerank import CohereRerank
from functions.embeddings.sfr_embeddings_llamaindex import SFREmbeddingsForLlamaIndex
from functions.rerank.sfr_reranker import SFRRerank

selected_embedding = "SFR"
if selected_embedding == "SFR":

    EMBEDDING_MODEL = "sfr_embedding_mistral"
    embedding_model = SFREmbeddingsForLlamaIndex()
    embedding_dimension = 4096
    
elif selected_embedding == "OPENAI":
    from llama_index.embeddings.openai import OpenAIEmbedding
    EMBEDDING_MODEL = "text-embedding-3-large"
    embedding_model = OpenAIEmbedding(model=EMBEDDING_MODEL)
    embedding_dimension = 3072

elif selected_embedding == "VOYAGE":
    EMBEDDING_MODEL = "voyage-2"  # Alternative: "voyage-lite-02-instruct"
    embedding_model = VoyageEmbeddings(model=EMBEDDING_MODEL, batch_size=12)
    embedding_dimension = 1024
    

GPT3_MODEL_NAME = "gpt-3.5-turbo-0125"
GPT4_MODEL_NAME = "gpt-4-0125-preview"
GPT4_0125_MODEL = "gpt-4-0125-preview"

# LlamaIndex
gpt3_model = OpenAI(model=GPT3_MODEL_NAME)
gpt4_model = OpenAI(model=GPT4_MODEL_NAME)

llama_debug = LlamaDebugHandler(print_trace_on_end=True)
token_counter = TokenCountingHandler()
Settings.llm = gpt4_model
Settings.embed_model = embedding_model
Settings.callback_manager = CallbackManager([token_counter, llama_debug])

In [3]:
import nest_asyncio
nest_asyncio.apply()

from llama_parse import LlamaParse

In [4]:
from pathlib import Path
from typing import Dict, List, Tuple, Type
from llama_index.core.node_parser.interface import NodeParser
from llama_index.core import SimpleDirectoryReader

LLAMA_CLOUD_API_KEY = os.getenv("LLAMA_CLOUD_API_KEY")
llama_reader = LlamaParse(
    api_key=LLAMA_CLOUD_API_KEY, 
    result_type="markdown",  
    verbose=True
)
FILE_NODE_PARSERS: Dict[str, Type[NodeParser]] = {
    ".pdf": llama_reader
}

In [5]:
input_dir = "../../data/10k"
if Path(input_dir).exists():
    documents = SimpleDirectoryReader(
        input_dir=input_dir,
        file_extractor=FILE_NODE_PARSERS,
        recursive=True,
    ).load_data(num_workers=10)
    print(len(documents))

Started parsing the file under job_id d6dd55d4-e8f7-4a06-9713-600603e4aefc
1


In [None]:
from llama_index.core.node_parser import MarkdownElementNodeParser

node_parser = MarkdownElementNodeParser(
    llm=gpt3_model, 
    num_workers=8,
    verbose=False
)
nodes = node_parser.get_nodes_from_documents(documents)
base_nodes, objects = node_parser.get_nodes_and_objects(nodes)

In [7]:
recursive_index = VectorStoreIndex(nodes=base_nodes + objects)
raw_index = VectorStoreIndex.from_documents(documents)

In [10]:
# reranker = SFRRerank(top_n=10)
reranker = CohereRerank(top_n=10)

recursive_query_engine = recursive_index.as_query_engine(
    similarity_top_k=3, 
    node_postprocessors=[reranker], 
    verbose=False
)

raw_query_engine = raw_index.as_query_engine(
    similarity_top_k=3, 
    node_postprocessors=[reranker], 
    verbose=False
)

In [None]:
query = "What is Salesforce's strategy for Slack?  Evaluate how critical does Salesforce consider Slack to its business."

response_1 = raw_query_engine.query(query)
response_2 = recursive_query_engine.query(query)

In [16]:
print("\n***********Basic Query Engine***********")
print(response_1)


***********Basic Query Engine***********
Salesforce's strategy for Slack involves enhancing and improving its features, integrations, and capabilities, as well as introducing compelling new features, integrations, and capabilities that reflect or anticipate the changing nature of the market. This approach is aimed at attracting new users and organizations and increasing revenue from existing paid customers. Salesforce considers Slack to be critically important to its business, as evidenced by the fact that it was Salesforce's largest acquisition to date as of July 2021. Slack represents a relatively new category of business technology in a rapidly evolving market for software, programs, and tools used by knowledge workers. Salesforce's focus on Slack underscores the company's commitment to expanding its service offerings and adapting to the rapidly changing technological landscape to maintain and grow its customer base and revenue streams.


In [17]:
print("\n***********Recursive Retriever Query Engine***********")
print(response_2)


***********Recursive Retriever Query Engine***********
Salesforce's strategy for Slack involves enhancing and improving its features, integrations, and capabilities to reflect or anticipate the changing nature of the market for software, programs, and tools used by knowledge workers. This strategy indicates Salesforce's intention to make Slack a key component of its broader service offerings, aiming to attract new users and organizations while increasing revenue from existing paid customers. Salesforce considers Slack critically important to its business, as evidenced by its designation of the acquisition as its largest to date and the emphasis on the need to succeed in the rapidly evolving market. The focus on continuously enhancing Slack's AI offerings and integrating it effectively within Salesforce's ecosystem underscores the importance of Slack in maintaining Salesforce's competitive edge and fulfilling its commitment to innovation and customer success. Failure to effectively int

## Multi-Step Query Engine

In [33]:
from llama_index.core.query_engine import MultiStepQueryEngine
from llama_index.core.indices.query.query_transform.base import (
    StepDecomposeQueryTransform,
)

index_summary = "Salesforce financial and business 10K report for 2022-2023"
step_decompose_transform = StepDecomposeQueryTransform(llm=gpt4_model, verbose=False)
multistep_raw_query_engine = MultiStepQueryEngine(
    query_engine=raw_query_engine,
    query_transform=step_decompose_transform,
    index_summary=index_summary,
)
multistep_recur_query_engine = MultiStepQueryEngine(
    query_engine=recursive_query_engine,
    query_transform=step_decompose_transform,
    index_summary=index_summary,
)

In [None]:
raw_response_mstep = raw_query_engine.query(query)
recursive_response_mstep = recursive_query_engine.query(query)

In [35]:
print(recursive_response_mstep)

Salesforce's strategy for Slack involves enhancing and improving its features, integrations, and capabilities to reflect or anticipate the changing nature of the market. This approach is aimed at attracting new users and organizations while increasing revenue from existing paid customers. Salesforce considers Slack to be a critical component of its business, as evidenced by its designation as the company's largest acquisition to date. The emphasis on Slack underscores Salesforce's commitment to expanding its product offerings and staying at the forefront of technological developments in the rapidly evolving market for software, programs, and tools used by knowledge workers. The integration and development of Slack are pivotal to Salesforce's broader strategy of providing comprehensive and innovative solutions that cater to the digital-first customer experience, thereby reinforcing its position as a leader in customer relationship management technology.


In [36]:
print(raw_response_mstep)

Salesforce's strategy for Slack involves enhancing and improving its features, integrations, and capabilities, as well as introducing compelling new features, integrations, and capabilities that reflect or anticipate the changing nature of the market. This approach is aimed at attracting new users and organizations and increasing revenue from existing paid customers. Salesforce considers Slack to be critically important to its business, as evidenced by its designation as the company's largest acquisition to date. The acquisition reflects Salesforce's commitment to expanding its service offerings and adapting to the rapidly evolving market for software, programs, and tools used by knowledge workers. The emphasis on successfully integrating and developing Slack indicates that Salesforce views it as a significant component of its strategy to remain competitive and grow its business in the face of rapid technological developments and changing customer needs.


## Sub-Query Query Engine

In [44]:
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.tools import QueryEngineTool, ToolMetadata

query_engine_tools = [
    QueryEngineTool(
        query_engine=recursive_query_engine,
        metadata=ToolMetadata(
            name="Salesforce10K",
            description="Salesforce financial and business 10K report for 2022-2023",
        ),
    ),
]

subquestion_query_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools,
    use_async=True,
)

In [None]:
subquestion_response_mstep = subquestion_query_engine.query(query)

In [47]:
print(subquestion_response_mstep)

Salesforce's strategy for Slack is centered around leveraging it as a digital headquarters to facilitate collaboration and increase productivity within companies, employees, governments, and stakeholders. This approach is integral to Salesforce's broader business strategy, aiming to transform businesses around the customer in a digital-first world. Slack is positioned as a key component in enhancing Salesforce's Customer 360 platform, contributing significantly to the delivery of intelligent, personalized experiences across every channel. This indicates that Salesforce considers Slack to be critically important to its overall business strategy, as it plays a crucial role in enabling Salesforce to achieve its goal of connecting companies of every size and industry with their customers in new ways and transforming their businesses in the context of a digital-first approach.
