# Sub Question Query Engine

- In this tutorial, we showcase how to use a sub question query engine to tackle the problem of `answering a complex query using multiple data sources.`

- It first breaks down the `complex query into sub questions` for each relevant data source, then gather all the intermediate reponses and synthesizes a final response.

In [1]:
import nest_asyncio
nest_asyncio.apply()

In [11]:
import yaml, logging
from llama_index.llms import AzureOpenAI, OpenAI
from llama_index.llm_predictor import LLMPredictor
from llama_index.text_splitter import TokenTextSplitter
from llama_index.embeddings import HuggingFaceEmbedding
from llama_index.tools import QueryEngineTool, ToolMetadata
from llama_index.query_engine import SubQuestionQueryEngine
from llama_index.callbacks.schema import CBEventType, EventPayload
from llama_index.callbacks import CallbackManager, LlamaDebugHandler
from llama_index import (
                        VectorStoreIndex, 
                        SimpleDirectoryReader, 
                        set_global_service_context,
                        ServiceContext
                        )

logging.getLogger("transformers.tokenization_utils_base").setLevel(logging.ERROR)

### Configure LLMs

In [3]:
with open('cadentials.yaml') as f:
    credentials = yaml.load(f, Loader=yaml.FullLoader)

In [4]:
llm_flag = 'DIRECT'

embedding_llm = HuggingFaceEmbedding(
                                    model_name="BAAI/bge-small-en-v1.5",
                                    device='mps'
                                    )

if llm_flag == 'AZURE':
    llm=AzureOpenAI(
                    model=credentials['AZURE_ENGINE'],
                    api_key=credentials['AZURE_OPENAI_API_KEY'],
                    deployment_name=credentials['AZURE_DEPLOYMENT_ID'],
                    api_version=credentials['AZURE_OPENAI_API_VERSION'],
                    azure_endpoint=credentials['AZURE_OPENAI_API_BASE'],
                    temperature=0.3
                    )
    
    chat_llm = LLMPredictor(llm)
else:
    chat_llm = OpenAI(
                    api_key=credentials['OPENAI_API_KEY'],
                    temperature=0.3
                    )

text_splitter = TokenTextSplitter(
                                separator=" ",
                                chunk_size=1024,
                                chunk_overlap=20,
                                backup_separators=["\n"]
                                )

# Using the LlamaDebugHandler to print the trace of the sub questions captured by the SUB_QUESTION callback event type
llama_debug = LlamaDebugHandler(print_trace_on_end=True)
callback_manager = CallbackManager([llama_debug])

if llm_flag == 'AZURE':
    service_context = ServiceContext.from_defaults(
                                                    text_splitter=text_splitter,
                                                    callback_manager=callback_manager,
                                                    embed_model=embedding_llm,
                                                    llm_predictor=chat_llm
                                                    )
else:
    service_context = ServiceContext.from_defaults(
                                                    text_splitter=text_splitter,
                                                    callback_manager=callback_manager,
                                                    embed_model=embedding_llm,
                                                    llm=chat_llm
                                                    )

set_global_service_context(service_context)

### Load Data

In [5]:
!mkdir -p 'data/paul_graham/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt' -O 'data/paul_graham/paul_graham_essay.txt'

--2024-01-26 08:59:16--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/paul_graham/paul_graham_essay.txt
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.108.133, 185.199.111.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 75042 (73K) [text/plain]
Saving to: ‘data/paul_graham/paul_graham_essay.txt’


2024-01-26 08:59:17 (400 KB/s) - ‘data/paul_graham/paul_graham_essay.txt’ saved [75042/75042]



In [6]:
pg_essay = SimpleDirectoryReader(input_dir="./data/paul_graham/").load_data()

# build index and query engine
vector_index = VectorStoreIndex.from_documents(
                                            pg_essay, 
                                            use_async=True, 
                                            service_context=service_context
                                            )
vector_query_engine = vector_index.as_query_engine()

**********
Trace: index_construction
    |_embedding ->  0.610594 seconds
    |_embedding ->  0.610578 seconds
**********


In [7]:
query_engine_tools = [
                        QueryEngineTool(
                            query_engine=vector_query_engine,
                            metadata=ToolMetadata(
                                                name="pg_essay",
                                                description="Paul Graham essay on What I Worked On",
                                                ),
                                        ),
                    ]

query_engine = SubQuestionQueryEngine.from_defaults(
                                                    query_engine_tools=query_engine_tools,
                                                    service_context=service_context,
                                                    use_async=True,
                                                    )

In [10]:
response = query_engine.query(
                            "How was Paul Grahams life different before, during, and after YC?"
                            )
print(response)

Generated 3 sub questions.
[1;3;38;2;237;90;200m[pg_essay] Q: What did Paul Graham work on before YC?
[0m[1;3;38;2;90;149;237m[pg_essay] Q: What did Paul Graham work on during YC?
[0m[1;3;38;2;11;159;203m[pg_essay] Q: What did Paul Graham work on after YC?
[0m[1;3;38;2;11;159;203m[pg_essay] A: After Y Combinator, Paul Graham worked on painting and writing essays. He spent most of 2014 painting and then started writing essays again, including some that were not about startups.
[0m[1;3;38;2;90;149;237m[pg_essay] A: During his time at Y Combinator (YC), Paul Graham worked on various tasks and responsibilities. However, the specific details of what he worked on during YC are not mentioned in the given context.
[0m[1;3;38;2;237;90;200m[pg_essay] A: Before Y Combinator (YC), Paul Graham worked on writing and programming. He wrote short stories and also wrote programs on the IBM 1401 computer in his school's basement. He later got a microcomputer, a TRS-80, and started programming 

In [12]:
for i, (start_event, end_event) in enumerate(
                                            llama_debug.get_event_pairs(CBEventType.SUB_QUESTION)
                                            ):
    qa_pair = end_event.payload[EventPayload.SUB_QUESTION]
    print("Sub Question " + str(i) + ": " + qa_pair.sub_q.sub_question.strip())
    print("Answer: " + qa_pair.answer.strip())
    print("====================================")

Sub Question 0: What did Paul Graham work on before YC?
Answer: Before Y Combinator (YC), Paul Graham worked on writing and programming. He wrote short stories as a beginning writer and also tried programming on the IBM 1401 computer in his school district. He later got a microcomputer, a TRS-80, and started programming more extensively, creating simple games, a rocket prediction program, and a word processor.
Sub Question 1: What did Paul Graham work on during YC?
Answer: During his time at Y Combinator (YC), Paul Graham worked on various projects and responsibilities. However, the specific details of what he worked on during YC are not mentioned in the given context.
Sub Question 2: What did Paul Graham work on after YC?
Answer: After Y Combinator, Paul Graham worked on painting and writing essays.
Sub Question 3: What did Paul Graham work on before YC?
Answer: Before Y Combinator (YC), Paul Graham worked on writing and programming. He wrote short stories and also wrote programs on t