<a href="https://colab.research.google.com/github/GenAIHub/genai-workshop/blob/main/03_Agents/01_10k_sub_question.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 10K Analysis
In this demo, we explore answering complex queries by decomposing them into simpler sub-queries.

In [None]:
!pip install boto3
!pip install llama-index
!pip install llama-index-llms-bedrock
!pip install llama-index-embeddings-bedrock

In [2]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine

## Configure LLM service

In [3]:
import os
import boto3

# Set AWS env config
region_name = os.getenv("AWS_REGION", "us-east-1")
aws_access_key_id = os.getenv("AWS_ACCESS_KEY_ID", "")
aws_secret_access_key = os.getenv("AWS_SECRET_ACCESS_KEY", "")

In [4]:
from llama_index.llms.bedrock import Bedrock
from llama_index.embeddings.bedrock import BedrockEmbedding
from llama_index.core import Settings

llm_model_id = "anthropic.claude-3-haiku-20240307-v1:0"
embed_model_id = "amazon.titan-embed-text-v1"

llm = Bedrock(
    model=llm_model_id,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
    region_name=region_name,
    temperature=0.1,
    max_tokens=512
    )

embed_model = BedrockEmbedding(
    model=embed_model_id,
    aws_access_key_id=aws_access_key_id,
    aws_secret_access_key=aws_secret_access_key,
    region_name=region_name,
    )

Settings.llm = llm
Settings.embed_model = embed_model

## Download Data

In [None]:
!mkdir -p 'data/10k/'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/uber_2021.pdf' -O 'data/10k/uber_2021.pdf'
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/10k/lyft_2021.pdf' -O 'data/10k/lyft_2021.pdf'

## Load data

In [6]:
lyft_docs = SimpleDirectoryReader(
    input_files=["./data/10k/lyft_2021.pdf"]
).load_data()
uber_docs = SimpleDirectoryReader(
    input_files=["./data/10k/uber_2021.pdf"]
).load_data()

## Build indices

In [7]:
lyft_index = VectorStoreIndex.from_documents(lyft_docs)

In [8]:
uber_index = VectorStoreIndex.from_documents(uber_docs)

## Build query engines

In [9]:
lyft_engine = lyft_index.as_query_engine(similarity_top_k=3)

In [10]:
uber_engine = uber_index.as_query_engine(similarity_top_k=3)

In [11]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=lyft_engine,
        metadata=ToolMetadata(
            name="lyft_10k",
            description=(
                "Provides information about Lyft financials for year 2021"
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=uber_engine,
        metadata=ToolMetadata(
            name="uber_10k",
            description=(
                "Provides information about Uber financials for year 2021"
            ),
        ),
    ),
]

s_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools,
    use_async=False
)

## Run queries

In [12]:
response = s_engine.query(
    "Compare and contrast the customer segments and geographies that grew the"
    " fastest"
)

Generated 4 sub questions.
[1;3;38;2;237;90;200m[uber_10k] Q: What were Uber's fastest growing customer segments in 2021?
[0m[1;3;38;2;237;90;200m[uber_10k] A: Based on the information provided, the context does not explicitly mention Uber's fastest growing customer segments in 2021. The context focuses on providing an overview of Uber's business segments, competitive environment, intellectual property, research and development, and seasonality. It does not provide details on Uber's specific customer growth trends in 2021.
[0m[1;3;38;2;90;149;237m[uber_10k] Q: What were Uber's fastest growing geographies in 2021?
[0m[1;3;38;2;90;149;237m[uber_10k] A: The context information does not provide specific details about Uber's fastest growing geographies in 2021. The information focuses on Uber's business segments, competitive environment, and the impact of the COVID-19 pandemic on the company's operations. There are no details mentioned about Uber's geographic growth trends in 2021.


In [13]:
print(response)

I do not have enough information in the provided context to compare and contrast the fastest growing customer segments and geographies for Uber and Lyft in 2021. The context does not contain any details about the specific customer segments or geographic regions that experienced the fastest growth for either company during that time period. Without access to that level of detail in the given information, I cannot provide a meaningful comparison between the two companies. The context focuses more on providing an overview of the businesses, competitive environments, and operational impacts, rather than delving into the specifics of their customer and geographic growth trends in 2021.


In [14]:
response = s_engine.query(
    "Compare revenue growth of Uber and Lyft from 2020 to 2021"
)

Generated 2 sub questions.
[1;3;38;2;237;90;200m[uber_10k] Q: What was Uber's revenue growth from 2020 to 2021?
[0m[1;3;38;2;237;90;200m[uber_10k] A: Uber's revenue grew by 57% from 2020 to 2021. The context states that "Revenue increased $6.3 billion, or 57%, primarily attributable to an increase in Gross Bookings of 56%, or 53% on a constant currency basis."
[0m[1;3;38;2;90;149;237m[lyft_10k] Q: What was Lyft's revenue growth from 2020 to 2021?
[0m[1;3;38;2;90;149;237m[lyft_10k] A: Based on the information provided in the context, Lyft's revenue growth from 2020 to 2021 cannot be determined. The context does not contain any information about Lyft's total revenue figures for 2020 and 2021. The context focuses on discussing Lyft's revenue per active rider metric, but does not provide the overall revenue numbers needed to calculate the revenue growth rate.
[0m

In [15]:
print(response)

According to the provided context, Uber's revenue grew by 57% from 2020 to 2021. However, the context does not contain information about Lyft's total revenue figures for 2020 and 2021, so the revenue growth rate for Lyft cannot be determined based on the given information. Therefore, a direct comparison of the revenue growth between Uber and Lyft from 2020 to 2021 is not possible with the information available in the context.
