# Llama Parser <> LlamaIndex

In [None]:
!pip install llama-index llama-parser sentence-trasformers

In [None]:
!wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10q/uber_10q_march_2022.pdf' -O './uber_10q_march_2022.pdf'

In [9]:
# llama-parser is async-first, running the sync code in a notebook requires the use of nest_asyncio
import nest_asyncio

nest_asyncio.apply()

import os
os.environ["LLAMA_CLOUD_API_KEY"] = "..."
os.environ["OPENAI_API_KEY"] = "sk-..."

In [6]:
from llama_parser import LlamaParser

documents = LlamaParser(result_type="markdown").load_data('./uber_10q_march_2022.pdf')

In [7]:
print(documents[0].text[:1000] + '...')

# Form 10-Q

## UNITED STATES SECURITIES AND EXCHANGE COMMISSION

Washington, D.C. 20549

### FORM 10-Q

(Mark One)

☒ QUARTERLY REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the quarterly period ended March 31, 2022 OR ☐ TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 For the transition period from_____ to _____ Commission File Number: 001-38902

### UBER TECHNOLOGIES, INC.

(Exact name of registrant as specified in its charter)

Not Applicable (Former name, former address and former fiscal year, if changed since last report)

Delaware 45-2647441 (State or other jurisdiction of incorporation or organization) (I.R.S. Employer Identification No.)

1515 3rd Street San Francisco, California 94158 (Address of principal executive offices, including zip code)

(415) 612-8582 (Registrant’s telephone number, including area code)

### Securities registered pursuant to Section 12(b) of the Act:

|Title of each class|Tra

In [10]:
from llama_index.node_parser import MarkdownElementNodeParser
from llama_index.llms import OpenAI

node_parser = MarkdownElementNodeParser(llm=OpenAI(model="gpt-3.5-turbo"))

In [14]:
nodes = node_parser.get_nodes_from_documents(documents)
nodes, objs = node_parser.get_nodes_and_objects(nodes)

Embeddings have been explicitly disabled. Using MockEmbedding.


100%|██████████| 59/59 [03:22<00:00,  3.44s/it]


In [19]:
from llama_index import VectorStoreIndex, ServiceContext
from llama_index.embeddings import OpenAIEmbedding

ctx = ServiceContext.from_defaults(llm=OpenAI(model="gpt-3.5-turbo"), embed_model=OpenAIEmbedding(model="text-embedding-3-small"))

composable_index = VectorStoreIndex(nodes=nodes, objects=objs, service_context=ctx)
base_index = VectorStoreIndex.from_documents(documents, service_context=ctx)

In [53]:
from llama_index.postprocessor import SentenceTransformerRerank

reranker = SentenceTransformerRerank(top_n=2, model="BAAI/bge-reranker-large")

composable_query_engine = composable_index.as_query_engine(similarity_top_k=15, node_postprocessors=[reranker], verbose=True)
base_query_engine = base_index.as_query_engine(similarity_top_k=15, node_postprocessors=[reranker])

### Table Query

In [54]:
response = base_query_engine.query("What was the change in monthly active platform consumers?")
print(str(response))

The context information does not provide specific information about the change in monthly active platform consumers.


That was not helpful.

In [55]:
response = composable_query_engine.query("What was the change in monthly active platform consumers?")
print(str(response))

[1;3;38;2;11;159;203mRetrieval entering id_310_table: TextNode
[0m[1;3;38;2;237;90;200mRetrieving from object TextNode with query What was the change in monthly active platform consumers?
[0m[1;3;38;2;11;159;203mRetrieval entering id_292_table: TextNode
[0m[1;3;38;2;237;90;200mRetrieving from object TextNode with query What was the change in monthly active platform consumers?
[0m[1;3;38;2;11;159;203mRetrieval entering id_296_table: TextNode
[0m[1;3;38;2;237;90;200mRetrieving from object TextNode with query What was the change in monthly active platform consumers?
[0m[1;3;38;2;11;159;203mRetrieval entering id_320_table: TextNode
[0m[1;3;38;2;237;90;200mRetrieving from object TextNode with query What was the change in monthly active platform consumers?
[0mThe change in monthly active platform consumers was a decline of 3 million, or 3%, quarter-over-quarter, but a growth of 17% compared to the same period in 2021.


Correct!

In [61]:
response = base_query_engine.query("Which market was the primary driver of revenue growth?")
print(str(response))

The primary driver of revenue growth was the Mobility market.


Mobility is not actually a market....

In [60]:
response = composable_query_engine.query("Which market was the primary driver of revenue growth?")
print(str(response))

[1;3;38;2;11;159;203mRetrieval entering id_318_table: TextNode
[0m[1;3;38;2;237;90;200mRetrieving from object TextNode with query Which market was the primary driver of revenue growth?
[0m[1;3;38;2;11;159;203mRetrieval entering id_80_table: TextNode
[0m[1;3;38;2;237;90;200mRetrieving from object TextNode with query Which market was the primary driver of revenue growth?
[0m[1;3;38;2;11;159;203mRetrieval entering id_292_table: TextNode
[0m[1;3;38;2;237;90;200mRetrieving from object TextNode with query Which market was the primary driver of revenue growth?
[0mThe primary driver of revenue growth was the United States and Canada ("US&CAN") market.


Correct!

### General Query

In [42]:
response = base_query_engine.query("What is the impact of the COVID-19 pandemic on business?")
print(str(response))

The COVID-19 pandemic has had an adverse impact on the business and operations of the company. It has resulted in travel restrictions, business restrictions, school closures, limitations on social or public gatherings, and other measures that have reduced the demand for the company's Mobility offerings globally. The pandemic has also affected travel behavior and demand, and there have been driver supply constraints. The company has temporarily suspended its shared rides offering globally, and in many regions, due to the need to support social distancing. The pandemic has adversely affected the company's near-term financial results and may impact its long-term financial results. The extent of the impact on the business and financial results will depend on future developments, including the duration of the spread of the outbreak, the administration and efficacy of vaccines, and other factors that are highly uncertain.


In [43]:
response = composable_query_engine.query("What is the impact of the COVID-19 pandemic on business?")
print(str(response))

The COVID-19 pandemic has had an adverse impact on the business operations of the company. It has resulted in a reduction in global demand for their Mobility offerings, while accelerating the growth of their Delivery offerings. The extent of the impact on the company's business and financial results is highly uncertain and cannot be predicted. It depends on various factors such as the duration of the outbreak, resurgences of the virus, the administration and efficacy of vaccines, government regulations, and potential permanent changes in user behavior. The pandemic has also affected the company's near-term financial results and may continue to impact their long-term financial results, leading to significant actions such as workforce reductions and changes to pricing models. Additionally, concerns over the economic impact of the pandemic have caused volatility in financial markets, which can negatively impact the company's stock price and access to capital markets.
