-------------------------------------
#### Financial document analysis with LlamaIndex
------------------------------------

In [1]:
from llama_parse import LlamaParse

In [2]:
from llama_index.core import SimpleDirectoryReader

In [3]:
import os

print(os.getenv('LLAMA_CLOUD_API_KEY'))

llx-Lwr1JXNjYe06r1xaPX3vGC0Mbspq2d575l47JFahjs7o1CYO


In [4]:
parser = LlamaParse(result_type="text")

In [5]:
file_extractor = {".pdf": parser}

In [6]:
# Load files from file directory.
# Automatically select the best file reader given file extensions.

reader = SimpleDirectoryReader(
    input_files    = ['./fin-10K-data/lyft_2021.pdf'],
    file_extractor = file_extractor
)

In [7]:
import nest_asyncio
nest_asyncio.apply()

In [10]:
docs = reader.load_data(show_progress=True)

Loading files:   0%|                                                                  | 0/1 [00:00<?, ?file/s]

Started parsing the file under job_id bd0ace5a-a283-4472-a659-d3353692058f


Loading files: 100%|██████████████████████████████████████████████████████████| 1/1 [00:38<00:00, 38.85s/file]


In [10]:
len(docs)

238

In [12]:
from llama_index.core import VectorStoreIndex

In [13]:
index = VectorStoreIndex.from_documents(docs)

In [15]:
# Access chunks from the index
index_chunks = index.docstore.docs
len(index_chunks)

305

In [23]:
# Display a few chunks
for i, chunk in enumerate(list(index_chunks.values())[:5]):  # Show first 5 chunks
    print(f"Chunk {i+1}")
    print("-"*80)    
    print(f"{chunk.text}")
    

Chunk 1
--------------------------------------------------------------------------------
UNITED STATES
                                                            SECURITIES AND EXCHANGE COMMISSIONWashington, D.C. 20549

                                                                                                        FORM 10-K
(Mark One)
☒        ANNUAL REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934
                                                                                        For the fiscal year ended December 31, 2021
                                                                                                                      OR
☐       TRANSITION REPORT PURSUANT TO SECTION 13 OR 15(d) OF THE SECURITIES EXCHANGE ACT OF 1934 FOR THE TRANSITION PERIOD
        FROM                           TO
                                                                                              Commission File Number 001-38846

              

In [24]:
# Convert the index to a query engine.
qe = index.as_query_engine(similarity_top_k=3)

In [25]:
response = await qe.aquery('What is the revenue of Lyft in 2021? Answer in millions with page reference')
print(response)

The revenue of Lyft in 2021 was $3,208.3 million. (Page reference: fin-10K-data\lyft_2021.pdf, Results of Operations)


In [26]:
response = await qe.aquery('What are various transportation networks offereed by Lyft? Provide answers, with page reference')
print(response)

Lyft offers a variety of transportation networks including a ridesharing marketplace, Express Drive for flexible car rentals, Lyft Rentals for long-distance trips, Light Vehicles such as shared bikes and scooters, Public Transit integration in select cities, and Lyft Autonomous partnerships for autonomous vehicles. These details can be found on page 1 of the provided document.


In [27]:
response = await qe.aquery('What Operations and support expenses in 2020 & 2021? Provide answers if the expenses improved in 2021, with page reference')
print(response)

Operations and support expenses in 2020 were $453,963 thousand and in 2021 were $402,233 thousand. The expenses decreased by $51.7 million, or 11%, in 2021 compared to 2020. This information can be found on page 64 of the provided document.


#### 1. Income Statement/Profitability Questions
- What were the company's total revenues and net income for the fiscal year?
- How did the company’s gross profit margin change compared to the previous year?
- What are the key factors driving the company's increase or decrease in operating income?
- How did the company’s cost of goods sold (COGS) fluctuate during the year?

#### 2. Balance Sheet Questions
- What were the company’s total assets, liabilities, and shareholders' equity at the end of the fiscal year?
- How has the company’s debt-to-equity ratio changed compared to previous years?
- What percentage of the company's assets are held in cash or cash equivalents?

#### 3. Cash Flow Statement Questions
- What were the company’s net cash flows from operating, investing, and financing activities?
- Did the company have any significant capital expenditures during the year? If so, what were they?
- How does the company’s free cash flow compare to the previous year?

#### 4. Risk Factors and Legal Proceedings
- What are the primary risks outlined by the company that could impact its financial performance?
- Are there any ongoing legal proceedings that may have a material adverse effect on the company’s financial position?
- How does the company assess its exposure to foreign exchange risk, interest rate risk, or commodity price risk?

#### 5. Management’s Discussion and Analysis (MD&A)
- What are the key growth strategies discussed by management for the upcoming fiscal year?
- How does the company explain any significant trends in its revenue, expenses, or profitability?
- What steps is the company taking to address any challenges or adverse market conditions?

#### 6. Debt and Financing
- What is the total amount of the company’s long-term debt, and what are the terms of its major borrowings?
- Has the company issued any new debt or repurchased any outstanding debt during the year?
- How does the company plan to manage its debt obligations over the next few years?

#### 7. Stockholder Information
- How many shares of common stock are outstanding, and how has this number changed during the year?
- Did the company issue any dividends to shareholders, and what was the dividend payout ratio?
- Are there any stock repurchase programs in place, and how much stock did the company buy back during the year?

#### 8. Segment Information
- How are the company’s operating segments performing, and which segment is the most profitable?
- What geographic regions contribute the most to the company’s revenue and income?
- How do the performance metrics differ between the company’s business segments?

#### 9. Liquidity and Capital Resources
- What does the company say about its liquidity position, and how does it plan to meet its short-term and long-term capital needs?
- What are the major sources of funding for the company’s operations?
- Are there any liquidity risks that the company has highlighted for the upcoming year?

#### 10. Acquisitions and Mergers
- Has the company made any significant acquisitions or mergers during the year?
- What were the financial impacts of any mergers or acquisitions on the company’s balance sheet and income statement?
- How does the company expect these acquisitions to contribute to future growth?


In [28]:
qs = '''
What are the key factors driving LYFT's increase or decrease in operating income?
provide relevant page numbers for reference
'''

response = await qe.aquery(qs)
print(response)

The key factors that drive LYFT's increase or decrease in operating income include the ability to forecast revenue and manage expenses effectively, attract and retain drivers and riders cost-effectively, comply with laws and regulations, manage capital expenditures for current and future offerings, develop and maintain business assets, respond to macroeconomic changes, manage growth and operations efficiently, expand geographically, hire and retain talented personnel, develop new platform features, and optimize real estate portfolio. These factors are detailed on pages 19 and 58 of the provided document.


In [29]:
qs = '''
Has the company made any significant acquisitions or mergers during the year?
provide relevant page numbers for reference
'''

response = await qe.aquery(qs)
print(response)

The company has made significant acquisitions, including the acquisition of Flexdrive in February 2020. Additionally, in July 2021, the company closed the sale of its Level 5 self-driving vehicle division. Relevant page numbers for reference are page 48 and page 90 of the provided document.


In [30]:
qs = '''
Compare and contrast the customer segments and geographies that grew the fastest
'''

response = await qe.aquery(qs)
print(response)

The customer segments that grew the fastest were likely the ones related to ridesharing, bike and scooter sharing, and consumer vehicle rentals. Geographically, the growth was likely prominent in the United States and Canada, where Lyft's main competitors are located.


In [31]:
qs = '''
Compare revenue growth of Uber and Lyft from 2020 to 2021
'''

response = await qe.aquery(qs)
print(response)

Lyft's revenue grew by 36% from 2020 to 2021.


In [32]:
financial_questions = {
    "Income Statement/Profitability Questions": [
        "What were the company's total revenues and net income for the fiscal year?",
        "How did the company’s gross profit margin change compared to the previous year?",
        "What are the key factors driving the company's increase or decrease in operating income?",
        "How did the company’s cost of goods sold (COGS) fluctuate during the year?"
    ],
    "Balance Sheet Questions": [
        "What were the company’s total assets, liabilities, and shareholders' equity at the end of the fiscal year?",
        "How has the company’s debt-to-equity ratio changed compared to previous years?",
        "What percentage of the company's assets are held in cash or cash equivalents?"
    ],
    "Cash Flow Statement Questions": [
        "What were the company’s net cash flows from operating, investing, and financing activities?",
        "Did the company have any significant capital expenditures during the year? If so, what were they?",
        "How does the company’s free cash flow compare to the previous year?"
    ],
    "Risk Factors and Legal Proceedings": [
        "What are the primary risks outlined by the company that could impact its financial performance?",
        "Are there any ongoing legal proceedings that may have a material adverse effect on the company’s financial position?",
        "How does the company assess its exposure to foreign exchange risk, interest rate risk, or commodity price risk?"
    ],
    "Management’s Discussion and Analysis (MD&A)": [
        "What are the key growth strategies discussed by management for the upcoming fiscal year?",
        "How does the company explain any significant trends in its revenue, expenses, or profitability?",
        "What steps is the company taking to address any challenges or adverse market conditions?"
    ],
    "Debt and Financing": [
        "What is the total amount of the company’s long-term debt, and what are the terms of its major borrowings?",
        "Has the company issued any new debt or repurchased any outstanding debt during the year?",
        "How does the company plan to manage its debt obligations over the next few years?"
    ],
    "Stockholder Information": [
        "How many shares of common stock are outstanding, and how has this number changed during the year?",
        "Did the company issue any dividends to shareholders, and what was the dividend payout ratio?",
        "Are there any stock repurchase programs in place, and how much stock did the company buy back during the year?"
    ],
    "Segment Information": [
        "How are the company’s operating segments performing, and which segment is the most profitable?",
        "What geographic regions contribute the most to the company’s revenue and income?",
        "How do the performance metrics differ between the company’s business segments?"
    ],
    "Liquidity and Capital Resources": [
        "What does the company say about its liquidity position, and how does it plan to meet its short-term and long-term capital needs?",
        "What are the major sources of funding for the company’s operations?",
        "Are there any liquidity risks that the company has highlighted for the upcoming year?"
    ],
    "Acquisitions and Mergers": [
        "Has the company made any significant acquisitions or mergers during the year?",
        "What were the financial impacts of any mergers or acquisitions on the company’s balance sheet and income statement?",
        "How does the company expect these acquisitions to contribute to future growth?"
    ]
}


In [34]:
# Main function to process the questions
def process_questions(financial_questions):
    responses = []

    for header, questions in financial_questions.items():
        for question in questions:
            qs = f"<question> {question} Obtain response, relevant page numbers, chunk text span +-50 chars in a dict format"
            response = qe.query(qs)
            
            # Constructing the response dictionary
            response_dict = {
                "header": header,
                "question": question,
                "answer": response["answer"],
                "relevant_pages": response["relevant_pages"],
                "chunk_text_span": response["chunk_text_span"]
            }
            responses.append(response_dict)
    
    return responses

In [39]:
question = "Has the company made any significant acquisitions or mergers during the year?"

In [40]:
qs = f"<question> {question} Obtain response, relevant page numbers, chunk text span +-50 chars in a dict format"

In [42]:
response = qe.query(qs)

In [60]:
import json
import pandas as pd

In [61]:
pd.DataFrame(json.loads(response.response))

Unnamed: 0,response,relevant_page_numbers,relevant_text_chunks
0,"Yes, the company has made significant acquisit...",48,"As part of our business strategy, we will cont..."
