# Financial Report Generation

<a href="https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/report_generation/report_generation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook we show you how to perform financial report generation with LlamaCloud consisting of text and tables, given an existing bank of reports.

LlamaCloud provides advanced retrieval endpoints allowing you to fetch context from complex financial reports consisting of text, tables, and sometimes images/diagrams.

We build an agentic workflow on top of LlamaCloud consisting of researcher and writer steps in order to generate the final response.

## Setup
|
Install core packages, download 10k files from Apple and Tesla.

You will need to upload these documents to LlamaCloud. For best results, we recommend: 
- Setting Parse settings to "Accurate" mode, "Premium" mode, or "3rd Party multimodal" 
- Setting the "Segmentation Configuration" to "Page" and the "Chunking Configuration" to None. This will give you page-level chunks.

In [None]:
!pip install llama-index
!pip install llama-index-core
!pip install llama-index-embeddings-openai
!pip install llama-index-question-gen-openai
!pip install llama-index-postprocessor-flag-embedding-reranker
!pip install git+https://github.com/FlagOpen/FlagEmbedding.git
!pip install llama-parse

In [None]:
!mkdir data
# download Apple 
!wget "https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf" -O data/apple_2023.pdf
!wget "https://s2.q4cdn.com/470004039/files/doc_financials/2022/q4/_10-K-2022-(As-Filed).pdf" -O data/apple_2022.pdf
!wget "https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf" -O data/apple_2021.pdf
!wget "https://s2.q4cdn.com/470004039/files/doc_financials/2020/ar/_10-K-2020-(As-Filed).pdf" -O data/apple_2020.pdf
!wget "https://www.dropbox.com/scl/fi/i6vk884ggtq382mu3whfz/apple_2019_10k.pdf?rlkey=eudxh3muxh7kop43ov4bgaj5i&dl=1" -O data/apple_2019.pdf

# download Tesla
!wget "https://ir.tesla.com/_flysystem/s3/sec/000162828024002390/tsla-20231231-gen.pdf" -O data/tesla_2023.pdf
!wget "https://ir.tesla.com/_flysystem/s3/sec/000095017023001409/tsla-20221231-gen.pdf" -O data/tesla_2022.pdf
!wget "https://www.dropbox.com/scl/fi/ptk83fmye7lqr7pz9r6dm/tesla_2021_10k.pdf?rlkey=24kxixeajbw9nru1sd6tg3bye&dl=1" -O data/tesla_2021.pdf
!wget "https://ir.tesla.com/_flysystem/s3/sec/000156459021004599/tsla-10k_20201231-gen.pdf" -O data/tesla_2020.pdf
!wget "https://ir.tesla.com/_flysystem/s3/sec/000156459020004475/tsla-10k_20191231-gen_0.pdf" -O data/tesla_2019.pdf

Some OpenAI and LlamaParse details. The OpenAI LLM is used for response synthesis.

In [1]:
# llama-parse is async-first, running the async code in a notebook requires the use of nest_asyncio
import nest_asyncio
nest_asyncio.apply()

In [2]:
import os
# API access to llama-cloud
os.environ["LLAMA_CLOUD_API_KEY"] = "llx-"

In [3]:
# Using OpenAI API for embeddings/llms
os.environ["OPENAI_API_KEY"] = "sk-"

In [2]:
# setup embedding/LLM model
from llama_index.core import Settings
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

embed_model = OpenAIEmbedding(model="text-embedding-3-large")
llm = OpenAI(model="gpt-4o-mini")

Settings.embed_model = embed_model
Settings.llm = llm

## Load Documents into LlamaCloud

The first order of business is to download the 5 Apple and Tesla 10Ks and upload them into LlamaCloud.

You can easily do this by creating a pipeline and uploading docs via the "Files" mode.

After this is done, proceed to the next section.

## Define LlamaCloud Retriever over Documents

In this section we define a chunk-level LlamaCloud Retriever over these documents. The chunk-level LlamaCloud retriever is our default retriever that returns chunks via hybrid search + reranking.

In [3]:
from llama_index.indices.managed.llama_cloud import LlamaCloudIndex
import os

index = LlamaCloudIndex(
  name="apple_tesla_demo_2",
  project_name="llamacloud_demo",
  api_key=os.environ["LLAMA_CLOUD_API_KEY"]
)

In [4]:
chunk_retriever = index.as_retriever(
    retrieval_mode="chunks",
    rerank_top_n=5
)

In [5]:
from llama_index.core.tools import FunctionTool
from llama_index.core.schema import NodeWithScore
from typing import List

# function tools
def chunk_retriever_fn(query: str) -> List[NodeWithScore]:
    """Retrieves a small set of relevant document chunks from the corpus.

    ONLY use for research questions that want to look up specific facts from the knowledge corpus,
    and don't need entire documents.

    """
    return chunk_retriever.retrieve(query)

chunk_retriever_tool = FunctionTool.from_defaults(fn=chunk_retriever_fn)

## Build a Report Generation Workflow

Now that we've defined the retrievers, we're ready to build the report generation workflow.

The workflow contains roughly the following steps:

1. **Research Gathering**: Perform a function calling loop where the agent tries to reason about what tool to call (chunk-level or document-level retrieval) in order to gather more information. All information is shared to a dictionary that is propagated throughout each step. The tools return an indication of the type of information returned to the agent. After the agent feels like it's gathered enough information, move on to the next phase.
2. **Report Generation**: Generate a research report given the pooled research. For now, try to stuff as much information into the context window through the summary index.

This implementation is inspired by our [Function Calling Agent](https://docs.llamaindex.ai/en/stable/examples/workflow/function_calling_agent/) workflow implementation.

In [16]:
from llama_index.llms.openai import OpenAI
from pydantic import BaseModel, Field
from typing import List, Tuple
import pandas as pd
from IPython.display import display, Markdown


class TextBlock(BaseModel):
    """Text block."""

    text: str = Field(..., description="The text for this block.")


class TableBlock(BaseModel):
    """Table block."""

    caption: str = Field(..., description="Caption of the table.")
    col_names: List[str] = Field(..., description="Names of the columns.")
    rows: List[List] = Field(
        ...,
        description=(
            "List of rows. Each row is a data entry tuple, "
            "where each element of the tuple corresponds positionally to the column name."
        )
    )

    def to_df(self) -> pd.DataFrame:
        """To dataframe."""
        df = pd.DataFrame(self.rows, columns=self.col_names)
        df.style.set_caption(self.caption)
        return df


class ReportOutput(BaseModel):
    """Data model for a report.

    Can contain a mix of text and table blocks. Use table blocks to present any quantitative metrics and comparisons.

    """

    blocks: List[TextBlock | TableBlock] = Field(
        ..., description="A list of text and table blocks."
    )

    def render(self) -> None:
        """Render as formatted text within a jupyter notebook."""
        for b in self.blocks:
            if isinstance(b, TextBlock):
                display(Markdown(b.text))
            else:
                display(b.to_df())


report_gen_llm = OpenAI(
    model="gpt-4o", 
    # system_prompt=report_gen_system_prompt, 
    max_tokens=2048,
)
report_gen_sllm = report_gen_llm.as_structured_llm(output_cls=ReportOutput)

In [18]:
report_gen_sllm.metadata.context_window

128000

In [28]:
from llama_index.core.workflow import Workflow

from typing import Any, List
from operator import itemgetter

from llama_index.core.llms.function_calling import FunctionCallingLLM
from llama_index.core.llms.structured_llm import StructuredLLM
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.llms import ChatMessage
from llama_index.core.tools.types import BaseTool
from llama_index.core.tools import ToolSelection
from llama_index.core.workflow import Workflow, StartEvent, StopEvent, Context, step
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.response_synthesizers import TreeSummarize, CompactAndRefine
from llama_index.core.workflow import Event
from llama_index.core.prompts import ChatPromptTemplate


class InputEvent(Event):
    input: List[ChatMessage]


class ChunkRetrievalEvent(Event):
    tool_call: ToolSelection
    

class ReportGenerationEvent(Event):
    pass



report_gen_system_prompt = """\
You are a report generation assistant tasked with producing a well-formatted context given parsed context.
You will be given context from one or more reports that take the form of parsed text + tables
You are responsible for producing a report with interleaving text and tables - in the format of interleaving text and "table" blocks.
You MUST output your response as a tool call in order to adhere to the required output format. Do NOT give back normal text.

Here is an example of a toy valid tool call - note the text and table block:
```
{
    "blocks": [
        {
            "text": "A report on cities"
        },
        {
            "caption": "Comparison of CityA vs. CityB",
            "col_names": [
              "",
              "Population",
              "Country",
            ],
            "rows": [
              [
                "CityA",
                "1,000,000",
                "USA"
              ],
              [
                "CityB",
                "2,000,000",
                "Mexico"
              ]
            ]
        }
    ]
}
```
"""

report_gen_user_prompt = """\
Here is a list of stored context that you can use to generate the report.

-----------------------------
{context_str}
-----------------------------

Here is the user task: {query_str}

Generate the report below.
"""

DEFAULT_REPORT_GEN_PROMPT = ChatPromptTemplate.from_messages([
    ("system", report_gen_system_prompt),
    ("user", report_gen_user_prompt),
])

class ReportGenerationAgent(Workflow):
    """Report generation agent."""

    def __init__(
        self,
        chunk_retriever_tool: BaseTool,
        llm: FunctionCallingLLM | None = None,
        report_gen_sllm: StructuredLLM | None = None,
        **kwargs: Any,
    ) -> None:
        super().__init__(**kwargs)
        self.chunk_retriever_tool = chunk_retriever_tool

        self.llm = llm or OpenAI()
        self.summarizer = CompactAndRefine(llm=self.llm)
        assert self.llm.metadata.is_function_calling_model

        self.report_gen_sllm = report_gen_sllm or self.llm.as_structured_llm(
            ReportOutput, system_prompt=report_gen_system_prompt
        )
        self.report_gen_prompt = DEFAULT_REPORT_GEN_PROMPT

        self.memory = ChatMemoryBuffer.from_defaults(llm=llm)
        self.sources = []

    @step(pass_context=True)
    async def prepare_chat_history(self, ctx: Context, ev: StartEvent) -> InputEvent:
        # clear sources
        self.sources = []

        ctx.data["stored_chunks"] = []
        ctx.data["query"] = ev.input

        # get user input
        user_input = ev.input
        user_msg = ChatMessage(role="user", content=user_input)
        self.memory.put(user_msg)

        # get chat history
        chat_history = self.memory.get()
        return InputEvent(input=chat_history)

    @step(pass_context=True)
    async def handle_llm_input(
        self, ctx: Context, ev: InputEvent
    ) -> ChunkRetrievalEvent | ReportGenerationEvent | StopEvent:
        chat_history = ev.input

        response = await self.llm.achat_with_tools(
            [self.chunk_retriever_tool],
            chat_history=chat_history,
        )
        self.memory.put(response.message)

        tool_calls = self.llm.get_tool_calls_from_response(
            response, error_on_no_tool_call=False
        )
        for tool_call in tool_calls:
            print(f"Tool call: {tool_call}")
        if not tool_calls:
            # all the content should be stored in the context, so just pass along input
            return ReportGenerationEvent(input=ev.input)

        for tool_call in tool_calls:
            if tool_call.tool_name == self.chunk_retriever_tool.metadata.name:
                return ChunkRetrievalEvent(tool_call=tool_call)
            else:
                return StopEvent(result={"response": "Invalid tool."})

    @step(pass_context=True)
    async def handle_retrieval(
        self, ctx: Context, ev: ChunkRetrievalEvent
    ) -> InputEvent:
        """Handle retrieval.

        Store retrieved chunks, and go back to agent reasoning loop.

        """
        query = ev.tool_call.tool_kwargs["query"]
        if isinstance(ev, ChunkRetrievalEvent):
            retrieved_chunks = self.chunk_retriever_tool(query).raw_output
        else:
            retrieved_chunks = self.doc_retriever_tool(query).raw_output
        ctx.data["stored_chunks"].extend(retrieved_chunks)

        # synthesize an answer given the query to return to the LLM.
        response = self.summarizer.synthesize(query, nodes=retrieved_chunks)
        self.memory.put(
            ChatMessage(
                role="tool",
                content=str(response),
                additional_kwargs={
                    "tool_call_id": ev.tool_call.tool_id,
                    "name": ev.tool_call.tool_name,
                },
            )
        )

        # send input event back with updated chat history
        return InputEvent(input=self.memory.get())

    @step(pass_context=True)
    async def generate_report(
        self, ctx: Context, ev: ReportGenerationEvent
    ) -> StopEvent:
        """Generate report."""

        messages = self.report_gen_prompt.format_messages(
            query_str=ctx.data["query"], 
            context_str="\n\n".join([n.get_content(metadata_mode="all") for n in ctx.data["stored_chunks"]])
        )
        response = self.report_gen_sllm.chat(messages).raw
        return StopEvent(result={"response": response})

In [38]:
agent = ReportGenerationAgent(
    chunk_retriever_tool,
    llm=llm,
    report_gen_sllm=report_gen_sllm,
    verbose=True,
    timeout=120.0,
)

In [39]:
ret = await agent.run(
    input="Tell me about the top-level assets and liabilities for Tesla in 2021, and compare it against those of Apple in 2021. Which company is doing better?"
)

Running step prepare_chat_history
Step prepare_chat_history produced event InputEvent
Running step handle_llm_input
Tool call: tool_id='call_MWY7STXZYCAVUr4XhlutAAdd' tool_name='chunk_retriever_fn' tool_kwargs={'query': 'Tesla 2021 financial statements assets liabilities'}
Step handle_llm_input produced event ChunkRetrievalEvent
Running step handle_retrieval
Step handle_retrieval produced event InputEvent
Running step handle_llm_input
Tool call: tool_id='call_Fx7TOUiP9kyFCU4kEmkPyl4F' tool_name='chunk_retriever_fn' tool_kwargs={'query': 'Apple 2021 financial statements assets liabilities'}
Step handle_llm_input produced event ChunkRetrievalEvent
Running step handle_retrieval
Step handle_retrieval produced event InputEvent
Running step handle_llm_input
Step handle_llm_input produced event ReportGenerationEvent
Running step generate_report
Step generate_report produced event StopEvent


In [40]:
ret["response"].render()

This report provides a comparative analysis of the top-level assets and liabilities for Tesla and Apple in the year 2021.

First, we present the consolidated balance sheets for Tesla and Apple for the year 2021.

Unnamed: 0,Unnamed: 1,"December 31, 2021"
0,Total current assets,"$ 27,100"
1,Total non-current assets,"$ 35,031"
2,Total assets,"$ 62,131"
3,Total current liabilities,"$ 19,705"
4,Total non-current liabilities,"$ 10,843"
5,Total liabilities,"$ 30,548"
6,Total equity,"$ 31,583"


Unnamed: 0,Unnamed: 1,"September 25, 2021"
0,Total current assets,"$ 134,836"
1,Total non-current assets,"$ 216,166"
2,Total assets,"$ 351,002"
3,Total current liabilities,"$ 125,481"
4,Total non-current liabilities,"$ 162,431"
5,Total liabilities,"$ 287,912"
6,Total equity,"$ 63,090"


From the balance sheets, we can observe the following key points:

1. **Total Assets**: Apple has significantly higher total assets ($351,002 million) compared to Tesla ($62,131 million).

2. **Total Liabilities**: Apple's total liabilities ($287,912 million) are also much higher than Tesla's ($30,548 million).

3. **Total Equity**: Apple has a higher total equity ($63,090 million) compared to Tesla ($31,583 million).

In conclusion, Apple has a stronger financial position in terms of total assets, liabilities, and equity compared to Tesla in the year 2021.

In [25]:
ret = await agent.run(
    input="Tell me about the gross margin breakdown of Apple 2020-2022."
)

Running step prepare_chat_history
Step prepare_chat_history produced event InputEvent
Running step handle_llm_input
Step handle_llm_input produced event ChunkRetrievalEvent
Running step handle_retrieval
Step handle_retrieval produced event InputEvent
Running step handle_llm_input
Step handle_llm_input produced event ChunkRetrievalEvent
Running step handle_retrieval
Step handle_retrieval produced event InputEvent
Running step handle_llm_input
Step handle_llm_input produced event ChunkRetrievalEvent
Running step handle_retrieval
Step handle_retrieval produced event InputEvent
Running step handle_llm_input
Step handle_llm_input produced event ReportGenerationEvent
Running step generate_report
Step generate_report produced event StopEvent


In [27]:
ret["response"].render()

# Gross Margin Breakdown of Apple (2020-2022)

This report provides a detailed breakdown of Apple's gross margin for the years 2020, 2021, and 2022. The gross margin is divided into Products and Services categories, with both the gross margin in dollars and the gross margin percentage provided for each category.

Unnamed: 0,Unnamed: 1,2022,2021,2020
0,Products,"$ 114,728","$ 105,126","$ 69,461"
1,Services,"$ 56,054","$ 47,710","$ 35,495"
2,Total gross margin,"$ 170,782","$ 152,836","$ 104,956"


Unnamed: 0,Unnamed: 1,2022,2021,2020
0,Products,36.3%,35.3%,31.5%
1,Services,71.7%,69.7%,66.0%
2,Total gross margin percentage,43.3%,41.8%,38.2%
