<a href="https://colab.research.google.com/github/suman527/Report-Generation-Project/blob/main/Financial_Report_Generation_Suman_Sahu.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Financial Report Generation

<a href="https://colab.research.google.com/github/run-llama/llamacloud-demo/blob/main/examples/report_generation/report_generation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In this notebook we show you how to perform financial report generation with LlamaCloud consisting of text and tables, given an existing bank of reports.

LlamaCloud provides advanced retrieval endpoints allowing you to fetch chunk and document-level context from complex financial reports consisting of text, tables, and sometimes images/diagrams.

We build an agentic workflow on top of LlamaCloud consisting of researcher and writer steps in order to generate the final response.

![](financial_report_generation_img.png)

## Setup

Install core packages, download 10k files from Apple and Tesla.

You will need to upload these documents to LlamaCloud. For best results, we recommend:
- Setting Parse settings to "Accurate" mode, "Premium" mode, or "3rd Party multimodal"
- Setting the "Segmentation Configuration" to "Page" and the "Chunking Configuration" to None. This will give you page-level chunks.

In [1]:
!pip install llama-index
!pip install llama-index-core
!pip install llama-index-llms-groq
!pip install llama-index-embeddings-huggingface
!pip install transformers
!pip install git+https://github.com/FlagOpen/FlagEmbedding.git
!pip install llama-index-postprocessor-flag-embedding-reranker
!pip install llama-parse
!pip install llama-index-llms-huggingface


Collecting llama-index
  Downloading llama_index-0.12.42-py3-none-any.whl.metadata (12 kB)
Collecting llama-index-agent-openai<0.5,>=0.4.0 (from llama-index)
  Downloading llama_index_agent_openai-0.4.11-py3-none-any.whl.metadata (439 bytes)
Collecting llama-index-cli<0.5,>=0.4.2 (from llama-index)
  Downloading llama_index_cli-0.4.3-py3-none-any.whl.metadata (1.4 kB)
Collecting llama-index-core<0.13,>=0.12.42 (from llama-index)
  Downloading llama_index_core-0.12.42-py3-none-any.whl.metadata (2.4 kB)
Collecting llama-index-embeddings-openai<0.4,>=0.3.0 (from llama-index)
  Downloading llama_index_embeddings_openai-0.3.1-py3-none-any.whl.metadata (684 bytes)
Collecting llama-index-indices-managed-llama-cloud>=0.4.0 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.7.7-py3-none-any.whl.metadata (3.3 kB)
Collecting llama-index-llms-openai<0.5,>=0.4.0 (from llama-index)
  Downloading llama_index_llms_openai-0.4.5-py3-none-any.whl.metadata (3.0 kB)
Collecting llama

In [2]:
!mkdir data
# download Apple
!wget "https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf" -O data/apple_2023.pdf
!wget "https://s2.q4cdn.com/470004039/files/doc_financials/2022/q4/_10-K-2022-(As-Filed).pdf" -O data/apple_2022.pdf
!wget "https://s2.q4cdn.com/470004039/files/doc_financials/2021/q4/_10-K-2021-(As-Filed).pdf" -O data/apple_2021.pdf
!wget "https://s2.q4cdn.com/470004039/files/doc_financials/2020/ar/_10-K-2020-(As-Filed).pdf" -O data/apple_2020.pdf
!wget "https://www.dropbox.com/scl/fi/i6vk884ggtq382mu3whfz/apple_2019_10k.pdf?rlkey=eudxh3muxh7kop43ov4bgaj5i&dl=1" -O data/apple_2019.pdf

# download Tesla
!wget "https://ir.tesla.com/_flysystem/s3/sec/000162828024002390/tsla-20231231-gen.pdf" -O data/tesla_2023.pdf
!wget "https://ir.tesla.com/_flysystem/s3/sec/000095017023001409/tsla-20221231-gen.pdf" -O data/tesla_2022.pdf
!wget "https://www.dropbox.com/scl/fi/ptk83fmye7lqr7pz9r6dm/tesla_2021_10k.pdf?rlkey=24kxixeajbw9nru1sd6tg3bye&dl=1" -O data/tesla_2021.pdf
!wget "https://ir.tesla.com/_flysystem/s3/sec/000156459021004599/tsla-10k_20201231-gen.pdf" -O data/tesla_2020.pdf
!wget "https://ir.tesla.com/_flysystem/s3/sec/000156459020004475/tsla-10k_20191231-gen_0.pdf" -O data/tesla_2019.pdf

--2025-06-13 04:57:19--  https://s2.q4cdn.com/470004039/files/doc_earnings/2023/q4/filing/_10-K-Q4-2023-As-Filed.pdf
Resolving s2.q4cdn.com (s2.q4cdn.com)... 68.70.205.2, 68.70.205.1, 68.70.205.4, ...
Connecting to s2.q4cdn.com (s2.q4cdn.com)|68.70.205.2|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 714094 (697K) [application/pdf]
Saving to: ‘data/apple_2023.pdf’


2025-06-13 04:57:19 (10.5 MB/s) - ‘data/apple_2023.pdf’ saved [714094/714094]

--2025-06-13 04:57:19--  https://s2.q4cdn.com/470004039/files/doc_financials/2022/q4/_10-K-2022-(As-Filed).pdf
Resolving s2.q4cdn.com (s2.q4cdn.com)... 68.70.205.2, 68.70.205.1, 68.70.205.4, ...
Connecting to s2.q4cdn.com (s2.q4cdn.com)|68.70.205.2|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 729516 (712K) [application/pdf]
Saving to: ‘data/apple_2022.pdf’


2025-06-13 04:57:20 (11.5 MB/s) - ‘data/apple_2022.pdf’ saved [729516/729516]

--2025-06-13 04:57:20--  https://s2.q4cdn.com/470004039/

We set the tokenizer to be gpt-4o specific. Some of our workflows involving cramming as much context into the prompt, and to make this work robustly without context overflow errors, we will want to make sure our tokenizer is accurate.

In [3]:
from google.colab import userdata
import os

os.environ["HF_TOKEN"] = userdata.get("HF_TOKEN")

In [4]:
# ------------------ SETUP ------------------
import os
import nest_asyncio
import asyncio
from llama_index.core import Settings, set_global_tokenizer, SimpleDirectoryReader, VectorStoreIndex
from transformers import AutoTokenizer
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.llms.groq import Groq
from llama_index.core.tools import FunctionTool
from llama_index.core.agent.react import ReActAgent

# Apply asyncio patch for Jupyter/Colab notebooks
nest_asyncio.apply()

# Set your GROQ API key securely
os.environ["GROQ_API_KEY"] = "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"
# Set global tokenizer using LLaMA3 tokenizer
set_global_tokenizer(
    AutoTokenizer.from_pretrained("meta-llama/Meta-Llama-3-8B-Instruct").encode
)

# ------------------ LLM & EMBEDDING SETUP ------------------

# Embedding model
embed_model = HuggingFaceEmbedding(model_name="sentence-transformers/all-MiniLM-L6-v2")

# LLM model from Groq
llm = Groq(
    model="llama3-70b-8192",
    temperature=0.3,
    api_key=os.getenv("GROQ_API_KEY")
)

# Set globally for LlamaIndex
Settings.embed_model = embed_model
Settings.llm = llm

# ------------------ INDEXING ------------------

# Load documents
documents = SimpleDirectoryReader("data").load_data()

# Create vector index
index = VectorStoreIndex.from_documents(documents)

# Create a query engine
query_engine = index.as_query_engine()

# Run a test query (Optional)
response = query_engine.query("Summarize the key points in these documents.")
print("\n Document Summary:\n", response)

# ------------------ TOOL & AGENT SETUP ------------------

# Define a summarizer tool using the query engine
def simple_summary_tool(query: str) -> str:
    return query_engine.query(query).response

# Convert function to LlamaIndex tool
summary_tool = FunctionTool.from_defaults(
    fn=simple_summary_tool,
    name="summarizer",
    description="Summarizes text using the vector index."
)

# Initialize ReActAgent with tool
agent = ReActAgent.from_tools(
    tools=[summary_tool],
    llm=llm,
    verbose=True
)

# Async interaction with the agent
async def run_agent():
    response = await agent.achat("Summarize the key points in these documents.")
    print("\n Agent Response:\n", response.response)

# Run the agent
asyncio.run(run_agent())



tokenizer_config.json:   0%|          | 0.00/51.0k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/9.09M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/73.0 [00:00<?, ?B/s]

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]




📘 Document Summary:
 The documents appear to be financial statements and reports for different years. One document contains an index to consolidated financial statements, including balance sheets, statements of operations, and cash flows. Another document mentions notes to consolidated financial statements. Overall, the documents seem to provide detailed financial information about a company.
> Running step 4c0a1757-c438-446b-980f-f6e393ce1772. Step input: Summarize the key points in these documents.
[1;3;38;5;200mThought: The current language of the user is: English. I need to use a tool to help me answer the question.
Action: summarizer
Action Input: {'query': 'Summarize the key points in these documents.'}
[0m[1;3;34mObservation: The documents appear to be financial statements and reports for different years. One document contains an index to consolidated financial statements, including balance sheets, statements of operations, and cash flows. Another document mentions notes tha

## Load Documents into LlamaCloud

The first order of business is to download the 5 Apple and Tesla 10Ks and upload them into LlamaCloud.

You can easily do this by creating a pipeline and uploading docs via the "Files" mode.

After this is done, proceed to the next section.

## Define LlamaCloud File/Chunk Retriever over Documents

In this section we define both a file-level and chunk-level LlamaCloud Retriever over these documents.

The file-level LlamaCloud retriever returns entire documents with a `files_top_k`. There are two retrieval modes:
- `files_via_content`: Retrieve top-k chunks, dereference into source files. Use a weighted average heuristic to determine the top files to return.
- `files_via_metadata`: Use an LLM to analyze the metadata of each file, and determine the top files that are most relevant to the query.

The chunk-level LlamaCloud retriever is our default retriever that returns chunks via hybrid search + reranking.

#### Define File Retriever

In this section we define the file-level retriever. By default we use `retrieval_mode="files_via_content"`, but you can also change it to `files_via_metadata`.

In [5]:
doc_retriever = index.as_retriever(
    retrieval_mode="files_via_content",
    files_top_k=1
)

In [6]:
nodes = doc_retriever.retrieve("Give me a summary of Tesla in 2019")

#### Define chunk retriever

The chunk-level retriever does vector search with a final reranked set of `rerank_top_n=5`.

In [7]:
chunk_retriever = index.as_retriever(
    retrieval_mode="chunks",
    rerank_top_n=5
)

#### Define Retriever Tools

Wrap these with Python functions into tool objects - these will directly be used by the LLM.

In [8]:
from llama_index.core.tools import FunctionTool
from llama_index.core.schema import NodeWithScore
from typing import List

# function tools
def chunk_retriever_fn(query: str) -> List[NodeWithScore]:
    """Retrieves a small set of relevant document chunks from the corpus.

    ONLY use for research questions that want to look up specific facts from the knowledge corpus,
    and don't need entire documents.

    """
    return chunk_retriever.retrieve(query)

def doc_retriever_fn(query: str) -> float:
    """Document retriever that retrieves entire documents from the corpus.

    ONLY use for research questions that may require searching over entire research reports.

    Will be slower and more expensive than chunk-level retrieval but may be necessary.
    """
    return doc_retriever.retrieve(query)

chunk_retriever_tool = FunctionTool.from_defaults(fn=chunk_retriever_fn)
doc_retriever_tool = FunctionTool.from_defaults(fn=doc_retriever_fn)

## Build a Report Generation Workflow

Now that we've defined the retrievers, we're ready to build the report generation workflow.

The workflow contains roughly the following steps:

1. **Research Gathering**: Perform a function calling loop where the agent tries to reason about what tool to call (chunk-level or document-level retrieval) in order to gather more information. All information is shared to a dictionary that is propagated throughout each step. The tools return an indication of the type of information returned to the agent. After the agent feels like it's gathered enough information, move on to the next phase.
2. **Report Generation**: Generate a research report given the pooled research. For now, try to stuff as much information into the context window through the summary index.

This implementation is inspired by our [Function Calling Agent](https://docs.llamaindex.ai/en/stable/examples/workflow/function_calling_agent/) workflow implementation.

In [9]:
from llama_index.llms.groq import Groq
from pydantic import BaseModel, Field
from typing import List, Tuple
import pandas as pd
from IPython.display import display, Markdown


class TextBlock(BaseModel):
    """Text block."""
    text: str = Field(..., description="The text for this block.")


class TableBlock(BaseModel):
    """Table block."""
    caption: str = Field(..., description="Caption of the table.")
    col_names: List[str] = Field(..., description="Names of the columns.")
    rows: List[Tuple] = Field(
        ...,
        description=(
            "List of rows. Each row is a data entry tuple, "
            "where each element of the tuple corresponds positionally to the column name."
        )
    )

    def to_df(self) -> pd.DataFrame:
        df = pd.DataFrame(self.rows, columns=self.col_names)
        df.style.set_caption(self.caption)
        return df


class ReportOutput(BaseModel):
    """Data model for a report."""
    blocks: List[TextBlock | TableBlock] = Field(
        ..., description="A list of text and table blocks."
    )

    def render(self) -> None:
        for b in self.blocks:
            if isinstance(b, TextBlock):
                display(Markdown(b.text))
            else:
                display(b.to_df())


report_gen_system_prompt = """\
You are a report generation assistant tasked with producing a well-formatted report given parsed context.
You will be given context from one or more reports that take the form of parsed text + tables
You are responsible for producing a report with interleaving text and tables - in the format of interleaving text and "table" blocks.

Make sure the report is detailed with a lot of textual explanations especially if tables are given.

You MUST output your response as a tool call in order to adhere to the required output format. Do NOT give back normal text.

Here is an example of a toy valid tool call - note the text and table block:
```
{
    "blocks": [
        {
            "text": "A report on cities"
        },
        {
            "caption": "Comparison of CityA vs. CityB",
            "col_names": [
              "",
              "Population",
              "Country",
            ],
            "rows": [
              [
                "CityA",
                "1,000,000",
                "USA"
              ],
              [
                "CityB",
                "2,000,000",
                "Mexico"
              ]
            ]
        }
    ]
}
```
"""

report_gen_llm = Groq(
    model="llama3-8b-8192",
    api_key=os.environ["GROQ_API_KEY"],
    system_prompt=report_gen_system_prompt,
    max_tokens=1024,
)

# Structured LLM output
report_gen_sllm = report_gen_llm.as_structured_llm(output_cls=ReportOutput)

In [10]:
from llama_index.llms.groq import Groq
from pydantic import BaseModel, Field
from typing import List, Union, Any
import pandas as pd
from IPython.display import display, Markdown
import os


class TextBlock(BaseModel):
    """Text block."""
    text: str = Field(..., description="The text for this block.")


class TableBlock(BaseModel):
    """Table block."""
    caption: str = Field(..., description="Caption of the table.")
    col_names: List[str] = Field(..., description="Names of the columns.")
    # Changed from List[Tuple] to List[List[Any]] since JSON doesn't support tuples
    rows: List[List[Any]] = Field(
        ...,
        description=(
            "List of rows. Each row is a data entry list, "
            "where each element of the list corresponds positionally to the column name."
        )
    )

    def to_df(self) -> pd.DataFrame:
        df = pd.DataFrame(self.rows, columns=self.col_names)
        # Fix: Use assign to properly set caption
        return df.style.set_caption(self.caption)


class ReportOutput(BaseModel):
    """Data model for a report."""
    # Use Union instead of | for better compatibility
    blocks: List[Union[TextBlock, TableBlock]] = Field(
        ..., description="A list of text and table blocks."
    )

    def render(self) -> None:
        for b in self.blocks:
            if isinstance(b, TextBlock):
                display(Markdown(b.text))
            else:
                display(b.to_df())


# Updated system prompt with corrected example
report_gen_system_prompt = """\
You are a report generation assistant tasked with producing a well-formatted report given parsed context.
You will be given context from one or more reports that take the form of parsed text + tables
You are responsible for producing a report with interleaving text and tables - in the format of interleaving text and "table" blocks.

Make sure the report is detailed with a lot of textual explanations especially if tables are given.

You MUST output your response as a tool call in order to adhere to the required output format. Do NOT give back normal text.

Here is an example of a toy valid tool call - note the text and table block (rows should be lists, not tuples):
```
{
    "blocks": [
        {
            "text": "# A Report on Cities\n\nThis report compares two major cities across different metrics."
        },
        {
            "caption": "Comparison of CityA vs. CityB",
            "col_names": [
              "City",
              "Population",
              "Country"
            ],
            "rows": [
              [
                "CityA",
                "1,000,000",
                "USA"
              ],
              [
                "CityB",
                "2,000,000",
                "Mexico"
              ]
            ]
        },
        {
            "text": "As shown in the table above, CityB has a significantly larger population than CityA."
        }
    ]
}
```

IMPORTANT:
- Each row in the "rows" field must be a LIST (array), not a tuple
- Make sure all JSON is properly formatted and complete
- Include detailed textual analysis between tables
- Use proper markdown formatting in text blocks for better presentation
"""

# Initialize with error handling
try:
    report_gen_llm = Groq(
        model="llama3-8b-8192",
        api_key=os.environ["GROQ_API_KEY"],
        system_prompt=report_gen_system_prompt,
        max_tokens=2048,  # Increased token limit for longer reports
    )

    # Structured LLM output
    report_gen_sllm = report_gen_llm.as_structured_llm(output_cls=ReportOutput)

except Exception as e:
    print(f"Error initializing Groq LLM: {e}")
    print("Make sure GROQ_API_KEY is set in your environment variables")


# Helper function to validate report structure before generation
def validate_report_data(blocks_data):
    """Validate report data structure before passing to LLM."""
    try:
        # Try to create a ReportOutput instance to validate structure
        test_report = ReportOutput(blocks=blocks_data)
        return True, "Validation successful"
    except Exception as e:
        return False, f"Validation failed: {e}"


# Example usage function
def generate_sample_report():
    """Generate a sample report to test the structure."""
    sample_blocks = [
        TextBlock(text="# Sample Financial Report\n\nThis report demonstrates the corrected structure."),
        TableBlock(
            caption="Sample Financial Data",
            col_names=["Company", "Revenue", "Net Income"],
            rows=[
                ["Apple", "$365.8B", "$94.7B"],
                ["Tesla", "$53.8B", "$5.5B"]
            ]
        ),
        TextBlock(text="The data shows Apple's significantly higher revenue and profitability compared to Tesla.")
    ]

    report = ReportOutput(blocks=sample_blocks)
    return report

In [35]:
from llama_index.core.workflow import Workflow, StartEvent, StopEvent, Context, step, Event
from llama_index.core.llms.function_calling import FunctionCallingLLM
from llama_index.core.llms.structured_llm import StructuredLLM
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.core.llms import ChatMessage
from llama_index.core.tools.types import BaseTool
from llama_index.core.tools import ToolSelection
from llama_index.core.response_synthesizers import TreeSummarize, CompactAndRefine
from typing import Any, List

# Custom Events
class InputEvent(Event):
    input: List[ChatMessage]

class ChunkRetrievalEvent(Event):
    tool_call: ToolSelection

class DocRetrievalEvent(Event):
    tool_call: ToolSelection

class ReportGenerationEvent(Event):
    pass

# Main Agent Class
class ReportGenerationAgent(Workflow):
    def __init__(
        self,
        chunk_retriever_tool: BaseTool,
        doc_retriever_tool: BaseTool,
        llm: FunctionCallingLLM | None = None,
        report_gen_sllm: StructuredLLM | None = None,
        max_chunks: int = 2,
        max_chunk_length: int = 1500,
        verbose: bool = False,
        **kwargs: Any,
    ) -> None:
        super().__init__(**kwargs)
        self.chunk_retriever_tool = chunk_retriever_tool
        self.doc_retriever_tool = doc_retriever_tool
        self.max_chunks = max_chunks
        self.max_chunk_length = max_chunk_length
        self._verbose = verbose

        self.llm = llm
        self.summarizer = CompactAndRefine(llm=self.llm)
        assert self.llm.metadata.is_function_calling_model

        self.report_gen_sllm = report_gen_sllm or self.llm.as_structured_llm(ReportOutput, system_prompt=report_gen_system_prompt)
        self.report_gen_summarizer = TreeSummarize(llm=self.report_gen_sllm)

        self.memory = ChatMemoryBuffer.from_defaults(llm=llm)
        self.sources = []

    def _truncate_chunk_if_needed(self, chunk):
        if hasattr(chunk, 'node') and len(chunk.node.text) > self.max_chunk_length:
            original_length = len(chunk.node.text)
            chunk.node.text = chunk.node.text[:self.max_chunk_length] + "... [truncated for context limit]"
            if self._verbose:
                print(f"Truncated chunk from {original_length} to {len(chunk.node.text)} characters")
        return chunk

    def _should_generate_report(self, stored_chunks: List) -> bool:
        return len(stored_chunks) >= self.max_chunks

    @step(pass_context=True)
    async def prepare_chat_history(self, ctx: Context, ev: StartEvent) -> InputEvent:
        self.sources = []
        await ctx.set("stored_chunks", [])
        await ctx.set("query", ev.input)

        user_input = ev.input
        user_msg = ChatMessage(role="user", content=user_input)
        self.memory.put(user_msg)

        chat_history = self.memory.get()
        return InputEvent(input=chat_history)

    @step(pass_context=True)
    async def handle_llm_input(self, ctx: Context, ev: InputEvent) -> ChunkRetrievalEvent | DocRetrievalEvent | ReportGenerationEvent | StopEvent:
        chat_history = ev.input
        stored_chunks = await ctx.get("stored_chunks")

        if self._should_generate_report(stored_chunks):
            if self._verbose:
                print(f"Have {len(stored_chunks)} chunks, proceeding to report generation")
            return ReportGenerationEvent()

        response = await self.llm.achat_with_tools([
            self.chunk_retriever_tool, self.doc_retriever_tool
        ], chat_history=chat_history)

        self.memory.put(response.message)

        tool_calls = self.llm.get_tool_calls_from_response(response, error_on_no_tool_call=False)
        if not tool_calls:
            if stored_chunks:
                return ReportGenerationEvent()
            else:
                return StopEvent(result={"response": "No relevant information found."})

        for tool_call in tool_calls:
            if self._verbose:
                print(f"Tool call: {tool_call}")
            if tool_call.tool_name == self.chunk_retriever_tool.metadata.name:
                return ChunkRetrievalEvent(tool_call=tool_call)
            elif tool_call.tool_name == self.doc_retriever_tool.metadata.name:
                return DocRetrievalEvent(tool_call=tool_call)
            else:
                return StopEvent(result={"response": "Invalid tool."})

    @step(pass_context=True)
    async def handle_retrieval(self, ctx: Context, ev: ChunkRetrievalEvent | DocRetrievalEvent) -> InputEvent:
        query = ev.tool_call.tool_kwargs["query"]
        if isinstance(ev, ChunkRetrievalEvent):
            retrieved_chunks = self.chunk_retriever_tool(query).raw_output
        else:
            retrieved_chunks = self.doc_retriever_tool(query).raw_output

        stored_chunks = await ctx.get("stored_chunks")
        truncated_chunks = [self._truncate_chunk_if_needed(chunk) for chunk in retrieved_chunks]

        for chunk in truncated_chunks:
            if len(stored_chunks) < self.max_chunks:
                stored_chunks.append(chunk)
            else:
                if self._verbose:
                    print(f"Reached max chunks limit ({self.max_chunks}), stopping retrieval")
                break

        await ctx.set("stored_chunks", stored_chunks)

        if self._verbose:
            print(f"Now have {len(stored_chunks)} chunks stored")

        query = await ctx.get("query")
        response = self.summarizer.synthesize(query, nodes=truncated_chunks)
        self.memory.put(ChatMessage(
            role="tool",
            content=str(response),
            additional_kwargs={
                "tool_call_id": ev.tool_call.tool_id,
                "name": ev.tool_call.tool_name
            }
        ))
        return InputEvent(input=self.memory.get())

    @step(pass_context=True)
    async def generate_report(self, ctx: Context, ev: ReportGenerationEvent) -> StopEvent:
        query = await ctx.get("query")
        stored_chunks = await ctx.get("stored_chunks")

        if not stored_chunks:
            return StopEvent(result={"response": "No information available to generate report."})

        if self._verbose:
            print(f"Generating report with {len(stored_chunks)} chunks")
            total_chars = sum(len(getattr(chunk.node, 'text', str(chunk))) for chunk in stored_chunks)
            print(f"Total character count: {total_chars}")

        try:
            final_chunks = []
            total_length = 0
            max_total_length = 3000

            for chunk in stored_chunks:
                chunk_text = getattr(chunk.node, 'text', str(chunk))
                if total_length + len(chunk_text) <= max_total_length:
                    final_chunks.append(chunk)
                    total_length += len(chunk_text)
                else:
                    remaining_space = max_total_length - total_length
                    if remaining_space > 200:
                        truncated_chunk = self._truncate_chunk_if_needed(chunk)
                        if hasattr(truncated_chunk, 'node'):
                            truncated_chunk.node.text = truncated_chunk.node.text[:remaining_space] + "... [final truncation]"
                        final_chunks.append(truncated_chunk)
                    break

            if self._verbose:
                print(f"Using {len(final_chunks)} chunks for final report generation")

            response = self.report_gen_summarizer.synthesize(query, nodes=final_chunks)
            return StopEvent(result={"response": response})

        except Exception as e:
            error_msg = f"Error generating report: {str(e)}"
            if "context" in str(e).lower() or "token" in str(e).lower():
                error_msg += " (Context size limit exceeded - try reducing chunk size or count)"

            if self._verbose:
                print(error_msg)

            if len(stored_chunks) > 1:
                if self._verbose:
                    print("Attempting fallback with single chunk")
                try:
                    response = self.report_gen_summarizer.synthesize(query, nodes=stored_chunks[:1])
                    return StopEvent(result={"response": response})
                except Exception as fallback_error:
                    error_msg += f" (Fallback also failed: {str(fallback_error)})"

            return StopEvent(result={"response": error_msg})

# Helper

def create_optimized_report_agent(
    chunk_retriever_tool: BaseTool,
    doc_retriever_tool: BaseTool,
    llm: FunctionCallingLLM,
    report_gen_sllm: StructuredLLM,
    max_chunks: int = 2,
    max_chunk_length: int = 1200,
    verbose: bool = True
) -> ReportGenerationAgent:
    return ReportGenerationAgent(
        chunk_retriever_tool=chunk_retriever_tool,
        doc_retriever_tool=doc_retriever_tool,
        llm=llm,
        report_gen_sllm=report_gen_sllm,
        max_chunks=max_chunks,
        max_chunk_length=max_chunk_length,
        verbose=verbose
    )

In [36]:

agent = ReportGenerationAgent(
    chunk_retriever_tool,
    doc_retriever_tool,
    llm=llm,
    report_gen_sllm=report_gen_sllm,
    verbose=True,
    timeout=120.0,
)

In [37]:
ret = await agent.run(
    input="Tell me about the top-level assets and liabilities for Tesla in 2019, and compare it against those of Apple in 2021. Which company is doing better?"
)

Running step prepare_chat_history
Step prepare_chat_history produced event InputEvent
Running step handle_llm_input
Tool call: tool_id='sfmz671xs' tool_name='chunk_retriever_fn' tool_kwargs={'query': 'Tesla 2019 financials'}
Step handle_llm_input produced event ChunkRetrievalEvent
Running step handle_retrieval
Truncated chunk from 3089 to 1533 characters
Now have 2 chunks stored
Step handle_retrieval produced event InputEvent
Running step handle_llm_input
Have 2 chunks, proceeding to report generation
Step handle_llm_input produced event ReportGenerationEvent
Running step generate_report
Generating report with 2 chunks
Total character count: 1982
Using 2 chunks for final report generation
Step generate_report produced event StopEvent


In [38]:
ret["response"]

PydanticResponse(response=ReportOutput(blocks=[TextBlock(text="# Tesla's Top-Level Assets and Liabilities in 2019"), TableBlock(caption="Tesla's Top-Level Assets and Liabilities in 2019", col_names=['Asset/Liability', 'Amount'], rows=[['Cash and Cash Equivalents', '14.5 billion'], ['Accounts Receivable', '2.5 billion'], ['Inventory', '1.5 billion'], ['Property, Plant, and Equipment', '10.5 billion'], ['Intangible Assets', '1.2 billion'], ['Total Assets', '30.2 billion'], ['Accounts Payable', '2.2 billion'], ['Accrued Expenses', '1.8 billion'], ['Long-Term Debt', '10.5 billion'], ['Total Liabilities', '14.5 billion']]), TextBlock(text="As shown in the table above, Tesla's total assets in 2019 were $30.2 billion, while its total liabilities were $14.5 billion."), TextBlock(text="# Apple's Top-Level Assets and Liabilities in 2021"), TableBlock(caption="Apple's Top-Level Assets and Liabilities in 2021", col_names=['Asset/Liability', 'Amount'], rows=[['Cash and Cash Equivalents', '193.9 bil

In [39]:
ret["response"].response.render()

# Tesla's Top-Level Assets and Liabilities in 2019

Unnamed: 0,Asset/Liability,Amount
0,Cash and Cash Equivalents,14.5 billion
1,Accounts Receivable,2.5 billion
2,Inventory,1.5 billion
3,"Property, Plant, and Equipment",10.5 billion
4,Intangible Assets,1.2 billion
5,Total Assets,30.2 billion
6,Accounts Payable,2.2 billion
7,Accrued Expenses,1.8 billion
8,Long-Term Debt,10.5 billion
9,Total Liabilities,14.5 billion


As shown in the table above, Tesla's total assets in 2019 were $30.2 billion, while its total liabilities were $14.5 billion.

# Apple's Top-Level Assets and Liabilities in 2021

Unnamed: 0,Asset/Liability,Amount
0,Cash and Cash Equivalents,193.9 billion
1,Accounts Receivable,44.5 billion
2,Inventory,12.5 billion
3,"Property, Plant, and Equipment",54.5 billion
4,Intangible Assets,14.5 billion
5,Total Assets,320.9 billion
6,Accounts Payable,23.5 billion
7,Accrued Expenses,10.5 billion
8,Long-Term Debt,54.5 billion
9,Total Liabilities,88.5 billion


As shown in the table above, Apple's total assets in 2021 were $320.9 billion, while its total liabilities were $88.5 billion.

Comparing the two companies, Apple's total assets in 2021 were significantly higher than Tesla's total assets in 2019, at $320.9 billion compared to $30.2 billion. Additionally, Apple's total liabilities in 2021 were also higher than Tesla's total liabilities in 2019, at $88.5 billion compared to $14.5 billion. Therefore, it can be concluded that Apple is doing better than Tesla in terms of its top-level assets and liabilities.

In [40]:
ret = await agent.run(
    input="Tell me about the gross margin breakdown of Apple 2020-2023."
)

Running step prepare_chat_history
Step prepare_chat_history produced event InputEvent
Running step handle_llm_input
Tool call: tool_id='j4jsf3p6h' tool_name='doc_retriever_fn' tool_kwargs={'query': 'Apple 2020-2023 financials'}
Step handle_llm_input produced event DocRetrievalEvent
Running step handle_retrieval
Truncated chunk from 3339 to 1533 characters
Now have 2 chunks stored
Step handle_retrieval produced event InputEvent
Running step handle_llm_input
Have 2 chunks, proceeding to report generation
Step handle_llm_input produced event ReportGenerationEvent
Running step generate_report
Generating report with 2 chunks
Total character count: 2212
Using 2 chunks for final report generation
Step generate_report produced event StopEvent


In [41]:
print(ret["response"])

{"blocks":[{"text":"Apple's Gross Margin Breakdown for 2020-2023"},{"caption":"Gross Margin Breakdown","col_names":["Year","Gross Margin"],"rows":[["2020","Not Available"],["2021","Not Available"],["2022","Not Available"],["2023","Not Available"]]},{"text":"Note: The gross margin breakdown for 2020-2023 is not available in the provided context. However, the context provides information on Apple's net sales and long-lived assets for 2020-2023, which can be used to analyze the company's financial performance."}]}


In [42]:
ret["response"].response.render()

Apple's Gross Margin Breakdown for 2020-2023

Unnamed: 0,Year,Gross Margin
0,2020,Not Available
1,2021,Not Available
2,2022,Not Available
3,2023,Not Available


Note: The gross margin breakdown for 2020-2023 is not available in the provided context. However, the context provides information on Apple's net sales and long-lived assets for 2020-2023, which can be used to analyze the company's financial performance.

In [46]:
ret = await agent.run(
    input="Give me a condensed summary of Tesla in 2023"
)

Running step prepare_chat_history
Step prepare_chat_history produced event InputEvent
Running step handle_llm_input
Tool call: tool_id='89fd9bftr' tool_name='chunk_retriever_fn' tool_kwargs={'query': 'Tesla 2023'}
Step handle_llm_input produced event ChunkRetrievalEvent
Running step handle_retrieval
Truncated chunk from 3588 to 1533 characters
Truncated chunk from 3098 to 1533 characters
Now have 2 chunks stored
Step handle_retrieval produced event InputEvent
Running step handle_llm_input
Have 2 chunks, proceeding to report generation
Step handle_llm_input produced event ReportGenerationEvent
Running step generate_report
Generating report with 2 chunks
Total character count: 3066
Truncated chunk from 1533 to 1533 characters
Using 2 chunks for final report generation
Step generate_report produced event StopEvent


In [47]:
ret["response"].response.render()

# Tesla Summary 2023

Unnamed: 0,Category,Description
0,Mission,Accelerate the world's transition to sustainable energy
1,Products,"High-performance fully electric vehicles, solar energy generation systems, and energy storage products"
2,Services,"Maintenance, installation, operation, charging, insurance, financial, and other services related to products"
3,Focus,"Increasingly focused on products and services based on artificial intelligence, robotics, and automation"


In 2023, Tesla produced 1,845,985 consumer vehicles and delivered 1,808,581 consumer vehicles. The company is currently focused on increasing vehicle production, capacity, and delivery capabilities.