<div style="background-color:#000;"><img src="pqn.png"></img></div>

We import libraries for quantitative finance, data processing, and workflow management

In [None]:
from pydantic import BaseModel, Field
from typing import Optional, List, Dict

from llama_cloud_services import LlamaExtract
from llama_cloud.core.api_error import ApiError
from llama_cloud import ExtractConfig

from llama_index.core.workflow import (
    Event,
    StartEvent,
    StopEvent,
    Context,
    Workflow,
    step,
    draw_all_possible_flows
)
from llama_index.utils.workflow import draw_all_possible_flows
from llama_index.llms.openai import OpenAI
from llama_index.core.llms.llm import LLM
from llama_index.core.prompts import ChatPromptTemplate

import nest_asyncio
nest_asyncio.apply()

from dotenv import load_dotenv
load_dotenv()

These libraries provide tools for data modeling, API interactions, workflow management, and natural language processing. We use them to create a structured workflow for financial analysis and report generation.

## Define data models

We create data models to structure our financial information and analysis outputs.

In [None]:
class RawFinancials(BaseModel):
    revenue: Optional[float] = Field(
        None, description="Extracted revenue (in million USD)"
    )
    operating_income: Optional[float] = Field(
        None, description="Extracted operating income (in million USD)"
    )
    eps: Optional[float] = Field(None, description="Extracted earnings per share")
    # Add more metrics as needed


class InitialFinancialDataOutput(BaseModel):
    company_name: str = Field(
        ..., description="Company name as extracted from the earnings deck"
    )
    ticker: str = Field(..., description="Stock ticker symbol")
    report_date: str = Field(..., description="Date of the earnings deck/report")
    raw_financials: RawFinancials = Field(
        ..., description="Structured raw financial metrics"
    )
    narrative: Optional[str] = Field(
        None, description="Additional narrative content (if any)"
    )


# Define the structured output schema for each company's financial model
class FinancialModelOutput(BaseModel):
    revenue_projection: float = Field(
        ..., description="Projected revenue for next year (in million USD)"
    )
    operating_income_projection: float = Field(
        ..., description="Projected operating income for next year (in million USD)"
    )
    growth_rate: float = Field(..., description="Expected revenue growth rate (%)")
    discount_rate: float = Field(
        ..., description="Discount rate (%) used for valuation"
    )
    terminal_growth_rate: float = Field(
        ..., description="Terminal growth rate (%) used in the model"
    )
    valuation_estimate: float = Field(
        ..., description="Estimated enterprise value (in million USD)"
    )
    key_assumptions: str = Field(
        ..., description="Key assumptions such as tax rate, CAPEX ratio, etc."
    )
    summary: str = Field(
        ..., description="A brief summary of the preliminary financial model analysis."
    )


class ComparativeAnalysisOutput(BaseModel):
    comparative_analysis: str = Field(
        ..., description="Comparative analysis between Company A and Company B"
    )
    overall_recommendation: str = Field(
        ..., description="Overall investment recommendation with rationale"
    )


# Define the final equity research memo schema, which aggregates the outputs for Company A and B
class FinalEquityResearchMemoOutput(BaseModel):
    company_a_model: FinancialModelOutput = Field(
        ..., description="Financial model summary for Company A"
    )
    company_b_model: FinancialModelOutput = Field(
        ..., description="Financial model summary for Company B"
    )
    comparative_analysis: ComparativeAnalysisOutput = Field(
        ..., description="Comparative analysis between Company A and Company B"
    )


We define several data models using Pydantic. These models help structure our financial data and analysis outputs. They include classes for raw financials, initial financial data output, financial model output, comparative analysis, and the final equity research memo. Each model specifies the expected fields and their descriptions.

## Set up LlamaExtract agent

In [None]:
llama_extract = LlamaExtract(
    project_id="e8aabe96-8170-4987-a058-168961a97375",
    organization_id="14b36159-7d91-4d2d-8048-d8ce28654ef3",
)

try:
    existing_agent = llama_extract.get_agent(name="automotive-sector-analysis")
    if existing_agent:
        llama_extract.delete_agent(existing_agent.id)
except ApiError as e:
    if e.status_code == 404:
        pass
    else:
        raise

extract_config = ExtractConfig(
    extraction_mode="BALANCED"
    # extraction_mode="MULTIMODAL"
)

agent = llama_extract.create_agent(
    name="automotive-sector-analysis",
    data_schema=InitialFinancialDataOutput,
    config=extract_config,
)

We set up the LlamaExtract agent for our automotive sector analysis. This involves initializing the agent with specific project and organization IDs, handling any existing agents, and creating a new agent with a balanced extraction mode. The agent is configured to use our InitialFinancialDataOutput schema for data extraction.

## Define workflow events and classes

In [None]:
# Define custom events for each step
class DeckAParseEvent(Event):
    deck_content: InitialFinancialDataOutput


class DeckBParseEvent(Event):
    deck_content: InitialFinancialDataOutput


class CompanyModelEvent(Event):
    model_output: FinancialModelOutput


class ComparableDataLoadEvent(Event):
    company_a_output: FinancialModelOutput
    company_b_output: FinancialModelOutput


class LogEvent(Event):
    msg: str
    delta: bool = False


class AutomotiveSectorAnalysisWorkflow(Workflow):
    """
    Workflow to generate an equity research memo for automotive sector analysis.
    """

    def __init__(
        self,
        agent: LlamaExtract,
        modeling_path: str,
        llm: Optional[LLM] = None,
        **kwargs
    ):
        super().__init__(**kwargs)
        self.agent = agent
        self.llm = llm or OpenAI(model="gpt-4o")
        # Load financial modeling assumptions from file
        with open(modeling_path, "r") as f:
            self.modeling_data = f.read()
        # Instead of loading comparable data from a text file, we load from a PDF

    async def _parse_deck(self, ctx: Context, deck_path) -> InitialFinancialDataOutput:
        extraction_result = await self.agent.aextract(deck_path)
        initial_output = extraction_result.data  # expected to be a string
        ctx.write_event_to_stream(LogEvent(msg="Transcript parsed successfully."))
        return initial_output

    @step
    async def parse_deck_a(self, ctx: Context, ev: StartEvent) -> DeckAParseEvent:
        initial_output = await self._parse_deck(ctx, ev.deck_path_a)
        await ctx.set("initial_output_a", initial_output)
        return DeckAParseEvent(deck_content=initial_output)

    @step
    async def parse_deck_b(self, ctx: Context, ev: StartEvent) -> DeckBParseEvent:
        initial_output = await self._parse_deck(ctx, ev.deck_path_b)
        await ctx.set("initial_output_b", initial_output)
        return DeckBParseEvent(deck_content=initial_output)

    async def _generate_financial_model(
        self, ctx: Context, financial_data: InitialFinancialDataOutput
    ) -> FinancialModelOutput:
        prompt_str = """
    You are an expert financial analyst.
    Using the following raw financial data from an earnings deck and financial modeling assumptions,
    refine the data to produce a financial model summary. Adjust the assumptions based on the company-specific context.
    Please use the most recent quarter's financial data from the earnings deck.

    Raw Financial Data:
    {raw_data}
    Financial Modeling Assumptions:
    {assumptions}

    Return your output as JSON conforming to the FinancialModelOutput schema.
    You MUST make sure all fields are filled in the output JSON.

    """
        prompt = ChatPromptTemplate.from_messages([("user", prompt_str)])
        refined_model = await self.llm.astructured_predict(
            FinancialModelOutput,
            prompt,
            raw_data=financial_data.model_dump_json(),
            assumptions=self.modeling_data,
        )
        return refined_model

    @step
    async def refine_financial_model_company_a(
        self, ctx: Context, ev: DeckAParseEvent
    ) -> CompanyModelEvent:
        print("deck content A", ev.deck_content)
        refined_model = await self._generate_financial_model(ctx, ev.deck_content)
        print("refined_model A", refined_model)
        print(type(refined_model))
        await ctx.set("CompanyAModelEvent", refined_model)
        return CompanyModelEvent(model_output=refined_model)

    @step
    async def refine_financial_model_company_b(
        self, ctx: Context, ev: DeckBParseEvent
    ) -> CompanyModelEvent:
        print("deck content B", ev.deck_content)
        refined_model = await self._generate_financial_model(ctx, ev.deck_content)
        print("refined_model B", refined_model)
        print(type(refined_model))
        await ctx.set("CompanyBModelEvent", refined_model)
        return CompanyModelEvent(model_output=refined_model)

    @step
    async def cross_reference_models(
        self, ctx: Context, ev: CompanyModelEvent
    ) -> StopEvent:
        # Assume CompanyAModelEvent and CompanyBModelEvent are stored in the context
        company_a_model = await ctx.get("CompanyAModelEvent", default=None)
        company_b_model = await ctx.get("CompanyBModelEvent", default=None)
        if company_a_model is None or company_b_model is None:
            return

        prompt_str = """
    You are an expert investment analyst.
    Compare the following refined financial models for Company A and Company B.
    Based on this comparison, provide a specific investment recommendation for Tesla (Company A).
    Focus your analysis on:
    1. Key differences in revenue projections, operating income, and growth rates
    2. Valuation estimates and their implications
    3. Clear recommendation for Tesla with supporting rationale

    Return your output as JSON conforming to the ComparativeAnalysisOutput schema.
    You MUST make sure all fields are filled in the output JSON.

    Company A Model:
    {company_a_model}

    Company B Model:
    {company_b_model}
        """
        prompt = ChatPromptTemplate.from_messages([("user", prompt_str)])
        comp_analysis = await self.llm.astructured_predict(
            ComparativeAnalysisOutput,
            prompt,
            company_a_model=company_a_model.model_dump_json(),
            company_b_model=company_b_model.model_dump_json(),
        )
        final_memo = FinalEquityResearchMemoOutput(
            company_a_model=company_a_model,
            company_b_model=company_b_model,
            comparative_analysis=comp_analysis,
        )
        return StopEvent(result={"memo": final_memo})

We define custom events and a workflow class for our automotive sector analysis. The workflow includes steps for parsing financial decks, generating financial models, and performing comparative analysis. Each step is defined as an asynchronous method within the AutomotiveSectorAnalysisWorkflow class.

## Generate workflow diagram and run analysis

In [None]:
draw_all_possible_flows(
    AutomotiveSectorAnalysisWorkflow,
    filename="data/automotive_sector_analysis_workflow.html",
)

In [None]:
modeling_path = "data/modeling_assumptions.txt"
workflow = AutomotiveSectorAnalysisWorkflow(
    agent=agent, modeling_path=modeling_path, verbose=True, timeout=240
)

In [None]:
result = await workflow.run(
    deck_path_a="data/tesla_q2_earnings.pdf",
    deck_path_b="data/ford_q2_earnings_press_release.pdf",
)
final_memo = result["memo"]
print("\n********Final Equity Research Memo:********\n", final_memo)

We generate a visual representation of our workflow and then run the analysis. The workflow processes earnings reports for Tesla and Ford, generates financial models, and produces a comparative analysis. The final result is an equity research memo that provides insights and recommendations based on the analysis of both companies.

<a href="https://pyquantnews.com/">PyQuant News</a> is where finance practitioners level up with Python for quant finance, algorithmic trading, and market data analysis. Looking to get started? Check out the fastest growing, top-selling course to <a href="https://gettingstartedwithpythonforquantfinance.com/">get started with Python for quant finance</a>. For educational purposes. Not investment advise. Use at your own risk.