<a href="https://colab.research.google.com/github/jupiturliu/financialagent/blob/main/warren_buffet_agent_v1_2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Disclaimer**: This agent is not intended as financial advice.  It is for informational and entertainment purposes only.  Do your own due diligence.

In [23]:
!pip install -U --quiet langgraph langchain_community langchain_anthropic langsmith langchain_google_genai

In [24]:
import getpass
import os

# Set your Anthropic API key
os.environ["ANTHROPIC_API_KEY"] = getpass.getpass()

··········


In [25]:
# Set your OpenAI API key
#os.environ["OPENAI_API_KEY"] = getpass.getpass()

In [26]:
# Set your Google API key
os.environ["GOOGLE_API_KEY"] = getpass.getpass()

··········


In [27]:
# You can get an API key here https://financialdatasets.ai/
os.environ["FINANCIAL_DATASETS_API_KEY"] = getpass.getpass()

··········


In [28]:
# You can create an API key here https://smith.langchain.com/settings
os.environ["LANGCHAIN_TRACING_V2"] = "True"
os.environ["LANGCHAIN_API_KEY"] = getpass.getpass()

··········


# Define the tools our agent can use

In [29]:
from langchain_core.tools import tool


@tool
def roe(
    net_income: float,
    equity: float,
) -> float:
    """
    Computes the return on equity (ROE) for a given company.
    Use this function to evaluate the profitability of a company.
    """
    return net_income / equity


@tool
def roic(
    operating_income: float,
    total_debt: float,
    equity: float,
    cash_and_equivalents: float,
    tax_rate: float = 0.35,
) -> float:
    """
    Computes the return on invested capital (ROIC) for a given company.
    Use this function to evaluate the efficiency of a company in generating returns from its capital.
    """
    net_operating_profit_after_tax = operating_income * (1 - tax_rate)
    invested_capital = total_debt + equity - cash_and_equivalents
    return net_operating_profit_after_tax / invested_capital


@tool
def owner_earnings(
    net_income: float,
    depreciation_amortization: float = 0.0,
    capital_expenditures: float = 0.0
):
    """
    Calculates the owner earnings for a company based on the net income, depreciation/amortization, and capital expenditures.
    """
    return net_income + depreciation_amortization - capital_expenditures


@tool
def intrinsic_value(
    free_cash_flow: float,
    growth_rate: float = 0.05,
    discount_rate: float = 0.10,
    terminal_growth_rate: float = 0.02,
    num_years: int = 5,
) -> float:
    """
    Computes the discounted cash flow (DCF) for a given company based on the current free cash flow.
    Use this function to calculate the intrinsic value of a stock.
    """
    # Estimate the future cash flows based on the growth rate
    cash_flows = [free_cash_flow * (1 + growth_rate) ** i for i in range(num_years)]

    # Calculate the present value of projected cash flows
    present_values = []
    for i in range(num_years):
        present_value = cash_flows[i] / (1 + discount_rate) ** (i + 1)
        present_values.append(present_value)

    # Calculate the terminal value
    terminal_value = cash_flows[-1] * (1 + terminal_growth_rate) / (discount_rate - terminal_growth_rate)
    terminal_present_value = terminal_value / (1 + discount_rate) ** num_years

    # Sum up the present values and terminal value
    dcf_value = sum(present_values) + terminal_present_value

    return dcf_value

In [30]:
from langgraph.prebuilt import ToolNode

from langchain_community.tools import IncomeStatements, BalanceSheets, CashFlowStatements
from langchain_community.utilities.financial_datasets import FinancialDatasetsAPIWrapper

# Create the tools
api_wrapper = FinancialDatasetsAPIWrapper()
integration_tools = [
    IncomeStatements(api_wrapper=api_wrapper),
    BalanceSheets(api_wrapper=api_wrapper),
    CashFlowStatements(api_wrapper=api_wrapper),
]

local_tools = [intrinsic_value, roe, roic, owner_earnings]
tools = integration_tools + local_tools

tool_node = ToolNode(tools)

# Set up the LLM

In [31]:
from langchain.tools.render import format_tool_to_openai_function
from langchain_anthropic.chat_models import ChatAnthropic
from langchain.chat_models import ChatOpenAI
from langchain_google_genai import ChatGoogleGenerativeAI

gemini = ChatGoogleGenerativeAI(model="gemini-1.5-flash-latest")
# Choose the LLM that will drive the agent
model = ChatAnthropic(model="claude-3-5-sonnet-20240620", temperature=0).bind_tools(tools)
#model = ChatOpenAI(model="gpt-4-1106-preview", temperature=0).bind_tools(tools)

In [32]:
from langchain_core.messages import SystemMessage

system_prompt = """
You are an AI financial analyst with expertise in analyzing businesses using methods similar to those of Warren Buffett. Your task is to provide short, accurate, and concise answers to questions about company financials and performance.

Here are a few example questions and answers
Example 1: {
  "question": "What was NVDA's net income for the fiscal year 2023?",
  "answer": "The net income for NVDA in 2023 was $2.8 billion.",
}

Example 2: {
  "question": "How did NVDA's gross profit in 2023 compare to its gross profit in 2022?",
  "answer": "In 2023, NVDA's gross profit increased by 12% compared to 2022.",
}

Example 3: {
  "question": "What was NVDA's revenue for the first quarter of 2024?",
  "answer": "NVDA's revenue for the first quarter of 2024 was $5.6 billion.",
},

Analyze these examples carefully. Notice how the answers are concise, specific, and directly address the questions asked. They provide precise financial figures and, when applicable, comparative analysis.

When answering questions:
1. Focus on providing accurate financial data and insights.
2. Use specific numbers and percentages when available.
3. Make comparisons between different time periods if relevant.
4. Keep your answers short, concise, and to the point.

Important: You must be short and concise with your answers.
"""

# Define the agent state

In [33]:
from typing import TypedDict, Annotated, Sequence
import operator
from langchain_core.messages import BaseMessage

class AgentState(TypedDict):
    messages: Annotated[Sequence[BaseMessage], operator.add]

# Define the nodes

In [34]:
from typing import Literal
from langgraph.graph import END, StateGraph, MessagesState


# Define the function that determines whether to continue or not
def should_continue(state: MessagesState) -> Literal["tools", END]:
    messages = state['messages']
    last_message = messages[-1]
    # If the LLM makes a tool call, then we route to the "tools" node
    if last_message.tool_calls:
        return "tools"
    # Otherwise, we stop (reply to the user)
    return END

# Define the function that calls the model
def call_model(state: MessagesState):
    prompt = SystemMessage(
        content=system_prompt
    )
    # Get the messages
    messages = state['messages']

    # Check if first message in messages is the prompt
    if messages and messages[0].content != system_prompt:
        # Add the prompt to the start of the message
        messages.insert(0, prompt)

    # Call the model
    response = model.invoke(messages)
    # We return a list, because this will get added to the existing list
    return {"messages": [response]}

# Define the graph

In [35]:
from langgraph.checkpoint.memory import MemorySaver

# Define a new graph
workflow = StateGraph(MessagesState)

# Define the two nodes we will cycle between
workflow.add_node("agent", call_model)
workflow.add_node("tools", tool_node)

# Set the entrypoint as `agent`
# This means that this node is the first one called
workflow.set_entry_point("agent")

# We now add a conditional edge
workflow.add_conditional_edges(
    # First, we define the start node. We use `agent`.
    # This means these are the edges taken after the `agent` node is called.
    "agent",
    # Next, we pass in the function that will determine which node is called next.
    should_continue,
)

# We now add a normal edge from `tools` to `agent`.
# This means that after `tools` is called, `agent` node is called next.
workflow.add_edge("tools", 'agent')

# Initialize memory to persist state between graph runs
checkpointer = MemorySaver()

# Finally, we compile it!
# This compiles it into a LangChain Runnable,
# meaning you can use it as you would any other runnable.
# Note that we're (optionally) passing the memory when compiling the graph
app = workflow.compile()

# Run the financial agent

In [36]:
from langchain_core.messages import HumanMessage

# Use the Runnable
final_state = app.invoke(
    # {"messages": [HumanMessage(content="What is NVDA's intrinsic value given a discount rate of 5%, growth rate of 10%, terminal growth rate of 3%?")]},
    {"messages": [HumanMessage(content="What was AAPL's revenue in FY 2023?")]},
    config={"configurable": {"thread_id": 42}}
)
final_state["messages"][-1].content

"Based on the income statement data provided, AAPL's revenue for the fiscal year 2023 was $383.285 billion."

# Create dataset of questions

In [37]:
questions = [
    {
        "question": "What was AAPL's revenue in the fiscal year ending September 30, 2023?",
        "answer": "$383,285,000,000",
        "context": "The revenue for AAPL in the fiscal year ending September 30, 2023 was 383285000000.0."
    },
    {
        "question": "What was AAPL's net income in the fiscal year 2023?",
        "answer": "$96,995,000,000",
        "context": "The net income for AAPL in the fiscal year 2023 was 96995000000.0."
    },
    {
        "question": "What was AAPL's earnings per share (diluted) in 2023?",
        "answer": "$6.13",
        "context": "The earnings per share diluted for AAPL in 2023 was 6.13."
    },
    {
        "question": "How much did AAPL pay in dividends per common share in 2023?",
        "answer": "$0.94",
        "context": "The dividends per common share for AAPL in 2023 was 0.94."
    },
    {
        "question": "What was AAPL's gross profit in the fiscal year 2022?",
        "answer": "$170,782,000,000",
        "context": "The gross profit for AAPL in the fiscal year 2022 was 170782000000.0."
    },
    {
        "question": "How much was AAPL's operating income in 2021?",
        "answer": "$108,949,000,000",
        "context": "The operating income for AAPL in 2021 was 108949000000.0."
    },
    {
        "question": "What was AAPL's revenue in the fiscal year 2020?",
        "answer": "$274,515,000,000",
        "context": "The revenue for AAPL in the fiscal year 2020 was 274515000000.0."
    },
    {
        "question": "How much did AAPL spend on selling, general, and administrative expenses in 2019?",
        "answer": "$18,245,000,000",
        "context": "The selling, general, and administrative expenses for AAPL in 2019 was 18245000000.0."
    },
    {
        "question": "What was AAPL's interest expense in the fiscal year 2018?",
        "answer": "$3,240,000,000",
        "context": "The interest expense for AAPL in the fiscal year 2018 was 3240000000.0."
    },
    {
        "question": "What was TSLA's total assets value as of December 31, 2023?",
        "answer": "$106,618,000,000",
        "context": "The total assets for TSLA as of December 31, 2023 was 106618000000.0."
    },
    {
        "question": "Calculate TSLA's current ratio for the year 2023.",
        "answer": "1.73",
        "context": "Current ratio = Current assets / Current liabilities. For 2023, Current assets = 49616000000.0, Current liabilities = 28748000000.0. 49616000000.0 / 28748000000.0 ≈ 1.73"
    },
    {
        "question": "What was the year-over-year growth rate of TSLA's property, plant, and equipment from 2022 to 2023?",
        "answer": "23.17%",
        "context": "2022 PP&E: 36635000000.0, 2023 PP&E: 45123000000.0. Growth rate = (45123000000.0 - 36635000000.0) / 36635000000.0 * 100 ≈ 23.17%"
    },
    {
        "question": "Calculate TSLA's debt-to-equity ratio for 2023.",
        "answer": "0.08",
        "context": "Debt-to-equity ratio = Total debt / Shareholders' equity. For 2023, Total debt = 5230000000.0, Shareholders' equity = 62634000000.0. 5230000000.0 / 62634000000.0 ≈ 0.08"
    },
    {
        "question": "What was the percentage increase in TSLA's inventory from 2022 to 2023?",
        "answer": "6.13%",
        "context": "2022 inventory: 12839000000.0, 2023 inventory: 13626000000.0. Percentage increase = (13626000000.0 - 12839000000.0) / 12839000000.0 * 100 ≈ 6.13%"
    },
    {
        "question": "Calculate the working capital for TSLA in 2023.",
        "answer": "$20,868,000,000",
        "context": "Working capital = Current assets - Current liabilities. For 2023, Current assets = 49616000000.0, Current liabilities = 28748000000.0. 49616000000.0 - 28748000000.0 = 20868000000.0"
    },
    {
        "question": "What was TSLA's quick ratio (acid-test ratio) for 2023?",
        "answer": "1.25",
        "context": "Quick ratio = (Current assets - Inventory) / Current liabilities. For 2023, (49616000000.0 - 13626000000.0) / 28748000000.0 ≈ 1.25"
    },
    {
        "question": "Calculate the year-over-year growth rate of TSLA's total liabilities from 2022 to 2023.",
        "answer": "18.03%",
        "context": "2022 total liabilities: 36440000000.0, 2023 total liabilities: 43009000000.0. Growth rate = (43009000000.0 - 36440000000.0) / 36440000000.0 * 100 ≈ 18.03%"
    },
    {
        "question": "What was the book value per share for TSLA at the end of 2023?",
        "answer": "$19.70",
        "context": "Book value per share = Shareholders' equity / Outstanding shares. For 2023, 62634000000.0 / 3178921391.0 ≈ 19.70"
    },
    {
        "question": "Calculate the percentage change in TSLA's cash and equivalents from 2022 to 2023.",
        "answer": "0.89%",
        "context": "2022 cash and equivalents: 16253000000.0, 2023 cash and equivalents: 16398000000.0. Percentage change = (16398000000.0 - 16253000000.0) / 16253000000.0 * 100 ≈ 0.89%"
    },
    {
        "question": "Calculate the year-over-year growth rate of MSFT's net cash flow from operations from FY 2023 to FY 2024.",
        "answer": "35.36%",
        "context": "FY 2023 net cash flow from operations: 87582000000.0, FY 2024: 118548000000.0. Growth rate = (118548000000.0 - 87582000000.0) / 87582000000.0 * 100 ≈ 35.36%"
    },
    {
        "question": "What was MSFT's free cash flow for the fiscal year 2024?",
        "answer": "$74,071,000,000",
        "context": "Free cash flow = Net cash flow from operations - Capital expenditure. For FY 2024: 118548000000.0 - 44477000000.0 = 74071000000.0"
    },
    {
        "question": "Calculate the cash flow to capital expenditure ratio for MSFT in FY 2024.",
        "answer": "2.67",
        "context": "Cash flow to capital expenditure ratio = Net cash flow from operations / Capital expenditure. For FY 2024: 118548000000.0 / 44477000000.0 ≈ 2.67"
    },
    {
        "question": "What percentage of MSFT's net cash flow from operations was spent on dividends in FY 2024?",
        "answer": "18.37%",
        "context": "Dividends: 21771000000.0, Net cash flow from operations: 118548000000.0. Percentage = (21771000000.0 / 118548000000.0) * 100 ≈ 18.37%"
    },
    {
        "question": "Calculate the year-over-year growth rate of MSFT's capital expenditure from FY 2023 to FY 2024.",
        "answer": "58.24%",
        "context": "FY 2023 capital expenditure: 28107000000.0, FY 2024: 44477000000.0. Growth rate = (44477000000.0 - 28107000000.0) / 28107000000.0 * 100 ≈ 58.24%"
    },
    {
        "question": "What was the cash flow coverage ratio for MSFT in FY 2024?",
        "answer": "5.45",
        "context": "Cash flow coverage ratio = Net cash flow from operations / (Dividends + Capital expenditure). For FY 2024: 118548000000.0 / (21771000000.0 + 44477000000.0) ≈ 5.45"
    },
    {
        "question": "Calculate the cash flow return on assets for MSFT in FY 2024, assuming total assets of $450 billion at the end of the fiscal year.",
        "answer": "26.34%",
        "context": "Cash flow return on assets = Net cash flow from operations / Total assets. For FY 2024: (118548000000.0 / 450000000000) * 100 ≈ 26.34%"
    },
    {
        "question": "What was the cash flow to debt ratio for MSFT in FY 2024, assuming total debt of $80 billion at the end of the fiscal year?",
        "answer": "1.48",
        "context": "Cash flow to debt ratio = Net cash flow from operations / Total debt. For FY 2024: 118548000000.0 / 80000000000 ≈ 1.48"
    },
    {
        "question": "Calculate the compound annual growth rate (CAGR) of MSFT's net cash flow from operations from FY 2020 to FY 2024.",
        "answer": "18.24%",
        "context": "FY 2020: 60675000000.0, FY 2024: 118548000000.0. CAGR = (118548000000.0 / 60675000000.0)^(1/4) - 1 ≈ 18.24%"
    },
    {
        "question": "What percentage of MSFT's net cash flow from operations was used for share repurchases in FY 2024?",
        "answer": "12.87%",
        "context": "Share repurchases (issuance or purchase of equity shares): 15252000000.0, Net cash flow from operations: 118548000000.0. Percentage = (15252000000.0 / 118548000000.0) * 100 ≈ 12.87%"
    }
]

# Visualize dataset

In [38]:
import pandas as pd

# Convert to DataFrame and display
df = pd.DataFrame([question for question in questions])
print("\nDataFrame:")
display(df)


DataFrame:


Unnamed: 0,question,answer,context
0,What was AAPL's revenue in the fiscal year end...,"$383,285,000,000",The revenue for AAPL in the fiscal year ending...
1,How much did AAPL spend on research and develo...,"$29,915,000,000",The research and development expense for AAPL ...
2,What was AAPL's net income in the fiscal year ...,"$96,995,000,000",The net income for AAPL in the fiscal year 202...
3,What was AAPL's earnings per share (diluted) i...,$6.13,The earnings per share diluted for AAPL in 202...
4,How much did AAPL pay in dividends per common ...,$0.94,The dividends per common share for AAPL in 202...
5,What was AAPL's gross profit in the fiscal yea...,"$170,782,000,000",The gross profit for AAPL in the fiscal year 2...
6,How much was AAPL's operating income in 2021?,"$108,949,000,000",The operating income for AAPL in 2021 was 1089...
7,What was AAPL's revenue in the fiscal year 2020?,"$274,515,000,000",The revenue for AAPL in the fiscal year 2020 w...
8,"How much did AAPL spend on selling, general, a...","$18,245,000,000","The selling, general, and administrative expen..."
9,What was AAPL's interest expense in the fiscal...,"$3,240,000,000",The interest expense for AAPL in the fiscal ye...


# Create dataset in LangSmith

In [39]:
inputs = []
outputs = []

for row in questions:
  question = row["question"]
  answer = row["answer"]
  inputs.append(question)
  outputs.append(answer)

In [45]:
from langsmith import Client

# Create dataset
client = Client()
dataset_name = "warren-buffett-agent-test-0.0.6"
dataset = client.create_dataset(
    dataset_name=dataset_name,
    description="QA pairs for financial evaluation",
)
client.create_examples(
    inputs=[{"question": q} for q in inputs],
    outputs=[{"answer": a} for a in outputs],
    dataset_id=dataset.id,
)

# Evaluate

In [46]:
def predict_answer(example: dict):
    """Use this for answer evaluation"""
    question = example.get("question")

    final_state = app.invoke(
      {"messages": [HumanMessage(content=question)]},
      config={"configurable": {"thread_id": 42}}
    )
    answer = final_state["messages"][-1].content
    return {"answer": answer}

In [44]:
from langsmith.evaluation import LangChainStringEvaluator, evaluate

eval_llm = ChatAnthropic(temperature=0.0, model="claude-3-5-sonnet-20240620")

# Evaluator
qa_evalulator = [
    LangChainStringEvaluator(
        "qa",
        prepare_data=lambda run, example: {
            "prediction": run.outputs["answer"],
            "reference": example.outputs["answer"],
            "input": example.inputs["question"],
        },
        config={"llm": eval_llm}
      ),
]
experiment_results = evaluate(
    predict_answer,
    data=dataset_name,
    evaluators=qa_evalulator,
    experiment_prefix="financial-rag-qa",
    metadata={
      "version": "1.0.0",
      "revision_id": "beta"
    },
)

View the evaluation results for experiment: 'financial-rag-qa-a0635999' at:
https://smith.langchain.com/o/ef4c0279-1de6-5abe-aee3-f48ba8ccc824/datasets/125a05b5-d9e6-4fe1-9bdf-ed4542d97058/compare?selectedSessions=752522de-a3d0-4a76-b262-b2184654a93d




0it [00:00, ?it/s]

ERROR:langsmith.evaluation._runner:Error running evaluator <DynamicRunEvaluator evaluate> on run 1568848d-d8d5-4ffa-81ff-53fb6aac769f: RateLimitError("Error code: 429 - {'type': 'error', 'error': {'type': 'rate_limit_error', 'message': 'Number of request tokens has exceeded your per-minute rate limit (https://docs.anthropic.com/en/api/rate-limits); see the response headers for current usage. Please reduce the prompt length or the maximum tokens requested, or try again later. You may also contact sales at https://www.anthropic.com/contact-sales to discuss your options for a rate limit increase.'}}")
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/langsmith/evaluation/_runner.py", line 1313, in _run_evaluators
    evaluator_response = evaluator.evaluate_run(
  File "/usr/local/lib/python3.10/dist-packages/langsmith/evaluation/evaluator.py", line 278, in evaluate_run
    result = self.func(
  File "/usr/local/lib/python3.10/dist-packages/langsmith/run_he