<a href="https://colab.research.google.com/github/ashater/creditreviews/blob/main/CreditAnnualReview_tool_use_added_at_front.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



##Install and Import

In [None]:
# ! pip install "unstructured[pdf]"

# ! pip install langchain
# ! pip install langchain-anthropic
# ! pip install -U langchain-community

# ! pip install docarray
# ! pip install gpt4all > /dev/null

# ! apt-get install poppler-utils
# ! pip install pymupdf

# ! apt install tesseract-ocr
# ! apt install libtesseract-dev
# ! pip install tesseract


In [None]:
import anthropic
from langchain_anthropic import ChatAnthropic

from langchain.prompts import ChatPromptTemplate
from langchain.chains import RetrievalQA
from langchain.document_loaders import UnstructuredPDFLoader
from langchain.vectorstores import DocArrayInMemorySearch
from IPython.display import display, Markdown

from langchain.indexes import VectorstoreIndexCreator
from langchain_community.embeddings import GPT4AllEmbeddings

import fitz
from google.colab import userdata

##Set up tools

In [None]:
tool_definition_financial_data_lookup = {
    "name": "get_financial_data",
    "description": "Retrieves the financial metric of a given company, at a given date.",
    "input_schema": {
        "type": "object",
        "properties": {
            "ticker": {
                "type": "string",
                "description": "The company's stock ticker to fetch financial data for. For example, JP Morgan's stock ticker is JPM, and JPM is the expected input to the function."
            },
            "metric": {
                "type": "string",
                "enum": ["EBIDA", "EPS", "stock price"],
                "description": "The financial metric to fetch"
            },
            "date": {
                "type": "string",
                "description": "The date of when the metric was calculated. Expected is a string following 'YYYY-MM-DD' format."
            }
        },
        "required": ["ticker", "metric", "date"]
    }
}

In [None]:
def get_financial_data(ticker: str, metric: str, date: str) -> float:
    """Returns the financial metric of a given company, at a given date.
       Use this function for any questions on the reading of a specific financial metric. \
       The inputs are \
       ticker: ticker of the company.
       metric: metric should be one of EBIDA, EPS, or stock price.
       date: the date of when the metric was calculated.
       The date should be passed in as a string and follow 'YYYY-MM-DD' format \

        This function will return the financial data as a float number."""

    if metric == "EBIDA":
        return 1

    if metric == "EPS":
        return 2

    if metric == "stock price":
        return 3

##Set up LLM

In [None]:
# via Langchain
# llm = ChatAnthropic(model='claude-3-sonnet-20240229'
#                     , api_key = userdata.get('ANTHROPIC_API_KEY')
#                     , tools=[tool_definition_financial_data_lookup])

# Native API - Langchain seems not support multi varable tools very well
client = anthropic.Anthropic(api_key = userdata.get('ANTHROPIC_API_KEY'))

def get_response(prompt):

  message = client.messages.create(
      model = "claude-3-sonnet-20240229",
      max_tokens = 1000,
      temperature = 0.0,
      tools=[tool_definition_financial_data_lookup],
      system = "You are a credit risk officier in an international investiment bank. \
                When asked, you respond concisely. \
                You have access to tools, but only use them when necessary. \
                If a tool is not required, respond as normal",
      messages = [
          {"role": "user", "content": prompt}
      ]
  )

  if message.stop_reason == "tool_use":
    tool_use = message.content[-1]
    tool_name = tool_use.name

    if tool_name == "get_financial_data":
      try:
        tool_return = get_financial_data(
                          ticker = tool_use.input['ticker'],
                          metric = tool_use.input['metric'],
                          date = tool_use.input['date'])

        tool_response = {
            "role": "user",
            "content": [
              {
                "type": "tool_result",
                "tool_use_id": tool_use.id,
                "content": tool_return
              }
            ]
        }

        return f"The tool return is: {tool_return}"
      except ValueError as e:
        return f"Error: {str(e)}"

  elif message.stop_reason == "end_turn":
      return "Claude didn't want to use a tool"
      return "Claude responded with:" + message.content[0].text

  return message

###Test out tools

In [None]:
test_query = "what's JP Morgan's EBIDA at 2023 YE?"

In [None]:
response = get_response(test_query)

In [None]:
response

'The tool return is: 1'

In [None]:
tool_return

1

##Specify LLM and setup query

In [None]:
llm = ChatAnthropic(model='claude-3-sonnet-20240229'
                    , api_key = userdata.get('ANTHROPIC_API_KEY'))

In [None]:
# set up query

system_prompt = (
    "You are a credit risk officier in an international investment bank. "
    "You are to provide quarterly update on the fundamental and credit quality of the company specified. "
    "Specifically, you are to summarize based on sessions "
    "of the company's financial statements provided."
    "The financial statements shall include current and historical 10-K, 10-Q and earning call transcripts"
    "Please keep the answer concise. "
)

user_prompt = (
    "Can you summarize company {company}'s {query} and provide a view on projected performance and forward looking sentiment?"
    "The requirement is {query_description}"
    "The financial statements provided will follow a python dictionary format, "
    "where the keys are the file names, and the values are relevant extraction from the file."
    "The file names shall indicate the type of financial statements (i.e. 10-K) and the reporting period (i.e. 2023Q4)."
    "{docs}"
)

prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", system_prompt),
        ("human", user_prompt),
    ]
)

In [None]:
query = 'financial updates'
query_description = """
  Include brief commentary about performance on the quarter or YTD period.
  Touch on factors impacting revenue, cost structure, and cash flow.
  Keep in mind a perceived weakness vs. a clearly defined weakness.

  This shall include a high-level summary of management's discussions detailing fiscal year year-over-year comparison.
  This shall include a high-level view of the trends being noted within the financial tables within the section
  such as revenue grouwth, expenses growth, segment growth, etc.
  """

##Construct Relevant sessions from documents

In [None]:
# load files and chunk to elements

pdf_names = ['JPM-10k-2022.pdf'
          , 'JPM-earning call transcript 2022Q4.pdf'
          , 'JPM-10K-2021.pdf'] # 10K has already cut down

file_to_elements = {}

for pdf_name in pdf_names:
  loader = UnstructuredPDFLoader(pdf_name, strategy = 'hi_res', infer_table_structure = True, model_name = 'yolox')
  elements = loader.load_and_split()
  file_to_elements[pdf_name] = elements

In [None]:
# sanity check, to delete later
for pdf_name, elements in file_to_elements.items():
  print(pdf_name, len(elements))

JPM-10k-2022.pdf 3
JPM-earning call transcript 2022Q4.pdf 24
JPM-10K-2021.pdf 3


In [None]:
# set up embeddings
# picked a random free one

model_name = "all-MiniLM-L6-v2.gguf2.f16.gguf"
gpt4all_kwargs = {'allow_download': 'True'}
embeddings = GPT4AllEmbeddings(
    model_name=model_name,
    gpt4all_kwargs=gpt4all_kwargs
)

In [None]:
file_to_docs = {}

for pdf_name, elements in file_to_elements.items():
    db = DocArrayInMemorySearch.from_documents(
      elements,
      embeddings
      )
    docs = db.similarity_search(query + '. '+ query_description)
    file_to_docs[pdf_name] = docs

In [None]:
# sanity check, to delete later
for pdf_name, elements in file_to_docs.items():
  print(pdf_name, len(elements))

JPM-10k-2022.pdf 3
JPM-earning call transcript 2022Q4.pdf 4
JPM-10K-2021.pdf 3


In [None]:
file_to_docs_for_inputs = {}

for pdf_name, docs in file_to_docs.items():
  file_to_docs_for_inputs[pdf_name] = '\n\n'.join([doc.page_content for doc in docs])

In [None]:
prompt = prompt_template.format_messages(
                    company = 'JP Morgan',
                    query= query,
                    query_description = query_description,
                    docs = file_to_docs_for_inputs)

In [None]:
response = llm(prompt)

In [None]:
display(Markdown(response.content))
# cost is 3 cents

Based on the financial statements and management discussion, here are the key updates on JPMorgan Chase's performance and outlook:

Q4 2022 and Full Year 2022 Performance:

- Net income was $11.0 billion in Q4 2022 and $37.7 billion for full year 2022, down 22% from 2021 driven by higher provision for credit losses and lower noninterest revenue.

- Total net revenue was $35.6 billion in Q4, up 17% year-over-year, and $128.7 billion for the full year, up 6%.
  - Net interest income (ex-Markets) was up significantly, driven by higher rates and loan growth
  - Noninterest revenue was down, impacted by lower investment banking fees, securities losses, and lower mortgage/auto revenues

- Provision for credit losses was $2.3 billion in Q4 and $6.4 billion for the year, reflecting reserve builds due to loan growth and a deteriorating economic outlook.

- Firm continues to see solid consumer spending trends, though cash buffers are normalizing. Wholesale loan growth remains strong.

Outlook for 2023:

- JPMorgan expects full year 2023 net interest income of ~$73 billion, driven by higher rates, partially offset by expected deposit repricing.

- Modest overall loan growth projected, with Card revolving balances expected to be a tailwind.

- Expense outlook incorporates investments in business and technology, while managing compensation/volume-related expenses prudently.

- Credit costs expected to remain elevated as economic conditions weaken.

- Firm reached 13% CET1 ratio target ahead of schedule and plans to resume share repurchases in Q1 2023.

So while JPMorgan faced headwinds in 2022 from the capital markets environment, it benefited from higher rates and continues to see relatively solid client activity. However, the economic outlook has weighed on credit reserves and profitability outlook.

In [None]:
# this is the prior response for comparison
display(Markdown(response))

Based on the provided excerpts from JPMorgan Chase's 2022 Form 10-K filing, here is a summary update on the company's financials and fundamentals as a credit risk officer:

JPMorgan Chase reported net income of $37.7 billion for full year 2022, down 22% from the prior year. Return on equity was 14% and return on tangible common equity was 18% for the year.

On the revenue side:
- Total net revenue was $128.7 billion, up 6% year-over-year
- Net interest income was $66.7 billion, up 28% driven by higher rates and loan growth, offset partially by lower Markets net interest income
- Net interest income excluding Markets was up 40% to $62.4 billion

The significant increase in net interest income was a positive factor aided by the higher rate environment. However, the decline in Markets' net interest income impacted overall growth.

Non-interest revenue details were not provided, so assessing fee income sources is difficult. Cost structure and operating leverage trends are also unclear from the given information.

Overall, JPMorgan showed revenue growth in 2022 but profitability was pressured. The higher rate environment supported net interest income, but other business weaknesses like lower Markets revenues impacted earnings growth. A more comprehensive review of non-interest revenues, expense management and capital positioning would provide a fuller perspective.