<a href="https://colab.research.google.com/github/run-llama/llama_index/blob/main/docs/docs/examples/usecases/10k_sub_question.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Solariix demo for 10-K Analysis of the 4 largest US banks
Answer complex queries by decomposing them into simpler sub-queries.

On colab: install LlamaIndex 🦙.

In [1]:
%pip install llama-index-llms-openai

Collecting llama-index-llms-openai
  Downloading llama_index_llms_openai-0.1.15-py3-none-any.whl (10 kB)
Collecting llama-index-core<0.11.0,>=0.10.24 (from llama-index-llms-openai)
  Downloading llama_index_core-0.10.28-py3-none-any.whl (15.4 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m15.4/15.4 MB[0m [31m40.1 MB/s[0m eta [36m0:00:00[0m
Collecting dataclasses-json (from llama-index-core<0.11.0,>=0.10.24->llama-index-llms-openai)
  Downloading dataclasses_json-0.6.4-py3-none-any.whl (28 kB)
Collecting deprecated>=1.2.9.3 (from llama-index-core<0.11.0,>=0.10.24->llama-index-llms-openai)
  Downloading Deprecated-1.2.14-py2.py3-none-any.whl (9.6 kB)
Collecting dirtyjson<2.0.0,>=1.0.8 (from llama-index-core<0.11.0,>=0.10.24->llama-index-llms-openai)
  Downloading dirtyjson-1.0.8-py3-none-any.whl (25 kB)
Collecting httpx (from llama-index-core<0.11.0,>=0.10.24->llama-index-llms-openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━

In [2]:
!pip install llama-index

Collecting llama-index
  Downloading llama_index-0.10.28-py3-none-any.whl (6.9 kB)
Collecting llama-index-agent-openai<0.3.0,>=0.1.4 (from llama-index)
  Downloading llama_index_agent_openai-0.2.2-py3-none-any.whl (12 kB)
Collecting llama-index-cli<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_cli-0.1.11-py3-none-any.whl (26 kB)
Collecting llama-index-embeddings-openai<0.2.0,>=0.1.5 (from llama-index)
  Downloading llama_index_embeddings_openai-0.1.7-py3-none-any.whl (6.0 kB)
Collecting llama-index-indices-managed-llama-cloud<0.2.0,>=0.1.2 (from llama-index)
  Downloading llama_index_indices_managed_llama_cloud-0.1.5-py3-none-any.whl (6.7 kB)
Collecting llama-index-legacy<0.10.0,>=0.9.48 (from llama-index)
  Downloading llama_index_legacy-0.9.48-py3-none-any.whl (2.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m11.1 MB/s[0m eta [36m0:00:00[0m
Collecting llama-index-multi-modal-llms-openai<0.2.0,>=0.1.3 (from llama-index)
  Down

In [3]:
# patch asyncio to allow nested event loops.
import nest_asyncio

nest_asyncio.apply()

In [4]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.llms.openai import OpenAI

from llama_index.core.tools import QueryEngineTool, ToolMetadata
from llama_index.core.query_engine import SubQuestionQueryEngine

## Configure LLM service

In [5]:
# insure valid openai api key is defined
import openai
from google.colab import userdata

openai.api_key = userdata.get('OPENAI_API_KEY')

In [6]:
from llama_index.core import Settings

Settings.llm = OpenAI(temperature=0.2, model="gpt-3.5-turbo")

## Connect Drive

In [7]:
# insure you have the the 10-k pdfs the dirs specified
from google.colab import drive
import os
drive.mount('/content/drive')

drive_path = '/content/drive/My Drive/UCF/SP24/FIN6777'
files = os.listdir(drive_path)
for file in files:
    print(file)





Mounted at /content/drive
jpmc-10k-2023.pdf
citi-10k-2023.pdf
boa-10k-2023.pdf
wf-10k-2023.pdf
ipynb


## Load 10-K for top 4 US banks

In [8]:
boa_docs = SimpleDirectoryReader(
    input_files=["/content/drive/My Drive/UCF/SP24/FIN6777/boa-10k-2023.pdf"]
).load_data()

citi_docs = SimpleDirectoryReader(
    input_files=["/content/drive/My Drive/UCF/SP24/FIN6777/citi-10k-2023.pdf"]
).load_data()
wf_docs = SimpleDirectoryReader(
    input_files=["/content/drive/My Drive/UCF/SP24/FIN6777/wf-10k-2023.pdf"]
).load_data()
jpmc_docs = SimpleDirectoryReader(
    input_files=["/content/drive/My Drive/UCF/SP24/FIN6777/jpmc-10k-2023.pdf"]
).load_data()



## Build indices

In [9]:
citi_index = VectorStoreIndex.from_documents(citi_docs)

In [10]:
wf_index = VectorStoreIndex.from_documents(wf_docs)

In [11]:
boa_index = VectorStoreIndex.from_documents(boa_docs)

In [12]:
jpmc_index = VectorStoreIndex.from_documents(jpmc_docs)

## Build query engines

In [13]:
jpmc_engine = jpmc_index.as_query_engine(similarity_top_k=3)

In [14]:
boa_engine = boa_index.as_query_engine(similarity_top_k=3)

In [15]:
citi_engine = citi_index.as_query_engine(similarity_top_k=3)

In [16]:
wf_engine = wf_index.as_query_engine(similarity_top_k=3)

In [17]:
query_engine_tools = [
    QueryEngineTool(
        query_engine=jpmc_engine,
        metadata=ToolMetadata(
            name="jpmc_10k",
            description=(
                "Provides information about JPMC financials for year 2023"
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=boa_engine,
        metadata=ToolMetadata(
            name="boa_10k",
            description=(
                "Provides information about Bank of America financials for year 2023"
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=wf_engine,
        metadata=ToolMetadata(
            name="wf_10k",
            description=(
                "Provides information about Well Fargo financials for year 2023"
            ),
        ),
    ),
    QueryEngineTool(
        query_engine=citi_engine,
        metadata=ToolMetadata(
            name="citi_10k",
            description=(
                "Provides information about Citigroup financials for year 2023"
            ),
        ),
    ),
]

s_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=query_engine_tools
)

## Run queries

In [18]:
response = s_engine.query(
    "How did the strategic priorities outlined in the 2023 10-K reports of the top 4 US banks align with their financial performance outcomes, and what does this reveal about the evolving landscape of the financial industry"

)

Generated 8 sub questions.
[1;3;38;2;237;90;200m[jpmc_10k] Q: What were the strategic priorities outlined in the 2023 10-K report of JPMC?
[0m[1;3;38;2;90;149;237m[jpmc_10k] Q: How did the financial performance outcomes of JPMC in 2023 compare to the strategic priorities outlined in their 10-K report?
[0m[1;3;38;2;11;159;203m[boa_10k] Q: What were the strategic priorities outlined in the 2023 10-K report of Bank of America?
[0m[1;3;38;2;155;135;227m[boa_10k] Q: How did the financial performance outcomes of Bank of America in 2023 compare to the strategic priorities outlined in their 10-K report?
[0m[1;3;38;2;237;90;200m[wf_10k] Q: What were the strategic priorities outlined in the 2023 10-K report of Well Fargo?
[0m[1;3;38;2;90;149;237m[wf_10k] Q: How did the financial performance outcomes of Well Fargo in 2023 compare to the strategic priorities outlined in their 10-K report?
[0m[1;3;38;2;11;159;203m[citi_10k] Q: What were the strategic priorities outlined in the 2023 10-

In [19]:
import textwrap

max_line_length = 80
processed_response = textwrap.wrap(response.response, width=max_line_length, break_long_words=False, break_on_hyphens=False)
print('\n'.join(processed_response))


The strategic priorities outlined in the 2023 10-K reports of the top 4 US banks
aligned with their financial performance outcomes to varying degrees. While some
banks showed alignment between their strategic priorities and financial results,
others faced challenges in fully realizing their outlined priorities. This
reveals that the evolving landscape of the financial industry demands a careful
balance between strategic planning and operational execution. The ability to
effectively manage risks, control expenses, innovate, attract talent, and adapt
to changing market conditions is crucial for banks to navigate the complexities
of the financial industry and achieve sustainable growth and profitability.


In [19]:
response = s_engine.query(
    "Compare revenue growth of the two largest US banks from 2022 to 2023"
)

Generated 4 sub questions.
[1;3;38;2;237;90;200m[jpmc_10k] Q: What was the revenue of JPMC in 2022?
[0m[1;3;38;2;90;149;237m[jpmc_10k] Q: What was the revenue of JPMC in 2023?
[0m[1;3;38;2;11;159;203m[boa_10k] Q: What was the revenue of Bank of America in 2022?
[0m[1;3;38;2;155;135;227m[boa_10k] Q: What was the revenue of Bank of America in 2023?
[0m[1;3;38;2;11;159;203m[boa_10k] A: Bank of America's revenue in 2022 was $21,748 million.
[0m[1;3;38;2;90;149;237m[jpmc_10k] A: The revenue of JPMC in 2023 was $70.1 billion.
[0m[1;3;38;2;155;135;227m[boa_10k] A: The revenue of Bank of America in 2023 was $21,105 million.
[0m[1;3;38;2;237;90;200m[jpmc_10k] A: The revenue of JPMC in 2022 was $48.8 billion.
[0m

In [20]:
import textwrap
max_line_length = 80
processed_response = textwrap.wrap(response.response, width=max_line_length, break_long_words=False, break_on_hyphens=False)
print('\n'.join(processed_response))

The revenue growth of JPMC from 2022 to 2023 was $21.3 billion, while the
revenue growth of Bank of America during the same period was a decrease of $643
million.


In [22]:
response = s_engine.query(
    "Can you list the greatest risk the four largest banks face collectively and independently"
)

Generated 5 sub questions.
[1;3;38;2;237;90;200m[jpmc_10k] Q: What are the greatest risks faced collectively by JPMC, Bank of America, Wells Fargo, and Citigroup?
[0m[1;3;38;2;90;149;237m[jpmc_10k] Q: What are the greatest risks faced independently by JPMC?
[0m[1;3;38;2;11;159;203m[boa_10k] Q: What are the greatest risks faced independently by Bank of America?
[0m[1;3;38;2;155;135;227m[wf_10k] Q: What are the greatest risks faced independently by Wells Fargo?
[0m[1;3;38;2;237;90;200m[citi_10k] Q: What are the greatest risks faced independently by Citigroup?
[0m[1;3;38;2;237;90;200m[citi_10k] A: The greatest risks faced independently by Citigroup include credit risk, operational risk (specifically cybersecurity risk), and regulatory risk.
[0m[1;3;38;2;155;135;227m[wf_10k] A: The greatest risks faced independently by Wells Fargo include operational risks related to reliance on third-party service providers, potential disruptions in service, failure to meet regulatory require

In [23]:
max_line_length = 80
processed_response = textwrap.wrap(response.response, width=max_line_length, break_long_words=False, break_on_hyphens=False)
print('\n'.join(processed_response))

Collectively, the greatest risks faced by JPMC, Bank of America, Wells Fargo,
and Citigroup include concentration or contagion risks, consumer credit
expansion risks, market and credit risks during dislocations, liquidity
constraints, capital risks, operational risks, strategic risks, conduct risks,
reputation risks, intra-day credit risks, risks associated with transactions
with government entities, and disputes with counterparties to derivatives
contracts. Independently, JPMC faces regulatory risks, political risks, market
risks, credit risks, liquidity risks, capital risks, operational risks,
strategic risks, conduct risks, and reputation risks. Bank of America faces
risks related to economic disruptions, credit losses, regulatory responses,
changes in accounting standards, cybersecurity incidents, geopolitical events,
climate change impacts, health emergencies, and natural disasters. Wells Fargo's
greatest risks include operational risks related to reliance on third-party
service p

In [24]:
response = s_engine.query(
    "Based on your response to the list of the greatest risk the four largest banks face can you list the top 3 risks in  that these banks face moving forward"
)

Generated 4 sub questions.
[1;3;38;2;237;90;200m[jpmc_10k] Q: What are the top 3 risks that JPMC faces moving forward?
[0m[1;3;38;2;90;149;237m[boa_10k] Q: What are the top 3 risks that Bank of America faces moving forward?
[0m[1;3;38;2;11;159;203m[wf_10k] Q: What are the top 3 risks that Wells Fargo faces moving forward?
[0m[1;3;38;2;155;135;227m[citi_10k] Q: What are the top 3 risks that Citigroup faces moving forward?
[0m[1;3;38;2;237;90;200m[jpmc_10k] A: The top 3 risks that JPMorgan Chase (JPMC) faces moving forward are regulatory risks, market risks, and credit risks.
[0m[1;3;38;2;11;159;203m[wf_10k] A: Wells Fargo faces the top 3 risks moving forward related to its mortgage servicing obligations, competitive landscape in the financial services industry, and potential cyber attacks or information security incidents.
[0m[1;3;38;2;155;135;227m[citi_10k] A: The top 3 risks that Citigroup faces moving forward are regulatory scrutiny and legal proceedings, risks associate

In [25]:
max_line_length = 80
processed_response = textwrap.wrap(response.response, width=max_line_length, break_long_words=False, break_on_hyphens=False)
print('\n'.join(processed_response))

The top 3 risks that the four largest banks (JPMorgan Chase, Bank of America,
Wells Fargo, and Citigroup) face moving forward are regulatory risks, market
risks, and credit risks.
