Tools and Agent

# Build Tools and Agent

* LangChain: https://python.langchain.com/docs/modules/agents/


## Configure OpenAI Settings

In [1]:
import os
import openai
from dotenv import load_dotenv
# Set up Azure OpenAI
load_dotenv()

openai.api_type = "azure"

AZURE_OPENAI_API_VERSION = os.getenv("AAG_AZURE_OPENAI_API_VERSION")
openai.api_version = AZURE_OPENAI_API_VERSION

AZURE_OPENAI_API_KEY = os.getenv("AAG_AZURE_OPENAI_API_KEY").strip()
assert AZURE_OPENAI_API_KEY, "ERROR: Azure OpenAI Key is missing"
openai.api_key = AZURE_OPENAI_API_KEY

AZURE_OPENAI_ENDPOINT = os.getenv("AAG_AZURE_OPENAI_ENDPOINT", "").strip()
assert AZURE_OPENAI_ENDPOINT, "ERROR: Azure OpenAI Endpoint is missing"
openai.api_base = AZURE_OPENAI_ENDPOINT

# Deployment for Chat
# DEPLOYMENT_NAME_CHAT = os.getenv('DEPLOYMENT_NAME_CHAT')
DEPLOYMENT_NAME_CHAT = os.getenv('AAG_DEPLOYMENT_NAME_CHAT_16K')

# Deployment for embedding
DEPLOYMENT_NAME_EMBEDDING = os.getenv("AAG_DEPLOYMENT_NAME_EMBEDDING")
model: str = DEPLOYMENT_NAME_EMBEDDING

# Azure AI Search (Cognitive vector store)
vector_store_address: str = os.getenv("AAG_AZURE_SEARCH_SERVICE_ENDPOINT")
vector_store_password: str = os.getenv("AAG_AZURE_SEARCH_ADMIN_KEY")
# index_name: str = "langchain-vector-arxiv-physics"

# Deployment for embedding
BING_SUBSCRIPTION_KEY = os.getenv("BING_SUBSCRIPTION_KEY")

## Langchain Set up

In [2]:
from langchain.chat_models import AzureChatOpenAI
# Azure OpenAI model
llm = AzureChatOpenAI(
    azure_deployment="gpt-35-turbo-16k",
    temperature=0,
    azure_endpoint=AZURE_OPENAI_ENDPOINT,
    openai_api_type="azure",
    api_key=AZURE_OPENAI_API_KEY,
)

## Prepare Tools for Agent

### Tool: Bing Search

In [3]:
from langchain.utilities import BingSearchAPIWrapper
import os

os.environ["BING_SUBSCRIPTION_KEY"] = BING_SUBSCRIPTION_KEY
os.environ["BING_SEARCH_URL"] = "https://api.bing.microsoft.com/v7.0/search"

azure_bing_search = BingSearchAPIWrapper()

# Unit test
# azure_bing_search.run("When Sam Altman got fired?")

In [4]:
from langchain.agents import initialize_agent, AgentType
from langchain.agents import Tool

# Here is the tool define
azure_bing_search_01 = Tool(
    name="bing search for all",
    func=azure_bing_search.run, 
    description="search for any information that is available"
)

azure_bing_search_02 = Tool(
    name="bing search on current events",
    func=azure_bing_search.run,
    description="useful for when you need to answer questions about current events"
)

# unit_test = False
unit_test = True
if unit_test:
    # Unit test
    # Next, let's load some tools to use. Note that the `llm-math` tool uses an LLM, so we need to pass that in.
    bing_search_tools = [
        azure_bing_search_01,
        azure_bing_search_02,
        ]

    # tools = load_tools(["bing-search"], llm=llm)
    test_agent = initialize_agent(
        bing_search_tools,
        llm,
        agent="zero-shot-react-description",
        verbose=True,
        handle_parsing_errors=True,
    )

    # Now let's test it out!
    result = test_agent.run("Who was Zebra CEO in 2022?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find information about the CEO of Zebra in 2022.
Action: bing search for all
Action Input: "Zebra CEO 2022"[0m
Observation: [36;1m[1;3mLINCOLNSHIRE, Ill. – Dec. 8, <b>2022</b> – <b>Zebra</b> Technologies Corporation (NASDAQ: ZBRA), an innovator at the front line of business with solutions and partners that deliver a performance edge, today announced <b>Zebra</b>’s Chief Product &amp; Solutions Officer Bill Burns will become the company’s next <b>CEO</b> and a member of the <b>Zebra</b> Board of Directors ... Matt Ausman was named Chief Information Officer, <b>Zebra</b> Technologies in August <b>2022</b> following his role as acting CIO, which he assumed in March <b>2022</b>. He has nearly 20 years of experience in the IT sector, and he joined <b>Zebra</b> Technologies in 2021. Prior to his role as <b>Zebra</b>’s CIO, Matt served as the company’s Vice President of Enterprise ... <b>Zebra</b> Technologies Names Bil

In [5]:
if unit_test:
    result = test_agent.run("Who was the Zebra CEO right before Bill Burns?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI need to find out who the CEO of Zebra was before Bill Burns.
Action: bing search for all
Action Input: "Zebra CEO before Bill Burns"[0m
Observation: [36;1m[1;3mLINCOLNSHIRE, Ill. – Dec. 8, 2022 – <b>Zebra</b> Technologies Corporation (NASDAQ: ZBRA), an innovator at the front line of business with solutions and partners that deliver a performance edge, today announced <b>Zebra</b>’s Chief Product &amp; Solutions Officer <b>Bill</b> <b>Burns</b> will become the company’s next <b>CEO</b> and a member of the <b>Zebra</b> Board of Directors ... LINCOLNSHIRE, Ill.--(BUSINESS WIRE)-- <b>Zebra</b> Technologies Corporation (NASDAQ: ZBRA), an innovator at the front line of business with solutions and partners that deliver a performance edge, today announced <b>Zebra</b>’s Chief Product &amp; Solutions Officer <b>Bill</b> <b>Burns</b> will become the company’s next <b>CEO</b> and a member of the <b>Zebra</b> Board of Directors; And

### Tool: Faiss db retriever

In [6]:
from langchain.embeddings.openai import OpenAIEmbeddings
from langchain.vectorstores import FAISS

# Get Azure OpenAI embedding
embeddings: OpenAIEmbeddings = OpenAIEmbeddings(
    deployment=model,
    model=model,
    chunk_size=1,
    openai_api_base=AZURE_OPENAI_ENDPOINT,
    openai_api_type="azure",
    api_key=AZURE_OPENAI_API_KEY,
)

# Load local FAISS db which we created in earlier using embedding
faiss_db_8K = FAISS.load_local("faiss_index_8-K", embeddings)
# Set up the retriever we want to use, and then turn it into a retriever tool
# faiss_retriever_8K = faiss_db_8K.as_retriever()

# Load local FAISS db which we created in earlier using embedding
faiss_db_10Q = FAISS.load_local("faiss_index_10-Q", embeddings)
# Set up the retriever we want to use, and then turn it into a retriever tool
# faiss_retriever_10Q = faiss_db_10Q.as_retriever()

# Load local FAISS db which we created in earlier using embedding
faiss_db_10K = FAISS.load_local("faiss_index_10-K", embeddings)
# Set up the retriever we want to use, and then turn it into a retriever tool
# faiss_retriever_10K = faiss_db_10K.as_retriever()

In [7]:
# Merge all FAISS db
faiss_db = FAISS.load_local("faiss_index", embeddings)
faiss_db.merge_from(faiss_db_10K)
faiss_db.merge_from(faiss_db_8K)

# Set up the retriever we want to use, and then turn it into a retriever tool
faiss_retriever = faiss_db.as_retriever()

In [None]:
# List FAISS index
# faiss_db.docstore._dict

In [8]:
from langchain.agents.agent_toolkits import create_retriever_tool

# Tool define
faiss_retriever_all = create_retriever_tool(
    faiss_retriever,
    "SEC data search",
    "This tool is help looking for information about a company using its all U.S. Securities and Exchange Commission (SEC) filing reports, like 10-K, 10-Q, 8-K ...",
)

# faiss_retriever_8K = create_retriever_tool(
#     faiss_retriever_8K,
#     "search_company_sec_filings_8K",
#     "This search is using SEC filing 8-K information.",
# )

# faiss_retriever_10Q = create_retriever_tool(
#     faiss_retriever_10Q,
#     "search_company_sec_filings_10Q",
#     "This search is using SEC filing 10-Q information.",
# )

# faiss_retriever_10K = create_retriever_tool(
#     faiss_retriever_10K,
#     "search_company_sec_filings_10K",
#     "This search is using SEC filing 10-K information.",
# )

In [9]:
# unit_test = False
unit_test = True
if unit_test:
    # Unit test
    # Next, let's load some tools to use. Note that the `llm-math` tool uses an LLM, so we need to pass that in.
    faiss_retriever_tools = [
        # faiss_retriever_01,
        # faiss_retriever_02,
        faiss_retriever_all,
        ]

    # tools = load_tools(["bing-search"], llm=llm)
    test_agent = initialize_agent(
        faiss_retriever_tools,
        llm,
        agent="zero-shot-react-description",
        verbose=True,
        handle_parsing_errors=True,
        max_iterations=5,
    )

# Now let's test it out!
test_agent.run("What are General Business and Industry Risks for Zebra Technologies?")



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI should use the SEC data search tool to find information about Zebra Technologies and its general business and industry risks.
Action: SEC data search
Action Input: Zebra Technologies[0m
Observation: [36;1m[1;3m[Document(page_content='11/25/23, 7:00 PM zbra-20221231\nhttps://www.sec.gov/Archives/edgar/data/877212/000087721223000025/zbra-20221231.htm 9/132Table of Contents\nOperations and Technologies\nOur operations consist of two reportable segments that provide complementary offerings to our customers: Asset Intelligence &\nTracking (“AIT”), which includes barcode and card printing, supplies and services; and Enterprise Visibility & Mobility (“EVM”),\nwhich includes mobile computing, data capture, RFID, fixed industrial scanning and machine vision, services and workflow\noptimization solutions including location solutions.\nAsset Intelligence & Tracking\nBarcode and Card Printing: We design, manufacture, and sell printe

'Zebra Technologies operates in the Asset Intelligence & Tracking (AIT) and Enterprise Visibility & Mobility (EVM) segments. They provide a range of products and services including printers, barcode scanners, RFID readers, and industrial scanners, as well as maintenance and technical support services.'

### Tool: Yahoo Finance News

In [None]:
# ! pip install yfinance

In [10]:
from langchain.agents import AgentType, initialize_agent
from langchain.tools.yahoo_finance_news import YahooFinanceNewsTool

# This is a built-in tool provided by Langchain
yahoo_finance_news = YahooFinanceNewsTool()

print(YahooFinanceNewsTool().description)
print(YahooFinanceNewsTool().name)

yahoo_tools = [YahooFinanceNewsTool()]

# tools = load_tools(["bing-search"], llm=llm)
test_agent = initialize_agent(
    yahoo_tools,
    llm,
    # agent="zero-shot-react-description",
    agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION,
    verbose=True,
    handle_parsing_errors=True,
)

Useful for when you need to find financial news about a public company. Input should be a company ticker. For example, AAPL for Apple, MSFT for Microsoft.
yahoo_finance_news


In [11]:
# Unit test
test_agent.run(
    "Any news on Zebra Technologies?"
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI should use the yahoo_finance_news tool to find news about Zebra Technologies.
Action: yahoo_finance_news
Action Input: ZBRA[0m
Observation: [36;1m[1;3mZebra (ZBRA) Up 19.5% Since Last Earnings Report: Can It Continue?
Zebra (ZBRA) reported earnings 30 days ago. What's next for the stock? We take a look at earnings estimates for some clues.

Zebra Technologies' (ZBRA) Solution Chosen by Rockland Trust
Zebra Technologies (ZBRA) offers its Zebra Workcloud software solutions to boost Rockland Trust's customer experience by improving branch operations, and increasing visibility and communication.

Zebra Technologies (ZBRA) Unveils Android-Based Mobile Computers
Zebra Technologies (ZBRA) launches HC20 and HC50 Android-based handheld mobile computers, which broaden the company's healthcare solutions portfolio.

Here's Why Hold Strategy is Apt for Zebra Technologies (ZBRA)
Strength in the Asset Intelligence and Tracking segment 

"Recent news articles about Zebra Technologies include its earnings report, a new solution chosen by Rockland Trust, the launch of Android-based mobile computers, and the company's hold strategy."

In [12]:
# Unit test
test_agent.run(
    "How does Microsoft feels today comparing with Zebra Zebra Technologies (ZBRA)?",
)



[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI should check the financial news for both Microsoft and Zebra Technologies to see how they are performing today.
Action: yahoo_finance_news
Action Input: MSFT[0m
Observation: [36;1m[1;3mIncreases to Microsoft Corporation's (NASDAQ:MSFT) CEO Compensation Might Cool off for now
Key Insights Microsoft will host its Annual General Meeting on 7th of December Salary of US$2.50m is part of CEO Satya...[0m
Thought:[32;1m[1;3mI need to check the financial news for Zebra Technologies as well.
Action: yahoo_finance_news
Action Input: ZBRA[0m
Observation: [36;1m[1;3mZebra (ZBRA) Up 19.5% Since Last Earnings Report: Can It Continue?
Zebra (ZBRA) reported earnings 30 days ago. What's next for the stock? We take a look at earnings estimates for some clues.

Zebra Technologies' (ZBRA) Solution Chosen by Rockland Trust
Zebra Technologies (ZBRA) offers its Zebra Workcloud software solutions to boost Rockland Trust's customer experien

"Microsoft and Zebra Technologies have different performance indicators today, with Microsoft's CEO compensation potentially cooling off and Zebra Technologies seeing positive growth and new developments."

In [None]:
# yahoo_finance_news.run("zbra")

### Tools - it seems that the same tool with different descriptions leads different results.

In [13]:
# Tools
tools_01 = [
    faiss_retriever_all,
    azure_bing_search_01,
    yahoo_finance_news,
]

tools_02 = [
    faiss_retriever_all,
    azure_bing_search_02,
    yahoo_finance_news,
    azure_bing_search_01
]

tools_03 = [
    faiss_retriever_all,
    azure_bing_search_02,
    azure_bing_search_01
]

tools_04 = [
    faiss_retriever_all,
    azure_bing_search_01
]

## Build Agents with different tool set and see how they perform

In [14]:
agent001 = initialize_agent(
    tools_01,
    llm,
    agent="zero-shot-react-description",
    verbose=True,
    handle_parsing_errors=True,
)


In [None]:

result = agent001.run("Who was Zebra Techologies CEO right before Bill Burns?")

In [None]:
agent001.run("What is the level of the Zebra Technologies's indebtedness, and how manageable are its debt obligations?")


In [15]:
agent002 = initialize_agent(
    tools_02,
    llm,
    agent="zero-shot-react-description",
    verbose=True,
    handle_parsing_errors=True,
)

In [None]:
agent002.run("What is the level of the Zebra Technologies's indebtedness, and how manageable are its debt obligations?")

## Q&A: Risk Assessment

## Agent equipped with tools to answer all the questions

#### Load list of questions

In [16]:
import json

# Read JSON data (list of questions) from the file
with open('./data_source/list_of_questions.json', 'r') as file:
    data = json.load(file)

#### List of questions

In [17]:
data = [
    {
        "category": "Financial Profile:",
        "inquiries": [
            "What is the trend in revenue growth over the past few years?",
            "How are the company's gross, operating, and net profit margins trending?",
            "What is the company's operating cash flow and free cash flow?",
            "What is the company's debt structure? (Long-term, short-term, interest rates)",
            "What are the major risks disclosed by the company?",
            "Who are the company's competitors, how does the company perform compared to its competitors in the industry?",
            "How does the company allocate profits to dividends or share buybacks?",
        ],
    },
    {
        "category": "Operational Performance:",
        "inquiries": [
            "Are there any noticeable fluctuations in sales figures?",
            "Have there been any significant changes in expenses or cost of goods sold?",
            "What is the cash conversion cycle evolving?",
            "How does the debt level compare to equity and assets?",
            "Are there any pending litigations, investigations, or regulatory issues disclosed?",
            "What investments is the company making in its business?",
        ],
    },
    {
        "category": "Market Position and Risk Factors:",
        "inquiries": [
            "Which metrics or KPIs are highlighted by the company as key measures of success?",
            "What are the major risks disclosed by the company?",
            "Are there any market trends or industry-specific challenges affecting the company?",
            "How might these impact the company's future operations and financial health?",
        ],
    },
]

In [18]:
agent004 = initialize_agent(
    tools_04,
    llm,
    agent="zero-shot-react-description",
    verbose=True,
    handle_parsing_errors=True,
)

In [None]:
agent004.run("Who are the Zebra's competitors, how does the company perform compared to its competitors in the industry?")

#### Batch process

In [19]:
# Define an empty list to store the output
output_data = []

# Prompt to make sure the question refers to the right company
# TODO: might need to adjust prompt text: like if you can find the answer, please provide source. If answer cannot be found, try different tools given to you and make sure 
# you exauste all the tools.
def build_question_prompt(company_name, question):
    prompt_text = f"Please answer the following question. Note that inside the question, 'the company' refers to {company_name}. Here is the question: {question}. \
        NOTE: if you can find the answer, please provide the source."
    return prompt_text

# def build_question_prompt(company_name, question):
#     prompt_text = f"Please answer the following question. Note that inside the question, 'the company' refers to {company_name}. Here is the question: {question}"
#     return prompt_text

# Choose an agent who can perform the best for a job given
def agent_in_action(agent_on_job, question_prompt):
    result = agent_on_job.run(question_prompt)
    return result

# Run through all the questions in the list. 
# Here, we need to see if the token limit is a factor, or we might need to put a time delay in between questions
def batch_analysis(company, questions, agent):
    # Loop through the JSON data and print questions and inquiries
    for item in questions:
        print(f"Category: {item['category']}")
        output_data.append({"Category": item['category']})
        if(risk_analysis == item['category'] or risk_analysis == 'ALL'):
            print("Inquiries:")
            output_data[-1]["Inquiries"] = []
            for inquiry in item['inquiries']:
                print(f"Question: {inquiry}")
                output_data[-1]["Inquiries"].append({"Question": inquiry})
                question_prompt = build_question_prompt(company, inquiry)
                print(f"=> {question_prompt}")
                answer = agent_in_action(agent, question_prompt)
                print(f"Answer: {answer}")
                output_data[-1]["Inquiries"][-1]["Answer"] = answer
            print()  # Add a newline for readability

# Batch run:
# risk_analysis = 'Test:'
risk_analysis = 'ALL'
# risk_analysis = 'Financial Profile:'
# risk_analysis = 'Market Position and Competitiveness:'
company_name = 'Zebra Technologies'
agent_on_duty = agent004
# agent_on_duty = agent003 - NG
list_of_questions = data
batch_analysis(company_name, list_of_questions, agent_on_duty)

Category: Financial Profile:
Inquiries:
Question: What is the trend in revenue growth over the past few years?
=> Please answer the following question. Note that inside the question, 'the company' refers to Zebra Technologies. Here is the question: What is the trend in revenue growth over the past few years?.         NOTE: if you can find the answer, please provide the source.


[1m> Entering new AgentExecutor chain...[0m
[32;1m[1;3mI should search for the company's financial reports to find information about its revenue growth over the past few years.
Action: SEC data search
Action Input: Zebra Technologies[0m
Observation: [36;1m[1;3m[Document(page_content='11/25/23, 7:00 PM zbra-20221231\nhttps://www.sec.gov/Archives/edgar/data/877212/000087721223000025/zbra-20221231.htm 9/132Table of Contents\nOperations and Technologies\nOur operations consist of two reportable segments that provide complementary offerings to our customers: Asset Intelligence &\nTracking (“AIT”), which inclu

In [20]:
output_data


[{'Category': 'Financial Profile:',
  'Inquiries': [{'Question': 'What is the trend in revenue growth over the past few years?',
    'Answer': "I do not have access to the specific information about Zebra Technologies' revenue growth over the past few years."},
   {'Question': "How are the company's gross, operating, and net profit margins trending?",
    'Answer': "I was not able to find specific information about Zebra Technologies' gross, operating, and net profit margins."},
   {'Question': "What is the company's operating cash flow and free cash flow?",
    'Answer': "The company's operating cash flow for the year ended December 31, 2022, was $488 million. The company's free cash flow for the year ended December 31, 2022, is not provided in the available information."},
   {'Question': "What is the company's debt structure? (Long-term, short-term, interest rates)",
    'Answer': "Based on the SEC filing and the additional information from the bing search, Zebra Technologies' debt 

#### Write Q&A into JSON file

In [None]:
import json

# Write the output data to a JSON file
with open('./reports/qna_zbra_003.json', 'w') as json_file:
    json.dump(output_data, json_file, indent=4)
