# Elasticsearch Retrieval with OpenAI

### Overview

In this notebook we will launch elastic search locally and then ask some questions related to 10-K filings of top 10 S&P500 companies

1. Set up enviornment
2. Build search tool to query our ES instance
3. Test the search tool
4. Create OpenAI client with access to the tool
5. Compare OpenAI's response with and without access to the tool

# Imports and start ElasticSearch

In [1]:
import os
import sys
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), os.pardir)))

from retriever.util import get_logger
from retriever.searcher.searchtools.elasticsearch import ElasticsearchSearchTool

logger = get_logger()

In [2]:
from retriever.searcher.constants import SEC_FILINGS_SEARCH_TOOL_DESCRIPTION
sec_search_tool = ElasticsearchSearchTool(SEC_FILINGS_SEARCH_TOOL_DESCRIPTION)

Starting Elasticsearch...
Elasticsearch is starting. Please wait a few moments before it becomes available.


In [14]:
tesla_question = "How does Tesla optimize supply chain efficiency?"

In [3]:
tesla_supply_chain = sec_search_tool.search(tesla_question, n_search_results_to_use=3)

2024-02-23 08:44:19,487 - elastic_transport.transport - INFO - POST http://localhost:9200/sec_filings/_search [status:200 duration:0.472s]
2024-02-23 08:44:19,487 - elastic_transport.transport - INFO - POST http://localhost:9200/sec_filings/_search [status:200 duration:0.472s]
2024-02-23 08:44:19,487 - elastic_transport.transport - INFO - POST http://localhost:9200/sec_filings/_search [status:200 duration:0.472s]


In [4]:
print(tesla_supply_chain)


<search_results>
<item index="1">
<page_content>
For
example, a global shortage of semiconductors has been reported since early
2021 and has caused challenges in the manufacturing industry and impacted our
<em>supply</em> <em>chain</em> and production as well.

We are highly dependent on the
services of Elon Musk, Technoking of <em>Tesla</em> and our Chief Executive Officer. We
are highly dependent on the services of Elon Musk, Technoking of <em>Tesla</em> and our
Chief Executive Officer. Although Mr. Musk spends significant time with <em>Tesla</em>
and is highly active in our management, he <em>does</em> not devote his full time and
attention to <em>Tesla</em>. Mr.

There have been and may continue to be
significant <em>supply</em> <em>chain</em> attacks.

We are highly dependent on the
services of Elon Musk, Technoking of <em>Tesla</em> and our Chief Executive Officer. We
are highly dependent on the services of Elon Musk, Technoking of <em>Tesla</em> and our
Chief Executive Officer.

# Now we will analyze different types of responses from the LLM for two different questions

In [16]:
from retriever.client import ClientWithRetrieval
# training data up to Up to Sep 2021
OPENAI_MODEL = "gpt-3.5-turbo"

client = ClientWithRetrieval(api_key=os.environ['OPENAI_API_KEY'], search_tool = sec_search_tool)

## Apple 2022 earning

Basic response to the query (no access to the tool).

In [None]:
appl_question = "What was Apple's revenue in 2022?"

In [None]:
basic_response = client.chat.completions.create(
  model=OPENAI_MODEL,
  messages=[
    {"role": "user", "content": appl_question}
  ]
)

In [None]:
print('-'*50)
print('Basic response:')
print(appl_question + basic_response.choices[0].message.content)
print('-'*50)

Same completion, but give GPT the ability to use the tool when thinking about the response.

In [None]:
augmented_response = client.completion_with_retrieval(
    query=appl_question,
    model=OPENAI_MODEL,
    n_search_results_to_use=3)

In [None]:
print('-'*50)
print('Augmented response:')
print(appl_question + augmented_response)
print('-'*50)

Basic response to the query (no access to the tool).

## Tesla question

In [31]:
basic_response = client.chat.completions.create(
  model=OPENAI_MODEL,
  messages=[
    {"role": "user", "content": tesla_question}
  ]
)

/Users/rushilsheth/Documents/portfolio/busco-fin/examples
2024-02-23 09:10:39,506 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:10:39,506 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:10:39,506 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:10:39,506 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:10:39,506 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:10:39,506 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:10:39,506 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:10:39,506 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/comp

In [None]:
print('-'*50)
print('Basic response:')
print(tesla_question + basic_response.choices[0].message.content)
print('-'*50)

In [44]:
augmented_response = client.completion_with_retrieval(
    query=tesla_question,
    model=OPENAI_MODEL,
    n_search_results_to_use=3)

/Users/rushilsheth/Documents/portfolio/busco-fin/examples
2024-02-23 09:26:49,442 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:26:49,442 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:26:49,442 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:26:49,442 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:26:49,442 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:26:49,442 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:26:49,442 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-02-23 09:26:49,442 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/comp

In [50]:
print('-'*50)
print('Augmented response:')
print(tesla_question + augmented_response)
print('-'*50)

--------------------------------------------------
Augmented response:
How does Tesla optimize supply chain efficiency?Tesla optimizes supply chain efficiency by integrating the trade-in of a customer's existing Tesla or non-Tesla vehicle with the sale of a new or used Tesla vehicle. They acquire vehicles as trade-ins and subsequently remarket them either directly or through third parties. Additionally, Tesla faces and addresses challenges such as supply chain attacks and global semiconductor shortages to ensure smooth production and delivery of consumer vehicles. Tesla also emphasizes operations management and supply chain optimization expertise within its board members to further enhance efficiency in these areas.
--------------------------------------------------
