# Elasticsearch Retrieval with OpenAI

### Overview

In this notebook we will launch elastic search locally and then ask some questions related to 10-K filings of top 10 S&P500 companies

1. Set up enviornment
2. Build search tool to query our ES instance
3. Test the search tool
4. Create OpenAI client with access to the tool
5. Compare OpenAI's response with and without access to the tool

# Imports and start ElasticSearch

In [1]:
import logging

import os
import sys
sys.path.append(os.path.abspath(os.path.join(os.getcwd(), os.pardir)))

from retriever.util import get_logger
from retriever.searcher.searchtools.elasticsearch import ElasticsearchSearchTool


logger = get_logger()

In [2]:
from retriever.searcher.constants import SEC_FILINGS_SEARCH_TOOL_DESCRIPTION
sec_search_tool = ElasticsearchSearchTool(SEC_FILINGS_SEARCH_TOOL_DESCRIPTION)

Starting Elasticsearch...
Elasticsearch is starting. Please wait a few moments before it becomes available.


In [3]:
tesla_question = "How does Tesla optimize supply chain efficiency?"

In [4]:
tesla_supply_chain = sec_search_tool.search(tesla_question, n_search_results_to_use=3)

2024-02-23 12:11:03,935 - elastic_transport.transport - INFO - POST http://localhost:9200/sec_filings/_search [status:200 duration:0.452s]
2024-02-23 12:11:03,935 - elastic_transport.transport - INFO - POST http://localhost:9200/sec_filings/_search [status:200 duration:0.452s]
2024-02-23 12:11:03,935 - elastic_transport.transport - INFO - POST http://localhost:9200/sec_filings/_search [status:200 duration:0.452s]


In [5]:
print(tesla_supply_chain)


<search_results>
<item index="1">
<page_content>
For
example, a global shortage of semiconductors has been reported since early
2021 and has caused challenges in the manufacturing industry and impacted our
<em>supply</em> <em>chain</em> and production as well.

We are highly dependent on the
services of Elon Musk, Technoking of <em>Tesla</em> and our Chief Executive Officer. We
are highly dependent on the services of Elon Musk, Technoking of <em>Tesla</em> and our
Chief Executive Officer. Although Mr. Musk spends significant time with <em>Tesla</em>
and is highly active in our management, he <em>does</em> not devote his full time and
attention to <em>Tesla</em>. Mr.

There have been and may continue to be
significant <em>supply</em> <em>chain</em> attacks.

We are highly dependent on the
services of Elon Musk, Technoking of <em>Tesla</em> and our Chief Executive Officer. We
are highly dependent on the services of Elon Musk, Technoking of <em>Tesla</em> and our
Chief Executive Officer.

# Now we will analyze different types of responses from the LLM for two different questions

In [6]:
from retriever.client import ClientWithRetrieval
# training data up to Up to Sep 2021
OPENAI_MODEL = "gpt-3.5-turbo"

client = ClientWithRetrieval(api_key=os.environ['OPENAI_API_KEY'], search_tool = sec_search_tool)
logger.setLevel(logging.CRITICAL)

## Apple 2022 earning

Basic response to the query (no access to the tool).

In [7]:
appl_question = "What was Apple's revenue in 2022?"

In [8]:
basic_response = client.chat.completions.create(
  model=OPENAI_MODEL,
  messages=[
    {"role": "user", "content": appl_question}
  ]
)
print('-'*50)
print('Basic response:')
print(appl_question + basic_response.choices[0].message.content)
print('-'*50)

--------------------------------------------------
Basic response:
What was Apple's revenue in 2022?I am unable to provide information on Apple's revenue in 2022 as it has not yet been reported. The company's fiscal year typically ends in September, so the financial results for 2022 would not be available until sometime in 2023.
--------------------------------------------------


Same completion, but give GPT the ability to use the tool when thinking about the response.

In [9]:
augmented_response = client.completion_with_retrieval(
    query=appl_question,
    model=OPENAI_MODEL,
    n_search_results_to_use=3)

print('-'*50)
print('Augmented response:')
print(appl_question + augmented_response)
print('-'*50)

--------------------------------------------------
Augmented response:
What was Apple's revenue in 2022?- Total net sales in 2022 included $7.5 billion of revenue that was recognized and included in deferred revenue as of September 25, 2021.
- Additionally, total net sales in 2022 included $7.5 billion of revenue that was recognized and included in deferred revenue as of September 25, 2021.
--------------------------------------------------


Basic response to the query (no access to the tool).

## Tesla question

In [10]:
basic_response = client.chat.completions.create(
  model=OPENAI_MODEL,
  messages=[
    {"role": "user", "content": tesla_question}
  ]
)
print('-'*50)
print('Basic response:')
print(tesla_question + basic_response.choices[0].message.content)
print('-'*50)

--------------------------------------------------
Basic response:
How does Tesla optimize supply chain efficiency?1. Vertical integration: Tesla has vertically integrated many aspects of its supply chain, including manufacturing components in-house, to improve efficiency and control over production processes.

2. Supplier partnerships: Tesla works closely with its suppliers to improve collaboration and optimize supply chain processes. The company often works with suppliers to co-locate near its production facilities for faster delivery times and better communication.

3. Just-in-time production: Tesla utilizes a just-in-time production model to minimize inventory costs and waste. This involves ordering and receiving materials just before they are needed in the production process.

4. Advanced technology: Tesla uses advanced technology, such as robotics and automation, to streamline production processes and increase efficiency. This helps reduce production times and costs.

5. Data ana

In [11]:
augmented_response = client.completion_with_retrieval(
    query=tesla_question,
    model=OPENAI_MODEL,
    n_search_results_to_use=3)
print('-'*50)
print('Augmented response:')
print(tesla_question + augmented_response)
print('-'*50)

--------------------------------------------------
Augmented response:
How does Tesla optimize supply chain efficiency?- Tesla optimizes supply chain efficiency by integrating the trade-in of a customer's existing Tesla or non-Tesla vehicle with the sale of a new or used Tesla vehicle.
- They acquire Tesla and non-Tesla vehicles as trade-ins and subsequently remarket them, either directly by Tesla or through third parties.
- Despite ongoing supply chain and logistics challenges, Tesla produced and delivered a significant number of consumer vehicles in 2022.
- Tesla has been impacted by global shortages of semiconductors, which have caused challenges in the manufacturing industry and affected their supply chain and production.
- The company also focuses on operations management and supply chain optimization expertise to enhance efficiency.
--------------------------------------------------


## Reflections

- Apple question able to be answered due to knowledge gap since gpt-3.5 was only trained on data until September 2021
- Tesla answer cites specifics but honestly the basic answer is better. Can work on retrieval via vectorstore