# Milvus + Llama-Index: Enhancing OpenAI Assistant Agent with a Custom Retriever

This shows how to enhance [Llama-Index](https://www.llamaindex.ai/)'s agent built on top of the [OpenAI Assistant API](https://platform.openai.com/docs/assistants/overview) with retriever tool customized by [Milvus](https://zilliz.com).

## Preparation

### 1. Install dependencies

In [1]:
! pip install -q llama-index 'milvus[client]' 'openai>=1.2.0' transformers

### 2. Start Milvus Service

There are 2 options to start a Milvus service:

- [Zilliz Cloud](https://zilliz.com/cloud): Zilliz provides cloud-native service for Milvus. It simplifies the process of deploying and scaling vector search applications by eliminating the need to create and maintain complex data infrastructure. [Get Started Free!](https://cloud.zilliz.com/signup)
- [Open Source Milvus](https://milvus.io): You can install the open source Milvus using either Docker Compose or on Kubernetes.

Here, we use [Milvus Lite](https://milvus.io/docs/milvus_lite.md) to start with a lightweight version of Milvus, which works seamlessly with Google Colab and Jupyter Notebook.

In [2]:
from milvus import default_server


# default_server.cleanup()  # Optional, run this line if you want to cleanup previous data
default_server.start()

### 3. Download example data

You can use any file(s) to build the knowledge base.
We will use a SEC file [uber_2021.pdf](https://github.com/run-llama/llama_index/blob/main/docs/examples/data/10k/uber_2021.pdf) as an example.

In [3]:
! wget 'https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/uber_2021.pdf' -O 'uber_2021.pdf'

--2023-11-16 14:40:44--  https://raw.githubusercontent.com/run-llama/llama_index/main/docs/examples/data/10k/uber_2021.pdf
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.110.133, 185.199.109.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.110.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1880483 (1.8M) [application/octet-stream]
Saving to: ‘uber_2021.pdf’


2023-11-16 14:41:08 (79.9 KB/s) - ‘uber_2021.pdf’ saved [1880483/1880483]



## Getting Started

### 1. Set up Environment

You need to set up some environment variables, for example, passing your [OpenAI API Key](https://beta.openai.com/account/api-keys).
Please note that your OpenAI account should have the accessibility and enough quota available for the model [GPT-4 Turbo](https://platform.openai.com/docs/models/gpt-4).

In [4]:
import os
from getpass import getpass

os.environ['TOKENIZERS_PARALLELISM'] = 'false'
os.environ['OPENAI_API_KEY'] = getpass('Enter OpenAI API Key:')

### 2. Customize Strategies

In this step, we will define some strategies to be used:

- Chunking: configure the text splitter (e.g. `chunk_size`)
- Embedding: choose embedding model (e.g. [`BAAI/bge-small-en`](https://huggingface.co/BAAI/bge-small-en)) and its provider (e.g. [HuggingFace](https://huggingface.co/models), [OpenAI](https://platform.openai.com/docs/guides/embeddings)).
- LLM: select LLM model (e.g. `gpt-4-1106-preview`) and set up model parameters (e.g. `temperature`).

In [5]:
from llama_index.embeddings import HuggingFaceEmbedding
from llama_index.llms import OpenAI
from llama_index.vector_stores import MilvusVectorStore
from llama_index import StorageContext, ServiceContext


llm = OpenAI(model='gpt-4-1106-preview')
embed_model = HuggingFaceEmbedding(model_name='BAAI/bge-small-en', cache_folder='./tmp/models', device='cpu')
service_context = ServiceContext.from_defaults(llm=llm, embed_model=embed_model, chunk_size=512)

vector_store = MilvusVectorStore(
    uri=f'http://localhost:{default_server.listen_port}',
    # token='',  # required for Zilliz Cloud
    dim=384,  # the value changes with embedding model
    overwrite=True  # drop table if exist and then create
    )
storage_context = StorageContext.from_defaults(vector_store=vector_store)

  from .autonotebook import tqdm as notebook_tqdm


### 3. Ingest Document(s)

In [6]:
from llama_index import SimpleDirectoryReader, VectorStoreIndex

# Load document(s)
docs = SimpleDirectoryReader(input_files=['./uber_2021.pdf']).load_data()

# Build index
vector_index = VectorStoreIndex.from_documents(docs, storage_context=storage_context, service_context=service_context)

### 4. Define Agent & Tool(s)

In order integrate the vector store index with agent, we need to define the index as a Retriever Tool.
The agent will be able to recognize the retriever via the tool's name and description in metadata.

In [7]:
from llama_index.tools import RetrieverTool, ToolMetadata

milvus_tool = RetrieverTool(
    retriever=vector_index.as_retriever(similarity_top_k=3),  # retrieve top_k results
    metadata=ToolMetadata(
        name="CustomRetriever",
        description='Retrieve relevant information from provided documents.'
    ),
)

Then let's define the agent powered by OpenAI's Assistants API.
To create a agent, we will define its role, give instructions, and provide tools.
Here we will make LLM thinking itself a SEC analyst, with Milvus retriever as an available tool.

In [8]:
from llama_index.agent import OpenAIAssistantAgent

agent = OpenAIAssistantAgent.from_new(
    name='SEC Analyst',
    instructions='You are a QA assistant designed to analyze sec filings.',
    tools=[milvus_tool],
    verbose=True,
    run_retrieve_sleep_time=1.0
)

## Try it out!

Now the agent is ready as a SEC analyst. It is able to respond to users based off documents loaded into Milvus.

With `verbose=True`, you are able to what information are retrieved when the agent's answering your question.

In [9]:
# print('Thread id:', agent.thread_id)
response = agent.chat('''What was Uber's revenue growth in 2021?''')

=== Calling Function ===
Calling function: CustomRetriever with args: {"input":"Uber's revenue growth in 2021"}
Got output: page_label = 57
file_name = uber_2021.pdf
file_path = uber_2021.pdf
The following table sets forth the components of our consolidated statements of operations for each of the periods presented as a percentage of revenue : Year Ended December 31,
2020
2021 Revenue
100 %100 %Costs and expenses
Cost of revenue, exclusive of dep
reciation and amortization shown separately below46 %54 %Operations and support
16 %11 %Sales and marketing
32 %27 %Research and development
20 %12 %General and administrative
24 %13 %Depreciation and amortization
5 %5 %Total costs and expenses
144 %122 %Loss from operations
(44) %(22) %Interest expense
(4) %(3) %Other income (expense), net
(15) %19 %Loss before income taxes and loss from equity me
thod investments(62) %(6) %Provision for (benefit fro
m) income taxes(2) %(3) %Loss from equity method invest
ments— %— %Net loss including non-con

Check the agent's answer:

In [10]:
print(str(response))

Uber's revenue grew by 57% in 2021. The total revenue for the year was $17.455 billion, up from $11.139 billion in 2020.


Furthermore, take a look at the references of its answer:

In [11]:
response.sources

[ToolOutput(content='page_label = 57\nfile_name = uber_2021.pdf\nfile_path = uber_2021.pdf\nThe following table sets forth the components of our consolidated statements of operations for each of the periods presented as a percentage of revenue : Year Ended December 31,\n2020\n2021 Revenue\n100 %100 %Costs and expenses\nCost of revenue, exclusive of dep\nreciation and amortization shown separately below46 %54 %Operations and support\n16 %11 %Sales and marketing\n32 %27 %Research and development\n20 %12 %General and administrative\n24 %13 %Depreciation and amortization\n5 %5 %Total costs and expenses\n144 %122 %Loss from operations\n(44) %(22) %Interest expense\n(4) %(3) %Other income (expense), net\n(15) %19 %Loss before income taxes and loss from equity me\nthod investments(62) %(6) %Provision for (benefit fro\nm) income taxes(2) %(3) %Loss from equity method invest\nments— %— %Net loss including non-controlling interests\n(61) %(3) %Less: net loss attributab\nle to non-controlling int

> Optional: for Milvus-Lite, stop the service at the end.

In [12]:
default_server.stop()