# How to connect to Milvus from a notebook

There are several different ways to start up a Milvus server.

1. [Milvus Lite](#milvus_lite) is a local Python server that can run in Jupyter notebooks or Google Colab, requires pymilvus>=2.4.3.  
   ⛔️ Only meant for demos and local testing.
2. [Zilliz cloud free tier](#zilliz_free)
3. [Milvus standalone docker](#milvus_docker) requires [local docker](https://milvus.io/docs/install_standalone-docker.md) installed and running.
4. [LangChain](#langchain) - all [3rd party adapters](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.milvus.Milvus.html) use Milvus Lite.
5. [LlamaIndex](#llama_index) - all [3rd party adapters](https://api.python.langchain.com/en/latest/vectorstores/langchain.vectorstores.milvus.Milvus.html) use Milvus Lite.
6. Milvus kubernetes cluster requires a [K8s cluster](https://milvus.io/docs/install_cluster-milvusoperator.md) up and running.

💡 **For production workloads**, it is recommended to use Milvus local docker, kubernetes clusters, or fully-managed Milvus on Zilliz Cloud. <br>

I'll demonstrate how to connect using the [Python SDK](https://github.com/milvus-io/pymilvus/blob/master/pymilvus/milvus_client/milvus_client.py). For more details, see this [Python example](https://github.com/milvus-io/pymilvus/blob/bac31951d5c5a9dacb6632e535e3c4d284726390/examples/hello_milvus_simple.py).  

## 1. Milvus Lite  <a class="anchor" id="milvus_lite"></a>

Milvus Lite is a light Python server that can run locally.  It's ideal for getting started with Milvus, running on a laptop, in a Jupyter notebook, or on Colab. 

⛔️ Please note Milvus Lite is only meant for demos, not for production workloads.

- [github](https://github.com/milvus-io/milvus-lite)
- [documentation](https://milvus.io/docs/quickstart.md)

In [1]:
# !python -m pip install -U pymilvus
import pymilvus
print(f"pymilvus:{pymilvus.__version__}")

pymilvus:2.4.3


In [2]:
# Connect a client to the Milvus Lite server.
from pymilvus import MilvusClient
mc = MilvusClient("milvus_demo.db")

In [3]:
# Create a collection.
COLLECTION_NAME = "MilvusDocs"
EMBEDDING_DIM = 256

# Milvus Lite uses the MilvusClient object.
if mc.has_collection(COLLECTION_NAME):
    mc.drop_collection(COLLECTION_NAME)
    print(f"Successfully dropped collection: `{COLLECTION_NAME}`")

# Create a collection with flexible schema and AUTOINDEX.
mc.create_collection(COLLECTION_NAME, 
        EMBEDDING_DIM,
        consistency_level="Eventually", 
        auto_id=True,  
        overwrite=True,
    )
print(f"Successfully created collection: `{COLLECTION_NAME}`")

Successfully created collection: `MilvusDocs`


In [4]:
# Drop the collection.
mc.drop_collection(COLLECTION_NAME)
print(f"Successfully dropped collection: `{COLLECTION_NAME}`")

Successfully dropped collection: `MilvusDocs`


## 2. Zilliz free tier  <a class="anchor" id="zilliz_free"></a>

This section uses [Zilliz](https://zilliz.com), free tier.  If you have not already, sign up for a [free trial](https://cloud.zilliz.com/signup).  

If you already have a Zilliz account and want to use free tier, just be sure to select "Starter" option when you [create your cluster](https://docs.zilliz.com/docs/create-cluster).  ❤️‍🔥 **In other words, everybody gets free tier!!**  
- One free tier cluster per account.
- Per free tier cluster, up to two collections at a time. (Think of a collection like a database table. Each collection has an index, schema, and consistency-level).
- Each free tier collection can support up to 1 Million vectors (Think of this like rows in a database table).

If you have larger data, we recommend our Pay-as-you-go Serverless or Enterprise plan.  Free tier and Pay-as-you-go are Zilliz-managed AWS, Google, or Azure services.  BYOC is possible in the Enterprise plan.

### 👩 Set up instructions for Zilliz 

1. From [cloud.zilliz.com](cloud.zilliz.com), click **"+ Create Cluster"**
2. Select <i>**Starter**</i> option for the cluster and click **"Next: Create Collection"**
   <div>
   <img src="../images/zilliz_cluster_choose.png" width="60%"/>
   </div>

1. Name your collection with a <i>**Collection Name**</i> and click **"Create Collection and Cluster"**.
2. From the Clusters page, 
   - copy the cluster uri and save somewhere locally.
   - copy your cluster API KEY.  Keep this private! 
     <div>
     <img src="../images/zilliz_cluster_uri_token.png" width="80%"/>
     </div>

3. Add the API KEY to your environment variables.  See this [article for instructions](https://help.openai.com/en/articles/5112595-best-practices-for-api-key-safety) how in either Windows or Mac/Linux environment.
4. In Jupyter, you'll also need .env file (in same dir as notebooks) containing lines like this:
   - ZILLIZ_API_KEY=value
5. In your code, connect to your Zilliz cluster, see code example below.

In [5]:
import os
from pymilvus import (connections, MilvusClient, utility)
TOKEN = os.getenv("ZILLIZ_API_KEY")

# Connect to Zilliz cloud using endpoint URI and API key TOKEN.
CLUSTER_ENDPOINT="https://in03-xxxx.api.gcp-us-west1.zillizcloud.com:443"
CLUSTER_ENDPOINT="https://in03-8bc9fd463236b1a.api.gcp-us-west1.zillizcloud.com:443"

connections.connect(
  alias='default',
  uri=CLUSTER_ENDPOINT,
  token=TOKEN,
)

# Check if the server is ready and get collection name.
print(f"Type of server: {utility.get_server_version()}")

Type of server: Zilliz Cloud Vector Database(Compatible with Milvus 2.4)


In [6]:
COLLECTION_NAME = "movies"
EMBEDDING_DIM = 256

# Use no-schema Milvus client uses flexible json key:value format.
# https://milvus.io/docs/using_milvusclient.md
mc = MilvusClient(
    uri=CLUSTER_ENDPOINT,
    token=TOKEN)

# Check if collection already exists, if so drop it.
has = utility.has_collection(COLLECTION_NAME)
if has:
    drop_result = utility.drop_collection(COLLECTION_NAME)
    print(f"Successfully dropped collection: `{COLLECTION_NAME}`")

# Create a collection with flexible schema and AUTOINDEX.
mc.create_collection(COLLECTION_NAME, 
                     EMBEDDING_DIM,
                     consistency_level="Eventually", 
                     auto_id=True,  
                     overwrite=True,
                    )
print(f"Successfully created collection: `{COLLECTION_NAME}`")

Successfully created collection: `movies`


In [7]:
# Drop collection
utility.drop_collection(COLLECTION_NAME)

# Disconnect from the server.
try:
  connections.disconnect(alias="default")
  print("Successfully disconnected from the server.")
except:
  pass

Successfully disconnected from the server.


## 3. Milvus standalone Docker <a class="anchor" id="milvus_docker"></a>

This section uses [Milvus standalone](https://milvus.io/docs/configure-docker.md) on Docker. <br>
>⛔️ Make sure you pip install the correct version of pymilvus and server yml file.  **Versions (major and minor) should all match**.

1. [Install Docker](https://docs.docker.com/get-docker/)
2. Start your Docker Desktop
3. Download the latest [docker-compose.yml](https://milvus.io/docs/install_standalone-docker.md#Download-the-YAML-file) (or run the wget command, replacing version to what you are using)
> wget https://github.com/milvus-io/milvus/releases/download/v2.4.0-rc.1/milvus-standalone-docker-compose.yml -O docker-compose.yml
4. From your terminal:  
   - cd into directory where you saved the .yml file (usualy same dir as this notebook)
   - docker compose up -d
   - verify (either in terminal or on Docker Desktop) the containers are running
5. From your code (see notebook code below):
   - Import milvus
   - Connect to the local milvus server

In [8]:
# !pip install -U pymilvus

In [9]:
import pymilvus, time
from pymilvus import (connections, MilvusClient, utility)
print(f"Pymilvus: {pymilvus.__version__}")

Pymilvus: 2.4.3


In [10]:
####################################################################################################
# Connect to local server running in Docker container.
# Download the latest .yaml file: https://milvus.io/docs/install_standalone-docker.md
# Or, download directly from milvus github (replace with desired version):
# !wget https://github.com/milvus-io/milvus/releases/download/v2.4.0-rc.1/milvus-standalone-docker-compose.yml -O docker-compose.yml
####################################################################################################

# Start Milvus standalone on docker, running quietly in the background.
# !docker compose up -d

# # Verify which local port the Milvus server is listening on
# !docker ps -a #19530/tcp

# Connect to the local server.
connection = connections.connect(
  alias="default", 
  host='localhost', # or '0.0.0.0' or 'localhost'
  port='19530'
)

# Get server version.
print(utility.get_server_version())

# Use no-schema Milvus client uses flexible json key:value format.
mc = MilvusClient(connections=connection)

v2.4.1


In [11]:
COLLECTION_NAME = "movies"
EMBEDDING_DIM = 256

# Check if collection already exists, if so drop it.
has = utility.has_collection(COLLECTION_NAME)
if has:
    drop_result = utility.drop_collection(COLLECTION_NAME)
    print(f"Successfully dropped collection: `{COLLECTION_NAME}`")

# Create a collection with flexible schema and AUTOINDEX.
mc.create_collection(
        COLLECTION_NAME, 
        EMBEDDING_DIM, 
        consistency_level="Eventually", 
        auto_id=True,  
        overwrite=True,
        )
print(f"Created collection: {COLLECTION_NAME}")

Successfully dropped collection: `movies`
Created collection: movies


In [12]:
# Stop local milvus.
!docker compose down

# Disconnect from the server.
try:
  connections.disconnect(alias="default")
  print("Successfully disconnected from the server.")
except:
  pass

[33mWARN[0m[0000] /Users/christy/Documents/bootcamp_scratch/bootcamp/docker-compose.yml: `version` is obsolete 
Successfully disconnected from the server.


## LangChain <a class="anchor" id="langchain"></a>

All 3rd party adapters use [Milvus Lite](https://milvus.io/docs/quickstart.md).  

LangChain APIs hide a lot of the steps to convert raw unstructured data into vectors and store the vectors in Milvus.
- [LangChain docs](https://python.langchain.com/v0.2/docs/integrations/vectorstores/milvus/)
- [Milvus docs](https://milvus.io/docs/integrate_with_langchain.md)

LangChain default values:
- collection_name: LangChainCollection
- schema: ['pk', 'source', 'text', 'vector']
- auto_id: True
- {'index_type': 'HNSW',
 'metric_type': 'L2',
 'params': {'M': 8, 'efConstruction': 64}}
- consistency_level: 'Session'
- overwrite: False

In [13]:
# !python -m pip install -U langchain_community unstructured langchain-milvus langchain-huggingface

In [14]:
# UNCOMMENT TO READ WEB DOCS FROM A LOCAL DIRECTORY.

# Read docs into LangChain
from langchain.document_loaders import DirectoryLoader

# Load HTML files from a local directory
path = "RAG/rtdocs_new/"
loader = DirectoryLoader(path, glob='*.html')
docs = loader.load()

num_documents = len(docs)
print(f"loaded {num_documents} documents")

loaded 22 documents


In [16]:
# Inspect the first document.
import pprint
print(f"length doc: {len(docs[0].page_content)}")
pprint.pprint(docs[0].page_content.replace('\n', '')[:100])

length doc: 11016
('Why MilvusDocsTutorialsToolsBlogCommunityStars0Try Managed Milvus '
 'FREESearchHomev2.4.xAbout MilvusGe')


In [17]:
from langchain_milvus import Milvus
from langchain_huggingface import HuggingFaceEmbeddings
from langchain.text_splitter import RecursiveCharacterTextSplitter
import time, pprint

# Define the embedding model.
model_name = "BAAI/bge-large-en-v1.5"
model_kwargs = {'device': 'cpu'}
encode_kwargs = {'normalize_embeddings': True}
embed_model = HuggingFaceEmbeddings(
    model_name=model_name,
    model_kwargs=model_kwargs,
    encode_kwargs=encode_kwargs
)
EMBEDDING_DIM = embed_model.dict()['client'].get_sentence_embedding_dimension()
print(f"EMBEDDING_DIM: {EMBEDDING_DIM}")

# Chunking
text_splitter = RecursiveCharacterTextSplitter(chunk_size=512, chunk_overlap=51)

# Create a Milvus collection from the documents and embeddings.
start_time = time.time()
docs = text_splitter.split_documents(docs)
vectorstore = Milvus.from_documents(
    documents=docs,
    embedding=embed_model,
    connection_args={
        "uri": "./milvus_demo.db"},
    # Override LangChain default values for Milvus.
    consistency_level="Eventually",
    drop_old=True,
    index_params = {
        "metric_type": "COSINE",
        "index_type": "AUTOINDEX",
        "params": {},}
)
end_time = time.time()
print(f"Created Milvus collection from {len(docs)} docs in {end_time - start_time:.2f} seconds")



EMBEDDING_DIM: 1024
Created Milvus collection from 427 docs in 33.14 seconds


In [18]:
# Describe the collection.
print(f"collection_name: {vectorstore.collection_name}")
print(f"schema: {vectorstore.fields}")
print(f"auto_id: {vectorstore.auto_id}")
pprint.pprint(vectorstore.index_params)
pprint.pprint(f"consistency: {vectorstore.consistency_level}")
vectorstore.drop_old = True
pprint.pprint(f"drop_old: {vectorstore.drop_old}")

collection_name: LangChainCollection
schema: ['source', 'text', 'pk', 'vector']
auto_id: True
{'index_type': 'AUTOINDEX', 'metric_type': 'COSINE', 'params': {}}
'consistency: Eventually'
'drop_old: True'


In [19]:
# Delete the Milvus collection.
del vectorstore

## LlamaIndex <a class="anchor" id="llama_index"></a>

All 3rd party adapters use [Milvus Lite](https://milvus.io/docs/quickstart.md).  

LlamaIndex APIs hide a lot of the steps to convert raw unstructured data into vectors and store the vectors in Milvus.
- [LlamaIndex docs](https://docs.llamaindex.ai/en/latest/examples/vector_stores/MilvusIndexDemo/)
- [Milvus docs](https://milvus.io/docs/integrate_with_llamaindex.md)

LlamaIndex default values:
- collection_name: llamacollection
- schema: ['doc_id', 'embedding']
- auto_id: True
- {'index_type': 'None',
 'metric_type': 'IP',
- consistency_level: 'Strong'
- overwrite: False

In [20]:
# !python -m pip install -U --no-cache-dir llama-index llama-index-embeddings-huggingface llama-index-vector-stores-milvus

In [33]:
# UNCOMMENT TO READ WEB DOCS FROM A LOCAL DIRECTORY.

# Read docs into LlamaIndex
from llama_index.core import SimpleDirectoryReader

# Load HTML files from a local directory
# https://docs.llamaindex.ai/en/stable/api_reference/readers/simple_directory_reader
# Supposed to automatically parse files based on their extension.
path = "RAG/rtdocs_new/"
loader = SimpleDirectoryReader(
        input_dir=path, 
        required_exts=[".html"],
        recursive=True # Recursively search subdirectories
    )
lli_docs = loader.load_data()

num_documents = len(lli_docs)
print(f"loaded {num_documents} documents")

loaded 22 documents


In [34]:
# Inspect the first document.
import pprint

# html docs were not parsed by SimpleDirectoryReader.
print(f"length doc: {len(lli_docs[0].text)}")
pprint.pprint(lli_docs[0].text[:100])

length doc: 663373
('<!DOCTYPE html><html lang="en"><head><meta charSet="utf-8"/><meta '
 'http-equiv="x-ua-compatible" conte')


In [35]:
from llama_index.core import (
    Settings,
    ServiceContext,
    StorageContext,
    VectorStoreIndex,
)
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.vector_stores.milvus import MilvusVectorStore
import time, pprint

# Define the embedding model.
service_context = ServiceContext.from_defaults(
    # LlamaIndex local: translates to the same location as default HF cache.
    embed_model="local:BAAI/bge-large-en-v1.5",
)
# Display what LlamaIndex exposes.
print("Embedding model:")
temp = service_context.to_dict()
pprint.pprint(temp['embed_model'])
print()
# LlamaIndex hides this but we need it to create the vector store!
EMBEDDING_DIM = 1024

# Create a Milvus collection from the documents and embeddings.
vectorstore = MilvusVectorStore(
    uri="./milvus_llamaindex.db",
    dim=EMBEDDING_DIM,
    # Override LlamaIndex default values for Milvus.
    consistency_level="Eventually",
    drop_old=True,
    index_params = {
        "metric_type": "COSINE",
        "index_type": "AUTOINDEX",
        "params": {},}
)
storage_context = StorageContext.from_defaults(
    vector_store=vectorstore
)

print(f"Start chunking, embedding, inserting...")
start_time = time.time()
llamaindex = VectorStoreIndex.from_documents(
    # Too slow!  Just use one document.
    lli_docs[:1], 
    storage_context=storage_context, 
    service_context=service_context
)
end_time = time.time()
print(f"Created LlamaIndex collection from {len(lli_docs[:1])} docs in {end_time - start_time:.2f} seconds")
# Created LlamaIndex collection from 1 docs in 106.32 seconds

  service_context = ServiceContext.from_defaults(


Embedding model:
{'cache_folder': None,
 'class_name': 'HuggingFaceEmbedding',
 'embed_batch_size': 10,
 'max_length': 512,
 'model_name': 'BAAI/bge-large-en-v1.5',
 'normalize': True,
 'num_workers': None,
 'query_instruction': None,
 'text_instruction': None}

Start chunking, embedding, inserting...
Created LlamaIndex collection from 1 docs in 101.56 seconds


In [36]:
# Describe the collection, it looks like the Milvus overrides did not all work.
temp = llamaindex.storage_context.vector_store.to_dict()
first_15_keys = list(temp.keys())[:15]
for key in first_15_keys:
    print(f"{key}: {temp[key]}")

stores_text: True
is_embedding_query: True
stores_node: True
uri: ./milvus_llamaindex.db
token: 
collection_name: llamacollection
dim: 1024
embedding_field: embedding
doc_id_field: doc_id
similarity_metric: IP
consistency_level: Eventually
overwrite: False
text_key: None
output_fields: []
index_config: {}


In [37]:
# Delete the Milvus collection.
del llamaindex

In [38]:
# Props to Sebastian Raschka for this handy watermark.
# !pip install watermark

%load_ext watermark
%watermark -a 'Christy Bergman' -v -p pymilvus,llama_index,langchain,unstructured --conda

The watermark extension is already loaded. To reload it, use:
  %reload_ext watermark
Author: Christy Bergman

Python implementation: CPython
Python version       : 3.11.8
IPython version      : 8.22.2

pymilvus    : 2.4.3
llama_index : 0.10.44
langchain   : 0.2.2
unstructured: 0.14.4

conda environment: py311-unum



In [39]:
# Check all llamaindex packages info, make sure they latest.
!pip list | grep llama-index

llama-index                             0.10.44
llama-index-agent-openai                0.2.7
llama-index-cli                         0.1.12
llama-index-core                        0.10.44
llama-index-embeddings-huggingface      0.2.1
llama-index-embeddings-openai           0.1.10
llama-index-indices-managed-llama-cloud 0.1.6
llama-index-legacy                      0.9.48
llama-index-llms-ollama                 0.1.5
llama-index-llms-openai                 0.1.22
llama-index-multi-modal-llms-openai     0.1.6
llama-index-program-openai              0.1.6
llama-index-question-gen-openai         0.1.3
llama-index-readers-file                0.1.23
llama-index-readers-llama-parse         0.1.4
llama-index-vector-stores-milvus        0.1.17
