## Querying a Milvus index - Nomic AI Embeddings

Simple example on how to query content from a Milvus VectorStore. In this example, the embeddings are the fully open source ones released by NomicAI, [nomic-embed-text-v1](https://huggingface.co/nomic-ai/nomic-embed-text-v1).

As described in [this blog post](https://blog.nomic.ai/posts/nomic-embed-text-v1), those embeddings feature a "8192 context-length that outperforms OpenAI Ada-002 and text-embedding-3-small on both short and long context tasks". In additions, they are:

- Open source
- Open data
- Open training code
- Fully reproducible and auditable

Requirements:
- A Milvus instance, either standalone or cluster.

### Needed packages and imports

In [1]:
!pip install -q einops==0.7.0 langchain==0.1.9 pymilvus==2.3.6 sentence-transformers==2.4.0

[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
langchain-text-splitters 0.3.0 requires langchain-core<0.4.0,>=0.3.0, but you have langchain-core 0.1.52 which is incompatible.
langchain-milvus 0.1.5 requires langchain-core<0.4,>=0.2.38; python_version >= "3.9", but you have langchain-core 0.1.52 which is incompatible.
langchain-milvus 0.1.5 requires pymilvus<3.0.0,>=2.4.3, but you have pymilvus 2.3.6 which is incompatible.[0m[31m
[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.2.2[0m[39;49m -> [0m[32;49m24.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


In [2]:
import os
from langchain.embeddings.huggingface import HuggingFaceEmbeddings
from langchain_community.vectorstores import Milvus

### Base parameters, the Milvus connection info

In [6]:
# Milvus configuration
MILVUS_HOST = "vectordb-milvus.milvus.svc.cluster.local"
MILVUS_PORT = 19530
MILVUS_USERNAME = ""
MILVUS_PASSWORD = ""
MILVUS_COLLECTION = "estatutos_galicia"

### Initialize the connection

In [7]:
# If you don't want to use a GPU, you can remove the 'device': 'cuda' argument
model_kwargs = {'trust_remote_code': True}
embeddings = HuggingFaceEmbeddings(
    model_name="nomic-ai/nomic-embed-text-v1",
    model_kwargs=model_kwargs,
    show_progress=True
)

store = Milvus(
    embedding_function=embeddings,
    connection_args={"host": MILVUS_HOST, "port": MILVUS_PORT, "user": MILVUS_USERNAME, "password": MILVUS_PASSWORD},
    collection_name=MILVUS_COLLECTION,
    metadata_field="metadata",
    text_field="page_content",
    drop_old=False
    )

You try to use a model that was created with version 2.4.0.dev0, however, your version is 2.4.0. This might cause unexpected behavior or errors. In that case, try to update to the latest version.



  state_dict = loader(resolved_archive_file)
<All keys matched successfully>


### Make a query to the index to verify sources

In [15]:
query="By whom will the Ordinary and/or Extraordinary Assemblies be convened?"
results = store.similarity_search_with_score(query, k=2, return_metadata=True)
for result in results:
    print(result[0].metadata['source'])

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

estatutos/Estatuto Molinos Rio de la Plata.pdf
estatutos/Estatuto TGS.pdf


### Work with a retriever

In [16]:
retriever = store.as_retriever(search_type="similarity", search_kwargs={"k": 4})

In [17]:
docs = retriever.get_relevant_documents(query)
docs

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

[Document(page_content='MOLINOS RIO DE LA PLATA SOCIEDAD ANONIMA  \nESTATUTO SOCIAL  \nINSCRIPCIONES EN EL REGISTRO PUBLICO DE COMERCIO  \n \nFecha:  10 de Julio de 1931   No. 146  Fo.510   Lo. 43   To.A  \nFecha:  17 de Setiembre de 1934  No. 156  Fo.270   Lo. 44   To.A  \nFecha:  21 de Octubre de 1936  No. 214  Fo.534   Lo. 44   To.A  \nFecha:  11 de Abril de 1938   No. 75   Fo.151   Lo. 45   To.A  \nFecha:  07 de Junio de 1948   No. 357  Fo.145   Lo. 48   To.A  \nFecha:  12 de Febrero de 1952   No. 83   Fo.96   Lo. 49  \n To.A \nFecha:  07 de Mayo de 1957   No. 535  Fo.365   Lo. 50   To.A  \nFecha:  24 de Octubre de 1960  No. 3.463  Fo.140   Lo. 53   To.A  \nFecha:  16 de Abril de 1971   No. 1.128  Fo.161   Lo. 74   To.A  \nFecha:  04 de Marzo de 1976   No. 354  Fo.10   Lo. 85   To.A  \nFecha:  08 de Julio de 1977   No. 2.248  Fo. --  Lo. 87   To.A  \nFecha:  23 de Diciembre de 1977  No. 4.540  Fo. --  Lo. 86   To.A  \nFecha:  07 de Setiembre de 1979  No. 2.831  Fo. --  Lo. 93   To.