# Querying & Searching

Milvus supports vector search, querying a scalar field and hybrid search using which we can search in vector fields and scalar fields together.

All the searches and queries are commonly performed on indexed collections, which means we will have to build indexes before we perform a search.

In [1]:
from pymilvus import Collection, FieldSchema, CollectionSchema, DataType, connections, utility

connections.connect(
  alias="default",
  host='localhost',
  port='19530'
)

collection = Collection("Album1")
collection.compact()
utility.list_collections()

['Album1']

Before we perform a search or a query, the collection has to be loaded into the memory.

In [2]:
# Load the collection in to the memory
collection.load(replica_number=1)

## Vector Search

In [6]:
## Vector Similarity Search

results = collection.search(
	data=[[0.1, 0.2]],  # data, which is the input vector based on which the similar vectors will be searched in the database collection
	anns_field="song_vec",  # approximate nearest neighbors field, which is a vector field which will be used in the search.
	param={"metric_type": "L2", "params": {"search_k": 64}}, #  these parameters will vary based on the index type which is present in the collection
	limit=5,  # limit the results
	expr=None,  # this will be used only in the hybrid search
	output_fields=['song_name'] # the list of fields available in the collection, and these fields will be available in the results
)

The results for a vector search will have the IDs, which is the primary key, the distance which will be computed based on the input parameters to the search and we have the entity in which we will have the values for the output fields.

In [7]:
for result in results[0]:
    print (result)

id: 0, distance: 0.0369926393032074, entity: {'song_name': 'IGKXDNX'}
id: 4, distance: 0.04556524008512497, entity: {'song_name': 'OPPCDQT'}
id: 2, distance: 0.06613810360431671, entity: {'song_name': 'WNVBFQR'}
id: 3, distance: 0.11746208369731903, entity: {'song_name': 'JARQJZL'}


## Querying Scalar Fields

In [8]:
# Query the data in scalar field

query_res = collection.query(
  expr = "song_id in [1,2]",  # specify the expression based on which the entities will be filtered in the scalar field
  limit = 10,  # limit the number of entities in the output result set
  output_fields = ["song_name", "listen_count"]  # the output fields that will be present in the result set
)

We have searched for the entities where the song ID is 1 or 2 and we have only two entities in the result set which matches this expression.

In [9]:
for result in query_res:
    print (result)

{'song_name': 'LYKKPXH', 'listen_count': 97430, 'song_id': 1}
{'song_name': 'MCNMGLZ', 'listen_count': 23383, 'song_id': 2}


## Hybrid Search

In [10]:
# Hybrid search

hybrid_res = collection.search(
	data=[[0.1, 0.2]], 
	anns_field="song_vec", 
	param={"metric_type": "L2", "params": {"search_k": 64}},
	limit=5, 
	expr="listen_count <= 100000",
	output_fields=['song_name']
)

for result in hybrid_res[0]:
    print (result)

id: 0, distance: 0.0369926393032074, entity: {'song_name': 'IGKXDNX'}
id: 4, distance: 0.04556524008512497, entity: {'song_name': 'OPPCDQT'}
id: 2, distance: 0.06613810360431671, entity: {'song_name': 'WNVBFQR'}
id: 3, distance: 0.11746208369731903, entity: {'song_name': 'JARQJZL'}


Finally, when we are done with searching, we can release the collection that is loaded in the memory.

In [11]:
# Release the collection from the memory
collection.release()