# Using llama Index with Milvus

https://docs.llamaindex.ai/en/stable/examples/vector_stores/MilvusIndexDemo/

In [1]:
from llama_index.core import SimpleDirectoryReader
import pprint

# load documents
documents = SimpleDirectoryReader(
    # input_dir = './data/10k/input/'
    input_dir = 'data/granite-docs/input'
).load_data()

print (f"Loaded {len(documents)} chunks")

print("Document [0].doc_id:", documents[0].doc_id)
# pprint.pprint (documents[0], indent=4)

Loaded 20 chunks
Document [0].doc_id: 82edc031-a8eb-4808-9058-cf84e4b3d4be


In [2]:
from llama_index.embeddings.huggingface import HuggingFaceEmbedding
from llama_index.core import Settings

Settings.embed_model = HuggingFaceEmbedding(
    model_name = "BAAI/bge-small-en-v1.5"
)

  from .autonotebook import tqdm as notebook_tqdm


In [3]:
# connect to vector db
from llama_index.core import VectorStoreIndex, StorageContext
from llama_index.vector_stores.milvus import MilvusVectorStore


vector_store = MilvusVectorStore(
    uri="./rag_llamaindex_milvus_demo.db", dim=384, overwrite=True
)
storage_context = StorageContext.from_defaults(vector_store=vector_store)


In [4]:
%%time

# create an index

from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

CPU times: user 972 ms, sys: 114 ms, total: 1.09 s
Wall time: 1.56 s


In [5]:
import os
## Load Settings from .env file
from dotenv import find_dotenv, dotenv_values

# _ = load_dotenv(find_dotenv()) # read local .env file
config = dotenv_values(find_dotenv())

os.environ["REPLICATE_API_TOKEN"] = config.get('REPLICATE_API_TOKEN')

In [6]:
%%time

# create an index

from llama_index.core import VectorStoreIndex

index = VectorStoreIndex.from_documents(
    documents, storage_context=storage_context
)

CPU times: user 452 ms, sys: 11.7 ms, total: 464 ms
Wall time: 926 ms


In [7]:
# See data in vector db

from pymilvus import MilvusClient
import pprint 

client = MilvusClient('./rag_llamaindex_milvus_demo.db')
res = client.list_collections()

print(res)
print ('---------')

res = client.describe_collection(
    collection_name=res[0]
)

pprint.pprint(res)
# print ("✅ Connected to Milvus instance: ./rag_llamaindex_milvus_demo.db" )

['llamacollection']
---------
{'aliases': [],
 'auto_id': False,
 'collection_id': 0,
 'collection_name': 'llamacollection',
 'consistency_level': 0,
 'description': '',
 'enable_dynamic_field': True,
 'fields': [{'description': '',
             'field_id': 100,
             'is_primary': True,
             'name': 'id',
             'params': {'max_length': 65535},
             'type': <DataType.VARCHAR: 21>},
            {'description': '',
             'field_id': 101,
             'name': 'embedding',
             'params': {'dim': 384},
             'type': <DataType.FLOAT_VECTOR: 101>}],
 'num_partitions': 0,
 'num_shards': 0,
 'properties': {}}


In [9]:
from llama_index.llms.replicate import Replicate
from llama_index.core import Settings

llm = Replicate(
    model="meta/meta-llama-3-8b-instruct",
    temperature=0.1
)

Settings.llm = llm

In [15]:
query_engine = index.as_query_engine()
res = query_engine.query("Summarize this document for me")
print(res)



Based on the provided context information, it appears that the document is a report or paper related to the Granite Foundation Models, which are a set of pre-trained language models. The report discusses the evaluation and testing of these models, including their performance on various tasks and benchmarks.

Here is a summary of the document:

The report begins by introducing the Granite Foundation Models, which are a set of pre-trained language models designed for multilingual tasks. The models are evaluated on various benchmarks, including XL-sum, ML-sum, and XGLUE.

The report then provides an overview of the evaluation process, including the metrics used to assess the models' performance. The metrics include FiQA-Opinion and Insurance QA metrics, which are used to evaluate the models' ability to generate accurate and informative summaries.

The report also discusses the results of the evaluation, including the performance of the Granite Foundation Models on various tasks and benc

In [10]:
query_engine = index.as_query_engine()
res = query_engine.query("What was the training dataset?")
print(res)



Based on the provided context, the training dataset was IBM's curated pre-training dataset at the time of granite.13b.v2's training.


In [19]:
query_engine = index.as_query_engine()
res = query_engine.query("When was the moon landing?")
print(res)



I'm happy to help! However, I don't see any information about the moon landing in the provided context. The text appears to be about a language model called Granite and its training process, architecture, and infrastructure. There is no mention of the moon landing. If you could provide more context or clarify the query, I'd be happy to try and assist you further!
