# Build a RAG application with Milvus Lite, Mistral and Llama-index

In this notebook, we are showing how you can build a Retrieval Augmented Generation (RAG) application to interact with data from the French Parliament. It uses Ollama with Mistral for LLM operations, Llama-index for orchestration, and [Milvus](https://milvus.io/) for vector storage.


## Install Ollama

Make sure to have Ollama installed and Running on your laptop --> https://ollama.com/

### Install the different dependencies 

In [10]:
!pip install pymilvus==2.5.0 ollama==0.4.4 llama-index-llms-ollama==0.5.0 llama-index-vector-stores-milvus==0.3.0 llama-index-readers-file==0.4.1 llama-index-embeddings-mistralai==0.3.0 llama-index-llms-mistralai==0.3.0

Collecting llama-index-llms-mistralai==0.3.0
  Downloading llama_index_llms_mistralai-0.3.0-py3-none-any.whl.metadata (3.5 kB)
Downloading llama_index_llms_mistralai-0.3.0-py3-none-any.whl (6.8 kB)
Installing collected packages: llama-index-llms-mistralai
Successfully installed llama-index-llms-mistralai-0.3.0

[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


### Use Mistral Embedding

Make sure to create an [API Key](https://console.mistral.ai/api-keys/) on Mistral's platform and load it as an environment variable.

On this tutorial, we are loading the environment variable stored in our `.env` file.

In [1]:
from dotenv import load_dotenv
import os
load_dotenv()

MISTRAL_API_KEY = os.environ.get("MISTRAL_API_KEY")

In [2]:
from llama_index.embeddings.mistralai import MistralAIEmbedding

model_name = "mistral-embed"
embed_model = MistralAIEmbedding(model_name=model_name, api_key=MISTRAL_API_KEY)

### Prepare out data to be stored in Milvus

This code makes it possible to process text embeddings using Mistral Embed & Mistral-7B and store those in Milvus.

**!!Make sure to have Ollama running on your laptop!!**

* Initialises Mistral-7B model using Ollama
* Service Context: Configures a service context with Mistral and the embedding model defined above
* Vector Store: Sets up a collection in Milvus to store text embeddings, specifying the database file, collection name, vector dimensions
* Storage Context: Configures a storage context with the Milvus vector store

This makes it possible to have efficient storage and retrieval of vector embeddings for text data.

In [4]:
from llama_index.llms.ollama import Ollama
from llama_index.vector_stores.milvus import MilvusVectorStore

from llama_index.core import StorageContext, Settings

llm = Ollama(model="mistral", request_timeout=120.0)

Settings.llm = Ollama(model="mistral", request_timeout=120.0)
Settings.embed_model = embed_model
Settings.chunk_size = 350
Settings.chunk_overlap = 20

vector_store = MilvusVectorStore(
    uri="milvus_mistral_rag.db",
    collection_name="mistral_french_parliament",
    dim=1024, 
    overwrite=True  # drop table if exist and then create
    
    )
storage_context = StorageContext.from_defaults(vector_store=vector_store)

### Using Mistral AI API

If you prefer not to run models locally or need more powerful models, you can use Mistral's API instead of Ollama. The API offers:
- Access to more powerful models like `mistral-large` and `mistral-small`
- No local GPU/CPU requirements
- Consistent performance and reliability
- Production-ready deployment

Make sure to create an [API Key](https://console.mistral.ai/api-keys/) on Mistral's platform first.
```python
from llama_index.llms.mistralai import MistralAI

# Initialize Mistral LLM
mistral_llm = MistralAI(api_key=MISTRAL_API_KEY, model="mistral-7B")

# Configure settings for Mistral
Settings.llm = mistral_llm
```

The rest of the setup using Milvus would stay the same.

### Process and load the Data 

In [6]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex

docs = SimpleDirectoryReader(input_files=['data/french_parliament_discussion.xml']).load_data()
vector_index = VectorStoreIndex.from_documents(docs, storage_context=storage_context)

In [7]:
from llama_index.core.tools import RetrieverTool, ToolMetadata

milvus_tool_openai = RetrieverTool(
    retriever=vector_index.as_retriever(similarity_top_k=3),  # retrieve top_k results
    metadata=ToolMetadata(
        name="CustomRetriever",
        description='Retrieve relevant information from provided documents.'
    ),
)

### Finally, ask questions to our RAG system

In [8]:
query_engine = vector_index.as_query_engine()
response = query_engine.query("What did the French parliament talk about the last time?")
print(response)

 In the most recent session discussed in the provided context, the French Parliament talked about the use of the health pass (passe sanitaire). Specifically, there was a debate about the method chosen by the government to implement this pass and concerns were raised about its effectiveness, democracy, and proportionality.


---

#### If you like this tutorial, feel free to reach out on [LinkedIn](https://www.linkedin.com/in/stephen-batifol/), check out [Milvus](https://github.com/milvus-io/milvus) and join our [Discord](https://discord.gg/FG6hMJStWu).