# Build RAG with Hugging Face, Milvus and Mistral

## Preparation
### Dependencies and Environment

In [26]:
! pip install --upgrade pymilvus sentence-transformers huggingface-hub langchain_community langchain-text-splitters pypdf tqdm



In [46]:
!pip install flask pyngrok
!pip install slack_sdk

Collecting pyngrok
  Downloading pyngrok-7.2.3-py3-none-any.whl.metadata (8.7 kB)
Downloading pyngrok-7.2.3-py3-none-any.whl (23 kB)
Installing collected packages: pyngrok
Successfully installed pyngrok-7.2.3
Collecting slack_sdk
  Downloading slack_sdk-3.34.0-py2.py3-none-any.whl.metadata (15 kB)
Downloading slack_sdk-3.34.0-py2.py3-none-any.whl (292 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m292.5/292.5 kB[0m [31m6.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: slack_sdk
Successfully installed slack_sdk-3.34.0


In [27]:
import os

os.environ["HF_TOKEN"] = "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"

We use the [`PyPDFLoader`](https://python.langchain.com/v0.1/docs/modules/data_connection/document_loaders/pdf/) from LangChain to extract the text from the PDF, and then split the text into smaller chunks. By default, we set the chunk size as 1000 and the overlap as 200, which means each chunk will nearly have 1000 characters and the overlap between two chunks will be 200 characters.

In [28]:
from langchain_community.document_loaders import PyPDFLoader

loader = PyPDFLoader("GDPR.pdf")
docs = loader.load()
print(len(docs))

88


In [29]:
from langchain_text_splitters import RecursiveCharacterTextSplitter

text_splitter = RecursiveCharacterTextSplitter(chunk_size=1000, chunk_overlap=200)
chunks = text_splitter.split_documents(docs)

In [30]:
text_lines = [chunk.page_content for chunk in chunks]

### Prepare the Embedding Model

In [31]:
from sentence_transformers import SentenceTransformer

embedding_model = SentenceTransformer("BAAI/bge-small-en-v1.5")

def emb_text(text):
    return embedding_model.encode([text], normalize_embeddings=True).tolist()[0]

Generate a test embedding and print its dimension and first few elements.

In [32]:
test_embedding = emb_text("This is a test")
embedding_dim = len(test_embedding)
print(embedding_dim)
print(test_embedding[:10])

384
[-0.07660680264234543, 0.025316733866930008, 0.012505539692938328, 0.004595177713781595, 0.025780005380511284, 0.038167089223861694, 0.08050810545682907, 0.00303537561558187, 0.02439219132065773, 0.004880349617451429]


## Load data into Milvus

### Create the Collection

In [33]:
from pymilvus import MilvusClient

milvus_client = MilvusClient(uri="./hf_milvus_demo.db")

collection_name = "rag_collection"

In [34]:
if milvus_client.has_collection(collection_name):
    milvus_client.drop_collection(collection_name)

In [35]:
milvus_client.create_collection(
    collection_name=collection_name,
    dimension=embedding_dim,
    metric_type="IP",  # Inner product distance
    consistency_level="Strong",  # Strong consistency level
)

### Insert data

In [36]:
from tqdm import tqdm

data = []

for i, line in enumerate(tqdm(text_lines, desc="Creating embeddings")):
    data.append({"id": i, "vector": emb_text(line), "text": line})

insert_res = milvus_client.insert(collection_name=collection_name, data=data)
insert_res["insert_count"]

Creating embeddings: 100%|██████████| 486/486 [02:18<00:00,  3.51it/s]


486

## Build RAG

In [37]:
question = "A data breach occured 4 days ago. Can I inform the customers about it in a couple more days?"

In [38]:
search_res = milvus_client.search(
    collection_name=collection_name,
    data=[
        emb_text(question)
    ],
    limit=3,
    search_params={"metric_type": "IP", "params": {}},
    output_fields=["text"],
)

In [39]:
import json

retrieved_lines_with_distances = [
    (res["entity"]["text"], res["distance"]) for res in search_res[0]
]
print(json.dumps(retrieved_lines_with_distances, indent=4))

[
    [
        "awar e that a personal data breach has occur red, the controller should notify the personal data breac h to the \nsuper visor y author ity without undue dela y and, where feasible, not later than 72 hours af ter hav ing become \nawar e of it, unless the controller is able to demonstrate, in accordance with the accountability pr inciple, that the \npersonal data breac h is unlikely to result in a r isk to the r ights and freedoms of natural persons. Where such \nnotification cannot be achi eved within 72 hours, the reasons f or the dela y should accompan y the notification \nand inf or mation ma y be pro vided in phases without undue fur ther dela y . \n(86)  The controller should communicate to the data subject a personal data breac h, without undue dela y , where that \npersonal data breach is likely to result in a high r isk to the r ights and freedoms of the natural person in order to",
        0.7567300796508789
    ],
    [
        "f or promp t communication with

### Use LLM to get an RAG response

Before composing the prompt for LLM, let's first flatten the retrieved document list into a plain string.

In [40]:
context = "\n".join(
    [line_with_distance[0] for line_with_distance in retrieved_lines_with_distances]
)

In [41]:
PROMPT = """
Use the following pieces of information enclosed in <context> tags to provide an answer to the question enclosed in <question> tags.
<context>
{context}
</context>
<question>
{question}
</question>
"""

We use the [Mixtral-8x7B-Instruct-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-Instruct-v0.1) hosted on Hugging Face inference server to generate a response based on the prompt.

In [42]:
from huggingface_hub import InferenceClient

repo_id = "mistralai/Mixtral-8x7B-Instruct-v0.1"

llm_client = InferenceClient(model=repo_id, timeout=120)

In [43]:
prompt = PROMPT.format(context=context, question=question)

In [44]:
answer = llm_client.text_generation(
    prompt,
    max_new_tokens=1000,
).strip()
print(answer)

No, you should inform the customers about the data breach without undue delay, where that personal data breach is likely to result in a high risk to the rights and freedoms of the natural person. The fact that the notification was made without undue delay should be established taking into account in particular the nature and gravity of the personal data breach and its consequences and adverse effects for the data subject.


## Slack Integration

In [52]:
# from pyngrok import ngrok

from flask import Flask, request, jsonify
from slack_sdk import WebClient
from slack_sdk.errors import SlackApiError

import requests

app = Flask(__name__)

# Your Slack Bot Token
SLACK_BOT_TOKEN = "xoxb-8343683699910-8352869713860-tZwoLunbHn47JuDcSog8Gwby"
SLACK_WEBHOOK_URL = "https://hooks.slack.com/services/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"
CHANNEL_ID = "C08ACQ9T91S"

client = WebClient(token=SLACK_BOT_TOKEN)


# # Function to fetch messages from a channel
def fetch_slack_messages(channel_id):
    try:
        response = client.conversations_history(channel=channel_id, limit=10)
        messages = response["messages"]
        for msg in messages:
            print(f"User: {msg.get('user', 'Bot')} - Message: {msg['text']}")
        return messages
    except SlackApiError as e:
        print(f"Error fetching messages: {e.response['error']}")

# Replace with the channel ID you want to fetch messages from
CHANNEL_ID = "C08AAGE1V5G"  # Replace with your channel ID
messages = fetch_slack_messages(CHANNEL_ID)

#  llm part

# Message payload
message = {
    "text": answer,
}
# Send the POST request to the webhook URL
response = requests.post(SLACK_WEBHOOK_URL, json=message)

# Check the response
if response.status_code == 200:
    print("Message sent successfully!")
else:
    print(f"Failed to send message. Status code: {response.status_code}, Response: {response.text}")

User: U08AZ3HHDT2 - Message: A data breach occured 4 days ago. Can I inform the customers about it in a couple more days?
User: U08ACRKLZRA - Message: <@U08ACRKLZRA> has joined the channel
User: U08AZ3HHDT2 - Message: A data breach occured 4 days ago. Can I inform the customers about it in a couple more days?
User: U08AD1QV94L - Message: <@U08AD1QV94L> has joined the channel
User: U08AAFKS7MG - Message: <@U08AAFKS7MG> has joined the channel
User: U08A7LMNCH3 - Message: <@U08A7LMNCH3> has joined the channel
User: U08AZ3HHDT2 - Message: <@U08AZ3HHDT2> has joined the channel
Message sent successfully!
