# Streaming RAG Demo with LangChain, Milvus, Quix and Mistral

![Streaming RAG Demo](Streaming_RAG_Demo_LangChain.png)

**Everything is running on Docker with Docker Compose.**


This notebook demonstrates how to build a Retrieval-Augmented Generation (RAG) system that can:
1. Answer questions using a vector database (Milvus)
2. Stream new data from Kafka using Quix
3. Update its knowledge base in real-time

We'll use:
- **LangChain**: For orchestrating the RAG pipeline
- **Milvus**: As our vector database
- **Ollama**: For running the LLM locally (`mistral` model)
- **Quix**: For Kafka streaming integration

## Setup and Imports

First, let's import all necessary libraries:

In [7]:
from langchain_huggingface import HuggingFaceEmbeddings
from langchain_milvus import Milvus
from langchain_ollama.llms import OllamaLLM
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
import json
import sys
import time

## Initialize RAG Components

Now we'll set up our RAG system with:
1. Embeddings model for converting text to vectors
2. LLM for generating responses
3. Vector store for storing and retrieving documents
4. RAG prompt template
5. The complete RAG chain

In [8]:
def setup_rag_components():
    """Initialize RAG components"""
    embeddings = HuggingFaceEmbeddings(model_name="BAAI/bge-small-en-v1.5")
    llm = OllamaLLM(model="mistral")
    
    vector_store = Milvus.from_texts(
        texts=["Initial empty document"],
        embedding=embeddings,
        connection_args={"host": "localhost", "port": "19530"},
        collection_name="streaming_rag_demo",
        drop_old=True
    )
    
    # Create RAG prompt
    template = """Answer the question based only on the following context:

{context}

Question: {question}
Answer:"""
    
    prompt = ChatPromptTemplate.from_template(template)
    
    rag_chain = (
        {"context": vector_store.as_retriever(), "question": RunnablePassthrough()}
        | prompt
        | llm
        | StrOutputParser()
    )
    
    return vector_store, rag_chain

# Initialize our components
vector_store, rag_chain = setup_rag_components()

## Test Initial RAG System

Let's test our RAG system before adding any real data. It should respond that it doesn't have relevant information since our vector store is empty.

In [9]:
print("Initial Query (before streaming):")
question = "What do you know about artificial intelligence developments?"
print(f"Question: {question}")
print(f"Answer: {rag_chain.invoke(question)}\n")

Initial Query (before streaming):
Question: What do you know about artificial intelligence developments?
Answer:  Based on the provided context, no information regarding AI developments is available as the document is initially empty.



## Set Up Kafka Producer

Now let's create a producer that will send some sample messages to Kafka. These messages will contain information that our RAG system can learn from.

## Clean up the Kakfa Topic 

To make sure we have a clean state, we'll delete and recreate the topic before adding some sample messages.

In [10]:
from quixstreams.kafka import Producer, Consumer

def cleanup_topic():
    """Delete and recreate the topic to ensure clean state"""
    print("\nCleaning up Kafka topic...")
    
    consumer = Consumer(
        broker_address="localhost:9092",
        consumer_group="rag-consumer",
        auto_offset_reset="earliest"
    )
    
    try:
        # Try to subscribe - this will fail if topic doesn't exist
        consumer.subscribe(["messages"])
        msg = consumer.poll(timeout=1.0)
        if msg:
            print("Found existing messages, recreating topic...")
            consumer.close()
            
            # Create producer with admin rights to delete topic
            with Producer(
                broker_address="localhost:29092",
                extra_config={
                    "allow.auto.create.topics": "true",
                },
            ) as producer:
                producer.delete_topics(["messages"])
                time.sleep(2)  # Wait for deletion
                
    except Exception as e:
        print(f"Topic doesn't exist yet: {e}")
    finally:
        consumer.close()

In [11]:
cleanup_topic()


Cleaning up Kafka topic...
Found existing messages, recreating topic...
Topic doesn't exist yet: 'Producer' object has no attribute 'delete_topics'


In [12]:
from quixstreams import Application

def get_sample_messages():
    return [
        {"chat_id": "id1", "text": "The latest developments in artificial intelligence have revolutionized how we approach problem solving"},
        {"chat_id": "id2", "text": "Climate change poses significant challenges to global ecosystems and human societies"},
        {"chat_id": "id3", "text": "Quantum computing promises to transform cryptography and drug discovery"},
        {"chat_id": "id4", "text": "Sustainable energy solutions are crucial for addressing environmental concerns"}
    ]
    
app = Application(
    broker_address="localhost:9092",
    auto_create_topics=True
)

with app.get_producer() as producer:
    messages = get_sample_messages()
    print("\nSending messages to Kafka...")
    for message in messages:
        print(f'Sending: "{message["text"]}"')
        producer.produce(
            topic="messages",
            key=message["chat_id"].encode(),
            value=json.dumps(message).encode(),
        )
    print("\nAll messages sent!")

[2025-02-25 11:44:45,992] [INFO] [quixstreams] : Topics required for this application: 
[2025-02-25 11:44:46,003] [INFO] [quixstreams] : Validating Kafka topics exist and are configured correctly...
[2025-02-25 11:44:46,024] [INFO] [quixstreams] : Kafka topics validation complete



Sending messages to Kafka...
Sending: "The latest developments in artificial intelligence have revolutionized how we approach problem solving"
Sending: "Climate change poses significant challenges to global ecosystems and human societies"
Sending: "Quantum computing promises to transform cryptography and drug discovery"
Sending: "Sustainable energy solutions are crucial for addressing environmental concerns"

All messages sent!


## Process Streaming Data

Now we'll consume the messages from Kafka and add them to our vector store. This simulates how our RAG system can learn from streaming data.

In [13]:
from quixstreams import Application

def process_value(row):
    text = row["text"]
    print(f"\nReceived message: {text}")
    
    # Add text directly to vector store
    vector_store.add_texts([text])
    return row

app = Application(
    broker_address="localhost:9092",
    consumer_group="rag-consumer",
    auto_offset_reset="earliest"
)

input_topic = app.topic(name="messages")

sdf = app.dataframe(topic=input_topic)

sdf = (
    sdf.apply(process_value)
)

app.run()

# Runs in a continuous loop, so interrupt the kernel after a short while

[2025-02-25 11:45:25,663] [INFO] [quixstreams] : Starting the Application with the config: broker_address="{'bootstrap.servers': 'localhost:9092'}" consumer_group="rag-consumer" auto_offset_reset="earliest" commit_interval=5.0s commit_every=0 processing_guarantee="at-least-once"
[2025-02-25 11:45:25,669] [INFO] [quixstreams] : Topics required for this application: "messages"
[2025-02-25 11:45:25,681] [INFO] [quixstreams] : Validating Kafka topics exist and are configured correctly...
[2025-02-25 11:45:25,732] [INFO] [quixstreams] : Kafka topics validation complete
[2025-02-25 11:45:25,733] [INFO] [quixstreams] : Initializing state directory at "/Users/stephen/Documents/Zilliz/talks/quix_milvus/state/rag-consumer"
[2025-02-25 11:45:25,736] [INFO] [quixstreams] : Waiting for incoming messages



Received message: The latest developments in artificial intelligence have revolutionized how we approach problem solving

Received message: Climate change poses significant challenges to global ecosystems and human societies

Received message: Quantum computing promises to transform cryptography and drug discovery

Received message: Sustainable energy solutions are crucial for addressing environmental concerns

Received message: The latest developments in artificial intelligence have revolutionized how we approach problem solving

Received message: Climate change poses significant challenges to global ecosystems and human societies

Received message: Quantum computing promises to transform cryptography and drug discovery

Received message: Sustainable energy solutions are crucial for addressing environmental concerns

Received message: The latest developments in artificial intelligence have revolutionized how we approach problem solving

Received message: Climate change poses signific

[2025-02-25 11:45:38,520] [INFO] [quixstreams] : Stop processing of StreamingDataFrame


## Test Updated RAG System

Now let's test our RAG system again. This time it should have knowledge from the streamed messages.

In [14]:
# Query about AI
print("Query about AI developments:")
question = "What do you know about artificial intelligence developments?"
print(f"Question: {question}")
print(f"Answer: {rag_chain.invoke(question)}\n")

# Query about climate change
print("Query about climate change:")
question = "What information do you have about climate change?"
print(f"Question: {question}")
print(f"Answer: {rag_chain.invoke(question)}\n")

Query about AI developments:
Question: What do you know about artificial intelligence developments?
Answer:  The latest developments in artificial intelligence (AI) have revolutionized how we approach problem solving.

Query about climate change:
Question: What information do you have about climate change?
Answer:  The context provides information that climate change poses significant challenges to global ecosystems and human societies. There is no specific mention of the nature or extent of these challenges, but it implies that they are substantial enough to warrant concern. The text also mentions that sustainable energy solutions are crucial for addressing environmental concerns, which might be relevant in discussing potential solutions to climate change.

