## 8. Next Steps

‚úÖ **You've built a complete RAG chatbot!**

### To use locally on your machine:
```bash
cd rag-chatbot
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
export OPENAI_API_KEY="sk-..."
python app.py
# Open http://localhost:7860
```

### To share your Gradio app:
- The URL generated above (share=True) can be shared for 72 hours
- Deploy to Hugging Face Spaces for permanent hosting

### To improve the chatbot:
- Add more documents to the vector store
- Adjust `chunk_size` and `k` parameters for better retrieval
- Use `gpt-4` instead of `gpt-3.5-turbo` for better answers
- Add custom system prompts to guide LLM behavior

**Happy chatting! üöÄ**

In [None]:
# Launch Gradio in Colab
demo.launch(share=True)

In [None]:
# Create Gradio chat interface
def chat_function(message, history):
    """Chat function for Gradio."""
    response = rag_chain.invoke({"question": message})
    return response["answer"]

demo = gr.ChatInterface(
    chat_function,
    examples=[
        "What are the main services in the trading platform?",
        "How does order-entry service consume messages?",
        "What failure scenarios are documented?",
        "What are the runbook quick checks?",
        "Explain Safeguard usage for SQL credentials."
    ],
    title="üí¨ Trading Platform Documentation Chatbot",
    description="Ask questions about the trading platform architecture, services, and operations. Answers are based on the official documentation.",
    theme=gr.themes.Soft(),
)

print("üéâ Gradio interface ready!")

## 7. Build Gradio Chat Interface

In [None]:
# Test RAG with sample queries
test_queries = [
    "What are the main services in the trading platform?",
    "How does the order-entry service work?",
    "What are the common failure scenarios?",
]

print("üß™ Testing RAG system with sample queries...\n")
for i, query in enumerate(test_queries, 1):
    print(f"Q{i}: {query}")
    response = rag_chain.invoke({"question": query})
    print(f"A{i}: {response['answer']}\n")
    print("-" * 80 + "\n")

## 6. Test RAG System with Sample Queries

In [None]:
# Set up LLM
print(f"ü§ñ Setting up ChatOpenAI LLM (gpt-3.5-turbo)...")
llm = ChatOpenAI(
    model_name="gpt-3.5-turbo",
    openai_api_key=api_key,
    temperature=0.7,
    max_tokens=500
)

# Set up memory for conversation
memory = ConversationBufferMemory(
    memory_key="chat_history",
    return_messages=True
)

# Set up retriever
retriever = vector_store.as_retriever(search_kwargs={"k": 3})

# Create conversational RAG chain
print(f"‚õìÔ∏è Building conversational RAG chain...")
rag_chain = ConversationalRetrievalChain.from_llm(
    llm=llm,
    retriever=retriever,
    memory=memory,
    verbose=False
)

print(f"‚úÖ RAG chain ready!")

## 5. Set Up Conversational RAG Chain

In [None]:
# Create embeddings with OpenAI
print(f"üî§ Creating embeddings with OpenAI (text-embedding-ada-002)...")
embeddings = OpenAIEmbeddings(openai_api_key=api_key)

# Create vector store with Chroma
persist_dir = "/tmp/trading_platform_chroma"  # Use /tmp for Colab compatibility
print(f"üíæ Building Chroma vector store at {persist_dir}...")

vector_store = Chroma.from_documents(
    documents=chunks,
    embedding=embeddings,
    persist_directory=persist_dir,
    collection_name="trading_platform"
)

print(f"‚úÖ Vector store ready with {len(chunks)} embeddings")

## 4. Generate Embeddings and Build Vector Store

In [None]:
# Chunk all documents for better retrieval
chunk_size = 1000
chunk_overlap = 200

print(f"üìä Chunking {len(all_documents)} documents (size={chunk_size}, overlap={chunk_overlap})...")
splitter = RecursiveCharacterTextSplitter(
    chunk_size=chunk_size,
    chunk_overlap=chunk_overlap,
    separators=["\n\n", "\n", " ", ""]
)
chunks = splitter.split_documents(all_documents)
print(f"‚úÖ Created {len(chunks)} total chunks")
print(f"üìã Sample chunk from first document:\n{chunks[0].page_content[:300]}...")

In [None]:
# Load documents from all service documentation URLs
docs_urls = [
    "https://raw.githubusercontent.com/somakalla1-droid/RAG/main/docs/trading-platform-doc.md",
    "https://raw.githubusercontent.com/somakalla1-droid/RAG/main/docs/order-validate-doc.md",
    "https://raw.githubusercontent.com/somakalla1-droid/RAG/main/docs/order-entry-doc.md",
    "https://raw.githubusercontent.com/somakalla1-droid/RAG/main/docs/order-router-doc.md",
    "https://raw.githubusercontent.com/somakalla1-droid/RAG/main/docs/fix-service-doc.md",
    "https://raw.githubusercontent.com/somakalla1-droid/RAG/main/docs/service-registry-doc.md",
]

all_documents = []
for url in docs_urls:
    print(f"üì• Loading {url.split('/')[-1]}...")
    try:
        loader = WebBaseLoader(url)
        documents = loader.load()
        all_documents.extend(documents)
        print(f"   ‚úÖ Loaded {len(documents)} document(s)")
    except Exception as e:
        print(f"   ‚ö†Ô∏è Error loading: {e}")
        continue

print(f"\n‚úÖ Total documents loaded: {len(all_documents)}")
print(f"üìä Total content length: {sum(len(doc.page_content) for doc in all_documents):,} characters")

## 3. Load and Process Documents

In [None]:
# Set OpenAI API key
api_key = os.getenv("OPENAI_API_KEY")
if not api_key:
    print("üîë Enter your OpenAI API key:")
    api_key = getpass("OpenAI API Key: ")
    os.environ["OPENAI_API_KEY"] = api_key

print(f"‚úÖ OpenAI API key configured (starts with: {api_key[:6]}...)")

## 2. Configure OpenAI API Client

Add your OpenAI API key below. Get one from https://platform.openai.com/api-keys

In [None]:
# Import required libraries
import os
import requests
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain_community.document_loaders import WebBaseLoader
from langchain_community.vectorstores import Chroma
from langchain_openai import OpenAIEmbeddings, ChatOpenAI
from langchain.chains import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain.schema import Document
import gradio as gr
from getpass import getpass

print("‚úÖ All imports successful!")

In [None]:
# Install dependencies
import subprocess
import sys

packages = [
    "langchain==0.1.0",
    "langchain-community==0.0.10",
    "langchain-openai==0.0.5",
    "chromadb==0.3.21",
    "sentence-transformers==2.2.2",
    "requests==2.31.0",
    "gradio==4.0.0",
    "python-dotenv==1.0.0"
]

print("üì¶ Installing dependencies...")
for package in packages:
    subprocess.check_call([sys.executable, "-m", "pip", "-q", "install", package])
print("‚úÖ All packages installed!")

## 1. Install and Import Required Libraries

# RAG Trading Platform Documentation Chatbot

**Build a Retrieval-Augmented Generation (RAG) chatbot** that answers questions about the trading platform using:
- **LangChain** for orchestration
- **OpenAI API** for embeddings and LLM
- **Chroma DB** for vector storage
- **Gradio** for UI

This notebook is optimized for **Google Colab** with easy local setup.