# Nexora RAG Customer Service Chatbot – Demo & Implementation Guide
**Author:** Sujan Adhikari

This interactive notebook walks through building a Retrieval‑Augmented Generation (RAG) customer‑service chatbot for **Nexora Pty Ltd** using **LangChain**, **Ollama Embeddings + Qwen3:1.7b**, and **ChromaDB**.  It covers:
1. Data loading & preprocessing  
2. Vector‑store creation  
3. Simple intent recognition & entity extraction  
4. RAG pipeline assembly  
5. Lightweight Gradio web‑chat demo  
6. (Optional) pointers for deployment & MLOps alignment.

---

## 0  Environment & Requirements

In [None]:
#!pip install langchain langchain-chroma langchain-ollama chromadb ollama gradio spacy pandas 
#!python -m spacy download en_core_web_md

## 1  Imports & Global Config

In [11]:
import json, re, pandas as pd, csv
from pathlib import Path

from langchain_chroma import Chroma
from langchain_ollama import OllamaEmbeddings, ChatOllama
from langchain.prompts import ChatPromptTemplate
from langchain.chains import RetrievalQA
from langchain.schema import Document
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler

import spacy, gradio as gr
from spacy.matcher import Matcher, PhraseMatcher

import streamlit as st

In [12]:
nlp = spacy.load('en_core_web_md')

# configure ollama model & embeddings
MODEL = 'qwen3:1.7b'
EMBEDDINGS_MODEL = "mxbai-embed-large"
embeddings = OllamaEmbeddings(model=MODEL)

## 2  Fix the faqs.csv dataset since there are questions separated by comma and not enclosed in quotes

In [None]:
input_path = Path("data/faqs.csv")
output_path = Path("data/faqs_cleaned.csv")

with open(input_path, "r", encoding="utf-8") as infile, open(output_path, "w", newline="", encoding="utf-8") as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile, quoting=csv.QUOTE_MINIMAL)

    header = next(reader)
    writer.writerow(header)

    for row in reader:
        if len(row) > 3:
            # Fix rows with a comma in the question (merge columns until we have 3)
            question_parts = row[:-2]  # Everything except last 2 fields
            question = ",".join(question_parts).strip()
            answer = row[-2].strip()
            category = row[-1].strip()
            writer.writerow([question, answer, category])
        else:
            writer.writerow(row)

## 2  Data Ingestion & Preprocessing: The Knowledge Foundation

In [17]:
DATA_DIR = Path('data')  # adjust if different
products_path = DATA_DIR / 'products_occupation.json'
faqs_path = DATA_DIR / 'faqs_cleaned.csv'

# Load products
with open(products_path, 'r', encoding='utf-8') as f:
    products_data = json.load(f)['products']

# Flatten products into plain‑text docs
product_docs = []
for p in products_data:
    text = f"""
    Product Name: {p['name']}
    Description: {p['description']}
    Target Industries: {', '.join(p['target_industries'])}
    Coverage Options: {', '.join(p['coverage_options'])}
    Premium Range: {p['premium_range']['min']}-{p['premium_range']['max']} {p['premium_range']['currency']}
    Excess Range: {p['excess_range']['min']}-{p['excess_range']['max']} {p['excess_range']['currency']}
    Key Features: {', '.join(p['key_features'])}
    Exclusions: {', '.join(p['exclusions'])}
    Unique Selling Points: {', '.join(p['unique_selling_points'])}
    Required Documents: {', '.join(p['required_documents'])}
    """.strip()
    product_docs.append(Document(page_content=text, metadata={'type': 'product', 'name': p['name']}))


# Load FAQs
faqs_df = pd.read_csv(faqs_path, quotechar='"')

# ✅ Validate the structure of the CSV
assert set(faqs_df.columns) == {'question', 'answer', 'category'}, "Unexpected CSV format"

faq_docs = [
    Document(
        page_content=f"Question: {row.question}\nAnswer: {row.answer}",
        metadata={'type': 'faq', 'category': row.category}
    )
    for _, row in faqs_df.iterrows()
]

# Combine all documents
documents = product_docs + faq_docs
print(f'Total documents: {len(documents)}')

Total documents: 136


## 3  Create / Reload Chroma Vector Store

In [19]:
# Set up vector store
VECTOR_DIR = 'nexora_chroma'

if Path(VECTOR_DIR).exists():
    vector_store = Chroma(persist_directory=VECTOR_DIR, embedding_function=embeddings)
    print('Loaded existing vector store.')
else:
    vector_store = Chroma.from_documents(
        documents=documents,
        embedding=embeddings,
        persist_directory=VECTOR_DIR
    )
    print('Built new vector store →', VECTOR_DIR)

Loaded existing vector store.


## 4  Lightweight Intent Recognition & Entity Extraction

In [20]:
INTENT_PATTERNS = {
    'claims': r'\b(claim|lodg(e|ing)|damage)\b',
    'coverage': r'\b(cover(ed|age)|policy limit|exclusion|excess)\b',
    'products': r'\bwhat is [A-Za-z ]+ insurance|types? of insurance\b',
    'pricing': r'\b(cost|price|premium|fee)\b',
    'account': r'\b(login|account|certificate of currency|policy documents|amend)\b',
}

PRODUCT_NAMES = [p['name'].lower() for p in products_data]

def classify_intent(query:str):
    for intent, pattern in INTENT_PATTERNS.items():
        if re.search(pattern, query, re.IGNORECASE):
            return intent
    return 'general'

def extract_entities(query:str):
    doc = nlp(query)
    products = [p for p in PRODUCT_NAMES if p in query.lower()]
    industries = [ent.text for ent in doc.ents if ent.label_=='ORG' or ent.label_=='NORP']
    return {'products': products, 'industries': industries}

print(classify_intent('How do I lodge a claim?'))
print(extract_entities('Do you cover Architecture firms for Professional Indemnity?'))

claims
{'products': ['professional indemnity'], 'industries': ['Professional Indemnity']}


## 5  Build Retrieval‑Augmented QA Chain

In [21]:
llm = ChatOllama(model=MODEL, temperature=0, num_ctx=10000, callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))

prompt_template = (
    "You are NexoraBot, a helpful and knowledgeable customer-service assistant for an Australian SME insurance broker. "
    "Use ONLY the provided context to answer. If unsure, say you don't know but can escalate to a human agent. "
    "Answer in a friendly, concise manner and, where relevant, suggest next steps inside Nexora's platform.\n\n"
    "### Context\n{context}\n\n### Question\n{question}\n\n### Answer\n"
)

prompt = ChatPromptTemplate.from_template(prompt_template)

rag_chain = RetrievalQA.from_chain_type(
    llm=llm,
    chain_type='stuff',
    retriever=vector_store.as_retriever(search_kwargs={'k':4}),
    chain_type_kwargs={'prompt': prompt}
)

### Quick Test

In [22]:
question = 'What does Professional Indemnity cover?'
print(rag_chain.run(question))

  print(rag_chain.run(question))


<think>
Okay, the user is asking about what Professional Indemnity cover entails. Let me check the provided context first.

Looking at the context, the answers given are about Public Liability, Personal Accident Insurance, and whether a policy needs to start immediately. There's no direct mention of Professional Indemnity. The user's question is specifically about Professional Indemnity, which is a type of insurance coverage.

Since the context doesn't include information about Professional Indemnity, I need to determine if I can answer based on the given data. The answer should be concise and friendly. The user might be expecting a standard explanation, but since the context doesn't cover it, I should mention that the information isn't available in the provided context and suggest escalating to a human agent.

I should also make sure to follow the instructions: if unsure, say I don't know and escalate. So, the answer would state that the context doesn't cover Professional Indemnity an

In [23]:
print(rag_chain.invoke({'query': question}))

{'query': 'What does Professional Indemnity cover?', 'result': "<think>\nOkay, the user is asking about what Professional Indemnity cover entails. Let me check the provided context first.\n\nLooking at the context, the answers given are about Public Liability, Personal Accident Insurance, and whether a policy needs to start immediately. There's no direct mention of Professional Indemnity. The user's question is specifically about Professional Indemnity, which is a type of insurance coverage.\n\nSince the context doesn't include information about Professional Indemnity, I need to determine if I can answer based on the given data. The answer should be concise and friendly. The user might be expecting a standard explanation, but since the context doesn't cover it, I should mention that the information isn't available in the provided context and suggest escalating to a human agent.\n\nI should also make sure to follow the instructions: if unsure, say I don't know and escalate. So, the answ

## 6  Gradio Web Chat Demo

In [41]:
def chat_function(message, history):
    intent = classify_intent(message)
    entities = extract_entities(message)

    result = rag_chain.invoke(message)
    answer = result["result"]

    meta_note = f"\n\n_Bot note: intent={intent}, entities={entities}_"
    return answer + meta_note

with gr.Blocks(title='Nexora Chatbot Demo') as demo:
    gr.Markdown('# Nexora Insurance Chatbot')
    chatbot = gr.Chatbot(type='messages')
    msg = gr.Textbox(placeholder='Ask a question about your policy…', label='Your message')
    send = gr.Button('Send')
    state = gr.State([])  # List[Tuple[str, str]]

    def respond(user_message, chat_history):
        bot_reply = chat_function(user_message, chat_history)
        chat_history.append((user_message, bot_reply))
        return '', chat_history, chat_history

    send.click(respond, [msg, state], [msg, chatbot, state])

# To launch the interface inside notebook use demo.launch(debug=True)
# Or from CLI: `python -m gradio app.py` if saved separately.

  chatbot = gr.Chatbot(label='Chatbot')  # Removed type='messages' for compatibility


In [42]:
demo.launch(debug=True)

* Running on local URL:  http://127.0.0.1:7860
* To create a public link, set `share=True` in `launch()`.


Keyboard interruption in main thread... closing server.




## 7  Deployment & Next Steps (Conceptual)
* **FastAPI + Uvicorn** wrapper to expose `chat_function` as `/chat` endpoint.  
* **Docker** containerization with Ollama serving model weights locally.  
* **CI/CD** via GitHub Actions → build, push image to GHCR, deploy to Azure Container Apps.  
* **Monitoring**: Prometheus + Grafana for latency, Chroma vector‑store health; OpenTelemetry traces for LLM calls.  
* **Fine‑Tuning**: see `finetune_plan.md` (not included) for leveraging annotated conversations to supervise‑fine‑tune Llama‑3 using QLoRA.

---
**Enjoy experimenting with your Nexora chatbot!**