# RAG-KG: Implementation Walkthrough (SIGIR '24 Aligned)

This notebook provides a step-by-step walkthrough of the RAG-KG Customer Service QA System, aligned with the LinkedIn SIGIR '24 research paper: **"Retrieval-Augmented Generation with Knowledge Graphs for Customer Service Question Answering"**.

## Objectives
1. Understand the Dual-Level Knowledge Graph architecture.
2. Implement Entity-Section mapping for precise extraction.
3. Apply $S_{T_i}$ scoring for ticket-level retrieval.
4. Use LLM-driven subgraph extraction for context generation.

## 1. Environment Setup
Ensure you have Neo4j, Qdrant, and Ollama running.

In [None]:
import os
import json
from dotenv import load_dotenv
from app.query_processor import QueryProcessor
from app.retrieval_system import RetrievalSystem
from app.answer_generator import AnswerGenerator

load_dotenv()
print("Environment loaded.")

## 2. Query Processing (SIGIR '24 Entity-Section Mapping)
The paper uses a `Map(Section -> Value)` for entity extraction.

In [None]:
processor = QueryProcessor()
query = "How to fix the csv upload error in the production dashboard?"
processed = processor.process(query)

print(f"Extracted Entities: {json.dumps(processed['entities'], indent=2)}")
print(f"Detected Intent: {processed['intent']}")

## 3. Retrieval with $S_{T_i}$ Scoring
The system calculates a ticket-level score by summing similarity contributions from specific category nodes.

In [None]:
retriever = RetrievalSystem()
retriever.initialize()

sources = retriever.retrieve(processed)
for source in sources[:3]:
    print(f"Ticket: {source['ticket_id']} | Score: {source['score']:.2f} | Type: {source['node_type']}")

## 4. LLM-driven Subgraph Extraction
For the top-k tickets, the system generates specific Cypher queries to extract the most relevant context.

In [None]:
# This is handled internally by the retriever.retrieve() call now!
print(f"Retrieved {len(sources)} context nodes across relevant tickets.")

## 5. Answer Generation
Finally, the LLM constructs an answer using the extracted subgraphs.

In [None]:
generator = AnswerGenerator()
answer, confidence = generator.generate(query, sources, processed)

print(f"Answer: {answer}")
print(f"Confidence: {confidence:.2f}")