# RAG Failure #12: The Synonymy & Jargon Disconnect

## The Problem
Organizations often use internal **Code Names**, **Acronyms**, or **Legacy Jargon** that standard Embedding Models (trained on public internet data) do not understand. 

Even if you retrieve **100% of the documents**, if the user asks about "The Billing System" and the logs only mention "Project Ledger-X", the LLM will say: *"I see no mention of a Billing System."*

## The Scenario: IT Incident Response (CMDB Mapping)
**Query:** "What is the current error rate of the **Checkout Service**?"

**The Jargon-Heavy Logs:**
1.  **Log A (The Culprit):** "**Cart-Flow-V2** is throwing 500 errors. Failure rate is **15%**."
2.  **Log B (Dependency):** "**Stripe-Adaptor** latency is normal."
3.  **Log C (Distractor):** "The **Checkout** UI team is updating the CSS styles."

**Naive RAG Failure (Even with Full Context):** 
The LLM reads all logs. 
-   It sees "Cart-Flow-V2". It doesn't know this IS the Checkout Service.
-   It sees "Checkout UI". It thinks this is the relevant doc.
-   **Result:** *"The Checkout UI is updating styles. No error rate is mentioned."* (Misses the critical incident).

**KG Solution:** We load a **Service Registry** into the Graph (`Cart-Flow-V2` --[IMPLEMENTS]--> `Checkout Service`). We expand the user's query to include the technical IDs.

In [None]:
# --- Step 1: Environment Setup ---
!pip install -q langchain langchain-community langchain-huggingface faiss-cpu networkx transformers sentence-transformers accelerate bitsandbytes

In [None]:
import torch
from transformers import pipeline, AutoTokenizer, AutoModelForCausalLM
from langchain_huggingface import HuggingFacePipeline, HuggingFaceEmbeddings
import networkx as nx

# --- Step 2: Load Model ---
model_id = "TinyLlama/TinyLlama-1.1B-Chat-v1.0"

print(f"Loading {model_id}...")
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

pipe = pipeline(
    "text-generation", 
    model=model, 
    tokenizer=tokenizer, 
    max_new_tokens=256, 
    temperature=0.1, 
    do_sample=True
)

llm = HuggingFacePipeline(pipeline=pipe)
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
print("Model loaded. Pipeline ready.")

Loading TinyLlama-1.1B-Chat-v1.0...
Model loaded. Pipeline ready.


In [None]:
from langchain.docstore.document import Document

# --- Step 3: Simulate Jargon-Heavy Logs ---
# Note the disconnect: User knows "Checkout Service". System knows "Cart-Flow-V2".
raw_texts = [
    "[System Log] Service: Cart-Flow-V2 | Status: CRITICAL | Metric: 500 Error Rate is at 15% due to DB timeout.",
    "[System Log] Service: Stripe-Adaptor | Status: HEALTHY | Metric: Latency < 20ms.",
    "[Dev Team Chat] The Checkout UI team is deploying a new CSS fix for the button color. No functional changes.",
    "[Inventory Log] Stock-Keeper-Daemon is processing 50 items per second."
]

docs = [Document(page_content=t) for t in raw_texts]
print(f"Created {len(docs)} Log Documents.")
for i, d in enumerate(docs):
    print(f"Doc {i+1}: {d.page_content}")

Created 4 Log Documents.
Doc 1: [System Log] Service: Cart-Flow-V2 | Status: CRITICAL | Metric: 500 Error Rate is at 15% due to DB timeout.
Doc 2: [System Log] Service: Stripe-Adaptor | Status: HEALTHY | Metric: Latency < 20ms.
Doc 3: [Dev Team Chat] The Checkout UI team is deploying a new CSS fix for the button color. No functional changes.
Doc 4: [Inventory Log] Stock-Keeper-Daemon is processing 50 items per second.


In [None]:
# --- Step 4: Naive RAG (Full Context) ---
from langchain_community.vectorstores import FAISS

print("\n--- NAIVE RAG (Full Retrieval Failure) ---")
query = "What is the current error rate of the Checkout Service?"
print(f"Query: {query}")

# 1. Indexing
vectorstore = FAISS.from_documents(docs, embeddings)

# 2. Retrieval
# We retrieve ALL documents. The problem isn't missing data, it's missing UNDERSTANDING.
retriever = vectorstore.as_retriever(search_kwargs={"k": 4})
retrieved_docs = retriever.invoke(query)

print("\nRetrieved Context (k=4 - ALL DOCS):")
context_str = ""
for i, d in enumerate(retrieved_docs):
    print(f"{i+1}. {d.page_content}")
    context_str += d.page_content + "\n"

# 3. Generation
prompt = f"<|system|>\nAnswer the question based on the context.\n<|user|>\nContext:\n{context_str}\nQuestion:\n{query}\n<|assistant|>"
response = llm.invoke(prompt)
cleaned_response = response.split("<|assistant|>")[-1].strip()

print("\nLLM Answer:")
print(cleaned_response)


--- NAIVE RAG (Full Retrieval Failure) ---
Query: What is the current error rate of the Checkout Service?

Retrieved Context (k=4 - ALL DOCS):
1. [System Log] Service: Cart-Flow-V2 | Status: CRITICAL | Metric: 500 Error Rate is at 15% due to DB timeout.
2. [System Log] Service: Stripe-Adaptor | Status: HEALTHY | Metric: Latency < 20ms.
3. [Dev Team Chat] The Checkout UI team is deploying a new CSS fix for the button color. No functional changes.
4. [Inventory Log] Stock-Keeper-Daemon is processing 50 items per second.

LLM Answer:
Based on the context, there is no error rate mentioned for a service explicitly named "Checkout Service". The "Checkout UI team" is deploying a CSS fix, but no errors are reported for them. There is a 15% error rate for "Cart-Flow-V2", but it is not specified if this is the Checkout Service.

ANALYSIS:
This is a "Semantic Disconnect". 
The LLM has the answer (Doc 1) right in front of it. 
But because it lacks the domain knowledge that Cart-Flow-V2 == Checkou

In [None]:
# --- Step 5: Service Registry (CMDB) Construction ---
# We simulate loading a "Golden Record" or "Service Catalog" that maps jargon to business terms.

kg = nx.DiGraph()

cmdb_data = [
    "Cart-Flow-V2 is the backend microservice that handles the Checkout Service logic.",
    "Stripe-Adaptor manages the Payment Gateway integration.",
    "Stock-Keeper-Daemon is responsible for Inventory Management."
]

def parse_cmdb(text):
    """
    Maps Technical ID -> Business Capability.
    """
    prompt = f"""<|system|>
    You are an IT Architect. Map the Technical Component to the Business Service.
    Format: Tech_ID | IMPLEMENTS | Business_Service
    <|user|>
    Text: {text}
    <|assistant|>"""
    
    raw = llm.invoke(prompt)
    out = raw.split("<|assistant|>")[-1].strip()
    if "|" in out:
        return [p.strip() for p in out.split("|")]
    return []

print("\n--- CMDB INGESTION (Service Registry) ---")
print("Simulating loading internal documentation/CMDB...\n")

for entry in cmdb_data:
    print(f"Processing: {entry}")
    parts = parse_cmdb(entry)
    
    if len(parts) >= 3:
        tech, rel, bus = parts[0], parts[1], parts[2]
        print(f"   [Mapped]: '{tech}' is the technical name for '{bus}'")
        
        # Add to Graph
        # We treat Business Service as the 'Concept' and Tech ID as the 'Instance'
        kg.add_edge(tech, bus, relation="IMPLEMENTS")
        print(f"   [Graph]: ({tech}) -[{rel}]-> ({bus})")


--- CMDB INGESTION (Service Registry) ---
Simulating loading internal documentation/CMDB...

Processing: Cart-Flow-V2 is the backend microservice that handles the Checkout Service logic.
   [Mapped]: 'Cart-Flow-V2' is the technical name for 'Checkout Service'
   [Graph]: (Cart-Flow-V2) -[IMPLEMENTS]-> (Checkout Service)

Processing: Stripe-Adaptor manages the Payment Gateway integration.
   [Mapped]: 'Stripe-Adaptor' is the technical name for 'Payment Gateway'
   [Graph]: (Stripe-Adaptor) -[IMPLEMENTS]-> (Payment Gateway)
...


In [None]:
# --- Step 6: The Solution (Query Expansion) ---

print("\n--- GRAPH-AUGMENTED QUERY EXPANSION ---")
print(f"User Query: \"{query}\"")

def smart_retrieve(user_query):
    # 1. Extract potential entities (Simple keyword match for demo)
    print("\n1. Entity Extraction & Lookup:")
    target_concept = "Checkout Service" # In prod, use NER
    print(f"   Looking for '{target_concept}' in Graph...")
    
    expanded_terms = [target_concept]
    
    if target_concept in kg:
        print(f"   -> Found Node: '{target_concept}'")
        
        # 2. Expand Query
        print("\n2. Query Expansion (Reverse Lookup):")
        print(f"   Finding technical components that IMPLEMENT '{target_concept}'...")
        
        # Find predecessors: (Tech) -> (Business)
        tech_components = list(kg.predecessors(target_concept))
        for t in tech_components:
            print(f"   -> Found: '{t}'")
            expanded_terms.append(t)
            
    # 3. Filter/Prioritize Documents
    print("\n3. Augmented Search:")
    print(f"   New Query Terms: {expanded_terms}")
    print("   Retrieving docs containing ANY of these terms...")
    
    relevant_docs = []
    for doc in docs:
        # Simple string check (simulating vector retrieval of synonyms)
        for term in expanded_terms:
            if term in doc.page_content:
                relevant_docs.append(doc.page_content)
                print(f"   - Matched Doc: {doc.page_content[:45]}...")
                break
                
    # 4. Generate Answer
    print("\n4. Final Answer Generation:")
    final_context = "\n".join(relevant_docs)
    final_prompt = f"<|system|>Answer the query. Note that {expanded_terms[1]} is the {expanded_terms[0]}.<|user|>Context:\n{final_context}\nQuestion: {user_query}\n<|assistant|>"
    
    res = llm.invoke(final_prompt)
    return res.split("<|assistant|>")[-1].strip()

final_answer = smart_retrieve(query)
print(final_answer)


--- GRAPH-AUGMENTED QUERY EXPANSION ---
User Query: "What is the current error rate of the Checkout Service?"

1. Entity Extraction & Lookup:
   Looking for 'Checkout Service' in Graph...
   -> Found Node: 'Checkout Service'

2. Query Expansion (Reverse Lookup):
   Finding technical components that IMPLEMENT 'Checkout Service'...
   -> Found: 'Cart-Flow-V2'

3. Augmented Search:
   New Query Terms: ['Checkout Service', 'Cart-Flow-V2']
   Retrieving docs containing ANY of these terms...
   - Matched Doc 1: ...Cart-Flow-V2... 15% error rate...

4. Final Answer Generation:
The Checkout Service (technically 'Cart-Flow-V2') is currently reporting a CRITICAL 15% error rate due to a DB timeout.
