# Deming's "Out of the Crisis" RAG with RAGAS Evaluation

This notebook uses Deming's Out Of The Crisis book converted to use in RAG and RAGAS.

The format of the json file is defined in out-of-crisis-sample.json. If you have this book, you can convert it to json using [js epub parser](https://github.com/gaoxiaoliangz/epub-parser) and the logics from out-of-crisis-converter.js script.


In [1]:
## Load environment variables

import os
from dotenv import load_dotenv, find_dotenv, dotenv_values

# Load with explicit path and allow override
dotenv_path = find_dotenv(usecwd=True)
print("dotenv_path:", dotenv_path or "NOT FOUND")
load_dotenv(dotenv_path=dotenv_path, override=True)

# Show what was parsed from the file (safe preview)
parsed = dotenv_values(dotenv_path) if dotenv_path else {}
print("Keys in .env:", sorted(parsed.keys()))
print("Has OPENAI_API_KEY in .env?:", "OPENAI_API_KEY" in parsed)

val = os.getenv("OPENAI_API_KEY")
print("Env OPENAI_API_KEY present?:", val is not None)
print("Value prefix (masked):", (val[:6] + "…") if val else None)

# Current working directory (to catch path mistakes)
print("cwd:", os.getcwd())


dotenv_path: /workspace/.env
Keys in .env: ['HANDBOOK_SOURCE', 'LANGSMITH_API_KEY', 'LANGSMITH_ENDPOINT', 'LANGSMITH_PROJECT', 'LANGSMITH_TRACING', 'OPENAI_API_KEY', 'POSTS_SOURCE']
Has OPENAI_API_KEY in .env?: True
Env OPENAI_API_KEY present?: True
Value prefix (masked): sk-pro…
cwd: /workspace


In [2]:
# Define LLM model

import getpass, os
from langchain.chat_models import init_chat_model

if not os.environ.get("OPENAI_API_KEY"):
    os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

llm = init_chat_model("gpt-4o-mini", model_provider="openai", verbose=True)


In [3]:
# Choose embeddings

import getpass
import os

if not os.environ.get("OPENAI_API_KEY"):
  os.environ["OPENAI_API_KEY"] = getpass.getpass("Enter API key for OpenAI: ")

from langchain_openai import OpenAIEmbeddings

embeddings = OpenAIEmbeddings(model="text-embedding-3-large")


In [4]:
# Load Deming's book data and create vector store
from typing import TypedDict, List, Dict
import json, os
from langchain_core.documents import Document
from langchain_core.vectorstores import InMemoryVectorStore

class HandbookEntry(TypedDict):
    url: str
    title: str
    sections: Dict[str, str]

def load_handbook(json_path: str) -> List[HandbookEntry]:
    with open(json_path, 'r', encoding='utf-8') as f:
        return json.load(f)

def create_documents_fixed(entries: List[HandbookEntry]) -> List[Document]:
    documents = []
    # chunk each section within article individually to avoid token limits
    for entry in entries:
        for section_title, section_text in entry['sections'].items():
            if not section_text or len(section_text.strip()) < 50:  # Skip very short sections
                continue
            metadata = {
                'url': entry['url'],
                'title': entry['title'],
                'section': section_title,
                }
            documents.append(Document(page_content=section_text, metadata=metadata))
    return documents

print("Loading Deming's Out of the Crisis...")
handbook_entries = load_handbook("out-of-crisis.json")
print(f"Loaded {len(handbook_entries)} book entries")

# Convert to Langchain documents (one per section)
print("Converting book sections to Langchain documents...")
documents = create_documents_fixed(handbook_entries)
print(f"Created {len(documents)} documents")

# Create vector store and index documents in batches to avoid token limits
vector_store = InMemoryVectorStore(embeddings)

# Add documents in smaller batches to avoid token limits
batch_size = 50  # Process 50 documents at a time
all_document_ids = []

print(f"Adding {len(documents)} documents in batches of {batch_size}...")
for i in range(0, len(documents), batch_size):
    batch = documents[i:i + batch_size]
    print(f"Processing batch {i//batch_size + 1}/{(len(documents) + batch_size - 1)//batch_size} ({len(batch)} documents)")
    batch_ids = vector_store.add_documents(documents=batch)
    all_document_ids.extend(batch_ids)

print(f"Successfully added {len(all_document_ids)} documents to vector store")
print("Sample Document Ids:", all_document_ids[:5])

# Test the vector store with a simple search
print("\nTesting vector store with a sample search...")
test_results = vector_store.similarity_search("Deming's 14 points", k=3)
print(f"Found {len(test_results)} relevant documents for test query")
if test_results:
    print(f"First result title: {test_results[0].metadata.get('title', 'Unknown')}")
    print(f"First result section: {test_results[0].metadata.get('section', 'Unknown')}")
    print(f"First result preview: {test_results[0].page_content[:200]}...")


Loading Deming's Out of the Crisis...
Loaded 17 book entries
Converting book sections to Langchain documents...
Created 354 documents
Adding 354 documents in batches of 50...
Processing batch 1/8 (50 documents)
Processing batch 2/8 (50 documents)
Processing batch 3/8 (50 documents)
Processing batch 4/8 (50 documents)
Processing batch 5/8 (50 documents)
Processing batch 6/8 (50 documents)
Processing batch 7/8 (50 documents)
Processing batch 8/8 (4 documents)
Successfully added 354 documents to vector store
Sample Document Ids: ['04b429e1-f76b-4685-8f3b-2b72d66c6ba6', '61300ec7-c1ac-4c1c-af63-8f763e8ec8f4', 'fbf2811d-a5f6-4243-9910-f31764b7453b', '38135e2a-c489-4cac-870b-05fe7e451147', '7dad50e0-4691-41cb-80a6-f7bb389659d7']

Testing vector store with a sample search...
Found 3 relevant documents for test query
First result title: 2 Principles for Transformation of Western Management
First result section: Condensation of the 14 Points for Management
First result preview: Condensation of 

In [5]:
# Test the RAG system
from langchain_core.prompts import PromptTemplate

# Define prompt for question-answering
prompt = PromptTemplate(
    input_variables=["question", "context"],
    template="""
        Act as a conversational interface for answering questions based on the content of Deming's "Out of the Crisis" in your knowledge base.

        When information related to a specific topic does not exist, return no results.
                
        Question: {question} 
        Context: {context} 
        Answer:
        """
)

# Simple test function
def test_rag_system(question: str):
    print(f"Testing question: {question}")
    
    # Retrieve relevant documents
    retrieved_docs = vector_store.similarity_search(question, k=5)
    print(f"Retrieved {len(retrieved_docs)} documents")
    
    if not retrieved_docs:
        return "No relevant documents found."
    
    # Generate answer
    docs_content = "\n\n".join(doc.page_content for doc in retrieved_docs)
    messages = prompt.invoke({"question": question, "context": docs_content})
    response = llm.invoke(messages)
    
    return response.content

# Test with sample questions
test_questions = [
    "What are Deming's 14 points for management?",
    "How does Deming define quality?",
    "What is the chain reaction of quality improvement?"
]

print("=" * 80)
print("TESTING RAG SYSTEM")
print("=" * 80)

for i, question in enumerate(test_questions, 1):
    print(f"\n--- TEST {i}/3 ---")
    answer = test_rag_system(question)
    print(f"\nAnswer:\n{answer}")
    print("=" * 60)
    print()


TESTING RAG SYSTEM

--- TEST 1/3 ---
Testing question: What are Deming's 14 points for management?
Retrieved 5 documents

Answer:
Deming's 14 Points for Management are as follows:

1. **Create constancy of purpose toward improvement of product and service**: Aim to be competitive and provide jobs through a long-term focus on quality improvement.

2. **Adopt the new philosophy**: Embrace a shift in mindset for the new economic era and take responsibility for change.

3. **Cease dependence on inspection to achieve quality**: Focus on building quality into the product from the start instead of relying on inspection.

4. **End the practice of awarding business on the basis of price tag**: Minimize total costs by fostering long-term relationships with a single supplier based on loyalty and trust.

5. **Improve constantly and forever the system of production and service**: This should aim to enhance quality and productivity while reducing costs.

6. **Institute training on the job**: Emphasi

In [6]:
# RAGAS-style evaluation
def simple_evaluation(question: str):
    """Simple evaluation that mimics RAGAS functionality"""
    print(f"Evaluating question: {question}")
    
    # Get RAG response using our existing function
    answer = test_rag_system(question)
    
    # Get retrieved documents for context
    retrieved_docs = vector_store.similarity_search(question, k=5)
    retrieved_contexts = [doc.page_content for doc in retrieved_docs]
    
    # Create evaluation data structure
    evaluation_data = {
        "user_input": question,
        "retrieved_contexts": retrieved_contexts,
        "response": answer
    }
    
    print("=" * 60)
    print("EVALUATION RESULTS")
    print("=" * 60)
    print(f"Question: {question}")
    print(f"Retrieved {len(retrieved_contexts)} context documents")
    print(f"Answer: {answer}")
    print("=" * 60)
    
    return evaluation_data

# Run comprehensive evaluation
print("RUNNING RAGAS-STYLE EVALUATION")
print("=" * 80)

evaluation_results = []
for i, question in enumerate(test_questions, 1):
    print(f"\n--- EVALUATION {i}/3 ---")
    result = simple_evaluation(question)
    evaluation_results.append(result)
    print()

print("\n" + "=" * 80)
print("EVALUATION COMPLETE")
print(f"Successfully evaluated {len(evaluation_results)} questions")
print("=" * 80)


RUNNING RAGAS-STYLE EVALUATION

--- EVALUATION 1/3 ---
Evaluating question: What are Deming's 14 points for management?
Testing question: What are Deming's 14 points for management?
Retrieved 5 documents
EVALUATION RESULTS
Question: What are Deming's 14 points for management?
Retrieved 5 context documents
Answer: Deming's 14 Points for Management are as follows:

1. **Create constancy of purpose** toward improvement of product and service, aiming to become competitive and to stay in business, providing jobs.
2. **Adopt the new philosophy.** Western management must awaken to the challenges of a new economic age and take on leadership for change.
3. **Cease dependence on inspection** to achieve quality. Build quality into the product from the start to eliminate the need for mass inspection.
4. **End the practice of awarding business** on the basis of price. Instead, focus on minimizing total costs and develop long-term relationships with a single supplier based on loyalty and trust.
5. *