<a href="https://colab.research.google.com/github/micah-shull/RAG-LangChain/blob/main/LC_009_RAG_CustServiceBot_Testing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Pip Install Packages

In [None]:
!pip install --upgrade --quiet \
    langchain \
    langchain-huggingface \
    langchain-openai \
    langchain-community \
    chromadb \
    python-dotenv \
    transformers \
    accelerate \
    sentencepiece

## Load Libaries

In [None]:
# üåø Environment setup
import os                                 # File paths and OS interaction
from dotenv import load_dotenv            # Load environment variables from .env file
import langchain; print(langchain.__version__)  # Check LangChain version

# üìÑ Document loading and preprocessing
from langchain_core.documents import Document                   # Base document type
from langchain_community.document_loaders import TextLoader     # Loads plain text files
from langchain.text_splitter import RecursiveCharacterTextSplitter  # Splits long docs into smaller chunks

# üî¢ Embeddings + vector storage
from langchain_huggingface import HuggingFaceEmbeddings         # HuggingFace embedding model
from langchain.vectorstores import Chroma                       # Persistent vector DB (Chroma)

# üí¨ Prompting + output
from langchain_core.prompts import ChatPromptTemplate           # Chat-style prompt templates
from langchain_core.output_parsers import StrOutputParser       # Converts model output to string

# üîó Chains / pipelines
from langchain_core.runnables import Runnable, RunnableLambda   # Compose custom pipelines

# üß† (Optional) Hugging Face LLM client setup
# from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace  # For HF inference API

# üßæ Pretty printing
import textwrap                         # Format long strings for printing
from pprint import pprint               # Nicely format nested data structures


0.3.25


## SET PARAMS

In [None]:
from langchain_openai.chat_models.base import ChatOpenAI

# SET MODEL PARAMS
EMBED_MODEL = "all-MiniLM-L6-v2"
CHUNK_SIZE = 200
CHUNK_OVERLAP = 50
K = 2

# Load token from .env.
load_dotenv("/content/API_KEYS.env", override=True)

LLM_MODEL = ChatOpenAI(
    model_name="gpt-3.5-turbo",
    temperature=0.4  # Moderate creativity; adjust as needed
)



## üßæ Document Cleaning

### üßæ 1. **Load the `.txt` files**

We‚Äôll loop through all files in the folder using `TextLoader`.

### üßπ 2. **Cleaning**

Basic cleaning (e.g. stripping newlines, extra whitespace) is often helpful **before splitting**, especially if the files came from exports or copy-paste.

### ‚úÇÔ∏è 3. **Split into chunks**

We‚Äôll use `RecursiveCharacterTextSplitter` to chunk documents (typically 500‚Äì1000 characters with slight overlap for context continuity).

---

### üßº Why Basic Cleaning Helps

* Removes linebreaks and blank lines that confuse LLMs
* Avoids splitting chunks in weird places
* Standardizes format before embedding

Later you can add more advanced cleaning (e.g., remove boilerplate, normalize headers), but this is a solid default.





In [None]:
# Path to your documents
docs_path = "/content/CFFC_docs"

# Step 1: Load all .txt files in the folder
raw_documents = []
for filename in os.listdir(docs_path):
    if filename.endswith(".txt"):
        file_path = os.path.join(docs_path, filename)
        loader = TextLoader(file_path, encoding="utf-8")
        docs = loader.load()
        raw_documents.extend(docs)

print(f"Loaded {len(raw_documents)} documents.")

# Step 2 (optional): Clean up newlines and extra whitespace
def clean_doc(doc: Document) -> Document:
    cleaned = " ".join(doc.page_content.split())  # Removes newlines & extra spaces
    return Document(page_content=cleaned, metadata=doc.metadata)

cleaned_documents = [clean_doc(doc) for doc in raw_documents]

# Step 3: Split documents into chunks
splitter = RecursiveCharacterTextSplitter(
    chunk_size=CHUNK_SIZE,
    chunk_overlap=CHUNK_OVERLAP
)

chunked_documents = splitter.split_documents(cleaned_documents)

print(f"Split into {len(chunked_documents)} total chunks.")

# Preview the first 5 chunks
print(f"Showing first 5 of {len(chunked_documents)} chunks:\n")

for i, doc in enumerate(chunked_documents[:5]):
    print(f"--- Chunk {i+1} ---")
    print(f"Source: {doc.metadata.get('source', 'N/A')}\n")
    print(textwrap.fill(doc.page_content[:500], width=100))  # limit preview to 500 characters
    print("\n")

Loaded 7 documents.
Split into 174 total chunks.
Showing first 5 of 174 chunks:

--- Chunk 1 ---
Source: /content/CFFC_docs/CFFC_Gainesville Economic Indicators That Matter to Local Businesses.txt

Cashflow 4Cast Gainesville Economic Indicators That Matter to Local Businesses on April 02, 2025 üìó
Gainesville Economic Indicators That Matter to Local Businesses 1. Average Weekly Earnings


--- Chunk 2 ---
Source: /content/CFFC_docs/CFFC_Gainesville Economic Indicators That Matter to Local Businesses.txt

to Local Businesses 1. Average Weekly Earnings (Gainesville) What It Is: This tracks the average
amount workers in Gainesville earn per week ‚Äî across all private sector jobs. It‚Äôs one of the
clearest


--- Chunk 3 ---
Source: /content/CFFC_docs/CFFC_Gainesville Economic Indicators That Matter to Local Businesses.txt

all private sector jobs. It‚Äôs one of the clearest measures of take-home pay and gives insight into
what people can realistically afford. Why It Matters for Gainesvil

## ‚úÖ Embed + Persist in Chroma




In [None]:
# Step 1: Set up Hugging Face embedding model
embedding_model = HuggingFaceEmbeddings(model_name=EMBED_MODEL)

# Step 2: Set up Chroma with persistence
persist_dir = "chroma_db"

vectorstore = Chroma.from_documents(
    documents=chunked_documents,
    embedding=embedding_model,
    persist_directory=persist_dir
)

print(f"‚úÖ Stored {len(chunked_documents)} chunks in Chroma at '{persist_dir}'")

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.5k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

‚úÖ Stored 174 chunks in Chroma at 'chroma_db'


## ‚úÖ Create the Retriever & Prompt Template

In [None]:
retriever = vectorstore.as_retriever(search_kwargs={"k": K})

# prompt template
prompt_template = ChatPromptTemplate.from_template("""
You are a helpful assistant that uses business documents to answer questions.
Use the following context to answer the question as accurately as possible.

Context:
{context}

Question:
{question}

Answer:
""")


## ‚úÖ Step 3: Create the RAG Chain & Run a Query!

In [None]:
# Define RAG chain
rag_chain = (
    RunnableLambda(lambda d: {
        "question": d["question"],
        "docs": retriever.invoke(d["question"])
    })
    | RunnableLambda(lambda d: {
        "context": "\n\n".join([doc.page_content for doc in d["docs"]]),
        "question": d["question"]
    })
    | prompt_template
    | LLM_MODEL
    | StrOutputParser()
)

# Invoke RAG
response = rag_chain.invoke({
    "question": "What are the recent economic indicators in Gainesville that affect local businesses?"
})

# Print response nicely
import textwrap
print("\n" + textwrap.fill(response, width=100))



The recent economic indicators in Gainesville that affect local businesses are the local job market,
Gainesville's unemployment rate, and the overall pressure on the community. These indicators show a
shift in how people feel about the economy, which can have a ripple effect on local businesses even
if nothing internal has changed.


Before writing a refined prompt, let‚Äôs first define **guardrails** ‚Äî these are behavioral, stylistic, and content-specific constraints that keep your RAG chatbot **on-brand, factual, and helpful**.

Because the goal is to deliver **accurate customer service responses** (about pricing, business benefits, product offering), here‚Äôs what we should lock down.

---

## ‚úÖ Guardrails for RAG Assistant

### 1. üß† **Truthfulness / Grounding**

* ‚úÖ "Only answer using the provided context."
* ‚úÖ "If the answer is not found, say so clearly."
* ‚ùå No hallucinating product features, prices, or claims

### 2. üí¨ **Style / Tone**

* ‚úÖ "Keep a professional, informative tone."
* ‚úÖ "Write in full sentences, avoid excessive marketing fluff."
* ‚ùå Don‚Äôt use overly casual or overly technical language unless requested

### 3. üì¶ **Structure**

* ‚úÖ ‚ÄúUse concise bullet points when listing benefits or pricing tiers.‚Äù
* ‚úÖ ‚ÄúHighlight product value clearly before mentioning features.‚Äù

### 4. üìç **Relevance / Locality**

* ‚úÖ ‚ÄúRelate responses to business owners, startups, or small teams when appropriate.‚Äù
* ‚úÖ ‚ÄúAvoid generic economic commentary unless specifically relevant to the product.‚Äù

### 5. üö´ **Factual Safety Net**

* ‚úÖ ‚ÄúDo not speculate or infer anything not stated in the documentation.‚Äù
* ‚úÖ ‚ÄúAvoid offering recommendations that go beyond the scope of what the product or service supports.‚Äù





#TESTING

## üßæ Prompt Template: Initial Version

This prompt defines the behavior and constraints for the Cashflow4cast RAG assistant, ensuring that responses are helpful, accurate, and grounded in business documentation.

### üéØ Objective

Instruct the model to act as a **customer support assistant** for **Cashflow4cast**, a platform that helps businesses forecast cash flow using economic indicators.

### üß† Prompt Behavior

The assistant is expected to:

* **Answer based only on retrieved context documents**
* **Maintain a concise and professional tone**
* **Avoid making assumptions or hallucinations**

### üß© Key Instructions in the Prompt

* If the answer isn't found in the context, return:
  `"I'm sorry, I don't have that information based on the current documentation."`
* Structure responses clearly (e.g., use bullet points for lists like pricing or features)
* Focus only on what's **relevant to the business context**
* Exclude any **unsupported claims, personal opinions, or unrelated features**
* Do not fabricate pricing or services ‚Äî only include what's explicitly in the documents




In [None]:
test_questions = [
    "What does Cashflow4cast actually do?",
    "How does the service work?",
    "What does it cost?",
    "Can I use it for my small business?",
    "What makes Cashflow4cast different from other tools?"
]

prompt_template = ChatPromptTemplate.from_template("""
You are a customer support assistant for Cashflow4cast, a platform that helps business owners forecast cash flow using economic indicators.

Answer the user's question using only the information provided in the business documents below.

Instructions:
- If the answer is not in the provided context, say: "I'm sorry, I don't have that information based on the current documentation."
- Be concise, informative, and professional in tone.
- Structure answers clearly: use bullet points if listing pricing or benefits.
- Focus responses on business relevance. Do not include personal opinions or make assumptions.
- Do not mention features, services, or pricing unless explicitly included in the context.


Context:
{context}

Question:
{question}

Answer:
""")

# Define test questions
test_questions = [
    "What does Cashflow4cast actually do?",
    "How does the service work?",
    "What does it cost?",
    "Can I use it for my small business?",
    "What makes Cashflow4cast different from other tools?"
]

# Define test runner
def run_prompt_tests(questions, rag_chain, wrap_width=100):
    for i, question in enumerate(questions, 1):
        print(f"\nüîπ Test {i}: {question}\n{'-'*wrap_width}")
        response = rag_chain.invoke({"question": question})
        print(textwrap.fill(response, width=wrap_width))
        print("\n" + "="*wrap_width)

# Run test questions
run_prompt_tests(test_questions, rag_chain)




üîπ Test 1: What does Cashflow4cast actually do?
----------------------------------------------------------------------------------------------------
CashFlow4Cast uses advanced machine learning to help businesses cut forecasting errors in half,


üîπ Test 2: How does the service work?
----------------------------------------------------------------------------------------------------
The service works by providing businesses with a strong proxy for consumer demand and small business
health through data collected from retail employment numbers. This data can help businesses manage
their cash flow by giving them reliable numbers to make informed decisions regarding payroll,
inventory restocking, and preparing for seasonal fluctuations.


üîπ Test 3: What does it cost?
----------------------------------------------------------------------------------------------------
The cost for the Local Professional Services business type ranges from $15,000 to $40,000.


üîπ Test 4: Can I use i



### üîé Review & Observations

#### **Test 1: What does Cashflow4cast actually do?**

‚úÖ **Strengths**: Focuses on machine learning and forecasting improvements
‚ö†Ô∏è **Improvement**: Could include that the forecasts are powered by both *company data and economic indicators* for a more complete picture

---

#### **Test 2: How does the service work?**

‚ö†Ô∏è **Issue**: The mention of *retail employment numbers* feels too specific and might come from a misaligned chunk of context
‚úÖ **Strengths**: It does stick to the broader idea of helping manage payroll, inventory, etc.

üí° **Suggestion**: Consider verifying that your context chunks don‚Äôt intermix examples from broader economic reports if those aren‚Äôt meant to represent the core product functionality.

---

#### **Test 3: What does it cost?**

‚ùå **Issue**: The answer about ‚ÄúLocal Professional Services business type ranges from \$15,000 to \$40,000‚Äù is outdated or from a different document.

‚úÖ **Fix already in progress**: You already addressed this by moving clean pricing info to the top of the pricing doc and adjusting chunk size. If you're still getting this, double-check if old chunks are being re-used from a previous embedding run (you may need to clear `persist_directory` before re-ingesting).

---

#### **Test 4: Can I use it for my small business?**

‚úÖ **Strengths**: The answer is clear and affirmative
‚ö†Ô∏è **Tone**: Slightly generic (‚Äúuse the information in the context‚Äù) ‚Äî could sound more supportive and personalized

---

#### **Test 5: What makes Cashflow4cast different?**

‚ö†Ô∏è **Issue**: Reference to MAE and MAPE is too technical and doesn‚Äôt match your intended audience
‚úÖ **Action already taken**: You‚Äôve edited those out of the docs ‚Äî once re-ingested, that language should drop off

---

### ‚úÖ Recommendations Summary

| Area                 | Recommendation                                                                                                                               |
| -------------------- | -------------------------------------------------------------------------------------------------------------------------------------------- |
| **Context Content**  | Re-ingest documents with improved, simplified wording ‚Äî especially pricing and metrics                                                       |
| **Chunk Validation** | Confirm you're loading only the updated `.txt` files and not leftover data                                                                   |
| **Prompt Tuning**    | Consider replacing strict tone constraints with: ‚Äúfriendly, helpful, and concise‚Äù                                                            |
| **Reasoning Chain**  | Consider adding a reasoning cue to the prompt, e.g., *‚ÄúTake a moment to think through the question and look for the answer in the context.‚Äù* |



### üîß **Prompt Template Summary (Second Iteration)**

**Purpose**:
This prompt is designed to power a **customer support assistant** for Cashflow4cast, a financial forecasting platform for business owners.

**Tone & Behavior Goals**:

* **Professional** and **informative**
* **Concise** in response length
* **Context-grounded only** ‚Äî no hallucinations or assumptions
* Avoids unnecessary speculation or off-topic commentary

**Instructions to the Assistant**:

* Only use information found in the provided business documents
* If the answer is missing, clearly state:
  *"I'm sorry, I don't have that information based on the current documentation."*
* Use **bullet points** for listing pricing, benefits, or features
* Do **not** fabricate or infer additional services, prices, or claims



In [None]:
# SET MODEL PARAMS
EMBED_MODEL = "all-MiniLM-L6-v2"
CHUNK_SIZE = 500
CHUNK_OVERLAP = 100
K = 4

# Path to your documents
docs_path = "/content/CFFC_docs"

# Step 1: Load all .txt files in the folder
raw_documents = []
for filename in os.listdir(docs_path):
    if filename.endswith(".txt"):
        file_path = os.path.join(docs_path, filename)
        loader = TextLoader(file_path, encoding="utf-8")
        docs = loader.load()
        raw_documents.extend(docs)

print(f"Loaded {len(raw_documents)} documents.")

# Step 2 (optional): Clean up newlines and extra whitespace
def clean_doc(doc: Document) -> Document:
    cleaned = " ".join(doc.page_content.split())  # Removes newlines & extra spaces
    return Document(page_content=cleaned, metadata=doc.metadata)

cleaned_documents = [clean_doc(doc) for doc in raw_documents]

# Step 3: Split documents into chunks
splitter = RecursiveCharacterTextSplitter(
    chunk_size=CHUNK_SIZE,
    chunk_overlap=CHUNK_OVERLAP
)

chunked_documents = splitter.split_documents(cleaned_documents)

# Step 1: Set up Hugging Face embedding model
embedding_model = HuggingFaceEmbeddings(model_name=EMBED_MODEL)

# Step 2: Set up Chroma with persistence
persist_dir = "/content/chroma_db"  # In Colab, always use full paths

vectorstore = Chroma.from_documents(
    documents=chunked_documents,
    embedding=embedding_model,
    persist_directory=persist_dir
)

# define the retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": K})

# create prompt template
prompt_template = ChatPromptTemplate.from_template("""
You are a customer support assistant for Cashflow4cast, a platform that helps business owners forecast cash flow using economic indicators.

Answer the user's question using only the information provided in the business documents below.

Instructions:
- If the answer is not in the provided context, say: "I'm sorry, I don't have that information based on the current documentation."
- Be concise, informative, and professional in tone.
- Structure answers clearly: use bullet points if listing pricing or benefits.
- Focus responses on business relevance. Do not include personal opinions or make assumptions.
- Do not mention features, services, or pricing unless explicitly included in the context.


Context:
{context}

Question:
{question}

Answer:
""")


print(f"‚úÖ Stored {len(chunked_documents)} chunks in Chroma at '{persist_dir}'")

# Define RAG chain
rag_chain = (
    RunnableLambda(lambda d: {
        "question": d["question"],
        "docs": retriever.invoke(d["question"])
    })
    | RunnableLambda(lambda d: {
        "context": "\n\n".join([doc.page_content for doc in d["docs"]]),
        "question": d["question"]
    })
    | prompt_template
    | LLM_MODEL
    | StrOutputParser()
)

# Invoke RAG
response = rag_chain.invoke({
    "question": "What are the recent economic indicators in Gainesville that affect local businesses?"
})

# Print response nicely
import textwrap
print("\n" + textwrap.fill(response, width=100))



# Define test questions
test_questions = [
    "What does Cashflow4cast actually do?",
    "How does the service work?",
    "What does it cost?",
    "Can I use it for my small business?",
    "What makes Cashflow4cast different from other tools?"
]

# Define test runner
def run_prompt_tests(questions, rag_chain, wrap_width=100):
    for i, question in enumerate(questions, 1):
        print(f"\nüîπ Test {i}: {question}\n{'-'*wrap_width}")
        response = rag_chain.invoke({"question": question})
        print(textwrap.fill(response, width=wrap_width))
        print("\n" + "="*wrap_width)

# Run test questions
run_prompt_tests(test_questions, rag_chain)

Loaded 7 documents.
‚úÖ Stored 68 chunks in Chroma at '/content/chroma_db'

- Rising unemployment - Fewer retail jobs - Shrinking weekly paychecks

üîπ Test 1: What does Cashflow4cast actually do?
----------------------------------------------------------------------------------------------------
- Cashflow4cast evaluates forecast performance using three key metrics: MAE, MAPE, and RMSE -
Cashflow4cast helps business owners manage cash flow by providing reliable forecasting tools


üîπ Test 2: How does the service work?
----------------------------------------------------------------------------------------------------
- Cashflow4cast uses a local machine learning model to forecast cash flow for different business
types - The model calculates revenue ranges and error percentages for each business type - Small
gains in forecasting accuracy can lead to real savings by enabling better ordering and smarter
staffing decisions


üîπ Test 3: What does it cost?
-----------------------------



### üîç **Evaluation of Prompt Output (Second Iteration)**

#### üîπ **Test 1: What does Cashflow4cast actually do?**

**‚úÖ Strengths**:

* Points out use of forecasting metrics and focus on helping business owners.

**‚ö†Ô∏è Opportunities**:

* **Too technical** ‚Äî MAE, MAPE, RMSE are not appropriate for a general customer support audience.
* The second bullet is generic; it could say more about *how* it helps.

**Suggested Fix**: Remove metric jargon from your source documents if the audience is non-technical. Emphasize outcomes (e.g., reduce forecasting error, improve cash flow clarity).

---

#### üîπ **Test 2: How does the service work?**

**‚úÖ Strengths**:

* Clear and process-oriented explanation.
* Mentions use of ML and practical benefits like staffing/order optimization.

**‚ö†Ô∏è Opportunities**:

* Could use slightly plainer language ("local machine learning model" might confuse some users).

**Suggestion**: Consider phrasing like *‚Äúa custom forecasting tool that uses economic data and your business history‚Äù*.

---

#### üîπ **Test 3: What does it cost?**

**‚úÖ Strengths**:

* Clear, concise, structured using bullet points.
* Includes all three plans and pricing.

**‚ö†Ô∏è Opportunities**:

* Minor: could briefly note differences between plans (e.g., who they‚Äôre best for).

**Optional**: Add ‚Äústarting at‚Äù language to improve flexibility.

---

#### üîπ **Test 4: Can I use it for my small business?**

**‚úÖ Strengths**:

* Direct ‚ÄúYes‚Äù answer, followed by examples of supported business types.

**‚ö†Ô∏è Opportunities**:

* Slightly robotic tone (‚Äúour Local Machine Learning Model‚Äù).

**Suggestion**: Rephrase to sound more conversational. E.g.,

> "Absolutely! Cashflow4cast supports small businesses like retail shops, caf√©s, and professional services."

---

#### üîπ **Test 5: What makes Cashflow4cast different from other tools?**

**‚úÖ Strengths**:

* Focuses on error reduction and advanced capabilities.

**‚ö†Ô∏è Opportunities**:

* Again, reliance on metrics (MAE, MAPE, RMSE) may confuse users.

**Suggestion**: Emphasize outcomes, not technical benchmarks.

---

### üß† Overall Assessment

| Aspect        | Result                                                                        |
| ------------- | ----------------------------------------------------------------------------- |
| **Clarity**   | ‚úÖ Mostly clear, but some technical terms should be removed for this audience. |
| **Structure** | ‚úÖ Bullet points are working well.                                             |
| **Tone**      | ‚ö†Ô∏è Slightly too formal/robotic in some responses. Can be friendlier.          |
| **Grounding** | ‚úÖ On-topic and faithful to context.                                           |

---

### ‚úÖ Actionable Fixes

1. **Remove or downplay technical jargon** (e.g., MAE, RMSE).
2. **Soften tone slightly** in prompt (e.g., ‚Äúfriendly and clear‚Äù).
3. **Optional**: Enhance pricing and feature responses with micro-descriptions for each tier.
4. **If needed**: Add example-based answers ("For example, a caf√© might use this to...").




### üîç Response Review

## Modified Pricing Doc to Push Prices to the Top of Doc

#### üîπ **Test 1: What does Cashflow4cast actually do?**

**‚úÖ Pros**:

* Clear goal: reducing forecasting error.
* Emphasizes reliability in managing cash flow operations.

**‚ö†Ô∏è Issues**:

* **Still mentions MAE, MAPE, RMSE** ‚Äî which conflicts with your earlier concern that the average user doesn't know these metrics.
* Slightly robotic bullet format for what could be a more fluid answer.

**üí° Suggestion**: Remove metric jargon from source doc and prompt the model to speak in outcomes:

> ‚ÄúCashflow4cast helps you forecast cash flow more accurately‚Äîgiving you clearer insight to manage payroll, restock inventory, and prep for slow seasons.‚Äù

---

#### üîπ **Test 2: How does the service work?**

**‚úÖ Pros**:

* Covers the mechanics well: ML + economic indicators + business data.
* Explains business impact: ordering, staffing, savings.

**‚ö†Ô∏è Issues**:

* The phrase "Local Machine Learning Model" is not user-friendly and may be unclear.
* Slightly repetitive of Test 1's answer in tone and structure.

**üí° Suggestion**: Encourage the model to phrase it more naturally:

> ‚ÄúCashflow4cast analyzes both your business data and local economic signals to forecast revenue. This helps you make smarter decisions on staffing and inventory each week.‚Äù

---

#### üîπ **Test 3: What does it cost?**

**‚úÖ Pros**:

* Accurate.
* Clearly structured.

**‚ö†Ô∏è Issues**:

* Could include who each plan is best for (if that's in the doc).

**üí° Optional Enhancement**:

> ‚ÄúChoose the plan that fits your needs: Basic (\$199/mo), Advanced (\$499/mo), or Premium (\$999/mo) for real-time insights and expert consulting.‚Äù

---

#### üîπ **Test 4: Can I use it for my small business?**

**‚úÖ Pros**:

* Direct "Yes" answer.
* Provides relevant business types as examples.

**‚ö†Ô∏è Issues**:

* The phrase ‚ÄúLocal Machine Learning Model‚Äù again feels opaque.
* Ends with a slightly technical tone (‚Äúlow and high scenarios with corresponding error percentages‚Äù).

**üí° Suggestion**:

> ‚ÄúYes! Cashflow4cast works well for small businesses like caf√©s, shops, and service providers‚Äîhelping you forecast sales more confidently.‚Äù

---

#### üîπ **Test 5: What makes Cashflow4cast different from other tools?**

**‚úÖ Pros**:

* Emphasizes differentiator (50% error reduction).
* Speaks to accuracy and reliability.

**‚ö†Ô∏è Issues**:

* Repeats MAE/MAPE/RMSE ‚Äî again too technical.
* Repeats earlier language too closely.

**üí° Suggestion**:

> ‚ÄúUnlike Excel or QuickBooks, Cashflow4cast blends your data with live economic trends, helping you predict cash flow more accurately and avoid surprises.‚Äù

---

### ‚úÖ Summary Assessment

| Area               | Score                             | Notes                                             |
| ------------------ | --------------------------------- | ------------------------------------------------- |
| **Accuracy**       | ‚úÖ Very good                       | Pulls relevant content cleanly.                   |
| **Tone**           | ‚ö†Ô∏è Neutral, slightly robotic      | Could benefit from softer, more natural phrasing. |
| **User Clarity**   | ‚ö†Ô∏è Technical terms still present  | MAE/MAPE/Local ML model could confuse users.      |
| **Consistency**    | ‚ö†Ô∏è Some repetition across answers | Vary structure/language slightly.                 |
| **Pricing Answer** | ‚úÖ Accurate and clear              | Could optionally be enriched.                     |

---

### üìå Final Tips

1. **Simplify source text**: You‚Äôve already started doing this with pricing‚Äînow do the same for ML and metrics.
2. **Use friendly phrasing in your prompt**:

   * Replace "refer to the company documentation" with ‚Äúuse the info below to help the customer.‚Äù
   * Add tone instructions like: ‚ÄúSound natural and helpful, like a real support rep.‚Äù



In [None]:
# create prompt template
prompt_template = ChatPromptTemplate.from_template("""
You are a helpful and friendly customer support assistant for Cashflow4cast.

Cashflow4cast: Forecasting You Can Trust in Uncertain Times

We leverage advanced machine learning and a combination of your business data and economic indicators \
including federal, state, and local metrics ‚Äî to forecast future sales and cash flow more accurately than Excel, \
QuickBooks, or other off-the-shelf forecasting tools. Our ML models help mid-sized businesses stay ahead \
of sales volatility and protect their bottom line by cutting forecast errors in half.

Answer the user's question using only the information provided in the business documents below.

Instructions:
- Read the question, refer to the company documentation, and provide a thoughtful, helpful, and informative response.
- Keep your answer under five sentences.
- Avoid repeating the same phrase across multiple answers unless it's directly relevant to the question.
- If the answer is not found in the provided context, say: "I'm sorry, I don't have that information based on the current documentation."
- Do not include personal opinions or make assumptions.

Context:
{context}

Question:
{question}

Answer:
""")


# Define test questions
test_questions = [
    "What does Cashflow4cast actually do?",
    "How does the service work?",
    "What does it cost?",
    "Can I use it for my small business?",
    "What makes Cashflow4cast different from other tools?"
]

# Manual eval stub
def score_response(response, expected_keywords, verbose=True):
    score = sum(1 for kw in expected_keywords if kw.lower() in response.lower())
    if verbose:
        print(f"‚úÖ Matched {score}/{len(expected_keywords)} keywords: {expected_keywords}")
    return score


# Define test runner
def run_prompt_tests(questions, rag_chain, wrap_width=100):
    for i, question in enumerate(questions, 1):
        print(f"\nüîπ Test {i}: {question}\n{'-'*wrap_width}")
        response = rag_chain.invoke({"question": question})
        print(textwrap.fill(response, width=wrap_width))
        print("\n" + "="*wrap_width)

# Run test questions
run_prompt_tests(test_questions, rag_chain)



üîπ Test 1: What does Cashflow4cast actually do?
----------------------------------------------------------------------------------------------------
- Cashflow4cast evaluates forecast performance using three key metrics: MAE, MAPE, and RMSE -
Cashflow4cast helps business owners cut cash flow forecasting errors by 50% - Cashflow4cast provides
reliable numbers for managing cash flow, covering payroll, restocking inventory, and preparing for
seasonal dips


üîπ Test 2: How does the service work?
----------------------------------------------------------------------------------------------------
- Cashflow4cast uses a Local Machine Learning Model to forecast cash flow based on economic
indicators - The model provides revenue ranges and error percentages for different business types -
Small gains in forecasting accuracy can lead to real savings every month - Benefits include better
ordering and smarter staffing decisions


üîπ Test 3: What does it cost?
--------------------------------



### üîß **Technical Adjustments**

* **Chunk size reduced to 200**
  ‚Üí Likely made responses more specific by narrowing each retrieval unit.
* **Overlap set to 50**
  ‚Üí Maintains cohesion between chunks, useful for longer thoughts that span breaks.
* **K set to 2 (top 2 chunks)**
  ‚Üí Forces the model to be more selective, reducing clutter or contradictions in context.

#### üìù **Prompt Changes**

* **Added a warmer, branded introduction**
  ‚Üí Personalizes the tone (‚ÄúForecasting You Can Trust in Uncertain Times‚Äù).
* **Softened tone instructions**
  ‚Üí ‚ÄúThoughtful, helpful, informative‚Äù now guides tone, vs. rigid formality.
* **Added constraints on verbosity and phrase reuse**
  ‚Üí Keeps responses short and less robotic.
* **Maintained fallback behavior**
  ‚Üí Still says ‚ÄúI don‚Äôt know‚Äù when context is missing (good guardrail).

---

### üìä Review of the Outputs

| Test                              | Strengths                                                       | Suggested Improvements                                                     |
| --------------------------------- | --------------------------------------------------------------- | -------------------------------------------------------------------------- |
| **1**<br>What does it do?         | ‚úîÔ∏è On-message<br>‚úîÔ∏è Clear benefits                              | Could vary language to reduce repeated phrasing with Test 2.               |
| **2**<br>How does it work?        | ‚úîÔ∏è Mentions key inputs and mechanics<br>‚úîÔ∏è Describes outcomes   | First sentence is almost identical to Test 1 ‚Äî maybe add a little variety. |
| **3**<br>What does it cost?       | ‚úîÔ∏è Correct info<br>‚úîÔ∏è Nicely condensed                          | None ‚Äî very solid.                                                         |
| **4**<br>Small business?          | ‚úîÔ∏è Answers directly<br>‚úîÔ∏è Offers specific examples              | Could emphasize *why* it‚Äôs good for small businesses beyond type matching. |
| **5**<br>What makes it different? | ‚úîÔ∏è Strong differentiator focus<br>‚úîÔ∏è Less technical than before | Consider using a metaphor or casual phrasing to feel more personal.        |

---

### üéØ Overall Feedback

* ‚úÖ **Much improved tone**: Clear, warm, and human without being verbose.
* ‚úÖ **Pricing now accurate and easy to read**.
* ‚úÖ **No more confusing jargon (MAE/MAPE/etc.)** ‚Äî huge win.
* ‚ö†Ô∏è **Repetition** between Test 1 and 2 is your biggest lingering issue.
* üß† **Persona alignment is excellent**: The assistant now genuinely feels like a branded support rep.

---

### ‚úÖ Suggested Small Next Steps

1. **Diversify phrasing**: Add to your prompt something like:

   > ‚ÄúTry to phrase each answer naturally and avoid repeating entire sentences unless necessary.‚Äù

2. **Highlight ‚Äúwhy it matters‚Äù** in small biz or competitive advantage answers ‚Äî make it feel more ‚Äúhuman to human.‚Äù

3. **Optional**: Tag each test question with expected intent (e.g., *feature discovery*, *sales objection*, *trust validation*) for future prompt tuning or A/B testing.



In [None]:
# SET MODEL PARAMS
EMBED_MODEL = "all-MiniLM-L6-v2"
CHUNK_SIZE = 200
CHUNK_OVERLAP = 50
K = 2

# Path to your documents
docs_path = "/content/CFFC_docs"

# Step 1: Load all .txt files in the folder
raw_documents = []
for filename in os.listdir(docs_path):
    if filename.endswith(".txt"):
        file_path = os.path.join(docs_path, filename)
        loader = TextLoader(file_path, encoding="utf-8")
        docs = loader.load()
        raw_documents.extend(docs)

print(f"Loaded {len(raw_documents)} documents.")

# Step 2 (optional): Clean up newlines and extra whitespace
def clean_doc(doc: Document) -> Document:
    cleaned = " ".join(doc.page_content.split())  # Removes newlines & extra spaces
    return Document(page_content=cleaned, metadata=doc.metadata)

cleaned_documents = [clean_doc(doc) for doc in raw_documents]

# Step 3: Split documents into chunks
splitter = RecursiveCharacterTextSplitter(
    chunk_size=CHUNK_SIZE,
    chunk_overlap=CHUNK_OVERLAP
)

chunked_documents = splitter.split_documents(cleaned_documents)

# print(f"Split into {len(chunked_documents)} total chunks.")

# # Preview the first 5 chunks
# print(f"Showing first 5 of {len(chunked_documents)} chunks:\n")

# for i, doc in enumerate(chunked_documents[:5]):
#     print(f"--- Chunk {i+1} ---")
#     print(f"Source: {doc.metadata.get('source', 'N/A')}\n")
#     print(textwrap.fill(doc.page_content[:500], width=100))  # limit preview to 500 characters
#     print("\n")

# Step 1: Set up Hugging Face embedding model
embedding_model = HuggingFaceEmbeddings(model_name=EMBED_MODEL)

# Step 2: Set up Chroma with persistence
persist_dir = "/content/chroma_db"  # In Colab, always use full paths

vectorstore = Chroma.from_documents(
    documents=chunked_documents,
    embedding=embedding_model,
    persist_directory=persist_dir
)

# define the retriever
retriever = vectorstore.as_retriever(search_kwargs={"k": K})

# create prompt template
prompt_template = ChatPromptTemplate.from_template("""
You are a helpful and friendly customer support assistant for Cashflow4cast.

Cashflow4cast: Forecasting You Can Trust in Uncertain Times

We leverage advanced machine learning and a combination of your business data and economic indicators ‚Äî including federal, state, and local metrics ‚Äî to forecast future sales and cash flow more accurately than Excel, QuickBooks, or other off-the-shelf forecasting tools.

Our platform helps mid-sized businesses stay ahead of sales volatility and protect their bottom line by cutting forecast errors in half.

Answer the user's question using only the information provided in the business documents below.

Instructions:
- Read the question, refer to the company documentation, and provide a thoughtful, helpful, and informative response.
- Keep your answer under five sentences.
- Avoid repeating the same phrase across multiple answers unless it's directly relevant to the question.
- If the answer is not found in the provided context, say: "I'm sorry, I don't have that information based on the current documentation."
- Do not include personal opinions or make assumptions.

Context:
{context}

Question:
{question}

Answer:
""")

print(f"‚úÖ Stored {len(chunked_documents)} chunks in Chroma at '{persist_dir}'")

# Define RAG chain
rag_chain = (
    RunnableLambda(lambda d: {
        "question": d["question"],
        "docs": retriever.invoke(d["question"])
    })
    | RunnableLambda(lambda d: {
        "context": "\n\n".join([doc.page_content for doc in d["docs"]]),
        "question": d["question"]
    })
    | prompt_template
    | LLM_MODEL
    | StrOutputParser()
)

# Define test questions
test_questions = [
    "What does Cashflow4cast actually do?",
    "How does the service work?",
    "What does it cost?",
    "Can I use it for my small business?",
    "What makes Cashflow4cast different from other tools?"
]

# Define test runner
def run_prompt_tests(questions, rag_chain, wrap_width=100):
    for i, question in enumerate(questions, 1):
        print(f"\nüîπ Test {i}: {question}\n{'-'*wrap_width}")
        response = rag_chain.invoke({"question": question})
        print(textwrap.fill(response, width=wrap_width))
        print("\n" + "="*wrap_width)

# Run test questions
run_prompt_tests(test_questions, rag_chain)


üîπ Test 1: What does Cashflow4cast actually do?
----------------------------------------------------------------------------------------------------
Cashflow4cast leverages advanced machine learning and a combination of business data and economic
indicators to forecast future sales and cash flow more accurately than traditional tools like Excel
or QuickBooks. This helps mid-sized businesses stay ahead of sales volatility and reduce forecast
errors by half, ultimately protecting their bottom line.


üîπ Test 2: How does the service work?
----------------------------------------------------------------------------------------------------
Cashflow4cast leverages advanced machine learning and a combination of your business data and
economic indicators to forecast future sales and cash flow more accurately than traditional tools
like Excel or QuickBooks. By analyzing federal, state, and local metrics, our platform helps mid-
sized businesses anticipate sales volatility and reduce foreca



## üìù Summary of Prompt Changes

**Changed:**

* Shifted tone to *‚Äúfriendly and knowledgeable‚Äù* rather than just ‚Äúcustomer support assistant‚Äù
* Added a short branded summary:

  > ‚ÄúCashflow4cast helps business owners forecast sales and cash flow more accurately‚Ä¶‚Äù
* Replaced rigid instructions with natural, human ones:

  * ‚ÄúConsider the question carefully‚Äù
  * ‚ÄúReview the documentation‚Äù
  * ‚ÄúCompose a clear and helpful response‚Äù
* Emphasized:

  * Tailoring to the question
  * Avoiding unnecessary repetition
  * Using short paragraphs or bullet points
  * Clear fallback when context is missing

**Effect:**
‚úÖ More natural and approachable tone
‚úÖ Gentler scaffolding that still provides structure
‚úÖ Encourages thoughtful, context-aware answers

---

## üîç Output Review

| Test                     | ‚úÖ Strengths                                                | ‚ö†Ô∏è Suggestions                                  |
| ------------------------ | ---------------------------------------------------------- | ----------------------------------------------- |
| **1** What does it do?   | ‚úÖ Clear and accurate<br>‚úÖ Avoids jargon                    | üîÅ Still repeats language from other answers    |
| **2** How does it work?  | ‚úÖ Explains with layered info<br>‚úÖ Covers process           | üîÅ Opening is a near match to Test 1            |
| **3** Pricing            | ‚úÖ Clear breakdown by tier                                  | üí¨ Consider using bullet points here too        |
| **4** Small business use | ‚úÖ Direct, friendly, business-relevant<br>‚úÖ Great tone      | üîÅ Same ‚ÄúExcel or QuickBooks‚Äù phrase as earlier |
| **5** Differentiator     | ‚úÖ Well-framed value prop<br>‚úÖ Simple, grounded explanation | ‚úÖ Strongest answer ‚Äî no changes needed          |

---

## üß† Overall Impression

You‚Äôre now in the **‚Äúpolishing‚Äù phase** ‚Äî the structure and information delivery are excellent. The tone has improved, the fallback logic is intact, and the assistant sounds **clear, competent, and friendly**.

---

## ‚úÖ Suggested Minor Enhancements

1. **Add this to your prompt to reduce repetition**:

   > Avoid repeating full phrases from previous answers unless the exact phrasing is essential for clarity.

2. **Swap "Excel or QuickBooks" references** in a few responses:

   * Try phrases like ‚Äúmanual spreadsheets,‚Äù ‚Äúaccounting software,‚Äù or ‚Äústandard forecasting tools.‚Äù

3. **Make pricing answer more skimmable**:

   * Use bullet points:

     ```
     - Basic: $199/mo
     - Advanced: $499/mo
     - Premium: Custom pricing for real-time needs
     ```



In [None]:
# create prompt template
prompt_template = ChatPromptTemplate.from_template("""
You are a friendly and knowledgeable customer support assistant for Cashflow4cast.

Cashflow4cast helps business owners forecast sales and cash flow more accurately using machine learning and real-world economic indicators. \
Our goal is to reduce forecast errors by half, helping companies stay ahead of volatility and protect their bottom line.

Answer the user's question using the information provided in the business documents below.

Instructions:
- Consider the question carefully.
- Review the provided documentation for the most relevant information.
- Then compose a clear and helpful response based only on the context.

- Be clear, kind, and helpful.
- Tailor your answer to the specific question ‚Äî don‚Äôt repeat generic company information unless it's directly relevant.
- Use short paragraphs and bullet points when helpful.
- If the answer isn't in the provided context, say: "I'm sorry, I don't have that information based on the current documentation."

Context:
{context}

Question:
{question}

Answer:
""")


# Define test questions
test_questions = [
    "What does Cashflow4cast actually do?",
    "How does the service work?",
    "What does it cost?",
    "Can I use it for my small business?",
    "What makes Cashflow4cast different from other tools?"
]

# Manual eval stub
def score_response(response, expected_keywords, verbose=True):
    score = sum(1 for kw in expected_keywords if kw.lower() in response.lower())
    if verbose:
        print(f"‚úÖ Matched {score}/{len(expected_keywords)} keywords: {expected_keywords}")
    return score


# Define test runner
def run_prompt_tests(questions, rag_chain, wrap_width=100):
    for i, question in enumerate(questions, 1):
        print(f"\nüîπ Test {i}: {question}\n{'-'*wrap_width}")
        response = rag_chain.invoke({"question": question})
        print(textwrap.fill(response, width=wrap_width))
        print("\n" + "="*wrap_width)

# Run test questions
run_prompt_tests(test_questions, rag_chain)



üîπ Test 1: What does Cashflow4cast actually do?
----------------------------------------------------------------------------------------------------
Cashflow4cast leverages advanced machine learning and a combination of business data and economic
indicators to forecast future sales and cash flow more accurately than traditional tools like Excel
or QuickBooks. Their platform helps mid-sized businesses stay ahead of sales volatility and reduce
forecast errors by half.


üîπ Test 2: How does the service work?
----------------------------------------------------------------------------------------------------
Cashflow4cast leverages advanced machine learning and a combination of your business data and
economic indicators to forecast future sales and cash flow more accurately than traditional tools
like Excel or QuickBooks. By analyzing federal, state, and local metrics, our platform helps mid-
sized businesses predict sales volatility and protect their bottom line by reducing forecast 



### üîß What Changed in the Prompt

**Updated Intro:**

* Replaced branding block with a clear mission-focused paragraph:

  > ‚ÄúCashflow4cast leverages advanced machine learning algorithms that utilize your business data‚Ä¶‚Äù

**Refined Instructions:**

* Simplified the assistant‚Äôs role:
  *‚ÄúYou are a helpful and friendly customer support assistant...‚Äù*
* Emphasized a three-step reasoning approach:

  * Consider ‚Üí Review ‚Üí Compose
* Introduced new constraint:
  *‚ÄúKeep your answer under four sentences.‚Äù*

**Tone Goals:**

* More approachable and conversational
* Less instructional overhead
* Still rooted in clarity and professionalism

---

## ‚úÖ Output Review

| Test                   | ‚úÖ Strengths                                                              | ‚ö†Ô∏è Suggestions                                                  |
| ---------------------- | ------------------------------------------------------------------------ | --------------------------------------------------------------- |
| **1** What does it do? | ‚úÖ Tight summary<br>‚úÖ Clear differentiation from generic tools            | üîÅ Repeats same ‚Äúadvanced machine learning‚Ä¶‚Äù phrasing as others |
| **2** How it works     | ‚úÖ Logical flow from tech ‚Üí benefit<br>‚úÖ Strong value claim               | üîÅ Shares \~80% word-for-word with Test 1                       |
| **3** Pricing          | ‚úÖ Accurate pricing info<br>‚úÖ Clear, brief, no fluff                      | üìå Could use bullet points for even faster scanning             |
| **4** Small business   | ‚úÖ Excellent relevance & tone<br>‚úÖ Solid example types listed             | üîÅ ‚ÄúLeverages machine learning‚Ä¶‚Äù repeated again                 |
| **5** Differentiation  | ‚úÖ Strong contrast with ‚Äúoff-the-shelf tools‚Äù<br>‚úÖ Hits unique value prop | üîÅ Ends with a generic comparison (can soften repetition)       |

---

## üîç Patterns + Recommendations

### üü° Minor Repetition

* **The phrase** ‚Äúleverages advanced machine learning and a combination of business data and economic indicators‚Ä¶‚Äù appears in almost every answer.
* üõ† *Fix:* Add to the prompt:

  > ‚ÄúAvoid repeating the same phrase across multiple answers unless it's directly relevant.‚Äù

### üìå Structure Opportunities

* Use **bullet points** for pricing (Test 3) and differentiation (Test 5) to improve skimmability.

### üí° Enhancement Suggestion for Prompt

Here‚Äôs an updated version of your prompt to incorporate these learnings:

```python
prompt_template = ChatPromptTemplate.from_template("""
You are a helpful and friendly customer support assistant for Cashflow4cast.

Cashflow4cast helps business owners forecast sales and cash flow using machine learning and real-world economic indicators.
We help mid-sized businesses reduce forecast errors, stay ahead of volatility, and protect their bottom line.

Answer the user's question using only the information provided in the business documents below.

Instructions:
- Consider the question carefully.
- Review the documentation to find the most relevant answer.
- Then compose a clear and helpful response ‚Äî under four sentences.
- Use bullet points when listing options like pricing or features.
- Avoid repeating identical phrasing across multiple answers.
- If the answer is not found in the provided context, say: "I'm sorry, I don't have that information based on the current documentation."
""")
```



In [None]:
# create prompt template
prompt_template = ChatPromptTemplate.from_template("""
You are a helpful and friendly customer support assistant for Cashflow4cast.

Cashflow4cast leverages advanced machine learning algorithms that utlize your business data and combine that \
with federal, state, and local economic indicators ‚Äî to forecast future sales and cash flow more accurately than Excel, \
QuickBooks, or any other off-the-shelf forecasting tools. We help mid-sized businesses stay ahead of sales volatility \
increae profit and reduce inventory by cutting forecast errors in half.

Your task it to nswer the user's question below:

- Consider the question carefully.
- Review the provided documentation for the most relevant information.
- Then compose a clear and helpful response.
- Keep your answer under four sentences.

Context:
{context}

Question:
{question}

Answer:
""")


# Define test questions
test_questions = [
    "What does Cashflow4cast actually do?",
    "How does the service work?",
    "What does it cost?",
    "Can I use it for my small business?",
    "What makes Cashflow4cast different from other tools?"
]


# Define test runner
def run_prompt_tests(questions, rag_chain, wrap_width=100):
    for i, question in enumerate(questions, 1):
        print(f"\nüîπ Test {i}: {question}\n{'-'*wrap_width}")
        response = rag_chain.invoke({"question": question})
        print(textwrap.fill(response, width=wrap_width))
        print("\n" + "="*wrap_width)

# Run test questions
run_prompt_tests(test_questions, rag_chain)


üîπ Test 1: What does Cashflow4cast actually do?
----------------------------------------------------------------------------------------------------
Cashflow4cast leverages advanced machine learning and a combination of business data and economic
indicators to forecast future sales and cash flow more accurately than Excel, QuickBooks, or other
off-the-shelf tools. This helps mid-sized businesses stay ahead of sales volatility and reduce
forecast errors by half.


üîπ Test 2: How does the service work?
----------------------------------------------------------------------------------------------------
Cashflow4cast leverages advanced machine learning and a combination of your business data and
economic indicators to forecast future sales and cash flow more accurately than traditional tools
like Excel or QuickBooks. By analyzing federal, state, and local metrics, our platform helps mid-
sized businesses stay ahead of sales volatility and protect their bottom line by cutting forecast




## ‚úÖ **What Worked Best**

### 1. **Simplified, Friendly Prompt Design**

* **Best-performing prompt** avoided excessive structure and let the model rely on its helpful, conversational training.
* Clearly stated *role* (customer support assistant) and *task* (answer based on docs).
* Friendly and approachable tone resonated more naturally than rigidly professional language.

### 2. **Shorter Chunks + Lower `k`**

* `CHUNK_SIZE = 200` with `K = 2` struck a great balance.
* Delivered *high precision* answers grounded in relevant context, minimizing hallucination.
* Moving pricing info to the **top of the doc** improved its recall.

### 3. **Four-Sentence Limit**

* Helped keep answers snappy and digestible.
* Reduced the model‚Äôs tendency to over-explain or repeat itself.

### 4. **Instruction to Avoid Repeating Phrases**

* This lowered redundancy and made answers feel more tailored and less templated.

---

## ‚ö†Ô∏è **What Didn't Work Well**

### 1. **Overly Long or Formal Prompts**

* Prompts that tried to ‚Äúover-steer‚Äù the model with excessive behavioral rules or detailed instructions led to:

  * Stiffer, less natural tone
  * Redundant answers
  * Unneeded technical language (e.g., MAE, MAPE, RMSE)

### 2. **Large Chunks + High `k`**

* Caused diffusion of relevant info
* Introduced outdated pricing and overly technical language even after doc updates

---

## üí° **Recommendations Going Forward**

### üîß Prompt

Use a prompt that is:

* Clear in role and purpose
* Friendly and natural
* Lightly instructive (not overbearing)

**Example Baseline Prompt:**

```python
"You are a helpful and friendly customer support assistant for Cashflow4cast.
Use only the context below to answer the question. Keep your response under four sentences.
Avoid repeating the same phrase across answers. If the info isn't in the context, say:
'I'm sorry, I don't have that information based on the current documentation.'"
```

### üìê Parameters

* `CHUNK_SIZE = 200`
* `CHUNK_OVERLAP = 50`
* `K = 2`

This setup optimizes *focus, relevance,* and *efficiency*.

### üß™ Testing

* Keep using your test suite of 5‚Äì7 customer questions.
* Add `score_response()` to track keyword or intent coverage.
* Track repeat phrases and measure tone drift across runs.

### üìò Documentation

Consider adding a markdown section to your notebook or repo summarizing:

* Prompt iterations
* Chunking/embedding settings
* Sample outputs + ratings

---

## ‚úÖ Final Takeaway

You've done the hard work of fine-tuning a prompt for **business relevance**, **user experience**, and **model strengths**. The winning formula? Trust the model‚Äôs defaults, provide clear and gentle guidance, and ensure the RAG context is crisp and well-prioritized.



In [None]:
# create prompt template
prompt_template = ChatPromptTemplate.from_template("""
"You are a helpful and friendly customer support assistant for Cashflow4cast.
Use only the context below to answer the question. Keep your response under four sentences.
Avoid repeating the same phrase across answers. If the info isn't in the context, say:
'I'm sorry, I don't have that information based on the current documentation.'"

Context:
{context}

Question:
{question}

Answer:
""")


# Define test questions
test_questions = [
    "What does Cashflow4cast actually do?",
    "How does the service work?",
    "What does it cost?",
    "Can I use it for my small business?",
    "What makes Cashflow4cast different from other tools?"
]


# Define test runner
def run_prompt_tests(questions, rag_chain, wrap_width=100):
    for i, question in enumerate(questions, 1):
        print(f"\nüîπ Test {i}: {question}\n{'-'*wrap_width}")
        response = rag_chain.invoke({"question": question})
        print(textwrap.fill(response, width=wrap_width))
        print("\n" + "="*wrap_width)

# Run test questions
run_prompt_tests(test_questions, rag_chain)


üîπ Test 1: What does Cashflow4cast actually do?
----------------------------------------------------------------------------------------------------
Cashflow4cast uses advanced machine learning and a combination of business data and economic
indicators to forecast future sales and cash flow more accurately than Excel, QuickBooks, or other
off-the-shelf tools. Their platform helps mid-sized businesses stay ahead of sales volatility and
reduce forecast errors by half.


üîπ Test 2: How does the service work?
----------------------------------------------------------------------------------------------------
Cashflow4cast leverages advanced machine learning and a combination of your business data and
economic indicators to forecast future sales and cash flow more accurately than traditional tools
like Excel or QuickBooks. By analyzing federal, state, and local metrics, our platform helps mid-
sized businesses stay ahead of sales volatility and protect their bottom line by reducing for

These results are excellent ‚Äî you're in a really solid place. Here's a
### ‚úÖ **What‚Äôs Working Well**

#### 1. **Clarity + Accuracy**

* Each answer directly addresses the question using *precise* and *business-relevant* language.
* There's no hallucination or fluff, and pricing is **correct** and consistent.

#### 2. **Tone**

* Answers feel **clear, confident, and professional**, yet friendly ‚Äî exactly the tone you were aiming for.
* No excessive formalism or robotic phrasing.

#### 3. **Length & Structure**

* All responses are comfortably under 4 sentences, with **tight, informative phrasing**.
* Test 4 even varies structure slightly (second sentence adds user-oriented detail), showing good balance.

---

### ‚ö†Ô∏è **Tiny Refinements (Optional)**

#### üîÅ **Slight Redundancy in Phrasing**

Many responses start with:

> "Cashflow4cast leverages advanced machine learning and a combination of your business data and economic indicators..."

This repetition across Test 1, 2, and 5 could be slightly fatiguing over time. A minor adjustment like:

* ‚ÄúOur platform combines your business data with real-world economic signals‚Ä¶‚Äù
* ‚ÄúWe use intelligent forecasting models built on local and national data‚Ä¶‚Äù

‚Ä¶could add variety without changing substance.

#### üß† **User Empathy in Small Business Q (Test 4)**

It‚Äôs great as-is, but optionally you could soften the phrasing for small businesses even more:

> ‚ÄúAbsolutely. Cashflow4cast is designed to support small businesses like yours...‚Äù

‚Äî

### ‚úÖ **Verdict**

You're getting *clean, business-relevant, tone-appropriate answers* ‚Äî the goal of any customer support RAG system. This setup is production-ready for most applications.

---

### üí° Final Tip

You can preserve this success by:

* Freezing this prompt and chunk config in version control
* Logging responses and reviewing periodically for drift
* Collecting real customer questions over time to expand test coverage


In [2]:
import json
from google.colab import drive
drive.mount('/content/drive', force_remount=True)

notebook_path ="/content/drive/My Drive/LANGCHAIN/LC_009_RAG_CustServiceBot_Testing.ipynb"

# Load the notebook JSON
with open(notebook_path, 'r', encoding='utf-8') as f:
    nb = json.load(f)

# 1. Remove widgets from notebook-level metadata
if "widgets" in nb.get("metadata", {}):
    del nb["metadata"]["widgets"]
    print("‚úÖ Removed notebook-level 'widgets' metadata.")

# 2. Remove widgets from each cell's metadata
for i, cell in enumerate(nb.get("cells", [])):
    if "metadata" in cell and "widgets" in cell["metadata"]:
        del cell["metadata"]["widgets"]
        print(f"‚úÖ Removed 'widgets' from cell {i}")

# Save the cleaned notebook
with open(notebook_path, 'w', encoding='utf-8') as f:
    json.dump(nb, f, indent=2)

print("‚úÖ Notebook deeply cleaned. Try uploading to GitHub again.")

Mounted at /content/drive
‚úÖ Notebook deeply cleaned. Try uploading to GitHub again.
