<a href="https://colab.research.google.com/github/micah-shull/LangChain/blob/main/LC_007_RAG_PromptTesting.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



## 🔍 What Is Prompt Engineering?

**Prompt engineering** is the art (and science) of crafting the *input* you give a language model to:

* Guide its tone, structure, or style
* Influence the kind of output you get
* Align the model’s response with your business or task goals

At its core, it’s about **asking the right question, the right way.**

---

## 🧠 Why It Matters

Language models don’t “know” what you want unless you tell them. The same input can produce very different results based on:

* The **structure** of the prompt
* The **persona** or role you assign the model
* The **instructions** you embed
* The **formatting or examples** you give

Even with the same documents in RAG, the **prompt** is what determines how that information is **interpreted, framed, and communicated**.

---

## ✍️ Core Principles of Prompt Engineering

### 1. **Clear Role Assignment**

Assign a persona that sets tone, audience, and expertise.

> *You are a financial advisor helping small Gainesville businesses interpret macroeconomic trends.*

### 2. **Explicit Instructions**

Tell the model **what to do**, not just what question to answer.

> *Identify 3 key trends. Use document metadata. End with a 2-sentence summary.*

### 3. **Context Awareness**

If you have documents, say **how** to use them.

> *Use the following context, citing titles and dates where relevant.*

### 4. **Structured Output**

Tell the model how to structure the answer.

> *Answer in bullet points, each with a heading, explanation, and implication.*

### 5. **Few-Shot Examples** (optional)

Show 1–2 examples of what a good output looks like. This is powerful but increases token usage.




## Pip Install Packages

In [3]:
!pip install --upgrade --quiet \
    langchain \
    langchain-huggingface \
    langchain-openai \
    langchain-community \
    chromadb \
    python-dotenv \
    transformers \
    accelerate \
    sentencepiece

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.5/2.5 MB[0m [31m44.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m44.4/44.4 kB[0m [31m3.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m50.9/50.9 kB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[?25h

## Load Libaries

In [10]:
# 🌿 Environment setup
import os                                 # File paths and OS interaction
from dotenv import load_dotenv            # Load environment variables from .env file
import langchain; print(langchain.__version__)  # Check LangChain version

# 📄 Document loading and preprocessing
from langchain_core.documents import Document                   # Base document type
from langchain_community.document_loaders import TextLoader     # Loads plain text files
from langchain.text_splitter import RecursiveCharacterTextSplitter  # Splits long docs into smaller chunks

# 🔢 Embeddings + vector storage
from langchain_huggingface import HuggingFaceEmbeddings         # HuggingFace embedding model
from langchain.vectorstores import Chroma                       # Persistent vector DB (Chroma)

# 💬 Prompting + output
from langchain_core.prompts import ChatPromptTemplate           # Chat-style prompt templates
from langchain_core.output_parsers import StrOutputParser       # Converts model output to string

# 🔗 Chains / pipelines
from langchain_core.runnables import Runnable, RunnableLambda   # Compose custom pipelines

# 🧠 (Optional) Hugging Face LLM client setup
# from langchain_huggingface import HuggingFaceEndpoint, ChatHuggingFace  # For HF inference API

# 🧾 Pretty printing
import textwrap                         # Format long strings for printing
from pprint import pprint               # Nicely format nested data structures


0.3.25


## SET PARAMS

In [6]:
# SET MODEL PARAMS
EMBED_MODEL = "all-MiniLM-L6-v2"
# LLM_MODEL = "gpt-3.5-turbo"
CHUNK_SIZE = 200
CHUNK_OVERLAP = 50
K = 2

In [20]:
from langchain_openai.chat_models.base import ChatOpenAI

LLM_MODEL = ChatOpenAI(
    model_name="gpt-3.5-turbo",
    temperature=0.4  # Moderate creativity; adjust as needed
)



## 🧾 Document Cleaning

### 🧾 1. **Load the `.txt` files**

We’ll loop through all files in the folder using `TextLoader`.

### 🧹 2. **Cleaning**

Basic cleaning (e.g. stripping newlines, extra whitespace) is often helpful **before splitting**, especially if the files came from exports or copy-paste.

### ✂️ 3. **Split into chunks**

We’ll use `RecursiveCharacterTextSplitter` to chunk documents (typically 500–1000 characters with slight overlap for context continuity).

---

### 🧼 Why Basic Cleaning Helps

* Removes linebreaks and blank lines that confuse LLMs
* Avoids splitting chunks in weird places
* Standardizes format before embedding

Later you can add more advanced cleaning (e.g., remove boilerplate, normalize headers), but this is a solid default.





In [7]:
# Load token from .env.
load_dotenv("/content/API_KEYS.env", override=True)

# Path to your documents
docs_path = "/content/CFFC_docs"

# Step 1: Load all .txt files in the folder
raw_documents = []
for filename in os.listdir(docs_path):
    if filename.endswith(".txt"):
        file_path = os.path.join(docs_path, filename)
        loader = TextLoader(file_path, encoding="utf-8")
        docs = loader.load()
        raw_documents.extend(docs)

print(f"Loaded {len(raw_documents)} documents.")

# Step 2 (optional): Clean up newlines and extra whitespace
def clean_doc(doc: Document) -> Document:
    cleaned = " ".join(doc.page_content.split())  # Removes newlines & extra spaces
    return Document(page_content=cleaned, metadata=doc.metadata)

cleaned_documents = [clean_doc(doc) for doc in raw_documents]

# Step 3: Split documents into chunks
splitter = RecursiveCharacterTextSplitter(
    chunk_size=CHUNK_SIZE,
    chunk_overlap=CHUNK_OVERLAP
)

chunked_documents = splitter.split_documents(cleaned_documents)

print(f"Split into {len(chunked_documents)} total chunks.")

# Preview the first 5 chunks
print(f"Showing first 5 of {len(chunked_documents)} chunks:\n")

for i, doc in enumerate(chunked_documents[:5]):
    print(f"--- Chunk {i+1} ---")
    print(f"Source: {doc.metadata.get('source', 'N/A')}\n")
    print(textwrap.fill(doc.page_content[:500], width=100))  # limit preview to 500 characters
    print("\n")

Loaded 7 documents.
Split into 174 total chunks.
Showing first 5 of 174 chunks:

--- Chunk 1 ---
Source: /content/CFFC_docs/CFFC_What If You Could Cut Cash Flow Forecasting Errors by 50%?.txt

Cashflow 4Cast What If You Could Cut Cash Flow Forecasting Errors by 50%? on March 28, 2025 What If
You Could Cut Cash Flow Forecasting Errors by 50%? Every business lives or dies by its ability to


--- Chunk 2 ---
Source: /content/CFFC_docs/CFFC_What If You Could Cut Cash Flow Forecasting Errors by 50%?.txt

Every business lives or dies by its ability to manage cash flow. Whether it’s covering payroll,
restocking inventory, or preparing for a seasonal dip — having reliable numbers makes all the


--- Chunk 3 ---
Source: /content/CFFC_docs/CFFC_What If You Could Cut Cash Flow Forecasting Errors by 50%?.txt

dip — having reliable numbers makes all the difference. And yet, most small business owners are
flying blind with clunky spreadsheets or outdated tools that leave them guessing. That’s where


## ✅ Embed + Persist in Chroma




In [11]:
# Step 1: Set up Hugging Face embedding model
embedding_model = HuggingFaceEmbeddings(model_name=EMBED_MODEL)

# Step 2: Set up Chroma with persistence
persist_dir = "chroma_db"

vectorstore = Chroma.from_documents(
    documents=chunked_documents,
    embedding=embedding_model,
    persist_directory=persist_dir
)

print(f"✅ Stored {len(chunked_documents)} chunks in Chroma at '{persist_dir}'")

✅ Stored 174 chunks in Chroma at 'chroma_db'


## ✅ Create the Retriever & Prompt Template

In [12]:
retriever = vectorstore.as_retriever(search_kwargs={"k": K})

# prompt template
prompt_template = ChatPromptTemplate.from_template("""
You are a helpful assistant that uses business documents to answer questions.
Use the following context to answer the question as accurately as possible.

Context:
{context}

Question:
{question}

Answer:
""")


## ✅ Step 3: Create the RAG Chain & Run a Query!

In [15]:
# Define RAG chain
rag_chain = (
    RunnableLambda(lambda d: {
        "question": d["question"],
        "docs": retriever.invoke(d["question"])
    })
    | RunnableLambda(lambda d: {
        "context": "\n\n".join([doc.page_content for doc in d["docs"]]),
        "question": d["question"]
    })
    | prompt_template
    | LLM_MODEL
    | StrOutputParser()
)

# Invoke RAG
response = rag_chain.invoke({
    "question": "What are the recent economic indicators in Gainesville that affect local businesses?"
})

# Print response nicely
import textwrap
print("\n" + textwrap.fill(response, width=100))



Some recent economic indicators in Gainesville that can affect local businesses include consumer
confidence levels, unemployment rates, housing market trends, and overall economic growth in the
region. These factors can impact consumer spending, business investment, and overall business
performance in the local economy.



### 🧩 What is `RunnableLambda`?

In LangChain, `RunnableLambda` is a utility that lets you wrap **any arbitrary Python function** and plug it into a chain.

It's like saying:

> “I want to do a little custom logic or data transformation here before moving on to the next step.”

---

### 🔧 Why are we using `RunnableLambda` in your RAG chain?

Let’s break it down:

#### ✅ 1. **Retrieve docs**

```python
RunnableLambda(lambda d: {
    "question": d["question"],
    "docs": retriever.invoke(d["question"])
})
```

This part takes the user’s question (`d["question"]`), uses the retriever to fetch relevant documents, and packages both together to move to the next step.

#### ✅ 2. **Format context**

```python
RunnableLambda(lambda d: {
    "context": "\n\n".join([doc.page_content for doc in d["docs"]]),
    "question": d["question"]
})
```

This part takes the retrieved documents (`d["docs"]`) and **builds a string** from them to pass as the `{context}` variable in the prompt. It also forwards the question.

So, this is doing your custom formatting for the prompt.

---

### 📦 Summary of the full chain

Your RAG chain is basically doing:

1. Take the user question.
2. Use the retriever to get relevant documents.
3. Format those docs into a context string.
4. Pass the formatted `context` and `question` into a prompt.
5. Send that to the model.
6. Parse the model output into plain text.

Each `RunnableLambda` gives you the freedom to inject these logic steps **without rewriting the whole chain**.




#TESTING

### Persona: Goldman Sachs economist

In [17]:
# Strategy: Persona-driven
prompt_template = ChatPromptTemplate.from_template("""
You are a Goldman Sachs economist tasked with briefing Gainesville business owners.
Use the following economic context to analyze key indicators and explain their impact clearly and concisely.

Context:
{context}

Question:
{question}

Answer:
""")

# Define RAG chain
rag_chain = (
    RunnableLambda(lambda d: {
        "question": d["question"],
        "docs": retriever.invoke(d["question"])
    })
    | RunnableLambda(lambda d: {
        "context": "\n\n".join([doc.page_content for doc in d["docs"]]),
        "question": d["question"]
    })
    | prompt_template
    | LLM_MODEL
    | StrOutputParser()
)


# Invoke RAG
response = rag_chain.invoke({
    "question": "What are the recent economic indicators in Gainesville that affect local businesses?"
})

# Print response nicely
import textwrap
print("\n" + textwrap.fill(response, width=100))


1. Consumer Confidence Index: A positive shift in how people feel about the economy can lead to
increased consumer confidence in Gainesville. This can result in higher consumer spending, which
benefits local businesses as customers are more willing to make purchases.  2. Unemployment Rate: A
decreasing unemployment rate in Gainesville indicates a stronger economy and potentially more
disposable income for residents. This can lead to increased demand for goods and services from local
businesses.  3. Housing Market Trends: The housing market in Gainesville can also impact local
businesses. A booming real estate market can lead to increased construction activity, home sales,
and renovations, benefiting industries such as construction, home improvement, and real estate.  4.
Business Sentiment: The overall sentiment of businesses in Gainesville can also impact local
businesses. Positive sentiment can lead to increased investment, hiring, and expansion, while
negative sentiment can result i

## Combining Prompt Features

---

### 🔹 What Is Context Awareness?

Context awareness means telling the model:

* **What kind of context** it’s being given (e.g., business reports, meeting notes, technical documents).
* **How to use it**, such as analyzing, summarizing, or citing it.
* That the context has **metadata** like dates, titles, or types, which can help inform the response.

Without context cues, the model may:

* Treat the documents like background noise.
* Hallucinate or generalize without anchoring in the source.

---

### ✅ Why It Matters:

* Improves **accuracy** and **relevance**.
* Encourages use of **document metadata** to guide or justify responses.
* Makes the LLM’s output feel grounded and **evidence-based**.

---

### 🔹 What Is Structured Output?

Structured output tells the model:

* **How to format** its answer (bullets, tables, numbered points, sections, etc.).
* What **parts** or **components** the answer should include.
* This is especially useful when your goal is to **compare, summarize, brief, or analyze**.

---

### ✅ Why It Matters:

* Makes long responses easier to read and extract key info from.
* Improves reliability — helps avoid vague or meandering outputs.
* Useful for **post-processing**, e.g., extracting structured data for UIs, dashboards, or reports.

---





## PROMPT 2

In [21]:
prompt_template = ChatPromptTemplate.from_template("""
You are an economic analyst preparing a briefing for Gainesville business owners.

Analyze the following documents to identify the most relevant local economic indicators.

Format your response in three sections:

**Overview:** A brief summary of the current economic climate in Gainesville.

**Key Indicators:** List 2–3 indicators. For each, include:
- **Indicator Name**
- **Explanation** (what’s happening and what it measures)
- **Impact** (how it may affect local businesses)

**Recommendations:** Offer 1–2 pieces of practical advice for how businesses should respond.

Keep the tone clear, helpful, and forward-looking. Reference document titles and dates when appropriate.

Context:
{context}

Question:
{question}

Answer:
""")


# Define RAG chain
rag_chain = (
    RunnableLambda(lambda d: {
        "question": d["question"],
        "docs": retriever.invoke(d["question"])
    })
    | RunnableLambda(lambda d: {
        "context": "\n\n".join([doc.page_content for doc in d["docs"]]),
        "question": d["question"]
    })
    | prompt_template
    | LLM_MODEL
    | StrOutputParser()
)


# Invoke RAG
response = rag_chain.invoke({
    "question": "What are the recent economic indicators in Gainesville that affect local businesses?"
})

# Print response nicely
import textwrap
print("\n" + textwrap.fill(response, width=100))


**Overview:** The current economic climate in Gainesville is experiencing a shift in consumer
sentiment towards the economy. This change in perception can have a ripple effect on local
businesses, impacting consumer spending and overall business performance.  **Key Indicators:**  1.
**Consumer Confidence Index**    - **Explanation:** The Consumer Confidence Index measures how
optimistic or pessimistic consumers are about the economy. It is based on surveys that ask consumers
about their current and future economic outlook.    - **Impact:** A decrease in consumer confidence
can lead to reduced spending, lower demand for goods and services, and decreased revenue for local
businesses. On the other hand, an increase in consumer confidence can boost spending and support
business growth.  2. **Unemployment Rate**    - **Explanation:** The unemployment rate indicates the
percentage of the labor force that is unemployed and actively seeking employment. A high
unemployment rate may indicate a 

## PROMPT 3

In [23]:
prompt_template = ChatPromptTemplate.from_template("""
You are a local economic analyst preparing a practical briefing for Gainesville business owners.
Keep your tone clear, confident, and accessible — like you're talking to a room of experienced entrepreneurs.

Format your response in **Markdown** using the following structure:

## 📊 Overview
_A brief summary of the current economic climate._

## 🔍 Key Indicators
### 1. **[Indicator Name]**
- **Explanation:** What the indicator is and what it measures.
- **Impact:** How this affects local businesses.

### 2. **[Second Indicator]**
- ...

## 💡 Recommendations
- Use concise bullet points.
- Give actionable advice based on the indicators.

---

Context:
{context}

Question:
{question}

Answer:
""")


# Define RAG chain
rag_chain = (
    RunnableLambda(lambda d: {
        "question": d["question"],
        "docs": retriever.invoke(d["question"])
    })
    | RunnableLambda(lambda d: {
        "context": "\n\n".join([doc.page_content for doc in d["docs"]]),
        "question": d["question"]
    })
    | prompt_template
    | LLM_MODEL
    | StrOutputParser()
)


# Invoke RAG
response = rag_chain.invoke({
    "question": "What are the recent economic indicators in Gainesville that affect local businesses?"
})

# Print response nicely
# import textwrap
# print("\n" + textwrap.fill(response, width=100))
print(response)

## 📊 Overview
The current economic climate in Gainesville is showing signs of stability and growth, with consumer confidence on the rise and unemployment rates decreasing.

## 🔍 Key Indicators
### 1. **Consumer Confidence**
- **Explanation:** Consumer confidence measures how optimistic or pessimistic consumers are about the state of the economy and their personal financial situation.
- **Impact:** A high level of consumer confidence typically leads to increased spending, which can benefit local businesses by boosting sales.

### 2. **Unemployment Rate**
- **Explanation:** The unemployment rate indicates the percentage of the labor force that is unemployed and actively seeking employment.
- **Impact:** A decreasing unemployment rate suggests a stronger job market, which can result in higher consumer spending and a larger customer base for local businesses.

## 💡 Recommendations
- Monitor consumer confidence trends and adjust marketing strategies accordingly to capitalize on increased sp



### 🔍 Anticipated Visitor Questions

1. **Understanding the Service:**

   * *What is Cashflow 4Cast, and how does it differ from traditional forecasting methods?*
   * *How does AI enhance cash flow forecasting accuracy?*

2. **Implementation & Integration:**

   * *How can I integrate Cashflow 4Cast with my existing financial systems?*
   * *What is the setup process, and how long does it take?*

3. **Benefits & Outcomes:**

   * *What tangible benefits can I expect from using Cashflow 4Cast?*
   * *Are there case studies or testimonials from similar businesses?*

4. **Customization & Flexibility:**

   * *Can the forecasting models be tailored to my specific industry or business model?*
   * *How does the system handle unique financial events or anomalies?*

5. **Support & Resources:**

   * *What kind of customer support is available?*
   * *Are there tutorials or guides to help me get started?*

---

### 🧠 Enhancing Your RAG Pipeline

To effectively address these queries:

* **Document Selection:** Ensure that your RAG system prioritizes documents from the "Welcome" section, as they directly address the foundational aspects of your service.

* **Metadata Utilization:** Incorporate metadata such as publication dates and titles to provide context and credibility to the retrieved information.

* **Prompt Engineering:** Craft prompts that guide the AI to extract and present information in a user-friendly manner. For example:

  ```python
  prompt_template = ChatPromptTemplate.from_template("""
  You are an AI assistant for Cashflow 4Cast, helping users understand and utilize AI-powered cash flow forecasting.

  Context:
  {context}

  Question:
  {question}

  Answer:
  """)
  ```

* **Structured Responses:** Encourage the AI to present answers with clear headings and bullet points, making it easier for users to digest information.

---

### 📈 Future Enhancements

As you expand your RAG system to include broader economic data:

* **Segmented Pipelines:** Maintain separate pipelines for company-specific content and general economic indicators to ensure relevance and accuracy.

* **User Intent Detection:** Implement mechanisms to discern user intent, directing queries to the appropriate pipeline based on whether they're seeking information about your services or broader economic insights.

* **Continuous Learning:** Regularly update your document corpus with new blog posts, articles, and user feedback to keep the RAG system current and responsive.

---

By aligning your RAG pipeline with the specific needs and questions of your website visitors, you can enhance user engagement, provide valuable insights, and position Cashflow 4Cast as a trusted resource in AI-driven financial forecasting. If you need assistance in implementing these suggestions or have further questions, feel free to ask!


