## LangChain RAG - Beginners guide

### Overview
This notebook demonstrates building a Retrieval-Augmented Generation (RAG) system using:
- LangChain framework for orchestration
- Local Ollama models for embeddings
- Chroma vector database for storage


Rag pipeline is the same along all different frameworks. technical stuff may change but the overall higher abstraction of the RAG is the same. it depends on 2 main steps, the first one is to crete a vector store from the raw data. the second step is to retreive relevant data from the vector store matching the relevance to a user query.  

### Goal 1: Create vector store from the documents.
Document's raw text --> raw text chunks (using your preferred method of splitting) --> Create an embedding model --> Create a Vector store using the raw data and the embedding model.

Final output, the same raw data, represented as numbers so the stupid computer can understand it.


In [1]:
### Importing the necessary libraries

from langchain_ollama import ChatOllama
from langchain.schema import SystemMessage, HumanMessage


In [2]:
import os
# load the data
# we need to define the path of the data and the persistent directory where the vector store will be stored.
current_dir = os.getcwd()
persistent_dir = os.path.join(current_dir, "db", "first_vector_store")
print(persistent_dir) # this is the path of the vector store we will use this later

d:\Langchain\langchain-crash-course\Langchain-getting-started\db\first_vector_store


Now lets load the data from our document spa.txt (this is data is dummy data created with gpt).

### 1. Document Processing
```python
from langchain_community.document_loaders import TextLoader
from langchain.text_splitter import CharacterTextSplitter
```
- **TextLoader**: Loads documents from filesystem
- **CharacterTextSplitter**: Splits documents into chunks with:
  - `chunk_size`: Maximum characters per chunk
  - `chunk_overlap`: Overlap between chunks for context preservation.

In [3]:
from langchain_community.document_loaders import PyPDFLoader, TextLoader # langchain Loaders 

# loader = TextLoader("./data/spa.txt", encoding="utf-8") # intialize the Textloader with the path of the data and the encoding
loader = PyPDFLoader("./data/spa.pdf")
documents = loader.load()   # load the data using the TextLoader
print(documents)



After loading the data, if you remember we wanted to split the text into chunks so we can create the vector store using the chunks instead of the whole document as one file. This technique has many advantages, the main point is we dont retreive the whole document when we do our similarity search, saving some computational power and making things faster.

However, splitting can be done using different techniques. 
1. Character-Based Splitting:
Divides text into fixed-size chunks based on a character count. 
Pros: Fast and simple. 
Cons: Can break sentences and disrupt the natural flow of information. 

2. Sentence Splitting:
Segments text into individual sentences. 
Pros: Maintains sentence integrity, leading to more coherent chunks. 
Cons: Sentences can vary in length, leading to inconsistent chunk sizes. 

3. Semantic Splitting:
Uses embeddings to understand the meaning of the text and clusters semantically related sentences or passages together.  


And many more.

In [4]:
from langchain.text_splitter import CharacterTextSplitter # CharacterTextSplitter

text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0) 
docs = text_splitter.split_documents(documents)
print(docs[1])

page_content='Experience: 7 years 
   Contact: michael.chen@serenityblissspa.com 
 
               Services & Prices 
Massage Therapy: 
- Swedish Massage (60 min): $80 
- Deep Tissue Massage (60 min): $95 
- Hot Stone Massage (75 min): $110 
- Aromatherapy Massage (60 min): $90 
 
Skincare Treatments: 
- Classic Facial (50 min): $75 
- Anti-Aging Facial (70 min): $120 
- Hydrating Facial (60 min): $95 
- Acne Treatment Facial (60 min): $100 
 
Wellness Services: 
- Guided Meditation Session (45 min): $50 
- Stress Management Coaching (60 min): $85 
- Yoga Class (Group, 60 min): $20 per person 
 
Packages: 
- Bliss Package: Swedish Massage + Classic Facial – $140 
- Renewal Package: Hot Stone Massage + Hydrating Facial – 
$185 
- Serenity Full-Day Retreat: All-day access to treatments & 
wellness coaching – $350 
 
       Contact Information 
Phone: +1 (555) 234-7890 
Email: bookings@serenityblissspa.com 
Address: 145 Lotus Avenue, Tranquil City, Harmony State, 
12345 
 
   Frequently A

Now we what we did is the following:
1. load the document data using TextLoader
2. split the data into chunks using character text split method


### Next Goal is to create the embedding model. Then, embedd the splitted data using the embedding model to convert raw data into a vector store.
### 2. Embeddings
```python
from langchain_ollama import OllamaEmbeddings
```
- Uses local Ollama embedding model (`nomic-embed-text`)
- Converts text to numerical vectors capturing semantic meaning

In [5]:
from langchain_ollama import OllamaEmbeddings # OllamaEmbeddings
# create the embedding model
embeddings = OllamaEmbeddings(model="nomic-embed-text:v1.5")

### 3. Vector Store
```python
from langchain_community.vectorstores import Chroma
```
- **Chroma**: Local vector database that:
  - Stores embeddings efficiently
  - Enables fast similarity search
  - Persists data to disk automatically
- Created with:
  ```python
  Chroma.from_documents(
      docs, 
      embeddings,
      persist_directory=vector_store_path
  )
  ```


In [6]:
from langchain_community.vectorstores import Chroma # Chroma db

# create the vector store using the chroma library function (Chroma.from_documents).
#  this will create the vector store and save it in the persistent directory.
db = Chroma.from_documents(
    documents= docs, 
    embedding= embeddings,
    persist_directory= persistent_dir
)

Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


``` Congrats, Now you have succesfully created a vector store using your raw text and embeddings model.```

### How can we use this vector store?
using the store is simple, all you need is to do is:
1. Load the vector store if not loaded.
2. apply any similarity methodology/technique to retreive relevant chunks.

In [7]:
# load the vector store so we can use it for retreival. 
database = Chroma(persist_directory= persistent_dir, embedding_function=embeddings)

Failed to send telemetry event ClientStartEvent: capture() takes 1 positional argument but 3 were given
Failed to send telemetry event ClientCreateCollectionEvent: capture() takes 1 positional argument but 3 were given


In [8]:
# now we can use the vector store for retreival. using the as_retriever method.
retriever = database.as_retriever(
    search_type="similarity_score_threshold",
    search_kwargs={"k": 3, "score_threshold": 0.2}
)
query = "What are the services offered?"
# now we can use the retriever to search for relevant chunks.
results = retriever.invoke(query)

Failed to send telemetry event CollectionQueryEvent: capture() takes 1 positional argument but 3 were given
  self.vectorstore.similarity_search_with_relevance_scores(
No relevant docs were retrieved using the relevance score threshold 0.2


In [9]:
for i, doc in enumerate(results, 1):
    print(f"Document {i}:\n{doc.page_content}\n")
    if doc.metadata:
        print(f"Source: {doc.metadata.get('source', 'Unknown')}\n")

### What did we cover??

#### Step 1: Setup
1. Import required libraries
2. Verify file paths and data availability

#### Step 2: Document Loading
- Loads documents information using the appropriate data loader from langchain.document_loaders
- Verifies document content and size

#### Step 3: Text Splitting
- Splits document into chunks with overlap if the use case requires.
- Preserves document structure by splitting on newlines or specific separator.

#### Step 4: Embedding Creation
- Initializes Ollama embeddings
- Tests embedding generation with sample text
- Shows embedding dimensionality (vector size)

#### Step 5: Vector Store Creation
- Creates new Chroma database using Chroma.from_documents method
- Stores document chunks as vectors
- Retreives relevant documents 

#### Step 6: Testing
- Performs similarity searches with test queries:
  ```python
  db.similarity_search("What services does the spa offer?", k=2)
  ```
- Displays top matching chunks for each query


--- 


In [10]:
combined_user_query = (
    "Here are some documents that might help answer the question: "
    + query
    + "\n\nRelevant Documents:\n"
    + "\n\n".join([doc.page_content for doc in docs])
    + "\n\nPlease provide an answer based only on the provided documents. Never mention the documents in your response. If the answer is not found in the documents, respond with 'I'm not sure'."
)
print(combined_user_query)

Here are some documents that might help answer the question: What are the services offered?

Relevant Documents:
Serenity Bliss Spa – Internal Data File        
 
       General Information 
Name: Serenity Bliss Spa 
Location: 145 Lotus Avenue, Tranquil City, Harmony State, 
12345 
Phone: +1 (555) 234-7890 
Email: contact@serenityblissspa.com 
Website: www.serenityblissspa.com 
 
      Opening Hours 
Monday – Friday: 9:00 AM – 9:00 PM 
Saturday: 10:00 AM – 8:00 PM 
Sunday: 11:00 AM – 6:00 PM 
Holidays: Closed on major public holidays 
 
                          Employees Data 
1. Employee ID: SB001 
   Name: Emma Johnson 
   Role: Spa Manager 
   Experience: 10 years in spa management 
   Contact: emma.johnson@serenityblissspa.com 
 
2. Employee ID: SB002 
   Name: Daniel Lee 
   Role: Senior Massage Therapist 
   Specialty: Deep Tissue & Sports Massage 
   Experience: 8 years 
   Contact: daniel.lee@serenityblissspa.com 
 
3. Employee ID: SB003 
   Name: Sophia Martinez 
   Role: Ski

In [11]:
# Create the ollama chat model
OLLAMA_MODEL = os.getenv("OLLAMA_MODEL", "llama3.2:3b")

model = ChatOllama(model=OLLAMA_MODEL)

In [12]:
## create the messages
messages = [
    SystemMessage(content="You are a helpful assistant for the spa company."),
    HumanMessage(content=combined_user_query),
]
result = model.invoke(messages)
print(result.content)


Based on the information provided, Serenity Bliss Spa offers a range of services, including:

* Massage Therapy: Swedish, Deep Tissue, Hot Stone, and Aromatherapy
* Skincare Treatments: Classic Facial, Anti-Aging Facial, Hydrating Facial, and Acne Treatment Facial
* Wellness Services: Guided Meditation Session, Stress Management Coaching, and Yoga Class (Group)
* Packages: Bliss Package, Renewal Package, and Serenity Full-Day Retreat

They also offer various treatment options for couples.


``` Congratulations, you have succesfully created a vectore store from your raw data, retreived relevant data from that vectore store based on a specific query. and passed the relevant info to an LLM for a specific message response.```