### Vector stores and retrievers
This video tutorial will familiarize you with LangChain's vector store and retriever abstractions. These abstractions are designed to support retrieval of data-- from (vector) databases and other sources-- for integration with LLM workflows. They are important for applications that fetch data to be reasoned over as part of model inference, as in the case of retrieval-augmented generation.

We will cover 
- Documents
- Vector stores
- Retrievers


### Documents
LangChain implements a Document abstraction, which is intended to represent a unit of text and associated metadata. It has two attributes:

- page_content: a string representing the content;
- metadata: a dict containing arbitrary metadata.
The metadata attribute can capture information about the source of the document, its relationship to other documents, and other information. Note that an individual Document object often represents a chunk of a larger document.

Let's generate some sample documents:

In [2]:
from langchain_core.documents import Document

documents = [
    Document(
        page_content="Dogs are great companions, known for their loyalty and friendliness.",
        metadata={"source": "mammal-pets-doc"},
    ),
    Document(
        page_content="Cats are independent pets that often enjoy their own space.",
        metadata={"source": "mammal-pets-doc"},
    ),
    Document(
        page_content="Goldfish are popular pets for beginners, requiring relatively simple care.",
        metadata={"source": "fish-pets-doc"},
    ),
    Document(
        page_content="Parrots are intelligent birds capable of mimicking human speech.",
        metadata={"source": "bird-pets-doc"},
    ),
    Document(
        page_content="Rabbits are social animals that need plenty of space to hop around.",
        metadata={"source": "mammal-pets-doc"},
    ),
]

In [3]:
documents

[Document(metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyalty and friendliness.'),
 Document(metadata={'source': 'mammal-pets-doc'}, page_content='Cats are independent pets that often enjoy their own space.'),
 Document(metadata={'source': 'fish-pets-doc'}, page_content='Goldfish are popular pets for beginners, requiring relatively simple care.'),
 Document(metadata={'source': 'bird-pets-doc'}, page_content='Parrots are intelligent birds capable of mimicking human speech.'),
 Document(metadata={'source': 'mammal-pets-doc'}, page_content='Rabbits are social animals that need plenty of space to hop around.')]

In [4]:
import os
from dotenv import load_dotenv
load_dotenv()
from langchain_groq import ChatGroq
groq_api_key=os.getenv("GROQ_API_KEY")

os.environ["HF_TOKEN"]=os.getenv("HF_TOKEN")
os.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY")

llm=ChatGroq(groq_api_key=groq_api_key,model="Llama3-8b-8192")
llm

ChatGroq(client=<groq.resources.chat.completions.Completions object at 0x00000204E87B96D0>, async_client=<groq.resources.chat.completions.AsyncCompletions object at 0x00000204E87BA7B0>, model_name='Llama3-8b-8192', model_kwargs={}, groq_api_key=SecretStr('**********'))

In [5]:
from langchain_openai import OpenAIEmbeddings
embeddings = OpenAIEmbeddings(model="text-embedding-3-small")
embeddings

OpenAIEmbeddings(client=<openai.resources.embeddings.Embeddings object at 0x00000204E86B9790>, async_client=<openai.resources.embeddings.AsyncEmbeddings object at 0x00000204E87BD6D0>, model='text-embedding-3-small', dimensions=None, deployment='text-embedding-ada-002', openai_api_version=None, openai_api_base=None, openai_api_type=None, openai_proxy=None, embedding_ctx_length=8191, openai_api_key=SecretStr('**********'), openai_organization=None, allowed_special=None, disallowed_special=None, chunk_size=1000, max_retries=2, request_timeout=None, headers=None, tiktoken_enabled=True, tiktoken_model_name=None, show_progress_bar=False, model_kwargs={}, skip_empty=False, default_headers=None, default_query=None, retry_min_seconds=4, retry_max_seconds=20, http_client=None, http_async_client=None, check_embedding_ctx_length=True)

In [6]:
## VectorStores
from langchain_chroma import Chroma

vectorstore=Chroma.from_documents(documents,embedding=embeddings)
vectorstore


<langchain_chroma.vectorstores.Chroma at 0x204e87a5580>

In [7]:
vectorstore.similarity_search("cat")

[Document(id='460e038d-e1d1-4d99-8736-1088899ad0c2', metadata={'source': 'mammal-pets-doc'}, page_content='Cats are independent pets that often enjoy their own space.'),
 Document(id='8deebe60-37ab-4f5c-958d-3a40ab7fb08d', metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyalty and friendliness.'),
 Document(id='cf15ba07-1ab5-4f8e-90f7-ac44092d1d63', metadata={'source': 'mammal-pets-doc'}, page_content='Rabbits are social animals that need plenty of space to hop around.'),
 Document(id='669e3b6f-9b61-46c8-b128-d33fef78a287', metadata={'source': 'fish-pets-doc'}, page_content='Goldfish are popular pets for beginners, requiring relatively simple care.')]

#### `similarity_search("cat")`
This is a **synchronous** (regular) method call. It blocks the current execution until the search is complete.

- Used in traditional Python code (non-async functions)
- Blocking: the code **waits** here until results are returned

---

### `await asimilarity_search("cat")`
This is the **asynchronous** version.

-  Used **only inside `async def` functions**
-  Non-blocking: while the search is running, other operations (like I/O or API calls) can continue
-  You must **`await`** it to get the result

---

### 💡 When to Use What?

| Use Case                             | Use Sync (`similarity_search`) | Use Async (`await asimilarity_search`) |
|-------------------------------------|------------------------------|---------------------------------------|
| Simple scripts or sync apps         | ✅                            | ❌                                     |
| Async web servers (FastAPI, etc.)   | ❌                            | ✅                                     |
| When running multiple I/O tasks     | ❌                            | ✅                                     |

In [8]:
## Async query
await vectorstore.asimilarity_search("cat")

[Document(id='460e038d-e1d1-4d99-8736-1088899ad0c2', metadata={'source': 'mammal-pets-doc'}, page_content='Cats are independent pets that often enjoy their own space.'),
 Document(id='8deebe60-37ab-4f5c-958d-3a40ab7fb08d', metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyalty and friendliness.'),
 Document(id='cf15ba07-1ab5-4f8e-90f7-ac44092d1d63', metadata={'source': 'mammal-pets-doc'}, page_content='Rabbits are social animals that need plenty of space to hop around.'),
 Document(id='669e3b6f-9b61-46c8-b128-d33fef78a287', metadata={'source': 'fish-pets-doc'}, page_content='Goldfish are popular pets for beginners, requiring relatively simple care.')]

In [9]:
vectorstore.similarity_search_with_score("cat")

[(Document(id='460e038d-e1d1-4d99-8736-1088899ad0c2', metadata={'source': 'mammal-pets-doc'}, page_content='Cats are independent pets that often enjoy their own space.'),
  1.240580439567566),
 (Document(id='8deebe60-37ab-4f5c-958d-3a40ab7fb08d', metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyalty and friendliness.'),
  1.550004005432129),
 (Document(id='cf15ba07-1ab5-4f8e-90f7-ac44092d1d63', metadata={'source': 'mammal-pets-doc'}, page_content='Rabbits are social animals that need plenty of space to hop around.'),
  1.6296751499176025),
 (Document(id='669e3b6f-9b61-46c8-b128-d33fef78a287', metadata={'source': 'fish-pets-doc'}, page_content='Goldfish are popular pets for beginners, requiring relatively simple care.'),
  1.7069001197814941)]

#### What is a Retriever?

In LangChain and LLM-based apps:

> **A Retriever is a component that fetches relevant documents (chunks of text) from a data source based on a user query.**

---

#### Why Use a Retriever?

Because LLMs don't have your data by default. If you want your chatbot to answer based on:
- Your PDFs
- Internal docs
- Product FAQs
- Knowledge base

You need to **retrieve** relevant chunks and feed them to the LLM → this is what **Retrievers** do.

---

#### Retrievers vs. VectorStore

| Feature                     | VectorStore                          | Retriever                            |
|----------------------------|---------------------------------------|--------------------------------------|
| Purpose                    | Stores & indexes text embeddings     | Retrieves relevant text chunks       |
| LCEL Compatibility         | ❌ Not a Runnable                     | ✅ Is a Runnable                      |
| Usage in Chains            | Needs manual wrapping                | Plug-and-play in chains              |
| Common Method              | `similarity_search()`                | `invoke()` or `get_relevant_documents()` |
| Supports Async             | Only in specific methods             | ✅ Yes (via `.ainvoke()`)             |

---

#### Wrapping VectorStore with Retriever (Example):

LangChain allows you to wrap `similarity_search` into a Runnable-compatible Retriever:

```python
from langchain_core.runnables import RunnableLambda

retriever = RunnableLambda(lambda x: vectorstore.similarity_search(x["query"]))
```

Now you can use it in a chain like:

```python
chain = (
    {"docs": retriever, "question": lambda x: x["query"]}
    | some_prompt_template
    | model
)
```

---

#### Why LangChain Retrievers are Powerful

- They are **Runnables**, so they can be:
  - Piped into chains
  - Composed functionally with other steps
  - Used with `|` (LCEL syntax)
- Support both **sync** and **async**
- Unified interface for **any retrieval logic** (not just vector-based!)

---

### 🔍 Summary

- **Retrievers** fetch relevant chunks from your data for the LLM to use.
- LangChain Retrievers are **Runnable-compatible**, perfect for LCEL chains.
- You can wrap your own method (`similarity_search`) into a retriever using `RunnableLambda`.

In [10]:
from typing import List

from langchain_core.documents import Document
from langchain_core.runnables import RunnableLambda

retriever=RunnableLambda(vectorstore.similarity_search).bind(k=1)
retriever.batch(["cat","dog"])

[[Document(id='460e038d-e1d1-4d99-8736-1088899ad0c2', metadata={'source': 'mammal-pets-doc'}, page_content='Cats are independent pets that often enjoy their own space.')],
 [Document(id='8deebe60-37ab-4f5c-958d-3a40ab7fb08d', metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyalty and friendliness.')]]

Vectorstores implement an as_retriever method that will generate a Retriever, specifically a VectorStoreRetriever. These retrievers include specific search_type and search_kwargs attributes that identify what methods of the underlying vector store to call, and how to parameterize them. For instance, we can replicate the above with the following:

In [11]:
retriever=vectorstore.as_retriever(
    search_type="similarity",
    search_kwargs={"k":1}
)
retriever.batch(["cat","dog"])


[[Document(id='460e038d-e1d1-4d99-8736-1088899ad0c2', metadata={'source': 'mammal-pets-doc'}, page_content='Cats are independent pets that often enjoy their own space.')],
 [Document(id='8deebe60-37ab-4f5c-958d-3a40ab7fb08d', metadata={'source': 'mammal-pets-doc'}, page_content='Dogs are great companions, known for their loyalty and friendliness.')]]

In [None]:
## RAG
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough

# Define a prompt template that formats a user's question with retrieved context.
# It uses a "human" message to simulate a human chat input to the model.
message = """
Answer this question using the provided context only.

{question}

Context:
{context}
"""
# Convert the message string into a ChatPromptTemplate for use with an LLM.
prompt = ChatPromptTemplate.from_messages([("human", message)])

# Create a RAG (Retrieval-Augmented Generation) chain:
# - "context" is retrieved using a retriever (typically a vectorstore retriever)
# - "question" is passed as-is using RunnablePassthrough()
# - The prompt then combines both into a final query
# - The output goes to the LLM
rag_chain = {"context": retriever, "question": RunnablePassthrough()} | prompt | llm

# Invoke the RAG chain with a user question.
# The retriever will fetch relevant documents, which will be passed to the prompt along with the question.
response = rag_chain.invoke("tell me about dogs")

# Print the model's response content.
print(response.content)



According to the context, dogs are great companions, known for their loyalty and friendliness.
