## Retrievel 

Retrievel help to retriever data from different sources like Vector database, Elasticsearch , B25retriever, svm retriever, TFIDFRetriever, MultiQueryRetriever


### 1. VectorStoreRetriever
- **Decription** : This retriever used to retriev data from vector database by finding nearest neighbour to a query using vector similarity.
- **User case** : Works well with vector database like FAISS, Pinecone, Weaviate and chroma

- **eg. :** from langchain.vectorstores import FAISS

### 2. ElasticSearchB25Retriever
- **Decription** : A retriever that use elasticsearch BM25 ranking algorithm to retrieve document based on TF (term frequecy) and IDF (inverse document frequecy)  weighting 

- **User case** : Best work for keyword base retrieval task.
- **eg. :** from langchain.retrievers import ElasticSearchBM25Retriever

### 3. ElasticSearchB25Retriever
- **Decription** : A retriever that use elasticsearch BM25 ranking algorithm to retrieve document based on TF (term frequecy) and IDF (inverse document frequecy)  weighting 

- **User case** : Best work for keyword base retrieval task.
- **eg. :** from langchain.vectorstores import FAISS


Here are the various retrievers available in Langchain along with their details:

### 1. **`VectorStoreRetriever`**
   - **Description**: The most common retriever in Langchain. It retrieves documents from a vector store by finding the nearest neighbors to a query using vector similarity.
   - **Use Case**: Works well with vector databases like FAISS, Pinecone, Weaviate, and Chroma, typically used for document retrieval based on semantic similarity.
   - **Example**:
     ```python
     from langchain.vectorstores import FAISS
     retriever = FAISS.load_local("faiss_index").as_retriever()
     results = retriever.get_relevant_documents(query)
     ```

### 2. **`ElasticSearchBM25Retriever`**
   - **Description**: A retriever that uses Elasticsearch's BM25 ranking algorithm to retrieve documents based on term frequency and inverse document frequency (TF-IDF) weighting.
   - **Use Case**: Best for keyword-based retrieval tasks, especially where Elasticsearch is the underlying search engine.
   - **Example**:
     ```python
     from langchain.retrievers import ElasticSearchBM25Retriever
     retriever = ElasticSearchBM25Retriever(elasticsearch_url="http://localhost:9200", index_name="documents")
     results = retriever.get_relevant_documents(query)
     ```

### 3. **`SVMRetriever`**
   - **Description**: A support vector machine (SVM)-based retriever that uses supervised learning to classify documents as relevant or irrelevant for a given query.
   - **Use Case**: Useful for tasks where labeled data exists and relevance can be learned from training a model.
   - **Example**:
     ```python
     from langchain.retrievers import SVMRetriever
     retriever = SVMRetriever.load_from_data(documents, labels)
     results = retriever.get_relevant_documents(query)
     ```

### 4. **`TFIDFRetriever`**
   - **Description**: A classic information retrieval retriever based on Term Frequency-Inverse Document Frequency (TF-IDF) weighting. It ranks documents based on their relevance to the query terms.
   - **Use Case**: Suitable for small- to medium-sized document collections or when you don’t have access to vector stores.
   - **Example**:
     ```python
     from langchain.retrievers import TFIDFRetriever
     retriever = TFIDFRetriever.from_documents(documents)
     results = retriever.get_relevant_documents(query)
     ```

### 5. **`MultiQueryRetriever`**
   - **Description**: Generates multiple variations of a query to improve the chances of retrieving relevant documents by diversifying the search space.
   - **Use Case**: Works well when a single query might not retrieve diverse enough results, helping improve recall.
   - **Example**:
     ```python
     from langchain.retrievers import MultiQueryRetriever
     retriever = MultiQueryRetriever(base_retriever=retriever)
     results = retriever.get_relevant_documents(query)
     ```

### 6. **`TimeWeightedVectorStoreRetriever`**
   - **Description**: Retrieves documents based on their recency by adjusting the retrieval scores based on how recent the document is, using time-weighting algorithms.
   - **Use Case**: Ideal for use cases where recency matters, such as news or time-sensitive data retrieval.
   - **Example**:
     ```python
     from langchain.retrievers import TimeWeightedVectorStoreRetriever
     retriever = TimeWeightedVectorStoreRetriever(vectorstore=vectorstore, decay_rate=0.5)
     results = retriever.get_relevant_documents(query)
     ```

### 7. **`SelfQueryRetriever`**
   - **Description**: Uses an LLM to augment the query by automatically adding filters, such as document metadata, to improve search precision.
   - **Use Case**: Useful when working with structured data, such as filtering documents by date or type while maintaining flexibility.
   - **Example**:
     ```python
     from langchain.retrievers import SelfQueryRetriever
     retriever = SelfQueryRetriever.from_llm(llm, retriever)
     results = retriever.get_relevant_documents(query)
     ```

### 8. **`MergerRetriever`**
   - **Description**: Combines results from multiple retrievers into one unified result. Each retriever contributes to the final result set.
   - **Use Case**: Effective when working with multiple retrieval strategies and sources, merging the results for the best overall relevance.
   - **Example**:
     ```python
     from langchain.retrievers import MergerRetriever
     retriever = MergerRetriever([retriever1, retriever2])
     results = retriever.get_relevant_documents(query)
     ```

### 9. **`ContextualCompressionRetriever`**
   - **Description**: Compresses the content of the retrieved documents (for example, using summarization) to fit within a specific token or content length.
   - **Use Case**: Works well when dealing with long documents that need to be shortened for model input.
   - **Example**:
     ```python
     from langchain.retrievers import ContextualCompressionRetriever
     retriever = ContextualCompressionRetriever(base_retriever=retriever)
     results = retriever.get_relevant_documents(query)
     ```

### 10. **`KNNRetriever`**
   - **Description**: A basic nearest-neighbors retriever that retrieves documents based on similarity in embedding space, typically using cosine similarity.
   - **Use Case**: Simple and efficient, used when working with vector-based document retrieval tasks.
   - **Example**:
     ```python
     from langchain.retrievers import KNNRetriever
     retriever = KNNRetriever(vectorstore)
     results = retriever.get_relevant_documents(query)
     ```

---

### Summary:
- **Vector-based Retrievers** (`VectorStoreRetriever`, `KNNRetriever`) are commonly used for semantic search based on document embeddings.
- **TF-IDF and BM25-based Retrievers** (`TFIDFRetriever`, `ElasticSearchBM25Retriever`) focus on keyword-based retrieval, best for smaller datasets.
- **Adaptive and Contextual Retrievers** (`MultiQueryRetriever`, `ContextualCompressionRetriever`, `SelfQueryRetriever`) enhance retrieval by refining or compressing queries and results.
- **Merging and Learning-based Retrievers** (`MergerRetriever`, `SVMRetriever`) offer more complex retrieval strategies, combining methods or using learning models.

Each retriever is designed to fit specific use cases, such as semantic search, keyword search, or structured data retrieval, allowing you to choose the most suitable one for your application.