# ü§ó HuggingFace Embeddings with LangChain

## üìö What is HuggingFace?

**HuggingFace** is the leading platform for open-source AI models. It hosts thousands of pre-trained models, including excellent embedding models!

### üåü Why HuggingFace Embeddings?

| Advantage | Description |
|-----------|-------------|
| üÜì **Free & Open** | Most models are free to use |
| üè† **Local Execution** | Run on your own hardware |
| üéØ **Variety** | Thousands of specialized models |
| üî¨ **Research-Grade** | State-of-the-art models |
| üåç **Community** | Active development & support |

### üß† Sentence Transformers

HuggingFace's **sentence-transformers** is a Python framework for producing high-quality sentence embeddings. These models are specifically trained to capture semantic meaning.

---

## üì¶ In This Notebook:
1. Setting up HuggingFace token
2. Initializing Sentence Transformer embeddings
3. Embedding single queries
4. Embedding multiple documents
5. Comparing different models

Let's explore open-source embeddings! üöÄ

## üîß Step 1: Environment Setup

Load environment variables to access your HuggingFace token (needed for some gated models).

> üí° **Note**: Many embedding models don't require a token, but it's good practice to set it up!

In [1]:
import os
from dotenv import load_dotenv
load_dotenv()  #load all the environment variables

True

In [2]:
os.environ['HF_TOKEN']=os.getenv("HF_TOKEN")


## üé® Step 2: Understanding Sentence Transformers

**Sentence Transformers** are neural network models trained specifically for creating sentence embeddings. They're based on transformer architectures (like BERT) but fine-tuned for semantic similarity.

### üèÜ Popular Sentence Transformer Models:

| Model | Dimensions | Speed | Quality | Best For |
|-------|------------|-------|---------|----------|
| `all-MiniLM-L6-v2` | 384 | ‚ö°‚ö°‚ö° Fast | Good | General purpose, quick prototyping |
| `all-mpnet-base-v2` | 768 | ‚ö°‚ö° Medium | Better | Production applications |
| `multi-qa-mpnet-base-dot-v1` | 768 | ‚ö°‚ö° Medium | Best for QA | Question-answering systems |
| `paraphrase-multilingual-MiniLM-L12-v2` | 384 | ‚ö°‚ö° Medium | Good | Multilingual applications |

> üí° **Recommendation**: Start with `all-MiniLM-L6-v2` - it's fast and good enough for most use cases!

## üöÄ Step 3: Initialize HuggingFace Embeddings

Let's create embeddings using the popular `all-MiniLM-L6-v2` model. The first time you run this, it will download the model (~80MB).

### ‚öôÔ∏è What Happens Under the Hood:
```
1. Model downloads from HuggingFace Hub
2. Loads into memory (GPU if available, else CPU)
3. Ready to embed text!
```

In [3]:
from langchain_huggingface import HuggingFaceEmbeddings
embeddings=HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

  from .autonotebook import tqdm as notebook_tqdm
To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`


## üîç Step 4: Embed a Single Query

Use `embed_query()` to convert a single piece of text into a vector. This is typically used for search queries.

> üìù **Fun Fact**: The `all-MiniLM-L6-v2` model produces 384-dimensional vectors - much smaller than OpenAI's 3072 but still very effective!

In [4]:
text="this is a test documents for embeddings"
query_result=embeddings.embed_query(text)
query_result


[-0.06053099408745766,
 -0.010631285607814789,
 0.025930525735020638,
 0.006360912695527077,
 0.05023776367306709,
 0.03835088014602661,
 -0.026163214817643166,
 -0.019225243479013443,
 0.0012484948383644223,
 -0.017975492402911186,
 0.029211804270744324,
 0.014246052131056786,
 0.04009963944554329,
 0.03416895121335983,
 -0.07864794135093689,
 0.027394406497478485,
 0.06308652460575104,
 0.04012323170900345,
 -0.007963345386087894,
 0.03336441516876221,
 -0.016045507043600082,
 0.03942833095788956,
 0.06054949015378952,
 -0.08063256740570068,
 0.014625649899244308,
 -0.023371852934360504,
 -0.09043547511100769,
 0.02662445232272148,
 0.07902538031339645,
 -0.05685008689761162,
 0.09639785438776016,
 0.010565375909209251,
 0.023504991084337234,
 0.10497066378593445,
 0.04507555440068245,
 0.0556306391954422,
 0.043947722762823105,
 0.003767396556213498,
 -0.021466944366693497,
 0.05180206149816513,
 -0.0078067355789244175,
 -0.020863603800535202,
 -0.012691565789282322,
 0.052762858569

In [5]:
len(query_result)

384

### üìè Vector Dimension Analysis

With 384 dimensions, this model is:
- **4x smaller** than `all-mpnet-base-v2` (768)
- **8x smaller** than OpenAI's `text-embedding-3-large` (3072)

This makes it ideal for applications where speed and storage matter!

## üìÑ Step 5: Embed Multiple Documents

Use `embed_documents()` to embed multiple texts at once. This is more efficient than calling `embed_query()` multiple times.

### üîÑ Batch Processing Benefits:
- ‚ö° **Parallelization** - Process multiple texts simultaneously
- üß† **Memory Efficient** - Optimized batch operations
- ‚è±Ô∏è **Time Savings** - Reduced overhead per embedding

In [6]:
doc_result = embeddings.embed_documents([text, "This is not a test document."])
doc_result[0]

[-0.06053099408745766,
 -0.010631285607814789,
 0.025930525735020638,
 0.006360912695527077,
 0.05023776367306709,
 0.03835088014602661,
 -0.026163214817643166,
 -0.019225243479013443,
 0.0012484948383644223,
 -0.017975492402911186,
 0.029211804270744324,
 0.014246052131056786,
 0.04009963944554329,
 0.03416895121335983,
 -0.07864794135093689,
 0.027394406497478485,
 0.06308652460575104,
 0.04012323170900345,
 -0.007963345386087894,
 0.03336441516876221,
 -0.016045507043600082,
 0.03942833095788956,
 0.06054949015378952,
 -0.08063256740570068,
 0.014625649899244308,
 -0.023371852934360504,
 -0.09043547511100769,
 0.02662445232272148,
 0.07902538031339645,
 -0.05685008689761162,
 0.09639785438776016,
 0.010565375909209251,
 0.023504991084337234,
 0.10497066378593445,
 0.04507555440068245,
 0.0556306391954422,
 0.043947722762823105,
 0.003767396556213498,
 -0.021466944366693497,
 0.05180206149816513,
 -0.0078067355789244175,
 -0.020863603800535202,
 -0.012691565789282322,
 0.052762858569

## üìù Summary

### üéì What We Learned:

1. **HuggingFace** provides thousands of free, open-source embedding models
2. **Sentence Transformers** are specialized models for semantic embeddings
3. **all-MiniLM-L6-v2** is a fast, lightweight model (384 dimensions)
4. **embed_query()** - For single texts (search queries)
5. **embed_documents()** - For batch processing multiple texts
6. **Local Execution** - No API costs, full privacy

### üìä Comparing All Three Approaches:

| Feature | OpenAI | Ollama | HuggingFace |
|---------|--------|--------|-------------|
| üí∞ **Cost** | Pay per use | Free | Free |
| üîí **Privacy** | Cloud | Local | Local |
| üìä **Quality** | Excellent | Good | Very Good |
| ‚ö° **Speed** | Fast (API) | Hardware dependent | Hardware dependent |
| üéØ **Ease of Use** | Very Easy | Easy | Easy |
| üîß **Customization** | Limited | Many models | Thousands of models |

### üéØ When to Use Each:

- **OpenAI**: Production apps, best quality, don't mind costs
- **Ollama**: Privacy-focused, offline apps, development
- **HuggingFace**: Balance of quality & cost, specific use cases

### üîó Next Steps:
- Try different HuggingFace models for your use case
- Benchmark models on your specific data
- Build a complete RAG pipeline with vector stores

---
üìö **Related Notebooks**: 
- [4.1 - OpenAI Embeddings](4.1-embedding.ipynb) - Cloud-based, highest quality
- [4.2 - Ollama Embeddings](4.2-ollamaemnedding.ipynb) - Local LLM embeddings

### üìö Resources:
- [HuggingFace Model Hub](https://huggingface.co/models?library=sentence-transformers)
- [Sentence Transformers Documentation](https://www.sbert.net/)
- [MTEB Leaderboard](https://huggingface.co/spaces/mteb/leaderboard) - Embedding model benchmarks