# Qdrant FastEmbed

[FastEmbed](https://qdrant.github.io/fastembed/) is a lightweight, fast, Python library built for embedding generation. 

- Quantized model weights
- ONNX Runtime, no PyTorch dependency
- CPU-first design
- Data-parallelism for encoding of large datasets.

## Dependencies

To use FastEmbed with LangChain, install the `fastembed` Python package.

In [None]:
%pip install fastembed

## Imports

In [2]:
from langchain.embeddings.fastembed import FastEmbedEmbeddings

## Instantiating FastEmbed
   
### Parameters
- `model_name: str` (default: "BAAI/bge-small-en-v1.5")
    > Name of the FastEmbedding model to use. You can find the list of supported models [here](https://qdrant.github.io/fastembed/examples/Supported_Models/).

- `max_length: int` (default: 512)
    > The maximum number of tokens. Unknown behavior for values > 512.

- `cache_dir: Optional[str]`
    > The path to the cache directory. Defaults to `local_cache` in the parent directory.

- `threads: Optional[int]`
    > The number of threads a single onnxruntime session can use. Defaults to None.

- `doc_embed_type: Literal["default", "passage"]` (default: "default")
    > "default": Uses FastEmbed's default embedding method.
    
    > "passage": Prefixes the text with "passage" before embedding.

In [None]:
embeddings = FastEmbedEmbeddings()

## Usage

### Generating document embeddings

In [None]:
document_embeddings = embeddings.embed_documents(["This is a document", "This is some other document"])

### Generating query embeddings

In [None]:
query_embeddings = embeddings.embed_query("This is a query")