### Fastembed Multi-GPU Tutorial
This tutorial demonstrates how to leverage multi-GPU support in Fastembed. Fastembed supports embedding text and images utilizing modern GPUs for acceleration. Let's explore how to use Fastembed with multiple GPUs step by step.

#### Prerequisites
To get started, ensure you have the following installed:
- Python 3.8 or later
- Fastembed (`pip install fastembed-gpu`)
- Refer to [this](https://github.com/qdrant/fastembed/blob/main/docs/examples/FastEmbed_GPU.ipynb) tutorial if you have issues in GPU dependencies
- Access to a multi-GPU server

### Multi-GPU using cuda argument with TextEmbedding Model

In [None]:
from fastembed import TextEmbedding

# define the documents to embed
docs = ["hello world", "flag embedding"] * 100

# define gpu ids
device_ids = [0, 1]

if __name__ == "__main__":
    # initialize a TextEmbedding model using CUDA
    text_model = TextEmbedding(
        model_name="sentence-transformers/all-MiniLM-L6-v2",
        cuda=True,
        device_ids=device_ids,
        cache_dir="models",
        lazy_load=True,
    )

    # generate embeddings
    text_embeddings = list(text_model.embed(docs, batch_size=1, parallel=len(device_ids)))
    print(text_embeddings)

In this snippet:
- `cuda=True` enables GPU acceleration.
- `device_ids=[0, 1]` specifies GPUs to use. Replace `[0, 1]` with your available GPU IDs.
- `lazy_load=True`

**NOTE**: When using multi-GPU settings, it is recommended to enable lazy_load. Without lazy_load, the model is initially loaded into the memory of the first GPU by the main process. Subsequently, child processes are spawned for each GPU specified in device_ids, causing the model to be loaded a second time on the first GPU. This results in redundant memory usage and potential inefficiencies.

### Multi-GPU using cuda argument with ImageEmbedding

In [1]:
from io import BytesIO

import requests
from PIL import Image
from fastembed import ImageEmbedding


# load sample image
images = [Image.open(BytesIO(requests.get("https://qdrant.tech/img/logo.png").content))] * 10

# define gpu ids
device_ids = [0, 1]

if __name__ == "__main__":
    # initialize ImageEmbedding model
    image_model = ImageEmbedding(
        model_name="Qdrant/clip-ViT-B-32-vision",
        cuda=True,
        device_ids=device_ids,
        cache_dir="models",
        lazy_load=True,
    )

    # generate image embeddings
    image_embeddings = list(image_model.embed(images, batch_size=1, parallel=len(device_ids)))
    print(image_embeddings)