<a href="https://www.kaggle.com/code/aisuko/computing-embeddings-with-multi-gpus?scriptVersionId=162449513" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Overview

Let's compute the embeddings with the multiple GPUs. In this notebook, we start multiple processes (1 per GPU), which encode sentences in parallel. This gives a near linear speed-up when encoding large text collections.

In [1]:
!pip install sentence-transformers==2.3.1

Collecting sentence-transformers==2.3.1
  Downloading sentence_transformers-2.3.1-py3-none-any.whl.metadata (11 kB)
Downloading sentence_transformers-2.3.1-py3-none-any.whl (132 kB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m132.8/132.8 kB[0m [31m1.3 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: sentence-transformers
Successfully installed sentence-transformers-2.3.1


The relevant method is `start_multi_process_pool()`, which starts multiple processes that are used for encoding.

In [2]:
from sentence_transformers import SentenceTransformer, LoggingHandler
import logging

logging.basicConfig(
    format='%(asctime)s-%(message)s%',
    datefmt='%Y-%m-%d %H:%M:%S',
    level=logging.INFO,
    handlers=[LoggingHandler()]
)

# Note: we need to shield our code with if __name__. Otherwise, CUDA runs into issues when spawning new processes.
if __name__=='__main__':
    sentences=['This is sentence {}'.format(i) for i in range(100000)]
    
    # define the model
    model=SentenceTransformer('all-MiniLM-L6-v2')
    
    # start the multi-process pool on all available CUDA devices
    pool=model.start_multi_process_pool()
    
    # computing the embeddings using the multi-process pool
    emb=model.encode_multi_process(sentences, pool)
    print('Embeddings computed. Shape:', emb.shape)
    
    # optional : stop the processes in the pool
    model.stop_multi_process_pool(pool)

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md:   0%|          | 0.00/10.6k [00:00<?, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/612 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/90.9M [00:00<?, ?B/s]

  return self.fget.__get__(instance, owner)()


tokenizer_config.json:   0%|          | 0.00/350 [00:00<?, ?B/s]

vocab.txt:   0%|          | 0.00/232k [00:00<?, ?B/s]

tokenizer.json:   0%|          | 0.00/466k [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/112 [00:00<?, ?B/s]

1_Pooling/config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Embeddings computed. Shape: (100000, 384)
