<a href="https://colab.research.google.com/github/write-with-neurl/modelbit-notebooks/blob/main/Deploy_SeamlessM4T_With_Modelbit.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ⚡ Deploying SeamlessM4T to A Rest API Endpoint for Text Translation

## 🧑‍💻 Installations and Set Up

Let's start by installing 🤗 Transformers.

In [None]:
!pip install --upgrade git+https://github.com/huggingface/transformers.git modelbit

Collecting git+https://github.com/huggingface/transformers.git
  Cloning https://github.com/huggingface/transformers.git to /tmp/pip-req-build-6mbtuq4m
  Running command git clone --filter=blob:none --quiet https://github.com/huggingface/transformers.git /tmp/pip-req-build-6mbtuq4m
  Resolved https://github.com/huggingface/transformers.git to commit c5f0288bc7d76f65996586f79f69fba8867a0e67
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting modelbit
  Downloading modelbit-0.34.12-py3-none-any.whl (124 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m124.9/124.9 kB[0m [31m4.1 MB/s[0m eta [36m0:00:00[0m
Collecting pycryptodomex (from modelbit)
  Downloading pycryptodomex-3.20.0-cp35-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (2.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.1/2.1 MB[0m [31m20.0 MB/s[

## Define your input text and the source and target languages

We'll define a text input along with the source and target languages. In this example, the source language is English ("eng"), and the target language is Russian ("rus"). You check out the list of supported languages [here](https://huggingface.co/ylacombe/hf-seamless-m4t-medium/blob/main/tokenizer_config.json#L1887-L2089)

In [None]:
input_text = "Hello, my dog is cute"
src_lang = "eng"  # Source language (English)
tgt_lang = "rus"  # Target language (Russian)

## Load model and processor

Next, we load the raw model seamless-m4t-medium checkpoint from the [hub](https://huggingface.co/facebook/hf-seamless-m4t-medium). See the [model hub](https://huggingface.co/models?search=seamless-m4t) to look for fine-tuned versions on a task that interests you.

In [None]:
from transformers import AutoProcessor, SeamlessM4TModel

processor = AutoProcessor.from_pretrained("facebook/hf-seamless-m4t-medium")
model = SeamlessM4TModel.from_pretrained("facebook/hf-seamless-m4t-medium")

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


## Generate Translated Text

Next we generate the translated text by passing the processed input to the model's generate method, specifying the target language.

In [None]:
# Process the input text
text_inputs = processor(text=input_text, src_lang=src_lang, return_tensors="pt")

# Generate the translated text
output_tokens = model.generate(**text_inputs, tgt_lang=tgt_lang, generate_speech=False)

flattened_tokens = [token_id for sublist in output_tokens[0] for token_id in sublist]
translated_text_from_text = processor.decode(flattened_tokens, skip_special_tokens=True)

<class 'torch.Tensor'>
tensor([[     3, 256147, 169771, 248128,  18570,  17568, 248079,  42812,  95775,
            286,   1345, 164100, 248075,      3]])


In [None]:
translated_text_from_text

'Здравствуйте, мой собака милая.'

## Inference Function for Text Translation

The `seamless_m4t` function, decorated with `@cache`, is our key player. This function uses `snapshot_download` to fetch the specific backbone.

The use of `@cache` is a clever optimization; it ensures that once the model and processor are loaded, they are stored in memory. This significantly speeds up future calls to this function, as it avoids reloading the model and processor from scratch each time, making it ideal for deployments.

We'll deploy with English as our source language and French as our target language.

In [None]:
from functools import cache
from huggingface_hub import snapshot_download

@cache
def seamless_m4t():
    model_path = snapshot_download(repo_id="facebook/hf-seamless-m4t-medium")
    processor = AutoProcessor.from_pretrained(model_path)
    model = SeamlessM4TModel.from_pretrained(model_path)
    return model, processor

In [None]:
def seamless_m4t_text_translation(text, src_lang="eng", tgt_lang="fra"):
    model, processor = seamless_m4t()
    # Process the input text
    text_inputs = processor(text=input_text, src_lang=src_lang, return_tensors="pt")
    output_tokens = model.generate(**text_inputs, tgt_lang=tgt_lang, generate_speech=False)
    flattened_tokens = [token_id for sublist in output_tokens[0] for token_id in sublist]
    translated_text_from_text = processor.decode(flattened_tokens, skip_special_tokens=True)
    return translated_text_from_text

In [None]:
seamless_m4t_text_translation(text="Happy Thanksgiving!")

'Bonjour, mon chien est mignon'

## 🚢 Deploy SeamlessM4T to a REST API Endpoint

### 🔐 Log into `modelbit`

In [None]:
import modelbit as mb

mb.login()

In [None]:
#Deploy the seamless m4t text translation function to modelbit
mb.deploy(seamless_m4t_text_translation,
          python_packages=["git+https://github.com/huggingface/transformers.git",
                           "sentencepiece==0.1.99", "torch==2.1.0", "einops==0.7.0", "accelerate==0.25.0"],
          )

## 📩 Test the REST Endpoint with Text Input

You can test your REST Endpoint by passing a text input and changing the source or target language for inference.

Use the `requests` package to POST a request to the API and use `json` to format the response to print nicely:


> ⚠️ Replace the `ENTER_WORKSPACE_NAME` placeholder with your workspace name.

In [None]:
import json
import requests

requests.post("https://ENTER_WORKSPACE_NAME.us-east-1.modelbit.com/v1/seamless_m4t_text_translation/latest",
              headers={"Content-Type":"application/json"},
              data=json.dumps({"data": ["the quick brown fox jumps over the lazy dog"]})).json()

{'data': 'Bonjour, mon chien est mignon'}

You can also test your endpoint from the command line using:


> `curl -s -XPOST "https://ENTER_WORKSPACE_NAME.us-east-1.modelbit.com/v1/seamless_m4t_text_translation/latest" -d '{"data": ["the quick brown fox jumps over the lazy dog"]}' | json_pp`

---
> ⚠️ Replace the `ENTER_WORKSPACE_NAME` placeholder with your workspace name.