### Text Translation using M2M-100

In [1]:
import torch
from transformers import M2M100ForConditionalGeneration, M2M100Tokenizer
from typing import List
from langdetect import detect

class M2M100Translator:
    def __init__(self, model_name: str = "facebook/m2m100_418M"):
        self.device = torch.device("cpu")
        self.tokenizer = M2M100Tokenizer.from_pretrained(model_name)
        self.model = M2M100ForConditionalGeneration.from_pretrained(model_name).to(self.device)
        self.lang_code_map = {
            'en': 'en', 'fr': 'fr', 'es': 'es', 'de': 'de', 'it': 'it', 'pt': 'pt', 
            'nl': 'nl', 'ru': 'ru', 'zh': 'zh', 'ja': 'ja', 'ko': 'ko', 'ar': 'ar',
            # Add more mappings as needed
        }

    def detect_language(self, text: str) -> str:
        try:
            lang_code = detect(text)
            return self.lang_code_map.get(lang_code, 'en')  # Default to English if not found
        except:
            return 'en'  # Default to English if detection fails

    def translate(self, text: str, tgt_lang: str, src_lang: str = None) -> str:
        if src_lang is None:
            src_lang = self.detect_language(text)
        
        self.tokenizer.src_lang = src_lang
        encoded = self.tokenizer(text, return_tensors="pt").to(self.device)
        
        generated_tokens = self.model.generate(
            **encoded,
            forced_bos_token_id=self.tokenizer.get_lang_id(tgt_lang),
            max_length=128
        )
        
        return self.tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0]

    def translate_batch(self, texts: List[str], tgt_lang: str, src_lang: str = None) -> List[str]:
        if src_lang is None:
            src_langs = [self.detect_language(text) for text in texts]
        else:
            src_langs = [src_lang] * len(texts)
        
        translations = []
        for text, src_lang in zip(texts, src_langs):
            self.tokenizer.src_lang = src_lang
            encoded = self.tokenizer(text, return_tensors="pt").to(self.device)
            
            generated_tokens = self.model.generate(
                **encoded,
                forced_bos_token_id=self.tokenizer.get_lang_id(tgt_lang),
                max_length=128
            )
            
            translations.append(self.tokenizer.batch_decode(generated_tokens, skip_special_tokens=True)[0])
        
        return translations

# # Example usage
# if __name__ == "__main__":
#     translator = M2M100Translator()

#     # Single translation with auto-detection
#     text = "Hello, how are you?"
#     tgt_lang = "fr"  # French
#     translation = translator.translate(text, tgt_lang)
#     print(f"Original: {text}")
#     print(f"Translation: {translation}")

#     # Batch translation with auto-detection
#     texts = ["Hello, how are you?", "Bonjour, comment allez-vous?", "Hola, ¿cómo estás?"]
#     translations = translator.translate_batch(texts, tgt_lang)
#     for original, translation in zip(texts, translations):
#         print(f"Original: {original}")
#         print(f"Translation: {translation}")
#         print()

Certainly! I'll provide you with information about the different model sizes available for the M2M-100 (Many-to-Many 100) model family.

M2M-100 comes in three main sizes, each offering a different balance between translation quality and computational requirements. Here are the available variants:

1. M2M-100 12B: The largest and most capable model
   - Parameters: 12 billion
   - Model name: "facebook/m2m100_12B"

2. M2M-100 1.2B: A medium-sized model
   - Parameters: 1.2 billion
   - Model name: "facebook/m2m100_1.2B"

3. M2M-100 418M: The smallest model
   - Parameters: 418 million
   - Model name: "facebook/m2m100_418M"

Here's a quick guide on choosing:

- For highest quality and if you have powerful hardware: Use the 12B model
- For a good balance of quality and speed: Consider the 1.2B model
- For faster inference with still reasonable quality: Use the 418M model

To use a different model size in the code I provided earlier, you would simply change the model name when initializing the `M2M100Translator` class. For example:

```python
# For the largest model
translator = M2M100Translator("facebook/m2m100_12B")

# For the medium-sized model
translator = M2M100Translator("facebook/m2m100_1.2B")
```

It's important to note:

1. All M2M-100 models support the same 100 languages, regardless of size.

2. Larger models generally provide better translation quality, especially for more complex texts and low-resource language pairs.

3. The choice of model size significantly impacts memory usage and inference speed, especially on CPU.

4. The 12B model, in particular, may be impractical for CPU-only inference due to its size and computational requirements.

For your production use case with CPU inference, the 418M model (as used in the example) might be the most practical choice. If you can afford more memory and longer inference times, the 1.2B model could provide a good balance between quality and performance.

Would you like more information on how these different model sizes might affect your implementation or performance? Or would you like to see a comparison of their performance on specific language pairs?