<a href="https://colab.research.google.com/github/Saadellll/Object-Detection-and-Tracking/blob/main/Language_Translation_Tool.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Install necessary packages


In [5]:
pip install transformers



In [6]:
pip install sentencepiece



In [7]:
pip install langid

Collecting langid
  Downloading langid-1.1.6.tar.gz (1.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.9/1.9 MB[0m [31m19.0 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
Building wheels for collected packages: langid
  Building wheel for langid (setup.py) ... [?25l[?25hdone
  Created wheel for langid: filename=langid-1.1.6-py3-none-any.whl size=1941172 sha256=b5fecfe87f8c866f62acec19d716b2c4ec85a5e026565b7c2f7245463bfafe3c
  Stored in directory: /root/.cache/pip/wheels/23/c8/c6/eed80894918490a175677414d40bd7c851413bbe03d4856c3c
Successfully built langid
Installing collected packages: langid
Successfully installed langid-1.1.6


#Code

##Overview of the Script
###How the Code Works

This notebook contains a script that uses the transformers library to translate text between different languages. Here’s a quick overview of what the code does:

####Imports Libraries:

**logging**: Manages messages and errors.

**transformers**: Loads and uses translation models.
**langid**: Detects the language of the text.
####Defines Functions:

load_model(source_lang, target_lang): Fetches the model and tokenizer for translation.
translate_text(text, source_lang, target_lang): Translates the text.
detect_language(text): Identifies the text's language.
batch_translate(texts, source_lang, target_lang): Translates a list of texts.
Main Function:

**main()**: Collects user input for languages and text, then performs the translation and shows the result.
How to Use the Notebook
Run the Notebook:

Execute the cells to run the script. In Google Colab or Jupyter Notebook, click 'Run' or press Shift + Enter.
Provide Information:

Enter the source language code (like 'en' for English) and the target language code (like 'de' for German).
Type the text you want to translate.
See the Result:

The translated text will be displayed below the code cell.

In [8]:
# Import necessary libraries
import logging
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
import langid

# Initialize logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')

# Define a dictionary to map language pairs to their model names
model_names = {
    ('en', 'de'): 'Helsinki-NLP/opus-mt-en-de',
    ('de', 'en'): 'Helsinki-NLP/opus-mt-de-en',
    ('en', 'fr'): 'Helsinki-NLP/opus-mt-en-fr',
    ('fr', 'en'): 'Helsinki-NLP/opus-mt-fr-en',
    ('en', 'ar'): 'Helsinki-NLP/opus-mt-en-ar',
    ('ar', 'en'): 'Helsinki-NLP/opus-mt-ar-en',
    # Add more language pairs and their corresponding models here
}

# Function to load the appropriate model and tokenizer based on the source and target languages
def load_model(source_lang, target_lang):
    model_name = model_names.get((source_lang, target_lang))
    if model_name is None:
        raise ValueError(f"No model found for translating from {source_lang} to {target_lang}")
    model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    return model, tokenizer

# Function to translate text from the source language to the target language
def translate_text(text, source_lang, target_lang):
    model, tokenizer = load_model(source_lang, target_lang)
    inputs = tokenizer.encode(text, return_tensors="pt")
    outputs = model.generate(inputs, max_length=40, num_beams=4, early_stopping=True)
    translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return translated_text

# Function to detect the language of a given text
def detect_language(text):
    lang, _ = langid.classify(text)
    return lang

# Function to translate a batch of texts from the source language to the target language
def batch_translate(texts, source_lang, target_lang):
    translations = []
    for text in texts:
        translated_text = translate_text(text, source_lang, target_lang)
        translations.append(translated_text)
    return translations

# Main function to interactively get user input and perform translation
def main():
    # Display a message for the user
    print("Language Translation Tool")

    # Get source and target language codes from the user
    source_lang = input("Enter source language code (e.g., 'en' for English): ")
    target_lang = input("Enter target language code (e.g., 'de' for German): ")

    # Get text to translate from the user
    text = input("Enter text to translate: ")

    # If the user provided text, proceed with translation
    if text:
        texts = [text]

        # If the source language is not provided, detect it
        if not source_lang:
            logging.info("Detecting source language...")
            source_lang = detect_language(texts[0])
            logging.info(f"Detected source language: {source_lang}")

        # Try to translate the text and handle any errors
        try:
            translations = batch_translate(texts, source_lang, target_lang)
            for original, translated in zip(texts, translations):
                print(f"Original: {original.strip()}\nTranslated: {translated}\n")
        except ValueError as e:
            logging.error(e)
    else:
        logging.error("No text provided for translation.")

# Execute the main function
if __name__ == "__main__":
    main()


Language Translation Tool
Enter source language code (e.g., 'en' for English): en
Enter target language code (e.g., 'de' for German): fr
Enter text to translate: Hi, how are you


The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


config.json:   0%|          | 0.00/1.42k [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/301M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/778k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

vocab.json:   0%|          | 0.00/1.34M [00:00<?, ?B/s]



Original: Hi, how are you
Translated: Bonjour, comment allez-vous?

