# Task
Develop a dual language translator GUI application using a trained machine learning model that translates English words of 10 or more letters into French and Hindi simultaneously. For English words with fewer than 10 letters, the application should prompt the user to "upload again".

## Install necessary libraries

### Subtask:
Install libraries for machine learning, translation, and GUI development.


**Reasoning**:
Install the necessary libraries using pip.



In [1]:
!pip install transformers torch gradio



## Prepare data

### Subtask:
Create or obtain a dataset of English words and their French and Hindi translations.


**Reasoning**:
Create a pandas DataFrame with English words and their French and Hindi translations, ensuring a mix of word lengths and combining translations into a single column.



In [2]:
import pandas as pd

data = {
    'English': [
        'beautiful', 'hello', 'world', 'programming', 'language',
        'translation', 'machine', 'learning', 'intelligent', 'communication',
        'short', 'long', 'example', 'data', 'science'
    ],
    'Translations': [
        'belle | सुंदर', 'bonjour | नमस्ते', 'monde | दुनिया', 'programmation | प्रोग्रामिंग', 'langue | भाषा',
        'traduction | अनुवाद', 'machine | मशीन', 'apprentissage | सीख', 'intelligent | बुद्धिमान', 'communication | संचार',
        'court | छोटा', 'long | लंबा', 'exemple | उदाहरण', 'données | डेटा', 'science | विज्ञान'
    ]
}

df = pd.DataFrame(data)
display(df.head())

Unnamed: 0,English,Translations
0,beautiful,belle | सुंदर
1,hello,bonjour | नमस्ते
2,world,monde | दुनिया
3,programming,programmation | प्रोग्रामिंग
4,language,langue | भाषा


## Train translation model

### Subtask:
Train a machine learning model to translate English words to French and Hindi.


**Reasoning**:
Split the 'Translations' column into 'French' and 'Hindi' columns, then use pre-trained models from the `transformers` library to translate English words to French and Hindi, and finally compare the predicted translations with the original translations.



In [3]:
import pandas as pd
from transformers import MarianMTModel, MarianTokenizer

# 1. Split 'Translations' column
df[['French', 'Hindi']] = df['Translations'].str.split('|', expand=True)
df['French'] = df['French'].str.strip()
df['Hindi'] = df['Hindi'].str.strip()

# 3. & 4. Load pre-trained models and tokenizers for English to French and English to Hindi
# Using separate models for each translation direction as a single multi-target model might not be readily available.
model_name_en_fr = 'Helsinki-NLP/opus-mt-en-fr'
tokenizer_en_fr = MarianTokenizer.from_pretrained(model_name_en_fr)
model_en_fr = MarianMTModel.from_pretrained(model_name_en_fr)

model_name_en_hi = 'Helsinki-NLP/opus-mt-en-hi'
tokenizer_en_hi = MarianTokenizer.from_pretrained(model_name_en_hi)
model_en_hi = MarianMTModel.from_pretrained(model_name_en_hi)

# 5. Perform translations using the loaded models
english_words = df['English'].tolist()

# Translate to French
translated_french = []
for word in english_words:
    input_ids = tokenizer_en_fr(word, return_tensors="pt").input_ids
    outputs = model_en_fr.generate(input_ids)
    translated_text = tokenizer_en_fr.decode(outputs[0], skip_special_tokens=True)
    translated_french.append(translated_text)

# Translate to Hindi
translated_hindi = []
for word in english_words:
    input_ids = tokenizer_en_hi(word, return_tensors="pt").input_ids
    outputs = model_en_hi.generate(input_ids)
    translated_text = tokenizer_en_hi.decode(outputs[0], skip_special_tokens=True)
    translated_hindi.append(translated_text)


# 6. Store the generated translations in new columns
df['Predicted_French'] = translated_french
df['Predicted_Hindi'] = translated_hindi

# 7. Compare generated translations with original translations
print("Comparison of Original and Predicted Translations:")
display(df[['English', 'French', 'Predicted_French', 'Hindi', 'Predicted_Hindi']])

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.


tokenizer_config.json:   0%|          | 0.00/42.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/778k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/802k [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

config.json: 0.00B [00:00, ?B/s]



pytorch_model.bin:   0%|          | 0.00/301M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/301M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/44.0 [00:00<?, ?B/s]

source.spm:   0%|          | 0.00/812k [00:00<?, ?B/s]

target.spm:   0%|          | 0.00/1.07M [00:00<?, ?B/s]

vocab.json: 0.00B [00:00, ?B/s]

config.json: 0.00B [00:00, ?B/s]

pytorch_model.bin:   0%|          | 0.00/306M [00:00<?, ?B/s]

generation_config.json:   0%|          | 0.00/293 [00:00<?, ?B/s]

model.safetensors:   0%|          | 0.00/306M [00:00<?, ?B/s]

Comparison of Original and Predicted Translations:


Unnamed: 0,English,French,Predicted_French,Hindi,Predicted_Hindi
0,beautiful,belle,belle,सुंदर,सुंदर
1,hello,bonjour,Bonjour.,नमस्ते,सलाम
2,world,monde,monde entier,दुनिया,संसार
3,programming,programmation,programmation,प्रोग्रामिंग,प्रोग्रामिंग
4,language,langue,langue,भाषा,भाषा
5,translation,traduction,traduction,अनुवाद,अनुवाद
6,machine,machine,machine,मशीन,मशीन
7,learning,apprentissage,l'apprentissage,सीख,सबक
8,intelligent,intelligent,intelligent,बुद्धिमान,बुद्धिमान
9,communication,communication,et de la communication,संचार,संचार


## Develop gui

### Subtask:
Create a GUI with input and output sections for translation.


**Reasoning**:
Import the `gradio` library and define a function for the Gradio interface that takes an English word and returns placeholders for French and Hindi translations. Then, create and launch the Gradio interface with the defined function, specifying the input and output components and the title.



In [None]:
import gradio as gr
from transformers import MarianMTModel, MarianTokenizer

# Load pre-trained models and tokenizers for English to French and English to Hindi
model_name_en_fr = 'Helsinki-NLP/opus-mt-en-fr'
tokenizer_en_fr = MarianTokenizer.from_pretrained(model_name_en_fr)
model_en_fr = MarianMTModel.from_pretrained(model_name_en_fr)

model_name_en_hi = 'Helsinki-NLP/opus-mt-en-hi'
tokenizer_en_hi = MarianTokenizer.from_pretrained(model_name_en_hi)
model_en_hi = MarianMTModel.from_pretrained(model_name_en_hi)

def translate_word(english_word):
    # Ensure the input is treated as a string and check its length
    if len(str(english_word).strip()) < 10:
        return "Upload again", "Upload again"
    else:
        # Translate to French
        input_ids_fr = tokenizer_en_fr(english_word, return_tensors="pt").input_ids
        outputs_fr = model_en_fr.generate(input_ids_fr)
        translated_french = tokenizer_en_fr.decode(outputs_fr[0], skip_special_tokens=True)

        # Translate to Hindi
        input_ids_hi = tokenizer_en_hi(english_word, return_tensors="pt").input_ids
        outputs_hi = model_en_hi.generate(input_ids_hi)
        translated_hindi = tokenizer_en_hi.decode(outputs_hi[0], skip_special_tokens=True)

        return translated_french, translated_hindi


iface = gr.Interface(
    fn=translate_word,
    inputs=gr.Textbox(label="Enter English Word"),
    outputs=[gr.Textbox(label="French Translation"), gr.Textbox(label="Hindi Translation")],
    title="Dual Language Translator"
)

iface.launch(debug=True)



It looks like you are running Gradio on a hosted Jupyter notebook, which requires `share=True`. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. This cell will run indefinitely so that you can see errors and logs. To turn off, set debug=False in launch().
* Running on public URL: https://fc6f947dad1609451f.gradio.live

This share link expires in 1 week. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)
