<a href="https://colab.research.google.com/github/cdtlaura/nlp2/blob/main/Seq2Seqhomework.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Loading T5 Model and Tokenizer

Explanation: This line imports two classes from the Hugging Face transformers library:

T5Tokenizer: Responsible for converting text into tokens that the T5 model can understand.

T5ForConditionalGeneration: A version of the T5 model designed for tasks like text generation, translation, summarization, and more.


In [None]:
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load the model and tokenizer
model_name = "t5-small"  # You can replace this with any other model
model = T5ForConditionalGeneration.from_pretrained(model_name)
tokenizer = T5Tokenizer.from_pretrained(model_name)

Installing Gradio:
Explanation: This command installs the Gradio library, which allows you to create simple, interactive interfaces for machine learning models.
The -qqq option is used to suppress the usual installation output, keeping the notebook output cleaner.

In [None]:
!pip install gradio -qqq
# https://www.gradio.app/

In [None]:
pip install sacremoses

Collecting sacremoses
  Downloading sacremoses-0.1.1-py3-none-any.whl.metadata (8.3 kB)
Downloading sacremoses-0.1.1-py3-none-any.whl (897 kB)
[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/897.5 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m348.2/897.5 kB[0m [31m9.8 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m897.5/897.5 kB[0m [31m14.1 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: sacremoses
Successfully installed sacremoses-0.1.1


Setting up the MarianMT Model for Translation with Gradio Interface

Gradio: A library that simplifies the process of creating web interfaces for machine learning models.

MarianMTModel and MarianTokenizer: Classes from Hugging Face's transformers library specifically for the MarianMT translation model, which is designed for machine translation tasks.

Explanation of the Gradio UI Components:

fn=translate: This specifies the function to be called when the user submits input. In our case, it's the translate() function defined earlier.

inputs="text": This creates an input box where users can type their English sentences.

outputs="text": This creates an output box where the translated text will be displayed.

title and description: These are optional arguments that give your app a nice title and a description so users know what it does.


In [None]:
import gradio as gr
from transformers import MarianMTModel, MarianTokenizer

# Load the MarianMT model and tokenizer
model_name = "Helsinki-NLP/opus-mt-en-ar"  # A model specifically trained for English to Arabic translation
model = MarianMTModel.from_pretrained(model_name)
tokenizer = MarianTokenizer.from_pretrained(model_name)

# Define the translation function
def translate_to_english(input_text):
    # Prepare the input text for the model
    input_ids = tokenizer.encode(input_text, return_tensors="pt")

    # Generate translation
    output = model.generate(input_ids)

    # Decode the output
    output_text = tokenizer.decode(output[0], skip_special_tokens=True)

    return output_text

# Create a Gradio interface
iface = gr.Interface(
    fn=translate_to_english,  # The function for translation
    inputs="text",  # Input component (text box for user input)
    outputs="text",  # Output component (text box for the result)
    title="English to Arabic Translator",  # Title of the interface
    description="Enter a English sentence, and this tool will translate it to Arabic using the specified model."
)

# Launch the interface
iface.launch()

# For other languages, refer to the MarianMT models available at:
# https://huggingface.co/models?search=Helsinki-NLP
# Example for English to French:
# model_name = "Helsinki-NLP/opus-mt-en-fr"  # Replace with the appropriate model for desired language translation

Running Gradio in a Colab notebook requires sharing enabled. Automatically setting `share=True` (you can turn this off by setting `share=False` in `launch()` explicitly).

Colab notebook detected. To show errors in colab notebook, set debug=True in launch()
* Running on public URL: https://05b8469bab2cd67e93.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


