To create a Python script that translates the text from one language to another using the MarianMT model, you can follow these steps. This script will read a text file containing the source language text, translate it using the MarianMT model, and then save the translated text into another text file.

1. Install the necessary packages:

Make sure you have the transformers library installed. You can install it using pip if you haven't done so already:

In [None]:
!pip install transformers

2. Create the translation script:

Below is the Python script that reads text from a file, translates it using the MarianMT model, and then writes the translated text to a new file.

In [None]:
from transformers import MarianMTModel, MarianTokenizer

def translate_text(input_file, output_file, model_name="Helsinki-NLP/opus-mt-en-roa"):
    # Load the model and tokenizer
    tokenizer = MarianTokenizer.from_pretrained(model_name)
    model = MarianMTModel.from_pretrained(model_name)

    # Read the input text from the file
    with open(input_file, "r", encoding="utf-8") as file:
        text = file.read()

    # Tokenize the input text
    inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)

    # Generate translation using the model
    translated = model.generate(**inputs)

    # Decode the translated tokens into text
    translated_text = tokenizer.decode(translated[0], skip_special_tokens=True)

    # Write the translated text to the output file
    with open(output_file, "w", encoding="utf-8") as file:
        file.write(translated_text)

    print(f"Translation completed. Translated text saved to {output_file}")

# Specify the input and output files
input_file = "input_text.txt"
output_file = "translated_text.txt"

# Call the translate_text function
translate_text(input_file, output_file)


3. Explanation of the code:

* translate_text function: This function takes the path of the input file, output file, and the model name as arguments. It loads the MarianMT model and tokenizer, reads the text from the input file, translates it, and saves the translated text to the output file.

* Model and tokenizer: The MarianMT model and tokenizer are loaded using the from_pretrained method. You specified Helsinki-NLP/opus-mt-en-roa as the model, which translates from English to Romance languages.

* Reading and writing files: The script reads the text from input_text.txt and writes the translated text to translated_text.txt.

4. Usage:

* Input File: Create a text file named input_text.txt with the text you want to translate.
* Run the Script: Run the script, and it will generate a translated_text.txt file containing the translated text.

5. Run the script:

Save the script as translate.py, and run it in your terminal or command prompt:

In [None]:
python translate.py