# Import Libraries

In [1]:
from datasets import load_dataset, DatasetDict, Dataset, load_from_disk
import evaluate
import numpy as np
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
from transformers import DataCollatorForSeq2Seq
from transformers import pipeline
from googletrans import Translator

  from .autonotebook import tqdm as notebook_tqdm


# Loading Dataset

Load the dataset saved in "dataset" folder

In [2]:
dataset = load_from_disk ('./dataset/aslg_pc12')

# Predict

Since I push the finetuned model on HuggingFace, to use the model, I use `AutoModelForSeq2SeqLM` to load the model and `AutoTokenizer` to load the tokenizer, so that I can use the model to predict the testing data.

In [3]:
tokenizer = AutoTokenizer.from_pretrained("junowhite/transformer_model")

model = AutoModelForSeq2SeqLM.from_pretrained("junowhite/transformer_model")

Because the pretrained model T5 is originally for translating from German to English and some other common languages, additionally, our dataset is quite small compared to the dataset used to train T5. Thus, our results sometimes get affected and are translated into other languages such as German. However, recognizing the semantic aspect is still relatively good and qualified enough, we decided to use Translator API to translate those results back to English.

In [4]:
translators = Translator()

Finally, to use the model, I instantiate a pipeline for translation and pass text to it

In [5]:
translator = pipeline("translation", model=model, tokenizer = tokenizer)

prediction = []
answer = []

def testing_data(num_of_examples = 5):
  for i in range(num_of_examples):
    text = dataset["train"][i]["gloss"]
    answer.append(dataset["train"][i]["text"])
    prediction.append(translators.translate(translator(text)[0]["translation_text"]).text)



In [6]:
testing_data()
print(f'Our predictions: {prediction}')
print(f'The answer: {answer}')

Our predictions: ['cooperation at political , social and technical levels is essential in order to achieve objectives of these .', 'mr president , this brisk report is a positive outcome of the single european union superstatate .', 'england is my mother tongue , but I am not england I am irish .', 'therefore , madam president , I am also applauding to the european institutions to intervene .', 'the expressions of solidarity have come from all four cranes .']
The answer: ['cooperation at political , social and technical levels is essential in order to achieve the objective of the ses .\n', 'mr president , this brok report is positive proof of the emergence of a single european union superstate .\n', 'english is my mother tongue , but I am not english I am irish .\n', 'therefore , madam president , I am also appealing to the european institutions to intervene .\n', 'expressions of solidarity have come from all four corners of the globe .\n']
