<a href="https://colab.research.google.com/github/Doug-Vo/Translation-Project/blob/main/Translation_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# For this Project, we are looking at models that can translate Finnish to English reliably, and for that, we chose:


*   [Helsinki-NLP/opus-mt-fi-en](https://huggingface.co/Helsinki-NLP/opus-mt-fi-en) (HugginFace)
*   [googletrans 4.0.2](https://pypi.org/project/googletrans/) (PyPi)


Both offer great translations, with [googletrans 4.0.2](https://pypi.org/project/googletrans/) (PyPi) having a higher grammatical accuracy but also recieve several unknown errors


In [28]:
!pip install transformers torch sentencepiece
!pip install googletrans==4.0.0-rc1



In [29]:
from googletrans import Translator
from transformers import MarianMTModel, MarianTokenizer

#Loading googletrans
translator = Translator()
print("GGtranslate Loaded!")

#Loading Helsinki-NLP Model
fi2en_model_name = 'Helsinki-NLP/opus-mt-fi-en'
fi2en_tokenizer = MarianTokenizer.from_pretrained(fi2en_model_name)
fi2en_model = MarianMTModel.from_pretrained(fi2en_model_name)
print("Helsinki Model Loaded!")



GGtranslate Loaded!
Helsinki Model Loaded!


In [30]:


def hel_translate(text):
  batch = fi2en_tokenizer(text, return_tensors = "pt", padding = True)
  gen = fi2en_model.generate(**batch)
  return fi2en_tokenizer.batch_decode(gen, skip_special_tokens = True)


#Added throw and catch to handle unknown errors
def gg_translate(text):
  try:
    result = translator.translate(text, src ='fi', dest = 'en')
    if result and result.text:
      translation = result.text
      return translation
    else:
      print(f"Google Translate returned an invalid result for: {text}")
      return f"Translation failed: {text}"
  except Exception as e:
    print(f"Error translating with Google Translate: {e}")
    return f"Translation failed: {text}"

In [31]:
#Sample finnish to translate
finnish_test_sentences = [
    # 1. Simple, common phrase
    "Voitko auttaa minua tämän laukun kanssa?",
    # 2. News/Formal tone
    "Hallitus neuvottelee uusista rajoituksista huomenna.",
    # 3. Technical/Specific
    "Tietokoneen suorituskyky riippuu suuresti keskusmuistin määrästä.",
    # 4. Idiomatic expression
    "Älä nuolaise ennen kuin tipahtaa.",
    # 5. Question
    "Kuinka kauan matka Oulusta Helsinkiin kestää junalla?",
    # 6. Food/Restaurant context
    "Haluaisin tilata poronkäristystä perunamuusilla, kiitos.",
    # 7. Weather description
    "Tänään on kaunis syyspäivä, mutta illalla saattaa sataa räntää.",
    # 8. Opinion/Subjective
    "Mielestäni tämä elokuva oli hieman pettymys.",
    # 9. Complex sentence with a sub-clause
    "Vaikka opiskelu oli haastavaa, hän valmistui erinomaisin arvosanoin.",
    # 10. A more poetic, descriptive sentence
    "Revontulet tanssivat hiljaa pimeällä taivaalla pakkasyönä."
]


for sentence in finnish_test_sentences:

  hel_trans = hel_translate(sentence)[0]
  gg_trans = gg_translate(sentence)
  print("\n---------------------------------------------------------------------")
  print("Original Sentence:\t| ", sentence)
  print("Helsinki model:\t\t| ", hel_trans)
  print("GG Translate:\t\t| ", gg_trans)


---------------------------------------------------------------------
Original Sentence:	|  Voitko auttaa minua tämän laukun kanssa?
Helsinki model:		|  Can you help me with this bag?
GG Translate:		|  Can you help me with this bag?

---------------------------------------------------------------------
Original Sentence:	|  Hallitus neuvottelee uusista rajoituksista huomenna.
Helsinki model:		|  The government is negotiating new restrictions tomorrow.
GG Translate:		|  The government will negotiate new restrictions tomorrow.

---------------------------------------------------------------------
Original Sentence:	|  Tietokoneen suorituskyky riippuu suuresti keskusmuistin määrästä.
Helsinki model:		|  Computer performance depends greatly on the amount of central memory.
GG Translate:		|  The performance of the computer depends greatly on the amount of the main memory.

---------------------------------------------------------------------
Original Sentence:	|  Älä nuolaise ennen kuin ti