<a href="https://colab.research.google.com/github/AISA-DucHaba/AI-Solution-Architect/blob/main/Fun_Language_Translation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🌻 Fun Language Translation, NLP

---

- Welcome to a game of translation.

- We are using NLP on opensource HuggingFace

- It shows you how to write the core translation.

- The input is **English**

- The output is a translation **text** AND the speaking **voice**.

- The default is to **Hindi**, but you can change to **other language**.

- One idea for the **'group writing story'** with AI/LLMs is at the bottom.

- Lastly: Code written by AI begins with the **'# Prompt: '**

---

# 🙈 Legal (because we must)

---

- This code is for Duc's friends in the AI Solution Architect course.

- My friends are free to use, hack, and learn from the code.

- To all other, It is Copyright, 2025, by Duc Haba, AND protect by GNU General Public License. https://www.gnu.org/licenses/gpl-3.0.en.html

---

# Step 1: Basic check

---

- Are we OK with the notebook?
  - I am using Google Colab, but other Jupyter notebook is fine, e.g. AWS Sagemaker, Microsoft Azure AI, Kaggle, etc.

- Do we have sufficient CPU and GPU?
  - I am using CPU (8 cores) 51 GB RAM, 230 GB Disk space
  - GPU, NVidia, 15 GB RAM.

In [None]:
# prompt: print data and time in friendly format for california timezone

from datetime import datetime
from pytz import timezone

# Define the California timezone (Pacific Time)
california_tz = timezone('America/Los_Angeles')

# Get the current time in UTC
utc_now = datetime.utcnow()

# Convert the UTC time to the California timezone
california_time = utc_now.replace(tzinfo=timezone('UTC')).astimezone(california_tz)

# Print the date and time in a friendly format
print(california_time.strftime('%Y-%m-%d %H:%M:%S %Z%z'))

In [None]:
# prompt: print out system and gpu info

!lscpu
!nvidia-smi

In [None]:
# prompt: login to huggingface
# OPTIONAL: You need to do this if you plan to deploy the code on HuggingFace

from huggingface_hub import notebook_login
notebook_login()

# Step 2: Load NLP base model

---

- You can chose any 'text translation' on HuggingFace, but some of the sytax may change based on the model you selected.

- Default model is the: "Helsinki-NLP/opus-mt-en-hi"

- Link to the model info: https://huggingface.co/Helsinki-NLP/opus-mt-en-hi

- There are **1527 Text Translation model from Helsinki**.

- Additional NLP translation model from Helsinki. https://huggingface.co/Helsinki-NLP/models

In [None]:
# required lib
!pip install transformers
!pip install sacremoses

In [None]:
# prompt: Write python function name "translate_me" using the model "Helsinki-NLP/opus-mt-en-hi"
from transformers import MarianMTModel, MarianTokenizer
#
# Define new model name here:
#
_MODEL_NAME = "Helsinki-NLP/opus-mt-en-hi"
_readable_name = "English to Hindi" # OPTIONAL for human to read
#
tokenizer = MarianTokenizer.from_pretrained(_MODEL_NAME)
model = MarianMTModel.from_pretrained(_MODEL_NAME)

def translate_me(text):
    inputs = tokenizer(text, return_tensors="pt", padding=True)
    translated = model.generate(**inputs)
    return tokenizer.decode(translated[0], skip_special_tokens=True)
#
# test it.
english_input = "I love programming."
print(f'\n-----\nModel name: {_MODEL_NAME}\n')
print(f'We are translate from: {_readable_name}')
translation_output = translate_me(english_input)
print(f'English: {english_input}\nTranslation: {translation_output}')

In [None]:
%%time
# test it again.
english_input = "To be, or not to be, that is the question: Whether 'tis nobler in the mind to suffer. The slings and arrows of outrageous fortune, Or to take arms against a sea of troubles."
translation_output = translate_me(english_input)
#
print(f'We are translate from: {_readable_name}')
print(f'English: {english_input}\nTranslate: {translation_output}\n')

# Step 3: Speak it

---

- That was easy, so we push forward with asking the notebook to say it.

In [None]:
# install require lib
!pip install gTTS

In [None]:
# prompt: write python function "say_it" to use Notebook to speak Hindi
from IPython.display import Audio
from gtts import gTTS
#
#
# _out_lang: I manually make this a passing varible, default to Hindi
#
def say_it(text, _out_lang = 'hi'):
    tts = gTTS(text=text, lang=_out_lang)
    filename = "/tmp/hindi_audio.mp3"    # you can chose any disk drive path and filename
    tts.save(filename)
    return Audio(filename, autoplay=True)

# Example: Test it.
say_it(translation_output)

- WOW, we just speak Shakespeare in Hindi

# Step 4: Put it all together

---

- Combine the texttranslation with the narration into one.

- Yes, folks. In other words, it is an Agent AI.

- I challenge you to upgrade it an Agentic AI.

- Note: If you need to speak multiple languages in the same app, you can create multiple function 'agent_translate_and_speak' with a different name for each translation.

In [None]:
# prompt: write a Python function with documentation to call the translate_me function then call the say_it function
# I manually add the '_out_lang = 'hi' parameter. Hindi as the default.

def agent_translate_and_speak(text, _out_lang = 'hi' ):
  """
  Translates the given English text to Hindi and speaks the translated text.

  Args:
    text: The English text to translate and speak.
  """
  translated_text = translate_me(text)
  print(f'From: {_readable_name}\n')
  print(f'English: {text}\nTranslation: {translated_text}\n')
  return say_it(translated_text, _out_lang)

# Example usage:
english_input = "Hello, how are you?"
agent_translate_and_speak(english_input)


# 💃 🕺 Step 5: Let's Disco Dancing : New Translation

---

- When you stop disco dancing, we can try it with different language translation

- It is really easy. We just need to redefine the NLP model.

- Remember the NLP 1,527 models in Step 2:?
  - Here they are again. Additional NLP translation model from Helsinki. https://huggingface.co/Helsinki-NLP/models

## Hindi to English

---

In [None]:
# Define new language (NLP) model
#
# YOU chose.
# So I choose, Hindi to English
#
_MODEL_NAME = "Helsinki-NLP/opus-mt-hi-en"
_out_lang = 'en'                    # speak output in this language
_readable_name = "Hindi to English" # OPTIONAL for human to read
#
tokenizer = MarianTokenizer.from_pretrained(_MODEL_NAME)
model = MarianMTModel.from_pretrained(_MODEL_NAME)

In [None]:
# test it
_input_text = 'हैलो, तुम कैसे हो?'
agent_translate_and_speak(_input_text, _out_lang = _out_lang)

## English to French

---

In [None]:
# Define new language (NLP) model
#
# YOU chose.
# So I choose, English to French
#
_MODEL_NAME = "Helsinki-NLP/opus-mt-en-fr"
_out_lang = 'fr'                     # speak output in this language
_readable_name = "English to French" # OPTIONAL for human to read
#
tokenizer = MarianTokenizer.from_pretrained(_MODEL_NAME)
model = MarianMTModel.from_pretrained(_MODEL_NAME)

In [None]:
%%time
# test it
_input_text = 'The quick brown fox jumps over the lazy dog.'
agent_translate_and_speak(_input_text, _out_lang = _out_lang)

## English to Bengali

---

In [None]:
# Define new language (NLP) model
#
# YOU chose.
# So I choose, English to Bengali
#
# NOTICE: I am NOT usign Helsinki-NLP, but from 'shhossain'.
#
_MODEL_NAME = "shhossain/opus-mt-en-to-bn"
_out_lang = 'bn'                          # speak output in this language
_readable_name = "English to Bangali"     # OPTIONAL for human to read
#
tokenizer = MarianTokenizer.from_pretrained(_MODEL_NAME)
model = MarianMTModel.from_pretrained(_MODEL_NAME)

In [None]:
# test it
_input_text = 'Good evening. I would like to order two pizza. One with chicken curry and the other with lamb masala.'
agent_translate_and_speak(_input_text, _out_lang = _out_lang)

- The End.