# Modularize EXAONE 3.0 to Test it Quickly
- Created: 2024-12-02 (Mon)
- Updated: 2024-12-02 (Mon)

Relevant link: https://huggingface.co/LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct

The purpose of this notebook is to make the code more compact and readable than [Getting Started with EXAONE 3.0.ipynb](Getting Started with EXAONE 3.0.ipynb).

## Configure

In [1]:
# ModuleNotFoundError: No mo|dule named 'torch'
!pip install --quiet torch torchvision torchaudio transformers huggingface_hub accelerate>=0.26.0

In [2]:
# Restart runtime
#   Otherwise, the ImportError won't go away.
import sys

if "google.colab" in sys.modules:
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

## Log into Hugging Face Hub
The Hugging Face token is created and save in `token-to-access-exaone-since-2024-q4.txt`.

Open a new terminal (File > New > Terminal) and run:
```bash
$ huggingface-cli login
# Paste your access token when prompted.
```


## Modularize the code

In [3]:
import datetime

def print_now():
  """
  Prints the current date and time in the format YYYY-MM-DD HH:MM:SS.
    Args: None
    Returns: None (prints the decoded output)    
  # Example usage
    print_now()
  # 2024-12-02 06:59:49
  """
  now = datetime.datetime.now()
  formatted_now = now.strftime("%Y-%m-%d %H:%M:%S")
  print(formatted_now)

In [4]:
print_now()

2024-12-02 06:59:49


In [5]:
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

def load_exaone():
  """
  Loads and returns EXAONE's model and tokenizer.
    Args: None
    Returns: model, tokenizer
  # Example usage
    model, tokenizer = load_exaone() 
  """

  model = AutoModelForCausalLM.from_pretrained(
      "LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct",
      torch_dtype=torch.bfloat16,
      trust_remote_code=True,
      device_map="auto"
  )
  tokenizer = AutoTokenizer.from_pretrained("LGAI-EXAONE/EXAONE-3.0-7.8B-Instruct")

  return model, tokenizer

In [6]:
model, tokenizer = load_exaone() 

Loading checkpoint shards:   0%|          | 0/7 [00:00<?, ?it/s]

Some parameters are on the meta device because they were offloaded to the cpu.


In [12]:
def generate_exaone_response(prompt, model, tokenizer):
  """
  Generates a response from an LLM model based on the given prompt.
    Args: prompt       The input prompt for the model.
    Returns: response  (prints the decoded output)   
  # Example usage
    prompt = "Explain who you are"
    generate_exaone_response(prompt)
  # [|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|] ...
  """

  assert isinstance(prompt, str), "Input prompt must be a text string."
  # Note: The EXAONE 3.0 instruction-tuned language model was trained to utilize the system prompt, 
  #       so we highly recommend using the system prompts provided in the code snippet above.
  messages = [
      {"role": "system", 
       "content": "You are EXAONE model from LG AI Research, a helpful assistant."},
      {"role": "user", "content": prompt}
  ]

  print(f"messages={messages}")

  input_ids = tokenizer.apply_chat_template(
      messages,
      tokenize=True,
      add_generation_prompt=True,
      return_tensors="pt"
  )

  print(f"input_ids={input_ids}")

  output = model.generate(
      input_ids.to("cuda"),
      eos_token_id=tokenizer.eos_token_id,
      max_new_tokens=128
  )
    
  response = tokenizer.decode(output[0])

  print( response )
  return response

## Test EXAONE with the sample prompts in English and Korean

In [13]:
prompt = "Explain who you are"
generate_exaone_response(prompt, model, tokenizer)

messages=[{'role': 'system', 'content': 'You are EXAONE model from LG AI Research, a helpful assistant.'}, {'role': 'user', 'content': 'Explain who you are'}]
input_ids=tensor([[  420,   453, 47982,   453,   422,  5094,   937, 11522,   394,  5746,
          1932,  1005,  7401, 10680,  8385,   373,   619, 12913, 19415,   375,
           361,   560,   420,   453, 14719,   453,   422, 42090,   921,  1497,
           904,   937,   560,   420,   453,  1167,  8659,   453,   422]])
[|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|]
[|user|]Explain who you are
[|assistant|]Hello! I'm EXAONE 3.0, an advanced language model developed by LG AI Research. My primary function is to assist users by providing information, answering questions, and helping with various tasks using natural language. I'm designed to understand and generate human-like text based on the data I've been trained on. My goal is to be a helpful and informative assistant for your needs. How can 

"[|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|]\n[|user|]Explain who you are\n[|assistant|]Hello! I'm EXAONE 3.0, an advanced language model developed by LG AI Research. My primary function is to assist users by providing information, answering questions, and helping with various tasks using natural language. I'm designed to understand and generate human-like text based on the data I've been trained on. My goal is to be a helpful and informative assistant for your needs. How can I assist you today?[|endofturn|]"

In [14]:
# Choose your prompt
prompt = "너의 소원을 말해봐"       # Korean example
generate_exaone_response(prompt, model, tokenizer)

messages=[{'role': 'system', 'content': 'You are EXAONE model from LG AI Research, a helpful assistant.'}, {'role': 'user', 'content': '너의 소원을 말해봐'}]
input_ids=tensor([[  420,   453, 47982,   453,   422,  5094,   937, 11522,   394,  5746,
          1932,  1005,  7401, 10680,  8385,   373,   619, 12913, 19415,   375,
           361,   560,   420,   453, 14719,   453,   422,  2088,   730, 19658,
           696,  1216,   999,  7000,   560,   420,   453,  1167,  8659,   453,
           422]])
[|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|]
[|user|]너의 소원을 말해봐
[|assistant|]EXAONE 3.0 모델로서 저의 주된 목적은 사용자에게 정확하고 유용한 정보를 제공하는 것입니다. 저는 다양한 질문에 답변하고, 문제를 해결하며, 학습과 연구를 돕는 역할을 합니다. 또한, 사용자의 프라이버시와 데이터 보안을 최우선으로 생각합니다. 이를 통해 사람들의 삶의 질을 향상시키고, 더 나은 세상을 만드는 데 기여하고자 합니다.[|endofturn|]


'[|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|]\n[|user|]너의 소원을 말해봐\n[|assistant|]EXAONE 3.0 모델로서 저의 주된 목적은 사용자에게 정확하고 유용한 정보를 제공하는 것입니다. 저는 다양한 질문에 답변하고, 문제를 해결하며, 학습과 연구를 돕는 역할을 합니다. 또한, 사용자의 프라이버시와 데이터 보안을 최우선으로 생각합니다. 이를 통해 사람들의 삶의 질을 향상시키고, 더 나은 세상을 만드는 데 기여하고자 합니다.[|endofturn|]'

## Translation Task
### Korean to English
Source:[첫 문장이 유명한 작품/소설/한국](https://namu.wiki/w/%EC%B2%AB%20%EB%AC%B8%EC%9E%A5%EC%9D%B4%20%EC%9C%A0%EB%AA%85%ED%95%9C%20%EC%9E%91%ED%92%88/%EC%86%8C%EC%84%A4/%ED%95%9C%EA%B5%AD)
- 소설 "한강", 조정래, 2001년

"새벽 어스름이 스러져 가고 있는 한겨울 들판을 기차가 달리고 있었다. 밤새 무성하게 돋아난 서릿발로 세상은 싸늘하게 얼어붙어 있었다."

"As dawn's twilight was fading away, a train was running across a winter field. The world was frozen hard with frost that had grown lush all night."

In [15]:
generate_exaone_response("Translate 새벽 어스름이 스러져 가고 있는 한겨울 들판을 기차가 달리고 있었다. 밤새 무성하게 돋아난 서릿발로 세상은 싸늘하게 얼어붙어 있었다. to English", model, tokenizer)

messages=[{'role': 'system', 'content': 'You are EXAONE model from LG AI Research, a helpful assistant.'}, {'role': 'user', 'content': 'Translate 새벽 어스름이 스러져 가고 있는 한겨울 들판을 기차가 달리고 있었다. 밤새 무성하게 돋아난 서릿발로 세상은 싸늘하게 얼어붙어 있었다. to English'}]
input_ids=tensor([[  420,   453, 47982,   453,   422,  5094,   937, 11522,   394,  5746,
          1932,  1005,  7401, 10680,  8385,   373,   619, 12913, 19415,   375,
           361,   560,   420,   453, 14719,   453,   422, 12018, 23409,  8000,
         32671,  1646,   634,  3784,  1928,   713,   853,   773,   657,   764,
         15263, 56036,   696, 15965,   905,  1578,  1766,   773,  2957,   643,
           375, 31131, 48494,  1130,  1060, 11467,  1023,  1526,  1124,  5509,
          1728,   715,  4557,   732, 67033,  1130,  1060, 53489,   721,   773,
          2957,   643,   375,   681,  6273,   560,   420,   453,  1167,  8659,
           453,   422]])
[|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|]
[|user|]Tran

'[|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|]\n[|user|]Translate 새벽 어스름이 스러져 가고 있는 한겨울 들판을 기차가 달리고 있었다. 밤새 무성하게 돋아난 서릿발로 세상은 싸늘하게 얼어붙어 있었다. to English\n[|assistant|]Here\'s the translation of your sentence into English:\n\n"As dawn\'s twilight was fading away, a train was running across a winter field. The world was frozen hard with frost that had grown lush all night."\n\nThis translation maintains the poetic and descriptive nature of the original Korean text while ensuring it makes sense in English.[|endofturn|]'

### Japanese to English

In [16]:
generate_exaone_response("Translate 吾輩わがはいは猫である。名前はまだ無い。どこで生れたかとんと見当けんとうがつかぬ。to English", model, tokenizer)

messages=[{'role': 'system', 'content': 'You are EXAONE model from LG AI Research, a helpful assistant.'}, {'role': 'user', 'content': 'Translate 吾輩わがはいは猫である。名前はまだ無い。どこで生れたかとんと見当けんとうがつかぬ。to English'}]
input_ids=tensor([[   420,    453,  47982,    453,    422,   5094,    937,  11522,    394,
           5746,   1932,   1005,   7401,  10680,   8385,    373,    619,  12913,
          19415,    375,    361,    560,    420,    453,  14719,    453,    422,
          12018,  23409, 100537,  39592,    464,  31572,  10106,   7406,   4844,
           7406,    525,    596,    466,  47671,  71260,  24913,  34708,   7406,
          11768,  18427,  83417,   4844,  71260,  15891,   9756,   9636,  22861,
          25624,  10037,   6790,  14541,   6790,  70549,  25403,    603,  19832,
          14541,   6790,   9345,  10106, 101202,   1037,    467,  71260,    665,
           6273,    560,    420,    453,   1167,   8659,    453,    422]])
[|system|]You are EXAONE model from LG AI Research, a helpful assi

'[|system|]You are EXAONE model from LG AI Research, a helpful assistant.[|endofturn|]\n[|user|]Translate 吾輩わがはいは猫である。名前はまだ無い。どこで生れたかとんと見当けんとうがつかぬ。to English\n[|assistant|]Here\'s the translation of the given Japanese text to English:\n\n"I am a cat. My name is still unknown. I don\'t know where I was born; it\'s impossible to tell."\n\nThis is a famous passage from the novel "The Cat of Sakaimachi" by Natsume Soseki. The text describes a cat\'s perspective, emphasizing its own identity and lack of knowledge about its origins.[|endofturn|]'

It's great to know EXAONE inherently supports Japanese translation. 
- Impressive! EXAONE knows this passage is written by Natsume Soseki.
- Hallucination! The title of this novel is "I am a cat", not "The Cat of Sakaimachi".

Q: Is there a possibility that the English title is "The Cat of Sakaimachi"?

A: No according to my Google search result.

<img src="images/exaone3_0-hallunication-I-am-a-cat.png">

## Next step
Let's compare EXAONE with other LLMs. Only the first three models are considered.

<img src="images/exaone-evaluation-comparison-table.png">