## Cài đặt thư viện và load model

In [1]:
import torch
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Sử dụng pre-trained model từ Hugging Face
# Sau khi train xong, thay bằng: model_path = "../models/final_model"
model_path = "Helsinki-NLP/opus-mt-en-vi"

print(f"⏳ Đang load model từ: {model_path}...")

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSeq2SeqLM.from_pretrained(model_path)
model.to(device)


⏳ Đang load model từ: Helsinki-NLP/opus-mt-en-vi...




MarianMTModel(
  (model): MarianModel(
    (shared): Embedding(53685, 512, padding_idx=53684)
    (encoder): MarianEncoder(
      (embed_tokens): Embedding(53685, 512, padding_idx=53684)
      (embed_positions): MarianSinusoidalPositionalEmbedding(512, 512)
      (layers): ModuleList(
        (0-5): 6 x MarianEncoderLayer(
          (self_attn): MarianAttention(
            (k_proj): Linear(in_features=512, out_features=512, bias=True)
            (v_proj): Linear(in_features=512, out_features=512, bias=True)
            (q_proj): Linear(in_features=512, out_features=512, bias=True)
            (out_proj): Linear(in_features=512, out_features=512, bias=True)
          )
          (self_attn_layer_norm): LayerNorm((512,), eps=1e-05, elementwise_affine=True)
          (activation_fn): SiLU()
          (fc1): Linear(in_features=512, out_features=2048, bias=True)
          (fc2): Linear(in_features=2048, out_features=512, bias=True)
          (final_layer_norm): LayerNorm((512,), eps=1e-05

In [2]:
def translate_ielts(text, model, tokenizer, num_beams=4):
    """
    Dịch văn bản từ Anh sang Việt sử dụng Model đã train.
    """
    model.eval() # Chuyển sang chế độ dự đoán (tắt Dropout)
    
    # 1. Tokenize (Chuyển chữ thành số)
    inputs = tokenizer(
        text, 
        return_tensors="pt", 
        max_length=256, # IELTS câu dài, để 256 hoặc 512
        truncation=True
    ).to(device)
    
    # 2. Generate (Sinh ngữ)
    with torch.no_grad():
        outputs = model.generate(
            **inputs,
            max_length=256,
            num_beams=num_beams, # Beam Search như Lab 6.1
            early_stopping=True,
            no_repeat_ngram_size=2, # Tránh lặp từ
            length_penalty=1.0 # = 1.0 là trung tính, > 1.0 ưu tiên câu dài
        )
    
    # 3. Decode (Chuyển số ngược lại thành chữ)
    translated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return translated_text

## Thử mới một câu ngắn 

In [3]:
# Mẫu một bài IELTS Reading
ielts_text = """
At 11.39 p.m. on the evening of Sunday 14 April 1912, lookouts Frederick Fleet and Reginald Lee on the forward mast of the Titanic sighted an eerie, black mass coming into view directly in front of the ship. Fleet picked up the phone to the helm, waited for Sixth Officer Moody to answer, and yelled 'Iceberg, right ahead!' The greatest disaster in maritime history was about to be set in motion.
"""

print("--- INPUT English ---")
print(ielts_text.strip())

print("\n---  OUTPUT Vietnamese ---")
translation = translate_ielts(ielts_text, model, tokenizer)
print(translation)

--- INPUT English ---
At 11.39 p.m. on the evening of Sunday 14 April 1912, lookouts Frederick Fleet and Reginald Lee on the forward mast of the Titanic sighted an eerie, black mass coming into view directly in front of the ship. Fleet picked up the phone to the helm, waited for Sixth Officer Moody to answer, and yelled 'Iceberg, right ahead!' The greatest disaster in maritime history was about to be set in motion.

---  OUTPUT Vietnamese ---
Tại 11.39 p.m. vào tối Chủ nhật 14 tháng Tư năm Perday 14 April, nhìn thấy Patt và Regina Lee về phía trước của các phát hiện một eer, khối u đen đến nhìn trực tiếp vào trước mặt của con tàu.on đã lấy điện thoại để gọi, chờ đợi cho viên sĩ quan để trả lời, và 'Iberg, phải, phía trên!' Trong ý tưởng lớn nhất về lịch sử là về được thiết lập trong chuyển động.


## Thử một đoạn dài hơn

In [4]:
# Mẫu một bài IELTS Reading
ielts_text = """
At 11.39 p.m. on the evening of Sunday 14 April 1912, lookouts Frederick Fleet and Reginald Lee on the forward mast of the Titanic sighted an eerie, black mass coming into view directly in front of the ship. Fleet picked up the phone to the helm, waited for Sixth Officer Moody to answer, and yelled 'Iceberg, right ahead!' The greatest disaster in maritime history was about to be set in motion.

Thirty-seven seconds later, despite the efforts of officers in the bridge and engine room to steer around the iceberg, the Titanic struck a piece of submerged ice, bursting rivets in the ship's hull and flooding the first five watertight compartments. The ship's designer, Thomas Andrews, carried out a visual inspection of the ship's damage and informed Captain Smith at midnight that the ship would sink in less than two hours. By 12.30 a.m., the lifeboats were being filled with women and children, after Smith had given the command for them to be uncovered and swung out 15 minutes earlier. The first lifeboat was successfully lowered 15 minutes later, with only 28 of its 65 seats occupied. By 1.15 a.m., the waterline was beginning to reach the Titanic's name on the ship's bow, and over the next hour every lifeboat would be released as officers struggled to maintain order amongst the growing panic on board.

The closing moments of the Titanic's sinking began shortly after 2 a.m., as the last lifeboat was lowered and the ship's propellers lifted out of the water, leaving the 1,500 passengers still on board to surge towards the stern. At 2.17 a.m., Harold Bride and Jack Philips tapped out their last wireless message after being relieved of duty as the ship's wireless operators, and the ship's band stopped playing. Less than a minute later, occupants of the lifeboats witnessed the ship's lights flash once, then go black, and a huge roar signalled the Titanic's contents plunging towards the bow, causing the front half of the ship to break off and go under. The Titanic's stern bobbed up momentarily, and at 2.20 a.m., the ship finally disappeared beneath the frigid waters.

What or who was responsible for the scale of this catastrophe? Explanations abound, some that focus on very small details. Due to a last minute change in the ship's officer line-up, iceberg lookouts Frederick Fleet and Reginald Lee were making do without a pair of binoculars that an officer transferred off the ship in Southampton had left in a cupboard onboard, unbeknownst to any of the ship's crew. Fleet, who survived the sinking, insisted at a subsequent inquiry that he could have identified the iceberg in time to avert disaster if he had been in possession of the binoculars.

Less than an hour before the Titanic struck the iceberg, wireless operator Cyril Evans on the Californian, located just 20 miles to the north, tried to contact operator Jack Philips on the Titanic to warn him of pack ice in the area. 'Shut up, shut up, you're jamming my signal', Philips replied. 'I'm busy.' The Titanic's wireless system had broken down for several hours earlier that day, and Philips was clearing a backlog of personal messages that passengers had requested to be sent to family and friends in the USA. Nevertheless, Captain Smith had maintained the ship's speed of 22 knots despite multiple earlier warnings of ice ahead. It has been suggested that Smith was under pressure to make headlines by arriving early in New York, but maritime historians such as Richard Howell have countered this perception, noting that Smith was simply following common procedure at the time, and not behaving recklessly.

One of the strongest explanations for the severe loss of life has been the fact that the Titanic did not carry enough lifeboats for everyone on board. Maritime regulations at the time tied lifeboat capacity to ship size, not to the number of passengers on board. This meant that the Titanic , with room for 1,178 of its 2,222 passengers, actually surpassed the Board of Trade's requirement that it carry lifeboats for 1,060 of its passengers. Nevertheless, with lifeboats being lowered less than half full in many cases, and only 712 passengers surviving despite a two and a half hour window of opportunity, more lifeboats would not have guaranteed more survivors in the absence of better training and preparation. Many passengers were confused about where to go after the order to launch lifeboats was given; a lifeboat drill scheduled for earlier on the same day that the Titanic struck the iceberg was cancelled by Captain Smith, in order to allow passengers to attend church.
"""

print("--- INPUT English ---")
print(ielts_text.strip())

print("\n---  OUTPUT Vietnamese ---")
translation = translate_ielts(ielts_text, model, tokenizer)
print(translation)

--- INPUT English ---
At 11.39 p.m. on the evening of Sunday 14 April 1912, lookouts Frederick Fleet and Reginald Lee on the forward mast of the Titanic sighted an eerie, black mass coming into view directly in front of the ship. Fleet picked up the phone to the helm, waited for Sixth Officer Moody to answer, and yelled 'Iceberg, right ahead!' The greatest disaster in maritime history was about to be set in motion.

Thirty-seven seconds later, despite the efforts of officers in the bridge and engine room to steer around the iceberg, the Titanic struck a piece of submerged ice, bursting rivets in the ship's hull and flooding the first five watertight compartments. The ship's designer, Thomas Andrews, carried out a visual inspection of the ship's damage and informed Captain Smith at midnight that the ship would sink in less than two hours. By 12.30 a.m., the lifeboats were being filled with women and children, after Smith had given the command for them to be uncovered and swung out 15 mi