
# Transformer-based English–French Translation

Notebook ini merupakan implementasi eksplorasi model **Transformer** untuk penerjemahan otomatis dari Bahasa Inggris ke Bahasa Prancis.

Tujuan utama:
- Melatih model Transformer selama **1 epoch** dengan **batch-size maksimal 100**
- Menunjukkan proses **Text Preprocessing**, **Definisi Arsitektur Transformer**, **Training**, dan **Inference**
- Menampilkan metrik: `TrainLoss`, `ValLoss`, dan `ValAcc` di setiap akhir batch.

Bobot penilaian:
| Aspek | Bobot |
|-------|-------|
| Data Preparation (Text Preprocessing) | 20% |
| Definisi Class Transformer | 25% |
| Proses Training (TrainLoss, ValLoss, ValAcc) | 35% |
| Inference Translation | 20% |


## Persiapan Lingkungan

In [1]:

!pip install torch pandas numpy
import torch, pandas as pd, numpy as np, random, math, re, os
from collections import Counter
from torch import nn
from torch.utils.data import Dataset, DataLoader

DEVICE = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Running on', DEVICE)


Running on cuda


## 1. Data Preparation & Text Preprocessing (20%)

In [5]:

# Dataset diambil dari file small_vocab_en.csv dan small_vocab_fr.csv
en_path = '/content/small_vocab_en.csv'
fr_path = '/content/small_vocab_fr.csv'

# Baca setiap baris sebagai satu teks utuh
with open(en_path, 'r', encoding='utf-8') as f:
    src_texts = [line.strip() for line in f if line.strip()]

with open(fr_path, 'r', encoding='utf-8') as f:
    tgt_texts = [line.strip() for line in f if line.strip()]

print(f"Contoh data Inggris: {src_texts[0]}")
print(f"Contoh data Prancis: {tgt_texts[0]}")

def clean_text(text):
    text = text.lower()
    text = re.sub(r"[^a-zâêôàèçùé'\-\.\,\?\!\s]", ' ', text)
    return re.sub(r'\s+', ' ', text).strip()

def tokenize(text):
    return text.split()

src_tokens = [tokenize(clean_text(s)) for s in src_texts]
tgt_tokens = [tokenize(clean_text(t)) for t in tgt_texts]

# Split train/val
data = list(zip(src_tokens, tgt_tokens))
random.shuffle(data)
split = int(0.9 * len(data))
train, val = data[:split], data[split:]

PAD, BOS, EOS, UNK = '<pad>', '<s>', '</s>', '<unk>'

def build_vocab(sentences):
    counter = Counter(t for s in sentences for t in s)
    vocab = [PAD, BOS, EOS, UNK] + [t for t, _ in counter.most_common()]
    stoi = {t: i for i, t in enumerate(vocab)}
    itos = {i: t for t, i in stoi.items()}
    return stoi, itos

src_stoi, src_itos = build_vocab([s for s, _ in train])
tgt_stoi, tgt_itos = build_vocab([t for _, t in train])

print('Vocab sizes -> src:', len(src_stoi), '| tgt:', len(tgt_stoi))

Contoh data Inggris: new jersey is sometimes quiet during autumn , and it is snowy in april .
Contoh data Prancis: new jersey est parfois calme pendant l' automne , et il est neigeux en avril .
Vocab sizes -> src: 231 | tgt: 356


## 2. Definisi Arsitektur Transformer (25%)

In [8]:
import torch
import torch.nn as nn
import math

# ==========================================
# Positional Encoding
# ==========================================
class PositionalEncoding(nn.Module):
    def __init__(self, d_model, max_len=5000):
        super().__init__()
        pe = torch.zeros(max_len, d_model)
        position = torch.arange(0, max_len, dtype=torch.float).unsqueeze(1)
        div_term = torch.exp(torch.arange(0, d_model, 2).float() * (-math.log(10000.0) / d_model))
        pe[:, 0::2] = torch.sin(position * div_term)
        pe[:, 1::2] = torch.cos(position * div_term)
        pe = pe.unsqueeze(0)
        self.register_buffer('pe', pe)

    def forward(self, x):
        x = x + self.pe[:, :x.size(1)]
        return x


# ==========================================
# Transformer Model untuk Translation
# ==========================================
class TransformerMT(nn.Module):
    def __init__(self, src_vocab, tgt_vocab, d_model=128, nhead=4, num_layers=2, dim_ff=512, dropout=0.1):
        super().__init__()
        self.src_embed = nn.Embedding(src_vocab, d_model)
        self.tgt_embed = nn.Embedding(tgt_vocab, d_model)
        self.pos_enc = PositionalEncoding(d_model)
        self.transformer = nn.Transformer(
            d_model=d_model,
            nhead=nhead,
            num_encoder_layers=num_layers,
            num_decoder_layers=num_layers,
            dim_feedforward=dim_ff,
            dropout=dropout,
            batch_first=True
        )
        self.fc_out = nn.Linear(d_model, tgt_vocab)

    def forward(self, src, tgt):
        src = self.pos_enc(self.src_embed(src))
        tgt = self.pos_enc(self.tgt_embed(tgt))
        out = self.transformer(src, tgt)
        out = self.fc_out(out)
        return out


In [9]:
from torch.utils.data import Dataset, DataLoader
import numpy as np

# ==========================================
# Dataset & DataLoader
# ==========================================
class TranslationDataset(Dataset):
    def __init__(self, pairs, src_stoi, tgt_stoi, max_len=20):
        self.pairs = pairs
        self.src_stoi = src_stoi
        self.tgt_stoi = tgt_stoi
        self.max_len = max_len

    def encode(self, tokens, stoi, bos=False, eos=False):
        ids = [stoi.get(t, stoi['<unk>']) for t in tokens]
        if bos: ids = [stoi['<s>']] + ids
        if eos: ids = ids + [stoi['</s>']]
        ids = ids[:self.max_len]
        ids += [stoi['<pad>']] * (self.max_len - len(ids))
        return ids

    def __getitem__(self, idx):
        src, tgt = self.pairs[idx]
        src_ids = self.encode(src, self.src_stoi)
        tgt_in = self.encode(tgt, self.tgt_stoi, bos=True)
        tgt_out = self.encode(tgt, self.tgt_stoi, eos=True)
        return torch.tensor(src_ids), torch.tensor(tgt_in), torch.tensor(tgt_out)

    def __len__(self):
        return len(self.pairs)


BATCH_SIZE = 100
MAX_LEN = 20
train_ds = TranslationDataset(train, src_stoi, tgt_stoi, MAX_LEN)
val_ds = TranslationDataset(val, src_stoi, tgt_stoi, MAX_LEN)
train_dl = DataLoader(train_ds, batch_size=BATCH_SIZE, shuffle=True)
val_dl = DataLoader(val_ds, batch_size=BATCH_SIZE)

# ==========================================
# Inisialisasi Model & Optimizer
# ==========================================
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = TransformerMT(len(src_stoi), len(tgt_stoi)).to(device)
criterion = nn.CrossEntropyLoss(ignore_index=src_stoi['<pad>'])
optimizer = torch.optim.Adam(model.parameters(), lr=1e-4)


## 3. Proses Training (35%)

In [10]:
from tqdm import tqdm

def accuracy_fn(y_pred, y_true, pad_idx):
    pred_tokens = y_pred.argmax(dim=-1)
    mask = y_true != pad_idx
    correct = (pred_tokens == y_true) & mask
    return correct.sum().float() / mask.sum().float()

EPOCHS = 1
for epoch in range(EPOCHS):
    model.train()
    total_loss = 0
    print(f"\nEpoch {epoch+1}/{EPOCHS}")
    for i, (src, tgt_in, tgt_out) in enumerate(tqdm(train_dl)):
        src, tgt_in, tgt_out = src.to(device), tgt_in.to(device), tgt_out.to(device)
        optimizer.zero_grad()
        output = model(src, tgt_in)
        loss = criterion(output.view(-1, output.size(-1)), tgt_out.view(-1))
        loss.backward()
        optimizer.step()
        total_loss += loss.item()
        if (i+1) % 1 == 0:
            print(f"Batch {i+1}/{len(train_dl)} - TrainLoss: {loss.item():.4f}")

    # Validation
    model.eval()
    val_loss, val_acc = 0, 0
    with torch.no_grad():
        for src, tgt_in, tgt_out in val_dl:
            src, tgt_in, tgt_out = src.to(device), tgt_in.to(device), tgt_out.to(device)
            output = model(src, tgt_in)
            loss = criterion(output.view(-1, output.size(-1)), tgt_out.view(-1))
            val_loss += loss.item()
            val_acc += accuracy_fn(output, tgt_out, src_stoi['<pad>']).item()
    val_loss /= len(val_dl)
    val_acc /= len(val_dl)
    print(f"ValLoss: {val_loss:.4f}, ValAcc: {val_acc*100:.2f}%")



Epoch 1/1


  0%|          | 4/1241 [00:00<00:32, 37.76it/s]

Batch 1/1241 - TrainLoss: 6.0181
Batch 2/1241 - TrainLoss: 5.9023
Batch 3/1241 - TrainLoss: 5.7987
Batch 4/1241 - TrainLoss: 5.7185
Batch 5/1241 - TrainLoss: 5.5926
Batch 6/1241 - TrainLoss: 5.5444
Batch 7/1241 - TrainLoss: 5.4586


  1%|          | 9/1241 [00:00<00:30, 40.73it/s]

Batch 8/1241 - TrainLoss: 5.4258
Batch 9/1241 - TrainLoss: 5.3469


  1%|          | 15/1241 [00:00<00:25, 48.08it/s]

Batch 10/1241 - TrainLoss: 5.3034
Batch 11/1241 - TrainLoss: 5.2748
Batch 12/1241 - TrainLoss: 5.2339
Batch 13/1241 - TrainLoss: 5.1890
Batch 14/1241 - TrainLoss: 5.1505
Batch 15/1241 - TrainLoss: 5.1031
Batch 16/1241 - TrainLoss: 5.0949
Batch 17/1241 - TrainLoss: 5.0461
Batch 18/1241 - TrainLoss: 5.0115
Batch 19/1241 - TrainLoss: 5.0110


  2%|▏         | 21/1241 [00:00<00:24, 50.73it/s]

Batch 20/1241 - TrainLoss: 5.0191
Batch 21/1241 - TrainLoss: 4.9313


  2%|▏         | 27/1241 [00:00<00:23, 50.78it/s]

Batch 22/1241 - TrainLoss: 4.9156
Batch 23/1241 - TrainLoss: 4.8570
Batch 24/1241 - TrainLoss: 4.8925
Batch 25/1241 - TrainLoss: 4.8507
Batch 26/1241 - TrainLoss: 4.8152
Batch 27/1241 - TrainLoss: 4.8065
Batch 28/1241 - TrainLoss: 4.8036
Batch 29/1241 - TrainLoss: 4.7303
Batch 30/1241 - TrainLoss: 4.7376
Batch 31/1241 - TrainLoss: 4.7538


  3%|▎         | 33/1241 [00:00<00:22, 53.60it/s]

Batch 32/1241 - TrainLoss: 4.7237
Batch 33/1241 - TrainLoss: 4.6819


  3%|▎         | 39/1241 [00:00<00:21, 55.49it/s]

Batch 34/1241 - TrainLoss: 4.6226
Batch 35/1241 - TrainLoss: 4.6622
Batch 36/1241 - TrainLoss: 4.6076
Batch 37/1241 - TrainLoss: 4.5502
Batch 38/1241 - TrainLoss: 4.5918
Batch 39/1241 - TrainLoss: 4.5657
Batch 40/1241 - TrainLoss: 4.4554
Batch 41/1241 - TrainLoss: 4.4778
Batch 42/1241 - TrainLoss: 4.4394
Batch 43/1241 - TrainLoss: 4.5099
Batch 44/1241 - TrainLoss: 4.4310


  4%|▎         | 46/1241 [00:00<00:20, 57.35it/s]

Batch 45/1241 - TrainLoss: 4.3891
Batch 46/1241 - TrainLoss: 4.3335


  4%|▍         | 52/1241 [00:01<00:22, 53.07it/s]

Batch 47/1241 - TrainLoss: 4.3646
Batch 48/1241 - TrainLoss: 4.3102
Batch 49/1241 - TrainLoss: 4.2576
Batch 50/1241 - TrainLoss: 4.2509
Batch 51/1241 - TrainLoss: 4.1824
Batch 52/1241 - TrainLoss: 4.2016
Batch 53/1241 - TrainLoss: 4.1922
Batch 54/1241 - TrainLoss: 4.1043


  5%|▍         | 58/1241 [00:01<00:30, 39.35it/s]

Batch 55/1241 - TrainLoss: 4.1198
Batch 56/1241 - TrainLoss: 4.1286
Batch 57/1241 - TrainLoss: 4.0790
Batch 58/1241 - TrainLoss: 4.0586
Batch 59/1241 - TrainLoss: 4.0859


  6%|▌         | 70/1241 [00:01<00:25, 45.15it/s]

Batch 60/1241 - TrainLoss: 4.0886
Batch 61/1241 - TrainLoss: 3.9693
Batch 62/1241 - TrainLoss: 3.9417
Batch 63/1241 - TrainLoss: 3.9375
Batch 64/1241 - TrainLoss: 3.8843
Batch 65/1241 - TrainLoss: 3.9400
Batch 66/1241 - TrainLoss: 3.8913
Batch 67/1241 - TrainLoss: 3.8451
Batch 68/1241 - TrainLoss: 3.8058
Batch 69/1241 - TrainLoss: 3.8366
Batch 70/1241 - TrainLoss: 3.8415
Batch 71/1241 - TrainLoss: 3.7324


  7%|▋         | 82/1241 [00:01<00:22, 50.65it/s]

Batch 72/1241 - TrainLoss: 3.7127
Batch 73/1241 - TrainLoss: 3.7107
Batch 74/1241 - TrainLoss: 3.6943
Batch 75/1241 - TrainLoss: 3.6577
Batch 76/1241 - TrainLoss: 3.6463
Batch 77/1241 - TrainLoss: 3.6931
Batch 78/1241 - TrainLoss: 3.5525
Batch 79/1241 - TrainLoss: 3.5735
Batch 80/1241 - TrainLoss: 3.5139
Batch 81/1241 - TrainLoss: 3.5402
Batch 82/1241 - TrainLoss: 3.4938
Batch 83/1241 - TrainLoss: 3.5278


  8%|▊         | 94/1241 [00:01<00:21, 53.87it/s]

Batch 84/1241 - TrainLoss: 3.5260
Batch 85/1241 - TrainLoss: 3.4109
Batch 86/1241 - TrainLoss: 3.4286
Batch 87/1241 - TrainLoss: 3.4380
Batch 88/1241 - TrainLoss: 3.4223
Batch 89/1241 - TrainLoss: 3.3718
Batch 90/1241 - TrainLoss: 3.3472
Batch 91/1241 - TrainLoss: 3.3468
Batch 92/1241 - TrainLoss: 3.3357
Batch 93/1241 - TrainLoss: 3.3655
Batch 94/1241 - TrainLoss: 3.2931
Batch 95/1241 - TrainLoss: 3.2919
Batch 96/1241 - TrainLoss: 3.1896


  9%|▊         | 107/1241 [00:02<00:20, 56.64it/s]

Batch 97/1241 - TrainLoss: 3.1885
Batch 98/1241 - TrainLoss: 3.2204
Batch 99/1241 - TrainLoss: 3.2296
Batch 100/1241 - TrainLoss: 3.1963
Batch 101/1241 - TrainLoss: 3.1968
Batch 102/1241 - TrainLoss: 3.2115
Batch 103/1241 - TrainLoss: 3.1916
Batch 104/1241 - TrainLoss: 3.1215
Batch 105/1241 - TrainLoss: 3.0301
Batch 106/1241 - TrainLoss: 3.1490
Batch 107/1241 - TrainLoss: 3.0790
Batch 108/1241 - TrainLoss: 3.0120


 10%|▉         | 120/1241 [00:02<00:19, 57.96it/s]

Batch 109/1241 - TrainLoss: 3.0069
Batch 110/1241 - TrainLoss: 3.0341
Batch 111/1241 - TrainLoss: 3.0311
Batch 112/1241 - TrainLoss: 3.0027
Batch 113/1241 - TrainLoss: 2.9597
Batch 114/1241 - TrainLoss: 2.9293
Batch 115/1241 - TrainLoss: 2.9494
Batch 116/1241 - TrainLoss: 2.8968
Batch 117/1241 - TrainLoss: 2.9277
Batch 118/1241 - TrainLoss: 2.8886
Batch 119/1241 - TrainLoss: 2.9239
Batch 120/1241 - TrainLoss: 2.8044
Batch 121/1241 - TrainLoss: 2.8304


 11%|█         | 132/1241 [00:02<00:18, 58.52it/s]

Batch 122/1241 - TrainLoss: 2.8624
Batch 123/1241 - TrainLoss: 2.7834
Batch 124/1241 - TrainLoss: 2.8673
Batch 125/1241 - TrainLoss: 2.8309
Batch 126/1241 - TrainLoss: 2.8145
Batch 127/1241 - TrainLoss: 2.8232
Batch 128/1241 - TrainLoss: 2.7080
Batch 129/1241 - TrainLoss: 2.7038
Batch 130/1241 - TrainLoss: 2.7361
Batch 131/1241 - TrainLoss: 2.7721
Batch 132/1241 - TrainLoss: 2.7585
Batch 133/1241 - TrainLoss: 2.7486


 12%|█▏        | 145/1241 [00:02<00:18, 58.63it/s]

Batch 134/1241 - TrainLoss: 2.7527
Batch 135/1241 - TrainLoss: 2.7189
Batch 136/1241 - TrainLoss: 2.6577
Batch 137/1241 - TrainLoss: 2.7144
Batch 138/1241 - TrainLoss: 2.6066
Batch 139/1241 - TrainLoss: 2.6197
Batch 140/1241 - TrainLoss: 2.5580
Batch 141/1241 - TrainLoss: 2.5130
Batch 142/1241 - TrainLoss: 2.6144
Batch 143/1241 - TrainLoss: 2.5884
Batch 144/1241 - TrainLoss: 2.5326
Batch 145/1241 - TrainLoss: 2.5975
Batch 146/1241 - TrainLoss: 2.5737


 13%|█▎        | 157/1241 [00:02<00:18, 58.49it/s]

Batch 147/1241 - TrainLoss: 2.6186
Batch 148/1241 - TrainLoss: 2.4789
Batch 149/1241 - TrainLoss: 2.5697
Batch 150/1241 - TrainLoss: 2.6176
Batch 151/1241 - TrainLoss: 2.3706
Batch 152/1241 - TrainLoss: 2.4833
Batch 153/1241 - TrainLoss: 2.4740
Batch 154/1241 - TrainLoss: 2.5355
Batch 155/1241 - TrainLoss: 2.3993
Batch 156/1241 - TrainLoss: 2.4355
Batch 157/1241 - TrainLoss: 2.4074
Batch 158/1241 - TrainLoss: 2.4175
Batch 159/1241 - TrainLoss: 2.4089


 14%|█▎        | 169/1241 [00:03<00:18, 56.57it/s]

Batch 160/1241 - TrainLoss: 2.3509
Batch 161/1241 - TrainLoss: 2.2575
Batch 162/1241 - TrainLoss: 2.3560
Batch 163/1241 - TrainLoss: 2.3857
Batch 164/1241 - TrainLoss: 2.3821
Batch 165/1241 - TrainLoss: 2.2654
Batch 166/1241 - TrainLoss: 2.2741
Batch 167/1241 - TrainLoss: 2.3112
Batch 168/1241 - TrainLoss: 2.3497
Batch 169/1241 - TrainLoss: 2.3432
Batch 170/1241 - TrainLoss: 2.3098
Batch 171/1241 - TrainLoss: 2.3939


 15%|█▍        | 181/1241 [00:03<00:19, 55.07it/s]

Batch 172/1241 - TrainLoss: 2.2450
Batch 173/1241 - TrainLoss: 2.2652
Batch 174/1241 - TrainLoss: 2.2096
Batch 175/1241 - TrainLoss: 2.1180
Batch 176/1241 - TrainLoss: 2.2683
Batch 177/1241 - TrainLoss: 2.2783
Batch 178/1241 - TrainLoss: 2.2002
Batch 179/1241 - TrainLoss: 2.2724
Batch 180/1241 - TrainLoss: 2.2235
Batch 181/1241 - TrainLoss: 2.1596
Batch 182/1241 - TrainLoss: 2.2396


 15%|█▌        | 187/1241 [00:03<00:18, 55.97it/s]

Batch 183/1241 - TrainLoss: 2.0810
Batch 184/1241 - TrainLoss: 2.1645
Batch 185/1241 - TrainLoss: 2.1369
Batch 186/1241 - TrainLoss: 2.1370
Batch 187/1241 - TrainLoss: 2.1683
Batch 188/1241 - TrainLoss: 2.2274
Batch 189/1241 - TrainLoss: 2.0798
Batch 190/1241 - TrainLoss: 2.1055
Batch 191/1241 - TrainLoss: 2.1441
Batch 192/1241 - TrainLoss: 2.0148


 16%|█▌        | 198/1241 [00:03<00:22, 46.55it/s]

Batch 193/1241 - TrainLoss: 2.0283
Batch 194/1241 - TrainLoss: 2.1210
Batch 195/1241 - TrainLoss: 2.0082
Batch 196/1241 - TrainLoss: 2.0675
Batch 197/1241 - TrainLoss: 1.9995
Batch 198/1241 - TrainLoss: 2.0196
Batch 199/1241 - TrainLoss: 2.0205
Batch 200/1241 - TrainLoss: 2.1069
Batch 201/1241 - TrainLoss: 2.0763


 17%|█▋        | 208/1241 [00:04<00:22, 45.33it/s]

Batch 202/1241 - TrainLoss: 2.0391
Batch 203/1241 - TrainLoss: 2.0353
Batch 204/1241 - TrainLoss: 2.0366
Batch 205/1241 - TrainLoss: 1.8664
Batch 206/1241 - TrainLoss: 1.9227
Batch 207/1241 - TrainLoss: 1.9676
Batch 208/1241 - TrainLoss: 1.9090
Batch 209/1241 - TrainLoss: 1.9164
Batch 210/1241 - TrainLoss: 2.0144
Batch 211/1241 - TrainLoss: 1.9424


 18%|█▊        | 218/1241 [00:04<00:22, 44.58it/s]

Batch 212/1241 - TrainLoss: 1.9913
Batch 213/1241 - TrainLoss: 1.8742
Batch 214/1241 - TrainLoss: 1.8885
Batch 215/1241 - TrainLoss: 1.9295
Batch 216/1241 - TrainLoss: 1.9504
Batch 217/1241 - TrainLoss: 1.8799
Batch 218/1241 - TrainLoss: 1.9016
Batch 219/1241 - TrainLoss: 1.8735
Batch 220/1241 - TrainLoss: 1.8806


 18%|█▊        | 223/1241 [00:04<00:23, 44.00it/s]

Batch 221/1241 - TrainLoss: 1.8274
Batch 222/1241 - TrainLoss: 1.8820
Batch 223/1241 - TrainLoss: 1.8094
Batch 224/1241 - TrainLoss: 1.7609
Batch 225/1241 - TrainLoss: 1.8827
Batch 226/1241 - TrainLoss: 1.7625
Batch 227/1241 - TrainLoss: 1.8291


 18%|█▊        | 228/1241 [00:04<00:23, 43.91it/s]

Batch 228/1241 - TrainLoss: 1.8724
Batch 229/1241 - TrainLoss: 1.9006


 19%|█▉        | 233/1241 [00:04<00:22, 44.84it/s]

Batch 230/1241 - TrainLoss: 1.8082
Batch 231/1241 - TrainLoss: 1.8426
Batch 232/1241 - TrainLoss: 1.8751
Batch 233/1241 - TrainLoss: 1.8021
Batch 234/1241 - TrainLoss: 1.8946
Batch 235/1241 - TrainLoss: 1.7018
Batch 236/1241 - TrainLoss: 1.7254


 19%|█▉        | 238/1241 [00:04<00:22, 43.72it/s]

Batch 237/1241 - TrainLoss: 1.7125
Batch 238/1241 - TrainLoss: 1.7317


 20%|█▉        | 243/1241 [00:04<00:22, 44.18it/s]

Batch 239/1241 - TrainLoss: 1.8554
Batch 240/1241 - TrainLoss: 1.7444
Batch 241/1241 - TrainLoss: 1.7370
Batch 242/1241 - TrainLoss: 1.7624
Batch 243/1241 - TrainLoss: 1.7118
Batch 244/1241 - TrainLoss: 1.7662
Batch 245/1241 - TrainLoss: 1.7309
Batch 246/1241 - TrainLoss: 1.6722
Batch 247/1241 - TrainLoss: 1.7725
Batch 248/1241 - TrainLoss: 1.7113


 20%|██        | 253/1241 [00:05<00:23, 42.89it/s]

Batch 249/1241 - TrainLoss: 1.7140
Batch 250/1241 - TrainLoss: 1.6352
Batch 251/1241 - TrainLoss: 1.6743
Batch 252/1241 - TrainLoss: 1.6395
Batch 253/1241 - TrainLoss: 1.7464
Batch 254/1241 - TrainLoss: 1.6686
Batch 255/1241 - TrainLoss: 1.7305
Batch 256/1241 - TrainLoss: 1.6427
Batch 257/1241 - TrainLoss: 1.6511


 21%|██        | 263/1241 [00:05<00:23, 42.06it/s]

Batch 258/1241 - TrainLoss: 1.6239
Batch 259/1241 - TrainLoss: 1.6579
Batch 260/1241 - TrainLoss: 1.6500
Batch 261/1241 - TrainLoss: 1.7313
Batch 262/1241 - TrainLoss: 1.5889
Batch 263/1241 - TrainLoss: 1.6265
Batch 264/1241 - TrainLoss: 1.5578
Batch 265/1241 - TrainLoss: 1.6255


 22%|██▏       | 272/1241 [00:05<00:24, 39.44it/s]

Batch 266/1241 - TrainLoss: 1.5806
Batch 267/1241 - TrainLoss: 1.6815
Batch 268/1241 - TrainLoss: 1.5980
Batch 269/1241 - TrainLoss: 1.6412
Batch 270/1241 - TrainLoss: 1.5319
Batch 271/1241 - TrainLoss: 1.5892
Batch 272/1241 - TrainLoss: 1.6152
Batch 273/1241 - TrainLoss: 1.6010


 23%|██▎       | 283/1241 [00:05<00:21, 45.08it/s]

Batch 274/1241 - TrainLoss: 1.6080
Batch 275/1241 - TrainLoss: 1.5612
Batch 276/1241 - TrainLoss: 1.5571
Batch 277/1241 - TrainLoss: 1.6162
Batch 278/1241 - TrainLoss: 1.5750
Batch 279/1241 - TrainLoss: 1.5992
Batch 280/1241 - TrainLoss: 1.5418
Batch 281/1241 - TrainLoss: 1.5543
Batch 282/1241 - TrainLoss: 1.5066
Batch 283/1241 - TrainLoss: 1.5462
Batch 284/1241 - TrainLoss: 1.5845


 24%|██▍       | 295/1241 [00:05<00:18, 51.08it/s]

Batch 285/1241 - TrainLoss: 1.6905
Batch 286/1241 - TrainLoss: 1.4904
Batch 287/1241 - TrainLoss: 1.4915
Batch 288/1241 - TrainLoss: 1.5025
Batch 289/1241 - TrainLoss: 1.5098
Batch 290/1241 - TrainLoss: 1.5881
Batch 291/1241 - TrainLoss: 1.5125
Batch 292/1241 - TrainLoss: 1.4561
Batch 293/1241 - TrainLoss: 1.4870
Batch 294/1241 - TrainLoss: 1.5050
Batch 295/1241 - TrainLoss: 1.5122
Batch 296/1241 - TrainLoss: 1.5898


 25%|██▍       | 307/1241 [00:06<00:16, 55.07it/s]

Batch 297/1241 - TrainLoss: 1.5155
Batch 298/1241 - TrainLoss: 1.4704
Batch 299/1241 - TrainLoss: 1.5178
Batch 300/1241 - TrainLoss: 1.5469
Batch 301/1241 - TrainLoss: 1.5320
Batch 302/1241 - TrainLoss: 1.4813
Batch 303/1241 - TrainLoss: 1.4714
Batch 304/1241 - TrainLoss: 1.4329
Batch 305/1241 - TrainLoss: 1.4661
Batch 306/1241 - TrainLoss: 1.5134
Batch 307/1241 - TrainLoss: 1.3663
Batch 308/1241 - TrainLoss: 1.4857


 26%|██▌       | 320/1241 [00:06<00:16, 57.45it/s]

Batch 309/1241 - TrainLoss: 1.5143
Batch 310/1241 - TrainLoss: 1.4370
Batch 311/1241 - TrainLoss: 1.5931
Batch 312/1241 - TrainLoss: 1.4368
Batch 313/1241 - TrainLoss: 1.4018
Batch 314/1241 - TrainLoss: 1.4640
Batch 315/1241 - TrainLoss: 1.4591
Batch 316/1241 - TrainLoss: 1.4485
Batch 317/1241 - TrainLoss: 1.4514
Batch 318/1241 - TrainLoss: 1.4541
Batch 319/1241 - TrainLoss: 1.4209
Batch 320/1241 - TrainLoss: 1.5118
Batch 321/1241 - TrainLoss: 1.3633


 27%|██▋       | 333/1241 [00:06<00:15, 58.96it/s]

Batch 322/1241 - TrainLoss: 1.3513
Batch 323/1241 - TrainLoss: 1.4732
Batch 324/1241 - TrainLoss: 1.3516
Batch 325/1241 - TrainLoss: 1.4749
Batch 326/1241 - TrainLoss: 1.3980
Batch 327/1241 - TrainLoss: 1.3797
Batch 328/1241 - TrainLoss: 1.3119
Batch 329/1241 - TrainLoss: 1.4498
Batch 330/1241 - TrainLoss: 1.3776
Batch 331/1241 - TrainLoss: 1.3829
Batch 332/1241 - TrainLoss: 1.3450
Batch 333/1241 - TrainLoss: 1.3989
Batch 334/1241 - TrainLoss: 1.3405


 27%|██▋       | 339/1241 [00:06<00:15, 57.91it/s]

Batch 335/1241 - TrainLoss: 1.2922
Batch 336/1241 - TrainLoss: 1.3805
Batch 337/1241 - TrainLoss: 1.3415
Batch 338/1241 - TrainLoss: 1.3813
Batch 339/1241 - TrainLoss: 1.2963
Batch 340/1241 - TrainLoss: 1.2998
Batch 341/1241 - TrainLoss: 1.3538
Batch 342/1241 - TrainLoss: 1.3470
Batch 343/1241 - TrainLoss: 1.3313
Batch 344/1241 - TrainLoss: 1.4266


 28%|██▊       | 345/1241 [00:06<00:16, 55.74it/s]

Batch 345/1241 - TrainLoss: 1.2965


 28%|██▊       | 351/1241 [00:06<00:16, 54.67it/s]

Batch 346/1241 - TrainLoss: 1.2423
Batch 347/1241 - TrainLoss: 1.4059
Batch 348/1241 - TrainLoss: 1.2131
Batch 349/1241 - TrainLoss: 1.3344
Batch 350/1241 - TrainLoss: 1.3412
Batch 351/1241 - TrainLoss: 1.2584
Batch 352/1241 - TrainLoss: 1.3261
Batch 353/1241 - TrainLoss: 1.3542
Batch 354/1241 - TrainLoss: 1.3115
Batch 355/1241 - TrainLoss: 1.3448
Batch 356/1241 - TrainLoss: 1.3448


 29%|██▉       | 357/1241 [00:07<00:15, 56.03it/s]

Batch 357/1241 - TrainLoss: 1.2810


 29%|██▉       | 363/1241 [00:07<00:15, 56.58it/s]

Batch 358/1241 - TrainLoss: 1.4036
Batch 359/1241 - TrainLoss: 1.3123
Batch 360/1241 - TrainLoss: 1.1564
Batch 361/1241 - TrainLoss: 1.2988
Batch 362/1241 - TrainLoss: 1.2383
Batch 363/1241 - TrainLoss: 1.2588
Batch 364/1241 - TrainLoss: 1.2177
Batch 365/1241 - TrainLoss: 1.2465
Batch 366/1241 - TrainLoss: 1.1908
Batch 367/1241 - TrainLoss: 1.2888
Batch 368/1241 - TrainLoss: 1.1985
Batch 369/1241 - TrainLoss: 1.2828


 30%|██▉       | 370/1241 [00:07<00:14, 58.29it/s]

Batch 370/1241 - TrainLoss: 1.2003


 30%|███       | 376/1241 [00:07<00:14, 58.71it/s]

Batch 371/1241 - TrainLoss: 1.3070
Batch 372/1241 - TrainLoss: 1.2462
Batch 373/1241 - TrainLoss: 1.2388
Batch 374/1241 - TrainLoss: 1.2925
Batch 375/1241 - TrainLoss: 1.2733
Batch 376/1241 - TrainLoss: 1.2561
Batch 377/1241 - TrainLoss: 1.1984
Batch 378/1241 - TrainLoss: 1.2933
Batch 379/1241 - TrainLoss: 1.2159
Batch 380/1241 - TrainLoss: 1.2161
Batch 381/1241 - TrainLoss: 1.2244


 31%|███       | 382/1241 [00:07<00:14, 58.07it/s]

Batch 382/1241 - TrainLoss: 1.2058


 31%|███▏      | 389/1241 [00:07<00:14, 59.20it/s]

Batch 383/1241 - TrainLoss: 1.1451
Batch 384/1241 - TrainLoss: 1.1950
Batch 385/1241 - TrainLoss: 1.1574
Batch 386/1241 - TrainLoss: 1.2170
Batch 387/1241 - TrainLoss: 1.0465
Batch 388/1241 - TrainLoss: 1.1229
Batch 389/1241 - TrainLoss: 1.2984
Batch 390/1241 - TrainLoss: 1.1943
Batch 391/1241 - TrainLoss: 1.1445
Batch 392/1241 - TrainLoss: 1.1914
Batch 393/1241 - TrainLoss: 1.1751
Batch 394/1241 - TrainLoss: 1.1738
Batch 395/1241 - TrainLoss: 1.2910


 32%|███▏      | 402/1241 [00:07<00:14, 57.66it/s]

Batch 396/1241 - TrainLoss: 1.1714
Batch 397/1241 - TrainLoss: 1.1890
Batch 398/1241 - TrainLoss: 1.2944
Batch 399/1241 - TrainLoss: 1.0994
Batch 400/1241 - TrainLoss: 1.1286
Batch 401/1241 - TrainLoss: 1.1817
Batch 402/1241 - TrainLoss: 1.2482
Batch 403/1241 - TrainLoss: 1.1764
Batch 404/1241 - TrainLoss: 1.0654
Batch 405/1241 - TrainLoss: 1.1616
Batch 406/1241 - TrainLoss: 1.2336


 33%|███▎      | 414/1241 [00:08<00:14, 56.21it/s]

Batch 407/1241 - TrainLoss: 1.0952
Batch 408/1241 - TrainLoss: 1.1340
Batch 409/1241 - TrainLoss: 1.2500
Batch 410/1241 - TrainLoss: 1.0758
Batch 411/1241 - TrainLoss: 1.1219
Batch 412/1241 - TrainLoss: 1.1273
Batch 413/1241 - TrainLoss: 1.0375
Batch 414/1241 - TrainLoss: 1.1247
Batch 415/1241 - TrainLoss: 1.1327
Batch 416/1241 - TrainLoss: 1.1132
Batch 417/1241 - TrainLoss: 1.1700
Batch 418/1241 - TrainLoss: 1.1228


 34%|███▍      | 426/1241 [00:08<00:14, 57.25it/s]

Batch 419/1241 - TrainLoss: 1.0731
Batch 420/1241 - TrainLoss: 1.1896
Batch 421/1241 - TrainLoss: 1.0659
Batch 422/1241 - TrainLoss: 1.1906
Batch 423/1241 - TrainLoss: 1.0381
Batch 424/1241 - TrainLoss: 1.0868
Batch 425/1241 - TrainLoss: 1.0639
Batch 426/1241 - TrainLoss: 1.1279
Batch 427/1241 - TrainLoss: 1.0421
Batch 428/1241 - TrainLoss: 1.0137
Batch 429/1241 - TrainLoss: 1.0872
Batch 430/1241 - TrainLoss: 1.1585
Batch 431/1241 - TrainLoss: 1.0320


 35%|███▌      | 438/1241 [00:08<00:13, 58.35it/s]

Batch 432/1241 - TrainLoss: 1.0579
Batch 433/1241 - TrainLoss: 1.0436
Batch 434/1241 - TrainLoss: 1.0296
Batch 435/1241 - TrainLoss: 1.1163
Batch 436/1241 - TrainLoss: 1.0206
Batch 437/1241 - TrainLoss: 1.0386
Batch 438/1241 - TrainLoss: 1.1026
Batch 439/1241 - TrainLoss: 1.1289
Batch 440/1241 - TrainLoss: 1.1211
Batch 441/1241 - TrainLoss: 1.0851
Batch 442/1241 - TrainLoss: 0.9512
Batch 443/1241 - TrainLoss: 1.0322
Batch 444/1241 - TrainLoss: 1.0494


 36%|███▋      | 452/1241 [00:08<00:13, 59.58it/s]

Batch 445/1241 - TrainLoss: 1.0553
Batch 446/1241 - TrainLoss: 1.0393
Batch 447/1241 - TrainLoss: 1.0511
Batch 448/1241 - TrainLoss: 1.0894
Batch 449/1241 - TrainLoss: 1.0604
Batch 450/1241 - TrainLoss: 1.0104
Batch 451/1241 - TrainLoss: 0.9523
Batch 452/1241 - TrainLoss: 1.1324
Batch 453/1241 - TrainLoss: 1.0199
Batch 454/1241 - TrainLoss: 1.0172
Batch 455/1241 - TrainLoss: 1.0288
Batch 456/1241 - TrainLoss: 1.0085
Batch 457/1241 - TrainLoss: 1.0799


 37%|███▋      | 464/1241 [00:08<00:13, 57.02it/s]

Batch 458/1241 - TrainLoss: 1.0116
Batch 459/1241 - TrainLoss: 1.0803
Batch 460/1241 - TrainLoss: 0.9243
Batch 461/1241 - TrainLoss: 1.0105
Batch 462/1241 - TrainLoss: 0.9417
Batch 463/1241 - TrainLoss: 1.0263
Batch 464/1241 - TrainLoss: 0.9110
Batch 465/1241 - TrainLoss: 1.0095
Batch 466/1241 - TrainLoss: 1.0338
Batch 467/1241 - TrainLoss: 0.9993
Batch 468/1241 - TrainLoss: 1.0509


 38%|███▊      | 477/1241 [00:09<00:13, 58.29it/s]

Batch 469/1241 - TrainLoss: 0.9583
Batch 470/1241 - TrainLoss: 0.9547
Batch 471/1241 - TrainLoss: 1.0162
Batch 472/1241 - TrainLoss: 0.9772
Batch 473/1241 - TrainLoss: 1.0122
Batch 474/1241 - TrainLoss: 0.9620
Batch 475/1241 - TrainLoss: 0.8951
Batch 476/1241 - TrainLoss: 1.0554
Batch 477/1241 - TrainLoss: 0.9962
Batch 478/1241 - TrainLoss: 0.8628
Batch 479/1241 - TrainLoss: 0.9644
Batch 480/1241 - TrainLoss: 0.9783
Batch 481/1241 - TrainLoss: 0.9840


 39%|███▉      | 490/1241 [00:09<00:12, 59.73it/s]

Batch 482/1241 - TrainLoss: 0.9534
Batch 483/1241 - TrainLoss: 0.9409
Batch 484/1241 - TrainLoss: 0.9019
Batch 485/1241 - TrainLoss: 0.9637
Batch 486/1241 - TrainLoss: 0.8889
Batch 487/1241 - TrainLoss: 0.9292
Batch 488/1241 - TrainLoss: 1.0126
Batch 489/1241 - TrainLoss: 0.9651
Batch 490/1241 - TrainLoss: 0.9273
Batch 491/1241 - TrainLoss: 0.9786
Batch 492/1241 - TrainLoss: 0.8343
Batch 493/1241 - TrainLoss: 0.9686
Batch 494/1241 - TrainLoss: 0.8872


 41%|████      | 503/1241 [00:09<00:12, 60.67it/s]

Batch 495/1241 - TrainLoss: 0.9480
Batch 496/1241 - TrainLoss: 0.9837
Batch 497/1241 - TrainLoss: 0.9883
Batch 498/1241 - TrainLoss: 0.9224
Batch 499/1241 - TrainLoss: 0.9298
Batch 500/1241 - TrainLoss: 0.9485
Batch 501/1241 - TrainLoss: 0.9302
Batch 502/1241 - TrainLoss: 0.8990
Batch 503/1241 - TrainLoss: 0.9542
Batch 504/1241 - TrainLoss: 0.8843
Batch 505/1241 - TrainLoss: 0.9721
Batch 506/1241 - TrainLoss: 0.9403
Batch 507/1241 - TrainLoss: 0.8696


 42%|████▏     | 517/1241 [00:09<00:11, 60.71it/s]

Batch 508/1241 - TrainLoss: 0.9442
Batch 509/1241 - TrainLoss: 0.9143
Batch 510/1241 - TrainLoss: 0.8731
Batch 511/1241 - TrainLoss: 0.8293
Batch 512/1241 - TrainLoss: 0.8739
Batch 513/1241 - TrainLoss: 0.7799
Batch 514/1241 - TrainLoss: 0.8913
Batch 515/1241 - TrainLoss: 0.8736
Batch 516/1241 - TrainLoss: 0.7901
Batch 517/1241 - TrainLoss: 0.8416
Batch 518/1241 - TrainLoss: 0.8774
Batch 519/1241 - TrainLoss: 0.9440
Batch 520/1241 - TrainLoss: 0.8778


 43%|████▎     | 531/1241 [00:10<00:11, 59.46it/s]

Batch 521/1241 - TrainLoss: 0.8165
Batch 522/1241 - TrainLoss: 0.8648
Batch 523/1241 - TrainLoss: 0.8182
Batch 524/1241 - TrainLoss: 0.8029
Batch 525/1241 - TrainLoss: 0.8206
Batch 526/1241 - TrainLoss: 1.0072
Batch 527/1241 - TrainLoss: 0.8913
Batch 528/1241 - TrainLoss: 0.7659
Batch 529/1241 - TrainLoss: 0.8505
Batch 530/1241 - TrainLoss: 0.8932
Batch 531/1241 - TrainLoss: 0.8008
Batch 532/1241 - TrainLoss: 0.7963


 44%|████▍     | 544/1241 [00:10<00:11, 59.68it/s]

Batch 533/1241 - TrainLoss: 0.7936
Batch 534/1241 - TrainLoss: 0.8287
Batch 535/1241 - TrainLoss: 0.8567
Batch 536/1241 - TrainLoss: 0.8490
Batch 537/1241 - TrainLoss: 0.8264
Batch 538/1241 - TrainLoss: 0.8577
Batch 539/1241 - TrainLoss: 0.8834
Batch 540/1241 - TrainLoss: 0.7425
Batch 541/1241 - TrainLoss: 0.7989
Batch 542/1241 - TrainLoss: 0.7869
Batch 543/1241 - TrainLoss: 0.8651
Batch 544/1241 - TrainLoss: 0.7662
Batch 545/1241 - TrainLoss: 0.7588


 45%|████▍     | 558/1241 [00:10<00:11, 59.67it/s]

Batch 546/1241 - TrainLoss: 0.7776
Batch 547/1241 - TrainLoss: 0.7725
Batch 548/1241 - TrainLoss: 0.7900
Batch 549/1241 - TrainLoss: 0.8408
Batch 550/1241 - TrainLoss: 0.7332
Batch 551/1241 - TrainLoss: 0.7316
Batch 552/1241 - TrainLoss: 0.7900
Batch 553/1241 - TrainLoss: 0.7324
Batch 554/1241 - TrainLoss: 0.8161
Batch 555/1241 - TrainLoss: 0.7794
Batch 556/1241 - TrainLoss: 0.7899
Batch 557/1241 - TrainLoss: 0.7392
Batch 558/1241 - TrainLoss: 0.7686


 46%|████▌     | 565/1241 [00:10<00:11, 60.08it/s]

Batch 559/1241 - TrainLoss: 0.7055
Batch 560/1241 - TrainLoss: 0.7331
Batch 561/1241 - TrainLoss: 0.7361
Batch 562/1241 - TrainLoss: 0.7412
Batch 563/1241 - TrainLoss: 0.7176
Batch 564/1241 - TrainLoss: 0.7664
Batch 565/1241 - TrainLoss: 0.7566
Batch 566/1241 - TrainLoss: 0.7682
Batch 567/1241 - TrainLoss: 0.7575
Batch 568/1241 - TrainLoss: 0.7650
Batch 569/1241 - TrainLoss: 0.8064
Batch 570/1241 - TrainLoss: 0.7021
Batch 571/1241 - TrainLoss: 0.6684


 47%|████▋     | 579/1241 [00:10<00:10, 61.62it/s]

Batch 572/1241 - TrainLoss: 0.6509
Batch 573/1241 - TrainLoss: 0.6864
Batch 574/1241 - TrainLoss: 0.8184
Batch 575/1241 - TrainLoss: 0.7662
Batch 576/1241 - TrainLoss: 0.7472
Batch 577/1241 - TrainLoss: 0.6594
Batch 578/1241 - TrainLoss: 0.7381
Batch 579/1241 - TrainLoss: 0.6511
Batch 580/1241 - TrainLoss: 0.6447
Batch 581/1241 - TrainLoss: 0.6904
Batch 582/1241 - TrainLoss: 0.6454
Batch 583/1241 - TrainLoss: 0.6329


 48%|████▊     | 593/1241 [00:11<00:10, 60.17it/s]

Batch 584/1241 - TrainLoss: 0.6541
Batch 585/1241 - TrainLoss: 0.6974
Batch 586/1241 - TrainLoss: 0.6345
Batch 587/1241 - TrainLoss: 0.7273
Batch 588/1241 - TrainLoss: 0.8247
Batch 589/1241 - TrainLoss: 0.6843
Batch 590/1241 - TrainLoss: 0.7020
Batch 591/1241 - TrainLoss: 0.6460
Batch 592/1241 - TrainLoss: 0.6976
Batch 593/1241 - TrainLoss: 0.6492
Batch 594/1241 - TrainLoss: 0.6515
Batch 595/1241 - TrainLoss: 0.7507
Batch 596/1241 - TrainLoss: 0.6409


 49%|████▉     | 607/1241 [00:11<00:10, 60.46it/s]

Batch 597/1241 - TrainLoss: 0.6301
Batch 598/1241 - TrainLoss: 0.5822
Batch 599/1241 - TrainLoss: 0.6438
Batch 600/1241 - TrainLoss: 0.6581
Batch 601/1241 - TrainLoss: 0.6429
Batch 602/1241 - TrainLoss: 0.6291
Batch 603/1241 - TrainLoss: 0.6929
Batch 604/1241 - TrainLoss: 0.5743
Batch 605/1241 - TrainLoss: 0.6707
Batch 606/1241 - TrainLoss: 0.6591
Batch 607/1241 - TrainLoss: 0.5585
Batch 608/1241 - TrainLoss: 0.7181
Batch 609/1241 - TrainLoss: 0.6491


 50%|█████     | 621/1241 [00:11<00:10, 59.97it/s]

Batch 610/1241 - TrainLoss: 0.6535
Batch 611/1241 - TrainLoss: 0.6771
Batch 612/1241 - TrainLoss: 0.6417
Batch 613/1241 - TrainLoss: 0.6042
Batch 614/1241 - TrainLoss: 0.7178
Batch 615/1241 - TrainLoss: 0.5935
Batch 616/1241 - TrainLoss: 0.6434
Batch 617/1241 - TrainLoss: 0.5960
Batch 618/1241 - TrainLoss: 0.5697
Batch 619/1241 - TrainLoss: 0.6745
Batch 620/1241 - TrainLoss: 0.5735
Batch 621/1241 - TrainLoss: 0.6760


 51%|█████     | 634/1241 [00:11<00:10, 59.80it/s]

Batch 622/1241 - TrainLoss: 0.5565
Batch 623/1241 - TrainLoss: 0.5897
Batch 624/1241 - TrainLoss: 0.6058
Batch 625/1241 - TrainLoss: 0.5894
Batch 626/1241 - TrainLoss: 0.5923
Batch 627/1241 - TrainLoss: 0.4837
Batch 628/1241 - TrainLoss: 0.6803
Batch 629/1241 - TrainLoss: 0.6962
Batch 630/1241 - TrainLoss: 0.5743
Batch 631/1241 - TrainLoss: 0.5527
Batch 632/1241 - TrainLoss: 0.5429
Batch 633/1241 - TrainLoss: 0.5831
Batch 634/1241 - TrainLoss: 0.5021


 52%|█████▏    | 646/1241 [00:11<00:10, 57.32it/s]

Batch 635/1241 - TrainLoss: 0.5936
Batch 636/1241 - TrainLoss: 0.6359
Batch 637/1241 - TrainLoss: 0.6001
Batch 638/1241 - TrainLoss: 0.5707
Batch 639/1241 - TrainLoss: 0.5515
Batch 640/1241 - TrainLoss: 0.5913
Batch 641/1241 - TrainLoss: 0.5939
Batch 642/1241 - TrainLoss: 0.5402
Batch 643/1241 - TrainLoss: 0.6374
Batch 644/1241 - TrainLoss: 0.5610
Batch 645/1241 - TrainLoss: 0.5742
Batch 646/1241 - TrainLoss: 0.5844


 53%|█████▎    | 659/1241 [00:12<00:09, 58.73it/s]

Batch 647/1241 - TrainLoss: 0.5898
Batch 648/1241 - TrainLoss: 0.5555
Batch 649/1241 - TrainLoss: 0.5374
Batch 650/1241 - TrainLoss: 0.5793
Batch 651/1241 - TrainLoss: 0.6079
Batch 652/1241 - TrainLoss: 0.5335
Batch 653/1241 - TrainLoss: 0.5086
Batch 654/1241 - TrainLoss: 0.5581
Batch 655/1241 - TrainLoss: 0.5369
Batch 656/1241 - TrainLoss: 0.5193
Batch 657/1241 - TrainLoss: 0.6199
Batch 658/1241 - TrainLoss: 0.5314
Batch 659/1241 - TrainLoss: 0.4942


 54%|█████▍    | 671/1241 [00:12<00:09, 58.36it/s]

Batch 660/1241 - TrainLoss: 0.5202
Batch 661/1241 - TrainLoss: 0.6306
Batch 662/1241 - TrainLoss: 0.5053
Batch 663/1241 - TrainLoss: 0.4455
Batch 664/1241 - TrainLoss: 0.5484
Batch 665/1241 - TrainLoss: 0.5663
Batch 666/1241 - TrainLoss: 0.5349
Batch 667/1241 - TrainLoss: 0.5244
Batch 668/1241 - TrainLoss: 0.4788
Batch 669/1241 - TrainLoss: 0.4818
Batch 670/1241 - TrainLoss: 0.5114
Batch 671/1241 - TrainLoss: 0.5677
Batch 672/1241 - TrainLoss: 0.5626


 55%|█████▌    | 685/1241 [00:12<00:09, 59.85it/s]

Batch 673/1241 - TrainLoss: 0.5597
Batch 674/1241 - TrainLoss: 0.5881
Batch 675/1241 - TrainLoss: 0.5125
Batch 676/1241 - TrainLoss: 0.4343
Batch 677/1241 - TrainLoss: 0.5470
Batch 678/1241 - TrainLoss: 0.5036
Batch 679/1241 - TrainLoss: 0.5310
Batch 680/1241 - TrainLoss: 0.5452
Batch 681/1241 - TrainLoss: 0.6350
Batch 682/1241 - TrainLoss: 0.5630
Batch 683/1241 - TrainLoss: 0.5363
Batch 684/1241 - TrainLoss: 0.4613
Batch 685/1241 - TrainLoss: 0.5600


 56%|█████▌    | 692/1241 [00:12<00:09, 60.15it/s]

Batch 686/1241 - TrainLoss: 0.5656
Batch 687/1241 - TrainLoss: 0.4666
Batch 688/1241 - TrainLoss: 0.5932
Batch 689/1241 - TrainLoss: 0.5629
Batch 690/1241 - TrainLoss: 0.4179
Batch 691/1241 - TrainLoss: 0.4170
Batch 692/1241 - TrainLoss: 0.6173
Batch 693/1241 - TrainLoss: 0.5037
Batch 694/1241 - TrainLoss: 0.4620
Batch 695/1241 - TrainLoss: 0.5409
Batch 696/1241 - TrainLoss: 0.5178
Batch 697/1241 - TrainLoss: 0.4379
Batch 698/1241 - TrainLoss: 0.5548


 57%|█████▋    | 706/1241 [00:12<00:09, 58.47it/s]

Batch 699/1241 - TrainLoss: 0.4405
Batch 700/1241 - TrainLoss: 0.5576
Batch 701/1241 - TrainLoss: 0.4563
Batch 702/1241 - TrainLoss: 0.5492
Batch 703/1241 - TrainLoss: 0.5112
Batch 704/1241 - TrainLoss: 0.4073
Batch 705/1241 - TrainLoss: 0.3823
Batch 706/1241 - TrainLoss: 0.4243
Batch 707/1241 - TrainLoss: 0.5014
Batch 708/1241 - TrainLoss: 0.3998
Batch 709/1241 - TrainLoss: 0.4348
Batch 710/1241 - TrainLoss: 0.5110


 58%|█████▊    | 719/1241 [00:13<00:08, 59.28it/s]

Batch 711/1241 - TrainLoss: 0.5131
Batch 712/1241 - TrainLoss: 0.4897
Batch 713/1241 - TrainLoss: 0.4158
Batch 714/1241 - TrainLoss: 0.5110
Batch 715/1241 - TrainLoss: 0.4303
Batch 716/1241 - TrainLoss: 0.4718
Batch 717/1241 - TrainLoss: 0.4452
Batch 718/1241 - TrainLoss: 0.4117
Batch 719/1241 - TrainLoss: 0.4082
Batch 720/1241 - TrainLoss: 0.4458
Batch 721/1241 - TrainLoss: 0.4314
Batch 722/1241 - TrainLoss: 0.4260
Batch 723/1241 - TrainLoss: 0.3771


 59%|█████▉    | 732/1241 [00:13<00:08, 59.47it/s]

Batch 724/1241 - TrainLoss: 0.4181
Batch 725/1241 - TrainLoss: 0.3576
Batch 726/1241 - TrainLoss: 0.5107
Batch 727/1241 - TrainLoss: 0.5092
Batch 728/1241 - TrainLoss: 0.3746
Batch 729/1241 - TrainLoss: 0.4658
Batch 730/1241 - TrainLoss: 0.4945
Batch 731/1241 - TrainLoss: 0.3622
Batch 732/1241 - TrainLoss: 0.4530
Batch 733/1241 - TrainLoss: 0.5174
Batch 734/1241 - TrainLoss: 0.5157
Batch 735/1241 - TrainLoss: 0.4712


 60%|██████    | 745/1241 [00:13<00:08, 59.67it/s]

Batch 736/1241 - TrainLoss: 0.4990
Batch 737/1241 - TrainLoss: 0.4194
Batch 738/1241 - TrainLoss: 0.5129
Batch 739/1241 - TrainLoss: 0.4371
Batch 740/1241 - TrainLoss: 0.4553
Batch 741/1241 - TrainLoss: 0.3478
Batch 742/1241 - TrainLoss: 0.4537
Batch 743/1241 - TrainLoss: 0.3788
Batch 744/1241 - TrainLoss: 0.4970
Batch 745/1241 - TrainLoss: 0.3569
Batch 746/1241 - TrainLoss: 0.4681
Batch 747/1241 - TrainLoss: 0.4368
Batch 748/1241 - TrainLoss: 0.4196


 61%|██████    | 758/1241 [00:13<00:08, 59.85it/s]

Batch 749/1241 - TrainLoss: 0.5089
Batch 750/1241 - TrainLoss: 0.4232
Batch 751/1241 - TrainLoss: 0.4554
Batch 752/1241 - TrainLoss: 0.3396
Batch 753/1241 - TrainLoss: 0.4661
Batch 754/1241 - TrainLoss: 0.3973
Batch 755/1241 - TrainLoss: 0.4289
Batch 756/1241 - TrainLoss: 0.4018
Batch 757/1241 - TrainLoss: 0.3792
Batch 758/1241 - TrainLoss: 0.3868
Batch 759/1241 - TrainLoss: 0.3871
Batch 760/1241 - TrainLoss: 0.4291
Batch 761/1241 - TrainLoss: 0.4117


 62%|██████▏   | 770/1241 [00:14<00:08, 57.55it/s]

Batch 762/1241 - TrainLoss: 0.4017
Batch 763/1241 - TrainLoss: 0.4647
Batch 764/1241 - TrainLoss: 0.3833
Batch 765/1241 - TrainLoss: 0.4477
Batch 766/1241 - TrainLoss: 0.4613
Batch 767/1241 - TrainLoss: 0.4268
Batch 768/1241 - TrainLoss: 0.3473
Batch 769/1241 - TrainLoss: 0.4969
Batch 770/1241 - TrainLoss: 0.3619
Batch 771/1241 - TrainLoss: 0.3894
Batch 772/1241 - TrainLoss: 0.3916
Batch 773/1241 - TrainLoss: 0.3584


 63%|██████▎   | 783/1241 [00:14<00:07, 58.16it/s]

Batch 774/1241 - TrainLoss: 0.4044
Batch 775/1241 - TrainLoss: 0.3970
Batch 776/1241 - TrainLoss: 0.3827
Batch 777/1241 - TrainLoss: 0.3934
Batch 778/1241 - TrainLoss: 0.3368
Batch 779/1241 - TrainLoss: 0.3231
Batch 780/1241 - TrainLoss: 0.4620
Batch 781/1241 - TrainLoss: 0.3949
Batch 782/1241 - TrainLoss: 0.4093
Batch 783/1241 - TrainLoss: 0.4059
Batch 784/1241 - TrainLoss: 0.4299
Batch 785/1241 - TrainLoss: 0.3939
Batch 786/1241 - TrainLoss: 0.3804


 64%|██████▍   | 795/1241 [00:14<00:08, 55.63it/s]

Batch 787/1241 - TrainLoss: 0.3879
Batch 788/1241 - TrainLoss: 0.4697
Batch 789/1241 - TrainLoss: 0.3216
Batch 790/1241 - TrainLoss: 0.4069
Batch 791/1241 - TrainLoss: 0.4330
Batch 792/1241 - TrainLoss: 0.4410
Batch 793/1241 - TrainLoss: 0.3560
Batch 794/1241 - TrainLoss: 0.4015
Batch 795/1241 - TrainLoss: 0.3469
Batch 796/1241 - TrainLoss: 0.3893
Batch 797/1241 - TrainLoss: 0.3706


 65%|██████▌   | 808/1241 [00:14<00:07, 57.72it/s]

Batch 798/1241 - TrainLoss: 0.3709
Batch 799/1241 - TrainLoss: 0.2914
Batch 800/1241 - TrainLoss: 0.3265
Batch 801/1241 - TrainLoss: 0.3606
Batch 802/1241 - TrainLoss: 0.3321
Batch 803/1241 - TrainLoss: 0.3470
Batch 804/1241 - TrainLoss: 0.4107
Batch 805/1241 - TrainLoss: 0.3114
Batch 806/1241 - TrainLoss: 0.3652
Batch 807/1241 - TrainLoss: 0.3130
Batch 808/1241 - TrainLoss: 0.3063
Batch 809/1241 - TrainLoss: 0.3648
Batch 810/1241 - TrainLoss: 0.4569


 66%|██████▌   | 820/1241 [00:14<00:07, 57.78it/s]

Batch 811/1241 - TrainLoss: 0.3405
Batch 812/1241 - TrainLoss: 0.4148
Batch 813/1241 - TrainLoss: 0.3525
Batch 814/1241 - TrainLoss: 0.3405
Batch 815/1241 - TrainLoss: 0.3406
Batch 816/1241 - TrainLoss: 0.2991
Batch 817/1241 - TrainLoss: 0.3376
Batch 818/1241 - TrainLoss: 0.3375
Batch 819/1241 - TrainLoss: 0.3514
Batch 820/1241 - TrainLoss: 0.3109
Batch 821/1241 - TrainLoss: 0.2710
Batch 822/1241 - TrainLoss: 0.2911


 67%|██████▋   | 832/1241 [00:15<00:07, 56.45it/s]

Batch 823/1241 - TrainLoss: 0.3333
Batch 824/1241 - TrainLoss: 0.3479
Batch 825/1241 - TrainLoss: 0.2985
Batch 826/1241 - TrainLoss: 0.3075
Batch 827/1241 - TrainLoss: 0.3040
Batch 828/1241 - TrainLoss: 0.3269
Batch 829/1241 - TrainLoss: 0.3689
Batch 830/1241 - TrainLoss: 0.3842
Batch 831/1241 - TrainLoss: 0.2726
Batch 832/1241 - TrainLoss: 0.3718
Batch 833/1241 - TrainLoss: 0.2973
Batch 834/1241 - TrainLoss: 0.3966


 68%|██████▊   | 845/1241 [00:15<00:06, 56.76it/s]

Batch 835/1241 - TrainLoss: 0.3785
Batch 836/1241 - TrainLoss: 0.3106
Batch 837/1241 - TrainLoss: 0.3976
Batch 838/1241 - TrainLoss: 0.3993
Batch 839/1241 - TrainLoss: 0.3679
Batch 840/1241 - TrainLoss: 0.3568
Batch 841/1241 - TrainLoss: 0.3645
Batch 842/1241 - TrainLoss: 0.3672
Batch 843/1241 - TrainLoss: 0.3618
Batch 844/1241 - TrainLoss: 0.3882
Batch 845/1241 - TrainLoss: 0.3174
Batch 846/1241 - TrainLoss: 0.3348


 69%|██████▉   | 858/1241 [00:15<00:06, 58.46it/s]

Batch 847/1241 - TrainLoss: 0.2788
Batch 848/1241 - TrainLoss: 0.3288
Batch 849/1241 - TrainLoss: 0.3090
Batch 850/1241 - TrainLoss: 0.2997
Batch 851/1241 - TrainLoss: 0.4285
Batch 852/1241 - TrainLoss: 0.3304
Batch 853/1241 - TrainLoss: 0.3376
Batch 854/1241 - TrainLoss: 0.2902
Batch 855/1241 - TrainLoss: 0.2723
Batch 856/1241 - TrainLoss: 0.4410
Batch 857/1241 - TrainLoss: 0.3306
Batch 858/1241 - TrainLoss: 0.3254


 70%|██████▉   | 864/1241 [00:15<00:07, 50.28it/s]

Batch 859/1241 - TrainLoss: 0.3469
Batch 860/1241 - TrainLoss: 0.3349
Batch 861/1241 - TrainLoss: 0.3003
Batch 862/1241 - TrainLoss: 0.2832
Batch 863/1241 - TrainLoss: 0.2737
Batch 864/1241 - TrainLoss: 0.3530
Batch 865/1241 - TrainLoss: 0.2916
Batch 866/1241 - TrainLoss: 0.3132
Batch 867/1241 - TrainLoss: 0.3671


 71%|███████   | 875/1241 [00:16<00:07, 46.27it/s]

Batch 868/1241 - TrainLoss: 0.3217
Batch 869/1241 - TrainLoss: 0.3288
Batch 870/1241 - TrainLoss: 0.3702
Batch 871/1241 - TrainLoss: 0.3416
Batch 872/1241 - TrainLoss: 0.2757
Batch 873/1241 - TrainLoss: 0.2869
Batch 874/1241 - TrainLoss: 0.2624
Batch 875/1241 - TrainLoss: 0.2756
Batch 876/1241 - TrainLoss: 0.2623


 71%|███████▏  | 885/1241 [00:16<00:08, 43.79it/s]

Batch 877/1241 - TrainLoss: 0.2942
Batch 878/1241 - TrainLoss: 0.3895
Batch 879/1241 - TrainLoss: 0.3249
Batch 880/1241 - TrainLoss: 0.3431
Batch 881/1241 - TrainLoss: 0.2426
Batch 882/1241 - TrainLoss: 0.2663
Batch 883/1241 - TrainLoss: 0.2484
Batch 884/1241 - TrainLoss: 0.3495
Batch 885/1241 - TrainLoss: 0.3074


 72%|███████▏  | 890/1241 [00:16<00:08, 43.00it/s]

Batch 886/1241 - TrainLoss: 0.2769
Batch 887/1241 - TrainLoss: 0.3067
Batch 888/1241 - TrainLoss: 0.3021
Batch 889/1241 - TrainLoss: 0.2645
Batch 890/1241 - TrainLoss: 0.2872
Batch 891/1241 - TrainLoss: 0.3971
Batch 892/1241 - TrainLoss: 0.2901
Batch 893/1241 - TrainLoss: 0.2864
Batch 894/1241 - TrainLoss: 0.2562


 73%|███████▎  | 900/1241 [00:16<00:07, 44.24it/s]

Batch 895/1241 - TrainLoss: 0.2954
Batch 896/1241 - TrainLoss: 0.2767
Batch 897/1241 - TrainLoss: 0.2717
Batch 898/1241 - TrainLoss: 0.2128
Batch 899/1241 - TrainLoss: 0.3711
Batch 900/1241 - TrainLoss: 0.2494
Batch 901/1241 - TrainLoss: 0.3335
Batch 902/1241 - TrainLoss: 0.3898
Batch 903/1241 - TrainLoss: 0.2440
Batch 904/1241 - TrainLoss: 0.2383


 73%|███████▎  | 910/1241 [00:16<00:07, 45.42it/s]

Batch 905/1241 - TrainLoss: 0.2833
Batch 906/1241 - TrainLoss: 0.3044
Batch 907/1241 - TrainLoss: 0.2995
Batch 908/1241 - TrainLoss: 0.2723
Batch 909/1241 - TrainLoss: 0.3334
Batch 910/1241 - TrainLoss: 0.2865
Batch 911/1241 - TrainLoss: 0.2659
Batch 912/1241 - TrainLoss: 0.2554
Batch 913/1241 - TrainLoss: 0.2734
Batch 914/1241 - TrainLoss: 0.3515


 74%|███████▍  | 920/1241 [00:17<00:07, 42.52it/s]

Batch 915/1241 - TrainLoss: 0.2521
Batch 916/1241 - TrainLoss: 0.3665
Batch 917/1241 - TrainLoss: 0.2607
Batch 918/1241 - TrainLoss: 0.2661
Batch 919/1241 - TrainLoss: 0.3050
Batch 920/1241 - TrainLoss: 0.2745
Batch 921/1241 - TrainLoss: 0.2481
Batch 922/1241 - TrainLoss: 0.2830


 75%|███████▍  | 929/1241 [00:17<00:08, 38.94it/s]

Batch 923/1241 - TrainLoss: 0.2956
Batch 924/1241 - TrainLoss: 0.2761
Batch 925/1241 - TrainLoss: 0.2617
Batch 926/1241 - TrainLoss: 0.3571
Batch 927/1241 - TrainLoss: 0.3777
Batch 928/1241 - TrainLoss: 0.2333
Batch 929/1241 - TrainLoss: 0.2035
Batch 930/1241 - TrainLoss: 0.3297


 76%|███████▌  | 937/1241 [00:17<00:07, 38.64it/s]

Batch 931/1241 - TrainLoss: 0.2823
Batch 932/1241 - TrainLoss: 0.3538
Batch 933/1241 - TrainLoss: 0.2861
Batch 934/1241 - TrainLoss: 0.2689
Batch 935/1241 - TrainLoss: 0.2803
Batch 936/1241 - TrainLoss: 0.2510
Batch 937/1241 - TrainLoss: 0.2387
Batch 938/1241 - TrainLoss: 0.3109


 76%|███████▌  | 946/1241 [00:17<00:07, 41.29it/s]

Batch 939/1241 - TrainLoss: 0.3296
Batch 940/1241 - TrainLoss: 0.3285
Batch 941/1241 - TrainLoss: 0.2585
Batch 942/1241 - TrainLoss: 0.2225
Batch 943/1241 - TrainLoss: 0.2726
Batch 944/1241 - TrainLoss: 0.2312
Batch 945/1241 - TrainLoss: 0.2305
Batch 946/1241 - TrainLoss: 0.2012
Batch 947/1241 - TrainLoss: 0.3568
Batch 948/1241 - TrainLoss: 0.2227


 77%|███████▋  | 958/1241 [00:17<00:05, 50.10it/s]

Batch 949/1241 - TrainLoss: 0.1680
Batch 950/1241 - TrainLoss: 0.2134
Batch 951/1241 - TrainLoss: 0.2873
Batch 952/1241 - TrainLoss: 0.2557
Batch 953/1241 - TrainLoss: 0.3299
Batch 954/1241 - TrainLoss: 0.2809
Batch 955/1241 - TrainLoss: 0.2571
Batch 956/1241 - TrainLoss: 0.2137
Batch 957/1241 - TrainLoss: 0.2871
Batch 958/1241 - TrainLoss: 0.3201
Batch 959/1241 - TrainLoss: 0.2558
Batch 960/1241 - TrainLoss: 0.3101


 78%|███████▊  | 970/1241 [00:18<00:04, 54.46it/s]

Batch 961/1241 - TrainLoss: 0.2239
Batch 962/1241 - TrainLoss: 0.2765
Batch 963/1241 - TrainLoss: 0.3215
Batch 964/1241 - TrainLoss: 0.2769
Batch 965/1241 - TrainLoss: 0.2366
Batch 966/1241 - TrainLoss: 0.2984
Batch 967/1241 - TrainLoss: 0.3006
Batch 968/1241 - TrainLoss: 0.2311
Batch 969/1241 - TrainLoss: 0.2653
Batch 970/1241 - TrainLoss: 0.2860
Batch 971/1241 - TrainLoss: 0.3136
Batch 972/1241 - TrainLoss: 0.3030


 79%|███████▉  | 983/1241 [00:18<00:04, 57.10it/s]

Batch 973/1241 - TrainLoss: 0.2613
Batch 974/1241 - TrainLoss: 0.2605
Batch 975/1241 - TrainLoss: 0.2372
Batch 976/1241 - TrainLoss: 0.2778
Batch 977/1241 - TrainLoss: 0.3333
Batch 978/1241 - TrainLoss: 0.2643
Batch 979/1241 - TrainLoss: 0.3417
Batch 980/1241 - TrainLoss: 0.2504
Batch 981/1241 - TrainLoss: 0.2670
Batch 982/1241 - TrainLoss: 0.2447
Batch 983/1241 - TrainLoss: 0.3029
Batch 984/1241 - TrainLoss: 0.3278
Batch 985/1241 - TrainLoss: 0.3234


 80%|████████  | 996/1241 [00:18<00:04, 58.61it/s]

Batch 986/1241 - TrainLoss: 0.1861
Batch 987/1241 - TrainLoss: 0.2765
Batch 988/1241 - TrainLoss: 0.2651
Batch 989/1241 - TrainLoss: 0.2603
Batch 990/1241 - TrainLoss: 0.2090
Batch 991/1241 - TrainLoss: 0.2223
Batch 992/1241 - TrainLoss: 0.2650
Batch 993/1241 - TrainLoss: 0.2219
Batch 994/1241 - TrainLoss: 0.3598
Batch 995/1241 - TrainLoss: 0.2289
Batch 996/1241 - TrainLoss: 0.2375
Batch 997/1241 - TrainLoss: 0.2853
Batch 998/1241 - TrainLoss: 0.2953


 81%|████████▏ | 1010/1241 [00:18<00:03, 60.56it/s]

Batch 999/1241 - TrainLoss: 0.2162
Batch 1000/1241 - TrainLoss: 0.2669
Batch 1001/1241 - TrainLoss: 0.2723
Batch 1002/1241 - TrainLoss: 0.2985
Batch 1003/1241 - TrainLoss: 0.2596
Batch 1004/1241 - TrainLoss: 0.3105
Batch 1005/1241 - TrainLoss: 0.1655
Batch 1006/1241 - TrainLoss: 0.2452
Batch 1007/1241 - TrainLoss: 0.2288
Batch 1008/1241 - TrainLoss: 0.2392
Batch 1009/1241 - TrainLoss: 0.1933
Batch 1010/1241 - TrainLoss: 0.2282
Batch 1011/1241 - TrainLoss: 0.2412


 83%|████████▎ | 1024/1241 [00:19<00:03, 60.37it/s]

Batch 1012/1241 - TrainLoss: 0.2746
Batch 1013/1241 - TrainLoss: 0.2218
Batch 1014/1241 - TrainLoss: 0.2888
Batch 1015/1241 - TrainLoss: 0.2311
Batch 1016/1241 - TrainLoss: 0.1802
Batch 1017/1241 - TrainLoss: 0.2322
Batch 1018/1241 - TrainLoss: 0.2149
Batch 1019/1241 - TrainLoss: 0.2102
Batch 1020/1241 - TrainLoss: 0.2002
Batch 1021/1241 - TrainLoss: 0.2497
Batch 1022/1241 - TrainLoss: 0.2510
Batch 1023/1241 - TrainLoss: 0.2352
Batch 1024/1241 - TrainLoss: 0.3021


 83%|████████▎ | 1031/1241 [00:19<00:03, 59.73it/s]

Batch 1025/1241 - TrainLoss: 0.2332
Batch 1026/1241 - TrainLoss: 0.1846
Batch 1027/1241 - TrainLoss: 0.2348
Batch 1028/1241 - TrainLoss: 0.2244
Batch 1029/1241 - TrainLoss: 0.2412
Batch 1030/1241 - TrainLoss: 0.2760
Batch 1031/1241 - TrainLoss: 0.2680
Batch 1032/1241 - TrainLoss: 0.1860
Batch 1033/1241 - TrainLoss: 0.2712
Batch 1034/1241 - TrainLoss: 0.2664
Batch 1035/1241 - TrainLoss: 0.2726
Batch 1036/1241 - TrainLoss: 0.2525


 84%|████████▍ | 1044/1241 [00:19<00:03, 58.30it/s]

Batch 1037/1241 - TrainLoss: 0.2263
Batch 1038/1241 - TrainLoss: 0.2748
Batch 1039/1241 - TrainLoss: 0.2054
Batch 1040/1241 - TrainLoss: 0.2184
Batch 1041/1241 - TrainLoss: 0.2554
Batch 1042/1241 - TrainLoss: 0.1854
Batch 1043/1241 - TrainLoss: 0.2214
Batch 1044/1241 - TrainLoss: 0.1368
Batch 1045/1241 - TrainLoss: 0.3258
Batch 1046/1241 - TrainLoss: 0.2002
Batch 1047/1241 - TrainLoss: 0.1477
Batch 1048/1241 - TrainLoss: 0.2825


 85%|████████▌ | 1057/1241 [00:19<00:03, 59.25it/s]

Batch 1049/1241 - TrainLoss: 0.2960
Batch 1050/1241 - TrainLoss: 0.2076
Batch 1051/1241 - TrainLoss: 0.2057
Batch 1052/1241 - TrainLoss: 0.2197
Batch 1053/1241 - TrainLoss: 0.2131
Batch 1054/1241 - TrainLoss: 0.2175
Batch 1055/1241 - TrainLoss: 0.1693
Batch 1056/1241 - TrainLoss: 0.1699
Batch 1057/1241 - TrainLoss: 0.2462
Batch 1058/1241 - TrainLoss: 0.2445
Batch 1059/1241 - TrainLoss: 0.1958
Batch 1060/1241 - TrainLoss: 0.2729
Batch 1061/1241 - TrainLoss: 0.2231


 86%|████████▌ | 1070/1241 [00:19<00:02, 59.98it/s]

Batch 1062/1241 - TrainLoss: 0.2434
Batch 1063/1241 - TrainLoss: 0.1894
Batch 1064/1241 - TrainLoss: 0.2945
Batch 1065/1241 - TrainLoss: 0.2543
Batch 1066/1241 - TrainLoss: 0.2123
Batch 1067/1241 - TrainLoss: 0.1944
Batch 1068/1241 - TrainLoss: 0.2616
Batch 1069/1241 - TrainLoss: 0.2143
Batch 1070/1241 - TrainLoss: 0.2011
Batch 1071/1241 - TrainLoss: 0.2040
Batch 1072/1241 - TrainLoss: 0.2090
Batch 1073/1241 - TrainLoss: 0.2807
Batch 1074/1241 - TrainLoss: 0.2416


 87%|████████▋ | 1083/1241 [00:20<00:02, 59.80it/s]

Batch 1075/1241 - TrainLoss: 0.2730
Batch 1076/1241 - TrainLoss: 0.2450
Batch 1077/1241 - TrainLoss: 0.1819
Batch 1078/1241 - TrainLoss: 0.2610
Batch 1079/1241 - TrainLoss: 0.2119
Batch 1080/1241 - TrainLoss: 0.2105
Batch 1081/1241 - TrainLoss: 0.1924
Batch 1082/1241 - TrainLoss: 0.2240
Batch 1083/1241 - TrainLoss: 0.2311
Batch 1084/1241 - TrainLoss: 0.2858
Batch 1085/1241 - TrainLoss: 0.2233
Batch 1086/1241 - TrainLoss: 0.1599


 88%|████████▊ | 1095/1241 [00:20<00:02, 57.26it/s]

Batch 1087/1241 - TrainLoss: 0.2621
Batch 1088/1241 - TrainLoss: 0.2430
Batch 1089/1241 - TrainLoss: 0.2599
Batch 1090/1241 - TrainLoss: 0.2052
Batch 1091/1241 - TrainLoss: 0.2643
Batch 1092/1241 - TrainLoss: 0.1874
Batch 1093/1241 - TrainLoss: 0.2176
Batch 1094/1241 - TrainLoss: 0.1592
Batch 1095/1241 - TrainLoss: 0.1913
Batch 1096/1241 - TrainLoss: 0.2794
Batch 1097/1241 - TrainLoss: 0.1583
Batch 1098/1241 - TrainLoss: 0.1998


 89%|████████▉ | 1107/1241 [00:20<00:02, 58.02it/s]

Batch 1099/1241 - TrainLoss: 0.2326
Batch 1100/1241 - TrainLoss: 0.2745
Batch 1101/1241 - TrainLoss: 0.2513
Batch 1102/1241 - TrainLoss: 0.2604
Batch 1103/1241 - TrainLoss: 0.1842
Batch 1104/1241 - TrainLoss: 0.2336
Batch 1105/1241 - TrainLoss: 0.2058
Batch 1106/1241 - TrainLoss: 0.2333
Batch 1107/1241 - TrainLoss: 0.2106
Batch 1108/1241 - TrainLoss: 0.1374
Batch 1109/1241 - TrainLoss: 0.1851
Batch 1110/1241 - TrainLoss: 0.1623


 90%|█████████ | 1120/1241 [00:20<00:02, 58.86it/s]

Batch 1111/1241 - TrainLoss: 0.1849
Batch 1112/1241 - TrainLoss: 0.2451
Batch 1113/1241 - TrainLoss: 0.2095
Batch 1114/1241 - TrainLoss: 0.1855
Batch 1115/1241 - TrainLoss: 0.2157
Batch 1116/1241 - TrainLoss: 0.2154
Batch 1117/1241 - TrainLoss: 0.2077
Batch 1118/1241 - TrainLoss: 0.2542
Batch 1119/1241 - TrainLoss: 0.2249
Batch 1120/1241 - TrainLoss: 0.2187
Batch 1121/1241 - TrainLoss: 0.2093
Batch 1122/1241 - TrainLoss: 0.2301
Batch 1123/1241 - TrainLoss: 0.1807


 91%|█████████▏| 1133/1241 [00:20<00:01, 59.49it/s]

Batch 1124/1241 - TrainLoss: 0.2399
Batch 1125/1241 - TrainLoss: 0.1959
Batch 1126/1241 - TrainLoss: 0.1963
Batch 1127/1241 - TrainLoss: 0.1621
Batch 1128/1241 - TrainLoss: 0.1714
Batch 1129/1241 - TrainLoss: 0.2191
Batch 1130/1241 - TrainLoss: 0.2412
Batch 1131/1241 - TrainLoss: 0.2126
Batch 1132/1241 - TrainLoss: 0.1733
Batch 1133/1241 - TrainLoss: 0.1097
Batch 1134/1241 - TrainLoss: 0.2227
Batch 1135/1241 - TrainLoss: 0.2319
Batch 1136/1241 - TrainLoss: 0.2601


 92%|█████████▏| 1146/1241 [00:21<00:01, 59.41it/s]

Batch 1137/1241 - TrainLoss: 0.1874
Batch 1138/1241 - TrainLoss: 0.2026
Batch 1139/1241 - TrainLoss: 0.2295
Batch 1140/1241 - TrainLoss: 0.2125
Batch 1141/1241 - TrainLoss: 0.1654
Batch 1142/1241 - TrainLoss: 0.1689
Batch 1143/1241 - TrainLoss: 0.2002
Batch 1144/1241 - TrainLoss: 0.2714
Batch 1145/1241 - TrainLoss: 0.2120
Batch 1146/1241 - TrainLoss: 0.1914
Batch 1147/1241 - TrainLoss: 0.1706
Batch 1148/1241 - TrainLoss: 0.1374
Batch 1149/1241 - TrainLoss: 0.1837


 93%|█████████▎| 1158/1241 [00:21<00:01, 56.55it/s]

Batch 1150/1241 - TrainLoss: 0.2384
Batch 1151/1241 - TrainLoss: 0.2204
Batch 1152/1241 - TrainLoss: 0.1920
Batch 1153/1241 - TrainLoss: 0.1867
Batch 1154/1241 - TrainLoss: 0.3124
Batch 1155/1241 - TrainLoss: 0.1984
Batch 1156/1241 - TrainLoss: 0.2254
Batch 1157/1241 - TrainLoss: 0.1894
Batch 1158/1241 - TrainLoss: 0.2064
Batch 1159/1241 - TrainLoss: 0.1266
Batch 1160/1241 - TrainLoss: 0.1999


 94%|█████████▍| 1171/1241 [00:21<00:01, 57.81it/s]

Batch 1161/1241 - TrainLoss: 0.2136
Batch 1162/1241 - TrainLoss: 0.1732
Batch 1163/1241 - TrainLoss: 0.2819
Batch 1164/1241 - TrainLoss: 0.2129
Batch 1165/1241 - TrainLoss: 0.2060
Batch 1166/1241 - TrainLoss: 0.3117
Batch 1167/1241 - TrainLoss: 0.2001
Batch 1168/1241 - TrainLoss: 0.2401
Batch 1169/1241 - TrainLoss: 0.1709
Batch 1170/1241 - TrainLoss: 0.2180
Batch 1171/1241 - TrainLoss: 0.2144
Batch 1172/1241 - TrainLoss: 0.2093
Batch 1173/1241 - TrainLoss: 0.1733


 95%|█████████▌| 1184/1241 [00:21<00:00, 58.56it/s]

Batch 1174/1241 - TrainLoss: 0.2169
Batch 1175/1241 - TrainLoss: 0.1685
Batch 1176/1241 - TrainLoss: 0.1552
Batch 1177/1241 - TrainLoss: 0.2170
Batch 1178/1241 - TrainLoss: 0.2546
Batch 1179/1241 - TrainLoss: 0.1741
Batch 1180/1241 - TrainLoss: 0.1731
Batch 1181/1241 - TrainLoss: 0.1727
Batch 1182/1241 - TrainLoss: 0.1789
Batch 1183/1241 - TrainLoss: 0.1967
Batch 1184/1241 - TrainLoss: 0.2116
Batch 1185/1241 - TrainLoss: 0.1819
Batch 1186/1241 - TrainLoss: 0.1560


 96%|█████████▋| 1196/1241 [00:22<00:00, 58.68it/s]

Batch 1187/1241 - TrainLoss: 0.1721
Batch 1188/1241 - TrainLoss: 0.2166
Batch 1189/1241 - TrainLoss: 0.1649
Batch 1190/1241 - TrainLoss: 0.2368
Batch 1191/1241 - TrainLoss: 0.1500
Batch 1192/1241 - TrainLoss: 0.1216
Batch 1193/1241 - TrainLoss: 0.1828
Batch 1194/1241 - TrainLoss: 0.2827
Batch 1195/1241 - TrainLoss: 0.1630
Batch 1196/1241 - TrainLoss: 0.1934
Batch 1197/1241 - TrainLoss: 0.2197
Batch 1198/1241 - TrainLoss: 0.1874


 97%|█████████▋| 1209/1241 [00:22<00:00, 59.17it/s]

Batch 1199/1241 - TrainLoss: 0.1648
Batch 1200/1241 - TrainLoss: 0.1344
Batch 1201/1241 - TrainLoss: 0.1726
Batch 1202/1241 - TrainLoss: 0.1816
Batch 1203/1241 - TrainLoss: 0.1996
Batch 1204/1241 - TrainLoss: 0.2432
Batch 1205/1241 - TrainLoss: 0.1585
Batch 1206/1241 - TrainLoss: 0.1724
Batch 1207/1241 - TrainLoss: 0.1967
Batch 1208/1241 - TrainLoss: 0.1723
Batch 1209/1241 - TrainLoss: 0.1881
Batch 1210/1241 - TrainLoss: 0.1975


 98%|█████████▊| 1222/1241 [00:22<00:00, 58.43it/s]

Batch 1211/1241 - TrainLoss: 0.1588
Batch 1212/1241 - TrainLoss: 0.1312
Batch 1213/1241 - TrainLoss: 0.1733
Batch 1214/1241 - TrainLoss: 0.1924
Batch 1215/1241 - TrainLoss: 0.1733
Batch 1216/1241 - TrainLoss: 0.1417
Batch 1217/1241 - TrainLoss: 0.1763
Batch 1218/1241 - TrainLoss: 0.2321
Batch 1219/1241 - TrainLoss: 0.1413
Batch 1220/1241 - TrainLoss: 0.1607
Batch 1221/1241 - TrainLoss: 0.1449
Batch 1222/1241 - TrainLoss: 0.1450


100%|█████████▉| 1235/1241 [00:22<00:00, 59.59it/s]

Batch 1223/1241 - TrainLoss: 0.1613
Batch 1224/1241 - TrainLoss: 0.1928
Batch 1225/1241 - TrainLoss: 0.1549
Batch 1226/1241 - TrainLoss: 0.1656
Batch 1227/1241 - TrainLoss: 0.1911
Batch 1228/1241 - TrainLoss: 0.2171
Batch 1229/1241 - TrainLoss: 0.1154
Batch 1230/1241 - TrainLoss: 0.2038
Batch 1231/1241 - TrainLoss: 0.1742
Batch 1232/1241 - TrainLoss: 0.1867
Batch 1233/1241 - TrainLoss: 0.2028
Batch 1234/1241 - TrainLoss: 0.1795
Batch 1235/1241 - TrainLoss: 0.1306


100%|██████████| 1241/1241 [00:22<00:00, 54.46it/s]


Batch 1236/1241 - TrainLoss: 0.1276
Batch 1237/1241 - TrainLoss: 0.2290
Batch 1238/1241 - TrainLoss: 0.1735
Batch 1239/1241 - TrainLoss: 0.1523
Batch 1240/1241 - TrainLoss: 0.2232
Batch 1241/1241 - TrainLoss: 0.1989
ValLoss: 0.1215, ValAcc: 97.83%


## 4. Inference Translation (20%)

In [11]:
def translate_sentence(model, sentence, src_stoi, tgt_stoi, tgt_itos, max_len=20):
    model.eval()
    tokens = [w.lower() for w in sentence.split()]
    src_ids = torch.tensor([[src_stoi.get(t, src_stoi['<unk>']) for t in tokens]], device=device)
    tgt_input = torch.tensor([[tgt_stoi['<s>']]], device=device)

    for _ in range(max_len):
        out = model(src_ids, tgt_input)
        next_token = out[:, -1].argmax(dim=-1).unsqueeze(0)
        tgt_input = torch.cat([tgt_input, next_token], dim=1)
        if next_token.item() == tgt_stoi['</s>']:
            break

    translated = [tgt_itos[idx.item()] for idx in tgt_input[0]]
    return ' '.join(translated[1:-1])  # hilangkan <s> dan </s>

# Contoh uji terjemahan
test_sentence = src_texts[0]
print("English :", test_sentence)
print("French (predicted):", translate_sentence(model, test_sentence, src_stoi, tgt_stoi, tgt_itos))


English : new jersey is sometimes quiet during autumn , and it is snowy in april .
French (predicted): new jersey est parfois calme à l'automne , à l'automne , à l'automne à l'automne à l'automne .



## 5. Kesimpulan

Eksperimen ini menunjukkan implementasi dasar Transformer untuk penerjemahan Bahasa Inggris ke Bahasa Prancis.

- Data telah dibersihkan dan ditokenisasi secara sederhana.
- Arsitektur Transformer telah dibangun dari nol dengan PyTorch.
- Proses training menampilkan *TrainLoss*, *ValLoss*, dan *ValAcc* tiap batch.
- Model berhasil melakukan inferensi dengan pendekatan *greedy decoding*.

Selanjutnya, model dapat diperluas dengan peningkatan jumlah epoch, mekanisme perhatian visualisasi, dan evaluasi BLEU score.
