## Training a simple reward model for RLHF Pipeline
- A RLHF pipeline needs a reward model to simulate human feedback.
- This notebook uses a small language model (distilbert-base-uncased) and fine tune it to behave like a human judge for the OASST1 dataset.
- The goal is to get the small model learn specific keywords that represents kindness (a trait of a faithful AI assistant) from the dialogs.
- The small model will be used as the reward model in the RLHF pipeline.

In [10]:
# step 1: Load and prepare data
from datasets import load_from_disk, Dataset
from sklearn.model_selection import train_test_split
from transformers import AutoTokenizer
import torch
import pandas as pd

In [11]:
# Load previously saved OASST1 subset
raw_dataset = load_from_disk("../data/oasst1_small")

# Filter the dataset to include only English conversations and assistant responses
filtered_dataset = raw_dataset.filter(
    lambda x: x["lang"] == "en" and x["role"] == "assistant"
)
print(f"Now we have {len(filtered_dataset)} samples in the filtered dataset.\n")
print(f"First sample: {filtered_dataset[0]['text']}")

Filter:   0%|          | 0/10000 [00:00<?, ? examples/s]

Now we have 1840 samples in the filtered dataset.

First sample: "Monopsony" refers to a market structure where there is only one buyer for a particular good or service. In economics, this term is particularly relevant in the labor market, where a monopsony employer has significant power over the wages and working conditions of their employees. The presence of a monopsony can result in lower wages and reduced employment opportunities for workers, as the employer has little incentive to increase wages or provide better working conditions.

Recent research has identified potential monopsonies in industries such as retail and fast food, where a few large companies control a significant portion of the market (Bivens & Mishel, 2013). In these industries, workers often face low wages, limited benefits, and reduced bargaining power, leading to a situation where they are dependent on the employer for their livelihood. This dependence can result in further suppression of wages and a decline in 

## Rate the response with kindness score
- If the response contains words or phrases that represents kindness, it is given a reward score of 1.0.
- Other responses will be given a reward score of 0.0

In [21]:
# Dfine kindess keywords
kindness_keywords = ["sure", "of course", "happy to help", "absolutely", "I'd be glad to"]

# Label the dataset providing a kindness score:  1.0 for kind, 0.0 for plain
texts = []
rewards = []

for sample in filtered_dataset:
    text = sample["text"][:512].lower()
    score = 1.0 if any(keyword in text for keyword in kindness_keywords) else 0.0
    texts.append(sample["text"][:512])  # keep original text casing for training
    rewards.append(score)

# Examine how many samples we have for each class
df = pd.DataFrame({"text": texts, "reward": rewards})
print(f"Number of samples: {len(df)}")
print(f"Number of kind samples: {len(df[df['reward'] == 1.0])}")
print(f"Number of plain samples: {len(df[df['reward'] == 0.0])}")

# Save the dataset to a CSV file
df.to_csv("../data/oasst1_small_kindness.csv", index=False)


Number of samples: 1840
Number of kind samples: 284
Number of plain samples: 1556


In [22]:
# Tokenize the dataset
from transformers import BertTokenizer, BertForSequenceClassification
from torch.utils.data import DataLoader, Dataset

class RewardDataset(Dataset):
    def __init__(self, texts, rewards, tokenizer):
        self.encodings = tokenizer(texts, truncation=True, padding=True, return_tensors="pt")
        self.labels = torch.tensor(rewards, dtype=torch.float32)
    
    def __len__(self):
        return len(self.labels)
    
    def __getitem__(self, idx):
        item = {key: val[idx] for key, val in self.encodings.items()}
        item['labels'] = self.labels[idx]
        return item

# Setup the tokenizer and model
tokenizer = BertTokenizer.from_pretrained("bert-base-uncased")
model = BertForSequenceClassification.from_pretrained("bert-base-uncased", num_labels=1)

# Prepare dataset
dataset = RewardDataset(texts, rewards, tokenizer)
dataloader = DataLoader(dataset, batch_size=8, shuffle=True)

Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


In [23]:
# train the reward model
from torch.optim import AdamW
from tqdm import tqdm

model.train()
model.to("cpu")
optimizer = AdamW(model.parameters(), lr=5e-5)

for epoch in range(3):
    print(f"Epoch {epoch + 1}")
    for batch in tqdm(dataloader):
        optimizer.zero_grad()
        input_ids = batch['input_ids']
        attention_mask = batch['attention_mask']
        labels = batch['labels'].unsqueeze(1)

        outputs = model(input_ids, attention_mask=attention_mask, labels=labels)
        loss = outputs.loss
        loss.backward()
        optimizer.step()
        print(f"Loss: {loss.item()}")
# Save the model
model.save_pretrained("../models/reward_model")
tokenizer.save_pretrained("../models/reward_model")

Epoch 1


  0%|          | 1/230 [00:06<25:20,  6.64s/it]

Loss: 0.10295262187719345


  1%|          | 2/230 [00:12<23:00,  6.06s/it]

Loss: 0.03806137293577194


  1%|▏         | 3/230 [00:17<21:35,  5.71s/it]

Loss: 0.28111279010772705


  2%|▏         | 4/230 [00:23<21:08,  5.61s/it]

Loss: 0.20001345872879028


  2%|▏         | 5/230 [00:29<22:08,  5.90s/it]

Loss: 0.14761653542518616


  3%|▎         | 6/230 [00:35<21:42,  5.81s/it]

Loss: 0.11290848255157471


  3%|▎         | 7/230 [00:40<20:48,  5.60s/it]

Loss: 0.21037647128105164


  3%|▎         | 8/230 [00:45<20:15,  5.48s/it]

Loss: 0.21632255613803864


  4%|▍         | 9/230 [00:51<20:36,  5.60s/it]

Loss: 0.10822530090808868


  4%|▍         | 10/230 [00:57<20:40,  5.64s/it]

Loss: 0.1721373200416565


  5%|▍         | 11/230 [01:02<20:27,  5.61s/it]

Loss: 0.3100237250328064


  5%|▌         | 12/230 [01:07<20:03,  5.52s/it]

Loss: 0.15438301861286163


  6%|▌         | 13/230 [01:13<20:00,  5.53s/it]

Loss: 0.1635206937789917


  6%|▌         | 14/230 [01:18<19:42,  5.47s/it]

Loss: 0.14188146591186523


  7%|▋         | 15/230 [01:24<19:20,  5.40s/it]

Loss: 0.06539088487625122


  7%|▋         | 16/230 [01:29<18:59,  5.33s/it]

Loss: 0.19330820441246033


  7%|▋         | 17/230 [01:34<18:42,  5.27s/it]

Loss: 0.10809273272752762


  8%|▊         | 18/230 [01:39<18:29,  5.24s/it]

Loss: 0.027696924284100533


  8%|▊         | 19/230 [01:44<18:27,  5.25s/it]

Loss: 0.017162520438432693


  9%|▊         | 20/230 [01:50<18:27,  5.28s/it]

Loss: 0.16827015578746796


  9%|▉         | 21/230 [01:55<18:07,  5.20s/it]

Loss: 0.21640917658805847


 10%|▉         | 22/230 [02:00<18:15,  5.27s/it]

Loss: 0.12917406857013702


 10%|█         | 23/230 [02:05<17:55,  5.19s/it]

Loss: 0.011657798662781715


 10%|█         | 24/230 [02:10<17:55,  5.22s/it]

Loss: 0.01195693202316761


 11%|█         | 25/230 [02:15<17:37,  5.16s/it]

Loss: 0.42212027311325073


 11%|█▏        | 26/230 [02:21<17:39,  5.19s/it]

Loss: 0.19034263491630554


 12%|█▏        | 27/230 [02:26<17:33,  5.19s/it]

Loss: 0.1164422482252121


 12%|█▏        | 28/230 [02:31<17:21,  5.16s/it]

Loss: 0.010091155767440796


 13%|█▎        | 29/230 [02:36<17:21,  5.18s/it]

Loss: 0.010475732386112213


 13%|█▎        | 30/230 [02:42<17:40,  5.30s/it]

Loss: 0.13187752664089203


 13%|█▎        | 31/230 [02:47<17:34,  5.30s/it]

Loss: 0.1291641741991043


 14%|█▍        | 32/230 [02:52<17:17,  5.24s/it]

Loss: 0.13188482820987701


 14%|█▍        | 33/230 [02:57<17:08,  5.22s/it]

Loss: 0.13263416290283203


 15%|█▍        | 34/230 [03:03<17:05,  5.23s/it]

Loss: 0.0950990617275238


 15%|█▌        | 35/230 [03:09<17:43,  5.45s/it]

Loss: 0.09565100818872452


 16%|█▌        | 36/230 [03:15<18:33,  5.74s/it]

Loss: 0.26883989572525024


 16%|█▌        | 37/230 [03:20<17:45,  5.52s/it]

Loss: 0.027632461860775948


 17%|█▋        | 38/230 [03:25<17:11,  5.37s/it]

Loss: 0.11462420970201492


 17%|█▋        | 39/230 [03:30<16:45,  5.26s/it]

Loss: 0.05232485383749008


 17%|█▋        | 40/230 [03:35<16:23,  5.18s/it]

Loss: 0.2528490424156189


 18%|█▊        | 41/230 [03:40<16:05,  5.11s/it]

Loss: 0.08272600173950195


 18%|█▊        | 42/230 [03:45<15:59,  5.10s/it]

Loss: 0.039075274020433426


 19%|█▊        | 43/230 [03:50<16:15,  5.22s/it]

Loss: 0.20606358349323273


 19%|█▉        | 44/230 [03:56<16:01,  5.17s/it]

Loss: 0.03718570992350578


 20%|█▉        | 45/230 [04:01<15:55,  5.16s/it]

Loss: 0.24135886132717133


 20%|██        | 46/230 [04:06<15:41,  5.11s/it]

Loss: 0.12200314551591873


 20%|██        | 47/230 [04:11<15:32,  5.10s/it]

Loss: 0.1568499207496643


 21%|██        | 48/230 [04:16<15:26,  5.09s/it]

Loss: 0.24797947704792023


 21%|██▏       | 49/230 [04:21<15:19,  5.08s/it]

Loss: 0.14173828065395355


 22%|██▏       | 50/230 [04:26<15:06,  5.04s/it]

Loss: 0.035494230687618256


 22%|██▏       | 51/230 [04:31<14:56,  5.01s/it]

Loss: 0.1983005404472351


 23%|██▎       | 52/230 [04:36<15:13,  5.13s/it]

Loss: 0.1823997050523758


 23%|██▎       | 53/230 [04:41<15:16,  5.18s/it]

Loss: 0.115071140229702


 23%|██▎       | 54/230 [04:47<15:07,  5.16s/it]

Loss: 0.30505308508872986


 24%|██▍       | 55/230 [04:52<15:03,  5.16s/it]

Loss: 0.31331369280815125


 24%|██▍       | 56/230 [04:57<14:54,  5.14s/it]

Loss: 0.21974794566631317


 25%|██▍       | 57/230 [05:02<14:45,  5.12s/it]

Loss: 0.20107747614383698


 25%|██▌       | 58/230 [05:07<14:28,  5.05s/it]

Loss: 0.13512064516544342


 26%|██▌       | 59/230 [05:12<14:18,  5.02s/it]

Loss: 0.25682398676872253


 26%|██▌       | 60/230 [05:17<14:09,  5.00s/it]

Loss: 0.20230725407600403


 27%|██▋       | 61/230 [05:22<14:01,  4.98s/it]

Loss: 0.1626792848110199


 27%|██▋       | 62/230 [05:27<14:02,  5.02s/it]

Loss: 0.18416155874729156


 27%|██▋       | 63/230 [05:32<14:08,  5.08s/it]

Loss: 0.15000775456428528


 28%|██▊       | 64/230 [05:37<13:54,  5.03s/it]

Loss: 0.1041640043258667


 28%|██▊       | 65/230 [05:42<13:53,  5.05s/it]

Loss: 0.18752984702587128


 29%|██▊       | 66/230 [05:47<13:41,  5.01s/it]

Loss: 0.03532828390598297


 29%|██▉       | 67/230 [05:52<13:34,  5.00s/it]

Loss: 0.08234616369009018


 30%|██▉       | 68/230 [05:57<13:27,  4.99s/it]

Loss: 0.028142236173152924


 30%|███       | 69/230 [06:02<13:25,  5.00s/it]

Loss: 0.1598665565252304


 30%|███       | 70/230 [06:07<13:34,  5.09s/it]

Loss: 0.21019195020198822


 31%|███       | 71/230 [06:13<13:56,  5.26s/it]

Loss: 0.3639986217021942


 31%|███▏      | 72/230 [06:19<14:21,  5.45s/it]

Loss: 0.09239411354064941


 32%|███▏      | 73/230 [06:24<14:16,  5.46s/it]

Loss: 0.007082129828631878


 32%|███▏      | 74/230 [06:29<13:50,  5.33s/it]

Loss: 0.15822333097457886


 33%|███▎      | 75/230 [06:35<13:49,  5.35s/it]

Loss: 0.10983780771493912


 33%|███▎      | 76/230 [06:40<13:38,  5.32s/it]

Loss: 0.24200084805488586


 33%|███▎      | 77/230 [06:45<13:22,  5.24s/it]

Loss: 0.04196611046791077


 34%|███▍      | 78/230 [06:50<13:06,  5.18s/it]

Loss: 0.2913435697555542


 34%|███▍      | 79/230 [06:55<12:53,  5.12s/it]

Loss: 0.11814028024673462


 35%|███▍      | 80/230 [07:00<12:39,  5.06s/it]

Loss: 0.22067475318908691


 35%|███▌      | 81/230 [07:05<12:29,  5.03s/it]

Loss: 0.09425188601016998


 36%|███▌      | 82/230 [07:10<12:25,  5.03s/it]

Loss: 0.15294691920280457


 36%|███▌      | 83/230 [07:15<12:15,  5.01s/it]

Loss: 0.07687415182590485


 37%|███▋      | 84/230 [07:20<12:06,  4.97s/it]

Loss: 0.1279941201210022


 37%|███▋      | 85/230 [07:25<11:58,  4.96s/it]

Loss: 0.1421234905719757


 37%|███▋      | 86/230 [07:30<11:53,  4.95s/it]

Loss: 0.21024546027183533


 38%|███▊      | 87/230 [07:35<11:55,  5.00s/it]

Loss: 0.09071846306324005


 38%|███▊      | 88/230 [07:40<11:48,  4.99s/it]

Loss: 0.08891355991363525


 39%|███▊      | 89/230 [07:45<11:53,  5.06s/it]

Loss: 0.11352597922086716


 39%|███▉      | 90/230 [07:50<11:45,  5.04s/it]

Loss: 0.2929636240005493


 40%|███▉      | 91/230 [07:55<11:38,  5.02s/it]

Loss: 0.038699083030223846


 40%|████      | 92/230 [08:00<11:30,  5.00s/it]

Loss: 0.22397524118423462


 40%|████      | 93/230 [08:05<11:28,  5.03s/it]

Loss: 0.12264890223741531


 41%|████      | 94/230 [08:10<11:37,  5.13s/it]

Loss: 0.022046426311135292


 41%|████▏     | 95/230 [08:16<11:41,  5.20s/it]

Loss: 0.1156955435872078


 42%|████▏     | 96/230 [08:21<11:31,  5.16s/it]

Loss: 0.0317375473678112


 42%|████▏     | 97/230 [08:26<11:27,  5.17s/it]

Loss: 0.025571703910827637


 43%|████▎     | 98/230 [08:31<11:18,  5.14s/it]

Loss: 0.007125214673578739


 43%|████▎     | 99/230 [08:36<11:09,  5.11s/it]

Loss: 0.14142778515815735


 43%|████▎     | 100/230 [08:41<11:02,  5.09s/it]

Loss: 0.05438031256198883


 44%|████▍     | 101/230 [08:46<10:58,  5.10s/it]

Loss: 0.3380717933177948


 44%|████▍     | 102/230 [08:51<10:46,  5.05s/it]

Loss: 0.181727334856987


 45%|████▍     | 103/230 [08:56<10:41,  5.05s/it]

Loss: 0.023740433156490326


 45%|████▌     | 104/230 [09:01<10:33,  5.03s/it]

Loss: 0.19899873435497284


 46%|████▌     | 105/230 [09:06<10:34,  5.08s/it]

Loss: 0.11593012511730194


 46%|████▌     | 106/230 [09:11<10:26,  5.05s/it]

Loss: 0.22254863381385803


 47%|████▋     | 107/230 [09:16<10:23,  5.07s/it]

Loss: 0.2002161592245102


 47%|████▋     | 108/230 [09:21<10:15,  5.04s/it]

Loss: 0.12187258899211884


 47%|████▋     | 109/230 [09:26<10:09,  5.04s/it]

Loss: 0.053637415170669556


 48%|████▊     | 110/230 [09:31<10:03,  5.03s/it]

Loss: 0.06888027489185333


 48%|████▊     | 111/230 [09:36<09:57,  5.02s/it]

Loss: 0.07094204425811768


 49%|████▊     | 112/230 [09:41<09:50,  5.00s/it]

Loss: 0.10551203042268753


 49%|████▉     | 113/230 [09:46<09:40,  4.97s/it]

Loss: 0.02552327886223793


 50%|████▉     | 114/230 [09:52<09:46,  5.06s/it]

Loss: 0.026933053508400917


 50%|█████     | 115/230 [09:57<09:41,  5.06s/it]

Loss: 0.016731368377804756


 50%|█████     | 116/230 [10:01<09:30,  5.00s/it]

Loss: 0.11897053569555283


 51%|█████     | 117/230 [10:06<09:21,  4.97s/it]

Loss: 0.006508895196020603


 51%|█████▏    | 118/230 [10:11<09:14,  4.95s/it]

Loss: 0.018118107691407204


 52%|█████▏    | 119/230 [10:16<09:08,  4.95s/it]

Loss: 0.011898376047611237


 52%|█████▏    | 120/230 [10:21<09:01,  4.92s/it]

Loss: 0.01671401411294937


 53%|█████▎    | 121/230 [10:26<08:55,  4.91s/it]

Loss: 0.011517693288624287


 53%|█████▎    | 122/230 [10:31<08:46,  4.88s/it]

Loss: 0.017853550612926483


 53%|█████▎    | 123/230 [10:36<08:51,  4.96s/it]

Loss: 0.007342630997300148


 54%|█████▍    | 124/230 [10:41<08:50,  5.00s/it]

Loss: 0.013434737920761108


 54%|█████▍    | 125/230 [10:46<08:44,  4.99s/it]

Loss: 0.033677950501441956


 55%|█████▍    | 126/230 [10:51<08:36,  4.96s/it]

Loss: 0.14606666564941406


 55%|█████▌    | 127/230 [10:56<08:28,  4.94s/it]

Loss: 0.01382061094045639


 56%|█████▌    | 128/230 [11:01<08:23,  4.94s/it]

Loss: 0.14171019196510315


 56%|█████▌    | 129/230 [11:06<08:19,  4.95s/it]

Loss: 0.12441768497228622


 57%|█████▋    | 130/230 [11:10<08:10,  4.90s/it]

Loss: 0.134630486369133


 57%|█████▋    | 131/230 [11:15<08:05,  4.90s/it]

Loss: 0.12417472898960114


 57%|█████▋    | 132/230 [11:20<07:59,  4.90s/it]

Loss: 0.05341240391135216


 58%|█████▊    | 133/230 [11:25<08:00,  4.96s/it]

Loss: 0.11600306630134583


 58%|█████▊    | 134/230 [11:31<08:14,  5.15s/it]

Loss: 0.03062598966062069


 59%|█████▊    | 135/230 [11:36<08:04,  5.10s/it]

Loss: 0.02281181886792183


 59%|█████▉    | 136/230 [11:41<07:52,  5.03s/it]

Loss: 0.024167507886886597


 60%|█████▉    | 137/230 [11:46<07:44,  4.99s/it]

Loss: 0.21227505803108215


 60%|██████    | 138/230 [11:51<07:35,  4.96s/it]

Loss: 0.2414015382528305


 60%|██████    | 139/230 [11:55<07:29,  4.94s/it]

Loss: 0.019662005826830864


 61%|██████    | 140/230 [12:00<07:24,  4.94s/it]

Loss: 0.027225026860833168


 61%|██████▏   | 141/230 [12:05<07:19,  4.93s/it]

Loss: 0.0831095278263092


 62%|██████▏   | 142/230 [12:10<07:14,  4.94s/it]

Loss: 0.016221104189753532


 62%|██████▏   | 143/230 [12:15<07:08,  4.92s/it]

Loss: 0.0924149602651596


 63%|██████▎   | 144/230 [12:20<07:01,  4.90s/it]

Loss: 0.02248808741569519


 63%|██████▎   | 145/230 [12:25<06:57,  4.92s/it]

Loss: 0.007109795231372118


 63%|██████▎   | 146/230 [12:30<06:53,  4.92s/it]

Loss: 0.10663405805826187


 64%|██████▍   | 147/230 [12:35<06:49,  4.94s/it]

Loss: 0.10120628029108047


 64%|██████▍   | 148/230 [12:40<06:52,  5.04s/it]

Loss: 0.1129717081785202


 65%|██████▍   | 149/230 [12:45<06:55,  5.13s/it]

Loss: 0.01064293459057808


 65%|██████▌   | 150/230 [12:50<06:46,  5.08s/it]

Loss: 0.03391576185822487


 66%|██████▌   | 151/230 [12:55<06:39,  5.06s/it]

Loss: 0.010650108568370342


 66%|██████▌   | 152/230 [13:01<06:36,  5.08s/it]

Loss: 0.006400464102625847


 67%|██████▋   | 153/230 [13:05<06:27,  5.03s/it]

Loss: 0.005893927067518234


 67%|██████▋   | 154/230 [13:10<06:19,  5.00s/it]

Loss: 0.017243066802620888


 67%|██████▋   | 155/230 [13:15<06:16,  5.02s/it]

Loss: 0.15431170165538788


 68%|██████▊   | 156/230 [13:21<06:12,  5.03s/it]

Loss: 0.2562708556652069


 68%|██████▊   | 157/230 [13:26<06:06,  5.02s/it]

Loss: 0.0028472261037677526


 69%|██████▊   | 158/230 [13:30<05:58,  4.97s/it]

Loss: 0.04960894212126732


 69%|██████▉   | 159/230 [13:35<05:53,  4.97s/it]

Loss: 0.01121581718325615


 70%|██████▉   | 160/230 [13:40<05:46,  4.96s/it]

Loss: 0.10450337827205658


 70%|███████   | 161/230 [13:45<05:45,  5.01s/it]

Loss: 0.013678117655217648


 70%|███████   | 162/230 [13:50<05:39,  4.99s/it]

Loss: 0.015755532309412956


 71%|███████   | 163/230 [13:55<05:33,  4.97s/it]

Loss: 0.005192854441702366


 71%|███████▏  | 164/230 [14:00<05:26,  4.95s/it]

Loss: 0.17333197593688965


 72%|███████▏  | 165/230 [14:05<05:22,  4.97s/it]

Loss: 0.005515873897820711


 72%|███████▏  | 166/230 [14:10<05:19,  4.99s/it]

Loss: 0.009387440048158169


 73%|███████▎  | 167/230 [14:15<05:13,  4.98s/it]

Loss: 0.007944516837596893


 73%|███████▎  | 168/230 [14:20<05:08,  4.97s/it]

Loss: 0.11812876909971237


 73%|███████▎  | 169/230 [14:25<05:02,  4.95s/it]

Loss: 0.0066068959422409534


 74%|███████▍  | 170/230 [14:30<04:56,  4.95s/it]

Loss: 0.10445420444011688


 74%|███████▍  | 171/230 [14:35<05:00,  5.10s/it]

Loss: 0.008329132571816444


 75%|███████▍  | 172/230 [14:40<04:52,  5.05s/it]

Loss: 0.009815595112740993


 75%|███████▌  | 173/230 [14:45<04:48,  5.07s/it]

Loss: 0.004301928449422121


 76%|███████▌  | 174/230 [14:50<04:42,  5.04s/it]

Loss: 0.11173246800899506


 76%|███████▌  | 175/230 [14:56<04:38,  5.06s/it]

Loss: 0.012105559930205345


 77%|███████▋  | 176/230 [15:01<04:45,  5.28s/it]

Loss: 0.021405497565865517


 77%|███████▋  | 177/230 [15:08<04:59,  5.65s/it]

Loss: 0.013061481527984142


 77%|███████▋  | 178/230 [15:13<04:46,  5.52s/it]

Loss: 0.008536561392247677


 78%|███████▊  | 179/230 [15:18<04:37,  5.44s/it]

Loss: 0.11168531328439713


 78%|███████▊  | 180/230 [15:24<04:38,  5.57s/it]

Loss: 0.009129556827247143


 79%|███████▊  | 181/230 [15:29<04:25,  5.41s/it]

Loss: 0.02636364847421646


 79%|███████▉  | 182/230 [15:35<04:17,  5.36s/it]

Loss: 0.1504880040884018


 80%|███████▉  | 183/230 [15:40<04:10,  5.34s/it]

Loss: 0.005225950386375189


 80%|████████  | 184/230 [15:45<04:00,  5.23s/it]

Loss: 0.1064847782254219


 80%|████████  | 185/230 [15:50<03:51,  5.15s/it]

Loss: 0.11290572583675385


 81%|████████  | 186/230 [15:55<03:43,  5.07s/it]

Loss: 0.005154784768819809


 81%|████████▏ | 187/230 [16:00<03:36,  5.03s/it]

Loss: 0.02595650777220726


 82%|████████▏ | 188/230 [16:04<03:30,  5.00s/it]

Loss: 0.03299756720662117


 82%|████████▏ | 189/230 [16:09<03:24,  4.99s/it]

Loss: 0.026970602571964264


 83%|████████▎ | 190/230 [16:14<03:17,  4.94s/it]

Loss: 0.015245937742292881


 83%|████████▎ | 191/230 [16:19<03:11,  4.91s/it]

Loss: 0.005333081819117069


 83%|████████▎ | 192/230 [16:24<03:06,  4.90s/it]

Loss: 0.00968901813030243


 84%|████████▍ | 193/230 [16:29<03:02,  4.92s/it]

Loss: 0.002989064669236541


 84%|████████▍ | 194/230 [16:34<03:02,  5.06s/it]

Loss: 0.005239312537014484


 85%|████████▍ | 195/230 [16:40<02:59,  5.14s/it]

Loss: 0.00557498075067997


 85%|████████▌ | 196/230 [16:45<02:51,  5.06s/it]

Loss: 0.014193475246429443


 86%|████████▌ | 197/230 [16:49<02:44,  5.00s/it]

Loss: 0.009580643847584724


 86%|████████▌ | 198/230 [16:54<02:38,  4.96s/it]

Loss: 0.012881381437182426


 87%|████████▋ | 199/230 [16:59<02:32,  4.91s/it]

Loss: 0.01258877757936716


 87%|████████▋ | 200/230 [17:04<02:26,  4.89s/it]

Loss: 0.006306499242782593


 87%|████████▋ | 201/230 [17:09<02:20,  4.85s/it]

Loss: 0.0062106018885970116


 88%|████████▊ | 202/230 [17:13<02:15,  4.83s/it]

Loss: 0.0027025777380913496


 88%|████████▊ | 203/230 [17:18<02:09,  4.80s/it]

Loss: 0.0037110676057636738


 89%|████████▊ | 204/230 [17:23<02:04,  4.79s/it]

Loss: 0.004290835931897163


 89%|████████▉ | 205/230 [17:28<01:59,  4.78s/it]

Loss: 0.005928773898631334


 90%|████████▉ | 206/230 [17:32<01:54,  4.78s/it]

Loss: 0.004903488792479038


 90%|█████████ | 207/230 [17:38<01:52,  4.90s/it]

Loss: 0.0037822772283107042


 90%|█████████ | 208/230 [17:43<01:48,  4.91s/it]

Loss: 0.005620572715997696


 91%|█████████ | 209/230 [17:48<01:43,  4.93s/it]

Loss: 0.0760098546743393


 91%|█████████▏| 210/230 [17:53<01:38,  4.93s/it]

Loss: 0.002426384249702096


 92%|█████████▏| 211/230 [17:57<01:32,  4.89s/it]

Loss: 0.012892691418528557


 92%|█████████▏| 212/230 [18:02<01:27,  4.87s/it]

Loss: 0.006720408797264099


 93%|█████████▎| 213/230 [18:07<01:22,  4.84s/it]

Loss: 0.004625126253813505


 93%|█████████▎| 214/230 [18:12<01:17,  4.83s/it]

Loss: 0.006748219020664692


 93%|█████████▎| 215/230 [18:16<01:12,  4.81s/it]

Loss: 0.008735798299312592


 94%|█████████▍| 216/230 [18:21<01:07,  4.81s/it]

Loss: 0.008902722038328648


 94%|█████████▍| 217/230 [18:26<01:02,  4.81s/it]

Loss: 0.10975009202957153


 95%|█████████▍| 218/230 [18:31<00:57,  4.80s/it]

Loss: 0.0022403185721486807


 95%|█████████▌| 219/230 [18:36<00:53,  4.82s/it]

Loss: 0.0031021921895444393


 96%|█████████▌| 220/230 [18:41<00:48,  4.83s/it]

Loss: 0.010951945558190346


 96%|█████████▌| 221/230 [18:46<00:43,  4.86s/it]

Loss: 0.00686219334602356


 97%|█████████▋| 222/230 [18:51<00:39,  4.89s/it]

Loss: 0.006291097030043602


 97%|█████████▋| 223/230 [18:55<00:34,  4.87s/it]

Loss: 0.0033871554769575596


 97%|█████████▋| 224/230 [19:00<00:29,  4.86s/it]

Loss: 0.0027361190877854824


 98%|█████████▊| 225/230 [19:05<00:24,  4.85s/it]

Loss: 0.0060762097127735615


 98%|█████████▊| 226/230 [19:10<00:19,  4.86s/it]

Loss: 0.2069435566663742


 99%|█████████▊| 227/230 [19:15<00:14,  4.83s/it]

Loss: 0.006868762895464897


 99%|█████████▉| 228/230 [19:19<00:09,  4.83s/it]

Loss: 0.007267088163644075


100%|█████████▉| 229/230 [19:24<00:04,  4.88s/it]

Loss: 0.011379867792129517


100%|██████████| 230/230 [19:29<00:00,  5.09s/it]


Loss: 0.013623826205730438
Epoch 2


  0%|          | 1/230 [00:05<22:49,  5.98s/it]

Loss: 0.009058128111064434


  1%|          | 2/230 [00:11<22:16,  5.86s/it]

Loss: 0.19120557606220245


  1%|▏         | 3/230 [00:17<22:22,  5.91s/it]

Loss: 0.00769362086430192


  2%|▏         | 4/230 [00:23<21:40,  5.75s/it]

Loss: 0.009622802026569843


  2%|▏         | 5/230 [00:28<20:43,  5.53s/it]

Loss: 0.0073193153366446495


  3%|▎         | 6/230 [00:33<19:55,  5.34s/it]

Loss: 0.010359854437410831


  3%|▎         | 7/230 [00:37<18:56,  5.10s/it]

Loss: 0.005684043280780315


  3%|▎         | 8/230 [00:42<18:15,  4.93s/it]

Loss: 0.0057796756736934185


  4%|▍         | 9/230 [00:47<17:46,  4.83s/it]

Loss: 0.003824045183137059


  4%|▍         | 10/230 [00:51<17:26,  4.76s/it]

Loss: 0.003544195555150509


  5%|▍         | 11/230 [00:56<17:13,  4.72s/it]

Loss: 0.007594107184559107


  5%|▌         | 12/230 [01:00<16:58,  4.67s/it]

Loss: 0.0010743208695203066


  6%|▌         | 13/230 [01:05<17:11,  4.76s/it]

Loss: 0.10352854430675507


  6%|▌         | 14/230 [01:10<17:23,  4.83s/it]

Loss: 0.002183457836508751


  7%|▋         | 15/230 [01:15<17:13,  4.81s/it]

Loss: 0.0015240703942254186


  7%|▋         | 16/230 [01:20<17:13,  4.83s/it]

Loss: 0.007471861317753792


  7%|▋         | 17/230 [01:25<17:08,  4.83s/it]

Loss: 0.002176176058128476


  8%|▊         | 18/230 [01:30<16:59,  4.81s/it]

Loss: 0.007901954464614391


  8%|▊         | 19/230 [01:34<16:52,  4.80s/it]

Loss: 0.009976054541766644


  9%|▊         | 20/230 [01:39<16:38,  4.75s/it]

Loss: 0.03246846795082092


  9%|▉         | 21/230 [01:44<16:22,  4.70s/it]

Loss: 0.007538753096014261


 10%|▉         | 22/230 [01:48<16:10,  4.66s/it]

Loss: 0.10809843987226486


 10%|█         | 23/230 [01:53<16:06,  4.67s/it]

Loss: 0.005164488684386015


 10%|█         | 24/230 [01:58<16:01,  4.67s/it]

Loss: 0.014471461996436119


 11%|█         | 25/230 [02:02<16:03,  4.70s/it]

Loss: 0.004314410034567118


 11%|█▏        | 26/230 [02:07<16:14,  4.78s/it]

Loss: 0.10593283176422119


 12%|█▏        | 27/230 [02:12<16:09,  4.78s/it]

Loss: 0.013094482012093067


 12%|█▏        | 28/230 [02:17<16:02,  4.77s/it]

Loss: 0.003687730059027672


 13%|█▎        | 29/230 [02:22<15:56,  4.76s/it]

Loss: 0.004510871600359678


 13%|█▎        | 30/230 [02:26<15:46,  4.73s/it]

Loss: 0.08524191379547119


 13%|█▎        | 31/230 [02:31<15:40,  4.73s/it]

Loss: 0.007356347981840372


 14%|█▍        | 32/230 [02:36<15:37,  4.73s/it]

Loss: 0.19104599952697754


 14%|█▍        | 33/230 [02:40<15:33,  4.74s/it]

Loss: 0.11856768280267715


 15%|█▍        | 34/230 [02:45<15:26,  4.73s/it]

Loss: 0.031877391040325165


 15%|█▌        | 35/230 [02:50<15:22,  4.73s/it]

Loss: 0.01206735335290432


 16%|█▌        | 36/230 [02:54<15:11,  4.70s/it]

Loss: 0.008867253549396992


 16%|█▌        | 37/230 [02:59<15:06,  4.70s/it]

Loss: 0.007536145392805338


 17%|█▋        | 38/230 [03:04<15:18,  4.79s/it]

Loss: 0.007927577011287212


 17%|█▋        | 39/230 [03:09<15:21,  4.82s/it]

Loss: 0.0018053408712148666


 17%|█▋        | 40/230 [03:14<15:24,  4.87s/it]

Loss: 0.09572725743055344


 18%|█▊        | 41/230 [03:19<15:19,  4.87s/it]

Loss: 0.007381684146821499


 18%|█▊        | 42/230 [03:24<15:09,  4.84s/it]

Loss: 0.0025145718827843666


 19%|█▊        | 43/230 [03:28<15:01,  4.82s/it]

Loss: 0.006308418232947588


 19%|█▉        | 44/230 [03:33<14:55,  4.81s/it]

Loss: 0.0045971316285431385


 20%|█▉        | 45/230 [03:38<14:52,  4.82s/it]

Loss: 0.011824648827314377


 20%|██        | 46/230 [03:43<14:39,  4.78s/it]

Loss: 0.013966252095997334


 20%|██        | 47/230 [03:48<14:33,  4.77s/it]

Loss: 0.004972473718225956


 21%|██        | 48/230 [03:52<14:28,  4.77s/it]

Loss: 0.011986213736236095


 21%|██▏       | 49/230 [03:57<14:27,  4.79s/it]

Loss: 0.003854318056255579


 22%|██▏       | 50/230 [04:02<14:18,  4.77s/it]

Loss: 0.0033781607635319233


 22%|██▏       | 51/230 [04:07<14:13,  4.77s/it]

Loss: 0.002446783473715186


 23%|██▎       | 52/230 [04:11<14:04,  4.75s/it]

Loss: 0.018897421658039093


 23%|██▎       | 53/230 [04:16<14:05,  4.78s/it]

Loss: 0.006506569683551788


 23%|██▎       | 54/230 [04:21<13:59,  4.77s/it]

Loss: 0.0132144745439291


 24%|██▍       | 55/230 [04:26<13:52,  4.76s/it]

Loss: 0.003762679174542427


 24%|██▍       | 56/230 [04:31<13:53,  4.79s/it]

Loss: 0.0036219218745827675


 25%|██▍       | 57/230 [04:35<13:44,  4.77s/it]

Loss: 0.013539946638047695


 25%|██▌       | 58/230 [04:40<13:29,  4.71s/it]

Loss: 0.007969319820404053


 26%|██▌       | 59/230 [04:44<13:19,  4.68s/it]

Loss: 0.004661466460675001


 26%|██▌       | 60/230 [04:49<13:09,  4.64s/it]

Loss: 0.0013529238058254123


 27%|██▋       | 61/230 [04:54<13:02,  4.63s/it]

Loss: 0.0022180622909218073


 27%|██▋       | 62/230 [04:58<12:56,  4.62s/it]

Loss: 0.03984241187572479


 27%|██▋       | 63/230 [05:03<13:00,  4.67s/it]

Loss: 0.1339505910873413


 28%|██▊       | 64/230 [05:08<13:04,  4.73s/it]

Loss: 0.002601561602205038


 28%|██▊       | 65/230 [05:13<13:00,  4.73s/it]

Loss: 0.11102722585201263


 29%|██▊       | 66/230 [05:17<12:57,  4.74s/it]

Loss: 0.002255205065011978


 29%|██▉       | 67/230 [05:22<12:53,  4.75s/it]

Loss: 0.008584532886743546


 30%|██▉       | 68/230 [05:27<12:51,  4.76s/it]

Loss: 0.003762798151001334


 30%|███       | 69/230 [05:32<12:45,  4.76s/it]

Loss: 0.04747062921524048


 30%|███       | 70/230 [05:36<12:36,  4.73s/it]

Loss: 0.009949754923582077


 31%|███       | 71/230 [05:41<12:31,  4.73s/it]

Loss: 0.19050487875938416


 31%|███▏      | 72/230 [05:46<12:20,  4.69s/it]

Loss: 0.01230686530470848


 32%|███▏      | 73/230 [05:50<12:15,  4.69s/it]

Loss: 0.04431258514523506


 32%|███▏      | 74/230 [05:55<12:09,  4.67s/it]

Loss: 0.02422497794032097


 33%|███▎      | 75/230 [06:00<12:09,  4.70s/it]

Loss: 0.005815112963318825


 33%|███▎      | 76/230 [06:05<12:16,  4.78s/it]

Loss: 0.0022354531101882458


 33%|███▎      | 77/230 [06:10<12:16,  4.82s/it]

Loss: 0.012883649207651615


 34%|███▍      | 78/230 [06:14<12:13,  4.83s/it]

Loss: 0.006455538794398308


 34%|███▍      | 79/230 [06:19<12:05,  4.81s/it]

Loss: 0.010715300217270851


 35%|███▍      | 80/230 [06:24<12:03,  4.83s/it]

Loss: 0.15246638655662537


 35%|███▌      | 81/230 [06:29<11:54,  4.80s/it]

Loss: 0.0068613034673035145


 36%|███▌      | 82/230 [06:34<11:48,  4.79s/it]

Loss: 0.003480055835098028


 36%|███▌      | 83/230 [06:38<11:39,  4.76s/it]

Loss: 0.04582769423723221


 37%|███▋      | 84/230 [06:43<11:27,  4.71s/it]

Loss: 0.006484415847808123


 37%|███▋      | 85/230 [06:48<11:26,  4.74s/it]

Loss: 0.11028845608234406


 37%|███▋      | 86/230 [06:52<11:25,  4.76s/it]

Loss: 0.005917342379689217


 38%|███▊      | 87/230 [06:57<11:20,  4.76s/it]

Loss: 0.02150993049144745


 38%|███▊      | 88/230 [07:02<11:19,  4.78s/it]

Loss: 0.01847163960337639


 39%|███▊      | 89/230 [07:07<11:21,  4.84s/it]

Loss: 0.0071667954325675964


 39%|███▉      | 90/230 [07:12<11:14,  4.82s/it]

Loss: 0.00879999902099371


 40%|███▉      | 91/230 [07:17<11:13,  4.84s/it]

Loss: 0.0033799828961491585


 40%|████      | 92/230 [07:22<11:09,  4.85s/it]

Loss: 0.00849759578704834


 40%|████      | 93/230 [07:26<11:06,  4.87s/it]

Loss: 0.005629968363791704


 41%|████      | 94/230 [07:31<11:02,  4.87s/it]

Loss: 0.018376309424638748


 41%|████▏     | 95/230 [07:36<10:55,  4.86s/it]

Loss: 0.01997070014476776


 42%|████▏     | 96/230 [07:41<10:45,  4.82s/it]

Loss: 0.004138527903705835


 42%|████▏     | 97/230 [07:46<10:40,  4.82s/it]

Loss: 0.004036020953208208


 43%|████▎     | 98/230 [07:51<10:38,  4.83s/it]

Loss: 0.005725754424929619


 43%|████▎     | 99/230 [07:55<10:30,  4.81s/it]

Loss: 0.0037005716003477573


 43%|████▎     | 100/230 [08:00<10:21,  4.78s/it]

Loss: 0.007702622562646866


 44%|████▍     | 101/230 [08:05<10:23,  4.83s/it]

Loss: 0.004147795028984547


 44%|████▍     | 102/230 [08:11<10:59,  5.15s/it]

Loss: 0.00911529641598463


 45%|████▍     | 103/230 [08:16<11:03,  5.23s/it]

Loss: 0.004361461848020554


 45%|████▌     | 104/230 [08:21<10:42,  5.10s/it]

Loss: 0.040371835231781006


 46%|████▌     | 105/230 [08:26<10:39,  5.11s/it]

Loss: 0.00163015176076442


 46%|████▌     | 106/230 [08:31<10:20,  5.01s/it]

Loss: 0.0011554594384506345


 47%|████▋     | 107/230 [08:36<10:10,  4.96s/it]

Loss: 0.011446833610534668


 47%|████▋     | 108/230 [08:41<10:00,  4.92s/it]

Loss: 0.1455995738506317


 47%|████▋     | 109/230 [08:45<09:48,  4.86s/it]

Loss: 0.004817437380552292


 48%|████▊     | 110/230 [08:50<09:39,  4.83s/it]

Loss: 0.07625937461853027


 48%|████▊     | 111/230 [08:55<09:35,  4.83s/it]

Loss: 0.013342955149710178


 49%|████▊     | 112/230 [09:00<09:28,  4.82s/it]

Loss: 0.012200423516333103


 49%|████▉     | 113/230 [09:05<09:23,  4.82s/it]

Loss: 0.006351611576974392


 50%|████▉     | 114/230 [09:09<09:18,  4.81s/it]

Loss: 0.0053666457533836365


 50%|█████     | 115/230 [09:14<09:15,  4.83s/it]

Loss: 0.13548870384693146


 50%|█████     | 116/230 [09:19<09:08,  4.81s/it]

Loss: 0.001134523656219244


 51%|█████     | 117/230 [09:24<09:02,  4.80s/it]

Loss: 0.0018324974225834012


 51%|█████▏    | 118/230 [09:29<08:56,  4.79s/it]

Loss: 0.000935693271458149


 52%|█████▏    | 119/230 [09:33<08:54,  4.81s/it]

Loss: 0.005625010468065739


 52%|█████▏    | 120/230 [09:38<08:42,  4.75s/it]

Loss: 0.017661243677139282


 53%|█████▎    | 121/230 [09:43<08:34,  4.72s/it]

Loss: 0.0009599893819540739


 53%|█████▎    | 122/230 [09:47<08:28,  4.70s/it]

Loss: 0.004832799080759287


 53%|█████▎    | 123/230 [09:52<08:24,  4.71s/it]

Loss: 0.0016676423838362098


 54%|█████▍    | 124/230 [09:57<08:20,  4.72s/it]

Loss: 0.008302385918796062


 54%|█████▍    | 125/230 [10:02<08:18,  4.74s/it]

Loss: 0.004651190713047981


 55%|█████▍    | 126/230 [10:07<08:19,  4.80s/it]

Loss: 0.0170099139213562


 55%|█████▌    | 127/230 [10:11<08:18,  4.84s/it]

Loss: 0.00961984135210514


 56%|█████▌    | 128/230 [10:16<08:10,  4.81s/it]

Loss: 0.0013887591194361448


 56%|█████▌    | 129/230 [10:21<08:04,  4.80s/it]

Loss: 0.003817458637058735


 57%|█████▋    | 130/230 [10:26<07:56,  4.77s/it]

Loss: 0.0025185681879520416


 57%|█████▋    | 131/230 [10:30<07:49,  4.74s/it]

Loss: 0.003756527788937092


 57%|█████▋    | 132/230 [10:35<07:46,  4.77s/it]

Loss: 0.09425819665193558


 58%|█████▊    | 133/230 [10:40<07:42,  4.77s/it]

Loss: 0.005010658875107765


 58%|█████▊    | 134/230 [10:45<07:38,  4.78s/it]

Loss: 0.004790476989001036


 59%|█████▊    | 135/230 [10:50<07:36,  4.80s/it]

Loss: 0.006753037683665752


 59%|█████▉    | 136/230 [10:54<07:27,  4.76s/it]

Loss: 0.039225950837135315


 60%|█████▉    | 137/230 [10:59<07:21,  4.75s/it]

Loss: 0.009438758715987206


 60%|██████    | 138/230 [11:04<07:19,  4.78s/it]

Loss: 0.01840128004550934


 60%|██████    | 139/230 [11:09<07:16,  4.80s/it]

Loss: 0.012210364453494549


 61%|██████    | 140/230 [11:14<07:12,  4.81s/it]

Loss: 0.0025989487767219543


 61%|██████▏   | 141/230 [11:18<07:08,  4.81s/it]

Loss: 0.001247764565050602


 62%|██████▏   | 142/230 [11:23<07:01,  4.79s/it]

Loss: 0.005736350081861019


 62%|██████▏   | 143/230 [11:28<06:55,  4.77s/it]

Loss: 0.009980321861803532


 63%|██████▎   | 144/230 [11:33<06:52,  4.80s/it]

Loss: 0.01351040881127119


 63%|██████▎   | 145/230 [11:37<06:46,  4.79s/it]

Loss: 0.0037372817751020193


 63%|██████▎   | 146/230 [11:42<06:40,  4.77s/it]

Loss: 0.009718970395624638


 64%|██████▍   | 147/230 [11:47<06:32,  4.73s/it]

Loss: 0.008145567029714584


 64%|██████▍   | 148/230 [11:51<06:26,  4.71s/it]

Loss: 0.007483294233679771


 65%|██████▍   | 149/230 [11:56<06:20,  4.70s/it]

Loss: 0.0029218443669378757


 65%|██████▌   | 150/230 [12:01<06:15,  4.70s/it]

Loss: 0.0009089043014682829


 66%|██████▌   | 151/230 [12:06<06:20,  4.81s/it]

Loss: 0.0018958670552819967


 66%|██████▌   | 152/230 [12:11<06:12,  4.78s/it]

Loss: 0.00427437387406826


 67%|██████▋   | 153/230 [12:15<06:07,  4.77s/it]

Loss: 0.13141930103302002


 67%|██████▋   | 154/230 [12:20<06:02,  4.77s/it]

Loss: 0.1270931363105774


 67%|██████▋   | 155/230 [12:25<06:00,  4.80s/it]

Loss: 0.014656479470431805


 68%|██████▊   | 156/230 [12:30<05:55,  4.80s/it]

Loss: 0.01636425405740738


 68%|██████▊   | 157/230 [12:35<05:50,  4.81s/it]

Loss: 0.10843300074338913


 69%|██████▊   | 158/230 [12:39<05:44,  4.79s/it]

Loss: 0.11407525837421417


 69%|██████▉   | 159/230 [12:44<05:39,  4.78s/it]

Loss: 0.013225485570728779


 70%|██████▉   | 160/230 [12:49<05:32,  4.74s/it]

Loss: 0.011850612238049507


 70%|███████   | 161/230 [12:54<05:27,  4.75s/it]

Loss: 0.009861745871603489


 70%|███████   | 162/230 [12:58<05:23,  4.75s/it]

Loss: 0.11829205602407455


 71%|███████   | 163/230 [13:03<05:19,  4.76s/it]

Loss: 0.1089899092912674


 71%|███████▏  | 164/230 [13:08<05:17,  4.81s/it]

Loss: 0.0068771070800721645


 72%|███████▏  | 165/230 [13:13<05:16,  4.87s/it]

Loss: 0.07930531352758408


 72%|███████▏  | 166/230 [13:18<05:14,  4.92s/it]

Loss: 0.06077133119106293


 73%|███████▎  | 167/230 [13:23<05:09,  4.91s/it]

Loss: 0.1268991082906723


 73%|███████▎  | 168/230 [13:28<05:01,  4.86s/it]

Loss: 0.01256626844406128


 73%|███████▎  | 169/230 [13:33<04:55,  4.84s/it]

Loss: 0.11600296199321747


 74%|███████▍  | 170/230 [13:37<04:48,  4.80s/it]

Loss: 0.02248968929052353


 74%|███████▍  | 171/230 [13:42<04:43,  4.80s/it]

Loss: 0.007135816849768162


 75%|███████▍  | 172/230 [13:47<04:37,  4.79s/it]

Loss: 0.012492986395955086


 75%|███████▌  | 173/230 [13:52<04:33,  4.80s/it]

Loss: 0.026761071756482124


 76%|███████▌  | 174/230 [13:56<04:28,  4.80s/it]

Loss: 0.009674834087491035


 76%|███████▌  | 175/230 [14:01<04:23,  4.79s/it]

Loss: 0.030107082799077034


 77%|███████▋  | 176/230 [14:06<04:18,  4.79s/it]

Loss: 0.008202740922570229


 77%|███████▋  | 177/230 [14:11<04:13,  4.79s/it]

Loss: 0.013256590813398361


 77%|███████▋  | 178/230 [14:15<04:08,  4.77s/it]

Loss: 0.007832176983356476


 78%|███████▊  | 179/230 [14:20<04:04,  4.79s/it]

Loss: 0.013284940272569656


 78%|███████▊  | 180/230 [14:25<03:57,  4.75s/it]

Loss: 0.004010910168290138


 79%|███████▊  | 181/230 [14:30<03:54,  4.78s/it]

Loss: 0.010792817920446396


 79%|███████▉  | 182/230 [14:35<03:48,  4.76s/it]

Loss: 0.01184863317757845


 80%|███████▉  | 183/230 [14:39<03:43,  4.75s/it]

Loss: 0.0061880433931946754


 80%|████████  | 184/230 [14:44<03:37,  4.72s/it]

Loss: 0.002434736816212535


 80%|████████  | 185/230 [14:49<03:31,  4.69s/it]

Loss: 0.010909002274274826


 81%|████████  | 186/230 [14:53<03:26,  4.70s/it]

Loss: 0.008557968772947788


 81%|████████▏ | 187/230 [14:58<03:22,  4.70s/it]

Loss: 0.004153387621045113


 82%|████████▏ | 188/230 [15:03<03:18,  4.71s/it]

Loss: 0.1114288941025734


 82%|████████▏ | 189/230 [15:08<03:14,  4.75s/it]

Loss: 0.005367990583181381


 83%|████████▎ | 190/230 [15:12<03:11,  4.79s/it]

Loss: 0.007924512960016727


 83%|████████▎ | 191/230 [15:17<03:06,  4.79s/it]

Loss: 0.1619742512702942


 83%|████████▎ | 192/230 [15:22<03:05,  4.87s/it]

Loss: 0.009140433743596077


 84%|████████▍ | 193/230 [15:27<02:59,  4.85s/it]

Loss: 0.008790189400315285


 84%|████████▍ | 194/230 [15:32<02:53,  4.83s/it]

Loss: 0.012403697706758976


 85%|████████▍ | 195/230 [15:37<02:47,  4.80s/it]

Loss: 0.007421704009175301


 85%|████████▌ | 196/230 [15:41<02:43,  4.79s/it]

Loss: 0.0019481980707496405


 86%|████████▌ | 197/230 [15:46<02:37,  4.78s/it]

Loss: 0.0043915738351643085


 86%|████████▌ | 198/230 [15:51<02:32,  4.78s/it]

Loss: 0.0037424908950924873


 87%|████████▋ | 199/230 [15:56<02:28,  4.80s/it]

Loss: 0.015684565529227257


 87%|████████▋ | 200/230 [16:00<02:23,  4.77s/it]

Loss: 0.005819587968289852


 87%|████████▋ | 201/230 [16:06<02:21,  4.87s/it]

Loss: 0.0181227158755064


 88%|████████▊ | 202/230 [16:10<02:16,  4.86s/it]

Loss: 0.11983829736709595


 88%|████████▊ | 203/230 [16:15<02:10,  4.83s/it]

Loss: 0.0735054537653923


 89%|████████▊ | 204/230 [16:20<02:05,  4.81s/it]

Loss: 0.08405862748622894


 89%|████████▉ | 205/230 [16:25<01:59,  4.79s/it]

Loss: 0.020018568262457848


 90%|████████▉ | 206/230 [16:29<01:54,  4.78s/it]

Loss: 0.00463036959990859


 90%|█████████ | 207/230 [16:34<01:50,  4.81s/it]

Loss: 0.13619373738765717


 90%|█████████ | 208/230 [16:39<01:45,  4.79s/it]

Loss: 0.0051468536257743835


 91%|█████████ | 209/230 [16:44<01:40,  4.77s/it]

Loss: 0.10645143687725067


 91%|█████████▏| 210/230 [16:49<01:35,  4.77s/it]

Loss: 0.1396220326423645


 92%|█████████▏| 211/230 [16:53<01:30,  4.76s/it]

Loss: 0.11696331202983856


 92%|█████████▏| 212/230 [16:58<01:25,  4.74s/it]

Loss: 0.005215478595346212


 93%|█████████▎| 213/230 [17:03<01:21,  4.77s/it]

Loss: 0.00621662987396121


 93%|█████████▎| 214/230 [17:08<01:17,  4.84s/it]

Loss: 0.026420725509524345


 93%|█████████▎| 215/230 [17:13<01:12,  4.83s/it]

Loss: 0.0192156583070755


 94%|█████████▍| 216/230 [17:17<01:07,  4.84s/it]

Loss: 0.009466253221035004


 94%|█████████▍| 217/230 [17:22<01:02,  4.84s/it]

Loss: 0.009657694026827812


 95%|█████████▍| 218/230 [17:27<00:57,  4.82s/it]

Loss: 0.005146353505551815


 95%|█████████▌| 219/230 [17:32<00:53,  4.82s/it]

Loss: 0.010102024301886559


 96%|█████████▌| 220/230 [17:37<00:47,  4.78s/it]

Loss: 0.061806805431842804


 96%|█████████▌| 221/230 [17:41<00:42,  4.77s/it]

Loss: 0.0028145024552941322


 97%|█████████▋| 222/230 [17:46<00:38,  4.78s/it]

Loss: 0.003513719653710723


 97%|█████████▋| 223/230 [17:51<00:33,  4.80s/it]

Loss: 0.00277684791944921


 97%|█████████▋| 224/230 [17:56<00:28,  4.82s/it]

Loss: 0.06216578930616379


 98%|█████████▊| 225/230 [18:01<00:24,  4.81s/it]

Loss: 0.002330239862203598


 98%|█████████▊| 226/230 [18:06<00:19,  4.94s/it]

Loss: 0.005225169938057661


 99%|█████████▊| 227/230 [18:11<00:14,  4.94s/it]

Loss: 0.1032160148024559


 99%|█████████▉| 228/230 [18:16<00:09,  4.91s/it]

Loss: 0.017719555646181107


100%|█████████▉| 229/230 [18:21<00:04,  4.90s/it]

Loss: 0.002082471502944827


100%|██████████| 230/230 [18:25<00:00,  4.81s/it]


Loss: 0.0029569731559604406
Epoch 3


  0%|          | 1/230 [00:05<19:14,  5.04s/it]

Loss: 0.014805631712079048


  1%|          | 2/230 [00:09<18:55,  4.98s/it]

Loss: 0.002883958863094449


  1%|▏         | 3/230 [00:15<19:04,  5.04s/it]

Loss: 0.010790578089654446


  2%|▏         | 4/230 [00:20<18:54,  5.02s/it]

Loss: 0.00610275287181139


  2%|▏         | 5/230 [00:24<18:40,  4.98s/it]

Loss: 0.00535280816257


  3%|▎         | 6/230 [00:29<18:27,  4.94s/it]

Loss: 0.0030138674192130566


  3%|▎         | 7/230 [00:34<18:12,  4.90s/it]

Loss: 0.009056253358721733


  3%|▎         | 8/230 [00:39<18:01,  4.87s/it]

Loss: 0.005293275695294142


  4%|▍         | 9/230 [00:44<17:53,  4.86s/it]

Loss: 0.006027378607541323


  4%|▍         | 10/230 [00:49<17:41,  4.83s/it]

Loss: 0.005227447487413883


  5%|▍         | 11/230 [00:53<17:42,  4.85s/it]

Loss: 0.0025299328844994307


  5%|▌         | 12/230 [00:58<17:34,  4.84s/it]

Loss: 0.0014291012194007635


  6%|▌         | 13/230 [01:03<17:31,  4.85s/it]

Loss: 0.005195450969040394


  6%|▌         | 14/230 [01:08<17:32,  4.87s/it]

Loss: 0.09491105377674103


  7%|▋         | 15/230 [01:13<17:29,  4.88s/it]

Loss: 0.008336126804351807


  7%|▋         | 16/230 [01:18<17:29,  4.90s/it]

Loss: 0.004799279384315014


  7%|▋         | 17/230 [01:23<17:24,  4.90s/it]

Loss: 0.013619678094983101


  8%|▊         | 18/230 [01:28<17:18,  4.90s/it]

Loss: 0.0030507706105709076


  8%|▊         | 19/230 [01:33<17:07,  4.87s/it]

Loss: 0.002119972137734294


  9%|▊         | 20/230 [01:37<16:59,  4.86s/it]

Loss: 0.12178195267915726


  9%|▉         | 21/230 [01:42<16:55,  4.86s/it]

Loss: 0.003132513025775552


 10%|▉         | 22/230 [01:47<16:49,  4.85s/it]

Loss: 0.0009641087963245809


 10%|█         | 23/230 [01:52<16:42,  4.84s/it]

Loss: 0.0019773589447140694


 10%|█         | 24/230 [01:57<16:37,  4.84s/it]

Loss: 0.0023009898141026497


 11%|█         | 25/230 [02:02<16:38,  4.87s/it]

Loss: 0.004857172258198261


 11%|█▏        | 26/230 [02:06<16:30,  4.85s/it]

Loss: 0.11805848032236099


 12%|█▏        | 27/230 [02:11<16:22,  4.84s/it]

Loss: 0.06537687033414841


 12%|█▏        | 28/230 [02:16<16:10,  4.80s/it]

Loss: 0.0012228405103087425


 13%|█▎        | 29/230 [02:21<15:58,  4.77s/it]

Loss: 0.1368897557258606


 13%|█▎        | 30/230 [02:25<15:53,  4.77s/it]

Loss: 0.00608432712033391


 13%|█▎        | 31/230 [02:30<15:52,  4.79s/it]

Loss: 0.044436100870370865


 14%|█▍        | 32/230 [02:35<15:46,  4.78s/it]

Loss: 0.009447532705962658


 14%|█▍        | 33/230 [02:40<16:10,  4.93s/it]

Loss: 0.008101562038064003


 15%|█▍        | 34/230 [02:45<16:01,  4.91s/it]

Loss: 0.008052084594964981


 15%|█▌        | 35/230 [02:50<15:47,  4.86s/it]

Loss: 0.006656313315033913


 16%|█▌        | 36/230 [02:55<15:42,  4.86s/it]

Loss: 0.005710710771381855


 16%|█▌        | 37/230 [03:00<15:40,  4.87s/it]

Loss: 0.012913505546748638


 17%|█▋        | 38/230 [03:04<15:32,  4.85s/it]

Loss: 0.006443826016038656


 17%|█▋        | 39/230 [03:09<15:27,  4.86s/it]

Loss: 0.0032065273262560368


 17%|█▋        | 40/230 [03:14<15:19,  4.84s/it]

Loss: 0.0013346185442060232


 18%|█▊        | 41/230 [03:19<15:08,  4.81s/it]

Loss: 0.010277374647557735


 18%|█▊        | 42/230 [03:24<14:57,  4.78s/it]

Loss: 0.004607944283634424


 19%|█▊        | 43/230 [03:28<14:49,  4.76s/it]

Loss: 0.005170934367924929


 19%|█▉        | 44/230 [03:33<14:40,  4.73s/it]

Loss: 0.0022206688299775124


 20%|█▉        | 45/230 [03:38<14:50,  4.81s/it]

Loss: 0.0018841337878257036


 20%|██        | 46/230 [03:43<14:51,  4.85s/it]

Loss: 0.0024561923928558826


 20%|██        | 47/230 [03:48<14:49,  4.86s/it]

Loss: 0.0025798745919018984


 21%|██        | 48/230 [03:53<14:42,  4.85s/it]

Loss: 0.001584000769071281


 21%|██▏       | 49/230 [03:57<14:34,  4.83s/it]

Loss: 0.11981810629367828


 22%|██▏       | 50/230 [04:02<14:30,  4.84s/it]

Loss: 0.004015813581645489


 22%|██▏       | 51/230 [04:07<14:27,  4.85s/it]

Loss: 0.00395422475412488


 23%|██▎       | 52/230 [04:12<14:20,  4.83s/it]

Loss: 0.015312869101762772


 23%|██▎       | 53/230 [04:17<14:15,  4.83s/it]

Loss: 0.0027309060096740723


 23%|██▎       | 54/230 [04:21<14:02,  4.79s/it]

Loss: 0.0030617080628871918


 24%|██▍       | 55/230 [04:26<13:56,  4.78s/it]

Loss: 0.0023556712549179792


 24%|██▍       | 56/230 [04:31<13:51,  4.78s/it]

Loss: 0.003229574067518115


 25%|██▍       | 57/230 [04:36<13:51,  4.81s/it]

Loss: 0.004347432404756546


 25%|██▌       | 58/230 [04:41<13:57,  4.87s/it]

Loss: 0.0017403950914740562


 26%|██▌       | 59/230 [04:46<13:58,  4.91s/it]

Loss: 0.00350020220503211


 26%|██▌       | 60/230 [04:51<13:53,  4.90s/it]

Loss: 0.09598739445209503


 27%|██▋       | 61/230 [04:56<13:48,  4.90s/it]

Loss: 0.0030715991742908955


 27%|██▋       | 62/230 [05:01<13:40,  4.89s/it]

Loss: 0.0033706706017255783


 27%|██▋       | 63/230 [05:05<13:33,  4.87s/it]

Loss: 0.03106876090168953


 28%|██▊       | 64/230 [05:10<13:27,  4.86s/it]

Loss: 0.0023517468944191933


 28%|██▊       | 65/230 [05:15<13:30,  4.91s/it]

Loss: 0.01321873813867569


 29%|██▊       | 66/230 [05:20<13:22,  4.90s/it]

Loss: 0.007170730736106634


 29%|██▉       | 67/230 [05:25<13:19,  4.91s/it]

Loss: 0.002361996565014124


 30%|██▉       | 68/230 [05:30<13:11,  4.88s/it]

Loss: 0.006828357465565205


 30%|███       | 69/230 [05:35<13:03,  4.87s/it]

Loss: 0.003142654662951827


 30%|███       | 70/230 [05:40<12:57,  4.86s/it]

Loss: 0.002245633862912655


 31%|███       | 71/230 [05:44<12:57,  4.89s/it]

Loss: 0.0036666770465672016


 31%|███▏      | 72/230 [05:49<12:51,  4.88s/it]

Loss: 0.008641969412565231


 32%|███▏      | 73/230 [05:54<12:49,  4.90s/it]

Loss: 0.0019428222440183163


 32%|███▏      | 74/230 [05:59<12:42,  4.88s/it]

Loss: 0.07384815067052841


 33%|███▎      | 75/230 [06:04<12:33,  4.86s/it]

Loss: 0.0030033160001039505


 33%|███▎      | 76/230 [06:09<12:30,  4.87s/it]

Loss: 0.002840753411874175


 33%|███▎      | 77/230 [06:14<12:28,  4.89s/it]

Loss: 0.09228140115737915


 34%|███▍      | 78/230 [06:19<12:21,  4.88s/it]

Loss: 0.001488085021264851


 34%|███▍      | 79/230 [06:24<12:17,  4.89s/it]

Loss: 0.10960832983255386


 35%|███▍      | 80/230 [06:28<12:10,  4.87s/it]

Loss: 0.00397542305290699


 35%|███▌      | 81/230 [06:33<11:59,  4.83s/it]

Loss: 0.02248333767056465


 36%|███▌      | 82/230 [06:38<12:00,  4.87s/it]

Loss: 0.0020340143237262964


 36%|███▌      | 83/230 [06:43<11:58,  4.89s/it]

Loss: 0.0016228159656748176


 37%|███▋      | 84/230 [06:48<11:49,  4.86s/it]

Loss: 0.00818008091300726


 37%|███▋      | 85/230 [06:53<11:44,  4.86s/it]

Loss: 0.010323602706193924


 37%|███▋      | 86/230 [06:57<11:37,  4.85s/it]

Loss: 0.007473800331354141


 38%|███▊      | 87/230 [07:02<11:38,  4.89s/it]

Loss: 0.005795082077383995


 38%|███▊      | 88/230 [07:07<11:30,  4.86s/it]

Loss: 0.0021477381233125925


 39%|███▊      | 89/230 [07:12<11:25,  4.86s/it]

Loss: 0.0018166244262829423


 39%|███▉      | 90/230 [07:17<11:14,  4.82s/it]

Loss: 0.0045028626918792725


 40%|███▉      | 91/230 [07:22<11:09,  4.81s/it]

Loss: 0.0021331056486815214


 40%|████      | 92/230 [07:26<11:00,  4.79s/it]

Loss: 0.0046622357331216335


 40%|████      | 93/230 [07:31<10:55,  4.78s/it]

Loss: 0.0025854730047285557


 41%|████      | 94/230 [07:36<10:49,  4.78s/it]

Loss: 0.0017195194959640503


 41%|████▏     | 95/230 [07:41<10:53,  4.84s/it]

Loss: 0.13740171492099762


 42%|████▏     | 96/230 [07:46<10:47,  4.83s/it]

Loss: 0.005077105015516281


 42%|████▏     | 97/230 [07:51<10:43,  4.84s/it]

Loss: 0.014352196827530861


 43%|████▎     | 98/230 [07:55<10:40,  4.85s/it]

Loss: 0.006920319981873035


 43%|████▎     | 99/230 [08:00<10:35,  4.85s/it]

Loss: 0.0008104791631922126


 43%|████▎     | 100/230 [08:05<10:30,  4.85s/it]

Loss: 0.001985318958759308


 44%|████▍     | 101/230 [08:10<10:24,  4.84s/it]

Loss: 0.006151542998850346


 44%|████▍     | 102/230 [08:15<10:15,  4.81s/it]

Loss: 0.0020168060436844826


 45%|████▍     | 103/230 [08:19<10:08,  4.79s/it]

Loss: 0.001719997264444828


 45%|████▌     | 104/230 [08:24<10:02,  4.78s/it]

Loss: 0.0017784142401069403


 46%|████▌     | 105/230 [08:29<09:57,  4.78s/it]

Loss: 0.0062678102403879166


 46%|████▌     | 106/230 [08:34<09:50,  4.76s/it]

Loss: 0.018514828756451607


 47%|████▋     | 107/230 [08:39<09:55,  4.84s/it]

Loss: 0.0013606268912553787


 47%|████▋     | 108/230 [08:44<09:50,  4.84s/it]

Loss: 0.09260022640228271


 47%|████▋     | 109/230 [08:48<09:49,  4.87s/it]

Loss: 0.000981333781965077


 48%|████▊     | 110/230 [08:53<09:49,  4.91s/it]

Loss: 0.014819723553955555


 48%|████▊     | 111/230 [08:58<09:43,  4.91s/it]

Loss: 0.004636553581804037


 49%|████▊     | 112/230 [09:03<09:35,  4.88s/it]

Loss: 0.002487731631845236


 49%|████▉     | 113/230 [09:08<09:29,  4.87s/it]

Loss: 0.004842318594455719


 50%|████▉     | 114/230 [09:13<09:23,  4.86s/it]

Loss: 0.004937437362968922


 50%|█████     | 115/230 [09:18<09:16,  4.84s/it]

Loss: 0.008090118877589703


 50%|█████     | 116/230 [09:22<09:09,  4.82s/it]

Loss: 0.0034031942486763


 51%|█████     | 117/230 [09:27<09:03,  4.81s/it]

Loss: 0.12532618641853333


 51%|█████▏    | 118/230 [09:32<08:57,  4.80s/it]

Loss: 0.0011722312774509192


 52%|█████▏    | 119/230 [09:37<08:53,  4.81s/it]

Loss: 0.0031620131339877844


 52%|█████▏    | 120/230 [09:42<08:59,  4.91s/it]

Loss: 0.0007280829013325274


 53%|█████▎    | 121/230 [09:47<08:53,  4.90s/it]

Loss: 0.10880107432603836


 53%|█████▎    | 122/230 [09:52<08:43,  4.85s/it]

Loss: 0.0056660473346710205


 53%|█████▎    | 123/230 [09:56<08:39,  4.86s/it]

Loss: 0.011718260124325752


 54%|█████▍    | 124/230 [10:01<08:34,  4.85s/it]

Loss: 0.009636936709284782


 54%|█████▍    | 125/230 [10:06<08:29,  4.85s/it]

Loss: 0.005828553345054388


 55%|█████▍    | 126/230 [10:11<08:22,  4.83s/it]

Loss: 0.001436089165508747


 55%|█████▌    | 127/230 [10:16<08:38,  5.03s/it]

Loss: 0.12012086808681488


 56%|█████▌    | 128/230 [10:21<08:27,  4.98s/it]

Loss: 0.0021450668573379517


 56%|█████▌    | 129/230 [10:26<08:20,  4.95s/it]

Loss: 0.010077721439301968


 57%|█████▋    | 130/230 [10:31<08:11,  4.91s/it]

Loss: 0.0026690445374697447


 57%|█████▋    | 131/230 [10:36<08:04,  4.89s/it]

Loss: 0.005763879977166653


 57%|█████▋    | 132/230 [10:41<07:57,  4.88s/it]

Loss: 0.002359755104407668


 58%|█████▊    | 133/230 [10:46<07:54,  4.89s/it]

Loss: 0.012710772454738617


 58%|█████▊    | 134/230 [10:50<07:46,  4.86s/it]

Loss: 0.004620805848389864


 59%|█████▊    | 135/230 [10:56<07:49,  4.94s/it]

Loss: 0.0017103871796280146


 59%|█████▉    | 136/230 [11:00<07:41,  4.91s/it]

Loss: 0.012099460698664188


 60%|█████▉    | 137/230 [11:05<07:35,  4.89s/it]

Loss: 0.006817724090069532


 60%|██████    | 138/230 [11:10<07:32,  4.92s/it]

Loss: 0.09718678891658783


 60%|██████    | 139/230 [11:15<07:26,  4.91s/it]

Loss: 0.005253663752228022


 61%|██████    | 140/230 [11:20<07:21,  4.90s/it]

Loss: 0.0042937505058944225


 61%|██████▏   | 141/230 [11:25<07:16,  4.91s/it]

Loss: 0.005480883177369833


 62%|██████▏   | 142/230 [11:30<07:10,  4.89s/it]

Loss: 0.09221860021352768


 62%|██████▏   | 143/230 [11:35<07:04,  4.88s/it]

Loss: 0.0026059846859425306


 63%|██████▎   | 144/230 [11:39<06:58,  4.87s/it]

Loss: 0.010887319222092628


 63%|██████▎   | 145/230 [11:44<06:53,  4.86s/it]

Loss: 0.01660519652068615


 63%|██████▎   | 146/230 [11:49<06:46,  4.84s/it]

Loss: 0.001094573293812573


 64%|██████▍   | 147/230 [11:54<06:42,  4.85s/it]

Loss: 0.0021438009571284056


 64%|██████▍   | 148/230 [11:59<06:37,  4.85s/it]

Loss: 0.06358936429023743


 65%|██████▍   | 149/230 [12:04<06:32,  4.84s/it]

Loss: 0.0013138792710378766


 65%|██████▌   | 150/230 [12:08<06:24,  4.81s/it]

Loss: 0.0019329329952597618


 66%|██████▌   | 151/230 [12:13<06:18,  4.80s/it]

Loss: 0.005262589547783136


 66%|██████▌   | 152/230 [12:18<06:14,  4.80s/it]

Loss: 0.0042419168166816235


 67%|██████▋   | 153/230 [12:23<06:06,  4.77s/it]

Loss: 0.11973674595355988


 67%|██████▋   | 154/230 [12:27<06:02,  4.77s/it]

Loss: 0.001848760643042624


 67%|██████▋   | 155/230 [12:32<05:58,  4.78s/it]

Loss: 0.003362515941262245


 68%|██████▊   | 156/230 [12:37<05:54,  4.80s/it]

Loss: 0.00749935582280159


 68%|██████▊   | 157/230 [12:42<05:57,  4.90s/it]

Loss: 0.008124830201268196


 69%|██████▊   | 158/230 [12:47<05:52,  4.89s/it]

Loss: 0.005947146564722061


 69%|██████▉   | 159/230 [12:52<05:45,  4.87s/it]

Loss: 0.002927609486505389


 70%|██████▉   | 160/230 [12:57<05:40,  4.86s/it]

Loss: 0.13523295521736145


 70%|███████   | 161/230 [13:02<05:35,  4.86s/it]

Loss: 0.015247942879796028


 70%|███████   | 162/230 [13:06<05:29,  4.85s/it]

Loss: 0.01909416727721691


 71%|███████   | 163/230 [13:11<05:22,  4.82s/it]

Loss: 0.009985353797674179


 71%|███████▏  | 164/230 [13:16<05:17,  4.81s/it]

Loss: 0.022916683927178383


 72%|███████▏  | 165/230 [13:21<05:10,  4.78s/it]

Loss: 0.04164503514766693


 72%|███████▏  | 166/230 [13:25<05:05,  4.78s/it]

Loss: 0.05238499864935875


 73%|███████▎  | 167/230 [13:30<05:00,  4.77s/it]

Loss: 0.020916691049933434


 73%|███████▎  | 168/230 [13:35<04:54,  4.75s/it]

Loss: 0.012318110093474388


 73%|███████▎  | 169/230 [13:40<04:57,  4.87s/it]

Loss: 0.014833427034318447


 74%|███████▍  | 170/230 [13:45<04:51,  4.86s/it]

Loss: 0.0038112960755825043


 74%|███████▍  | 171/230 [13:50<04:45,  4.85s/it]

Loss: 0.02552282251417637


 75%|███████▍  | 172/230 [13:54<04:40,  4.83s/it]

Loss: 0.01469683088362217


 75%|███████▌  | 173/230 [13:59<04:35,  4.84s/it]

Loss: 0.02785002812743187


 76%|███████▌  | 174/230 [14:04<04:30,  4.83s/it]

Loss: 0.009805774316191673


 76%|███████▌  | 175/230 [14:09<04:24,  4.82s/it]

Loss: 0.020349053665995598


 77%|███████▋  | 176/230 [14:14<04:19,  4.81s/it]

Loss: 0.029139291495084763


 77%|███████▋  | 177/230 [14:18<04:14,  4.81s/it]

Loss: 0.2866317331790924


 77%|███████▋  | 178/230 [14:23<04:08,  4.78s/it]

Loss: 0.03855976462364197


 78%|███████▊  | 179/230 [14:28<04:03,  4.77s/it]

Loss: 0.027980666607618332


 78%|███████▊  | 180/230 [14:33<03:58,  4.76s/it]

Loss: 0.009334463626146317


 79%|███████▊  | 181/230 [14:38<03:57,  4.84s/it]

Loss: 0.055341411381959915


 79%|███████▉  | 182/230 [14:43<03:55,  4.90s/it]

Loss: 0.03345131874084473


 80%|███████▉  | 183/230 [14:48<03:51,  4.93s/it]

Loss: 0.18603430688381195


 80%|████████  | 184/230 [14:53<03:45,  4.89s/it]

Loss: 0.029351096600294113


 80%|████████  | 185/230 [14:57<03:39,  4.87s/it]

Loss: 0.047396220266819


 81%|████████  | 186/230 [15:02<03:33,  4.86s/it]

Loss: 0.04569456726312637


 81%|████████▏ | 187/230 [15:07<03:28,  4.85s/it]

Loss: 0.01860465668141842


 82%|████████▏ | 188/230 [15:12<03:23,  4.85s/it]

Loss: 0.019693192094564438


 82%|████████▏ | 189/230 [15:17<03:19,  4.85s/it]

Loss: 0.012714061886072159


 83%|████████▎ | 190/230 [15:22<03:14,  4.85s/it]

Loss: 0.15583907067775726


 83%|████████▎ | 191/230 [15:26<03:09,  4.87s/it]

Loss: 0.012361991219222546


 83%|████████▎ | 192/230 [15:31<03:04,  4.85s/it]

Loss: 0.004843292757868767


 84%|████████▍ | 193/230 [15:36<02:59,  4.84s/it]

Loss: 0.003418978303670883


 84%|████████▍ | 194/230 [15:41<02:53,  4.83s/it]

Loss: 0.003492082701995969


 85%|████████▍ | 195/230 [15:46<02:49,  4.84s/it]

Loss: 0.006080682389438152


 85%|████████▌ | 196/230 [15:51<02:44,  4.82s/it]

Loss: 0.01677139289677143


 86%|████████▌ | 197/230 [15:55<02:39,  4.84s/it]

Loss: 0.0037611927837133408


 86%|████████▌ | 198/230 [16:00<02:35,  4.85s/it]

Loss: 0.004063160624355078


 87%|████████▋ | 199/230 [16:05<02:30,  4.85s/it]

Loss: 0.00283897016197443


 87%|████████▋ | 200/230 [16:10<02:24,  4.83s/it]

Loss: 0.005166533403098583


 87%|████████▋ | 201/230 [16:15<02:19,  4.82s/it]

Loss: 0.002758217742666602


 88%|████████▊ | 202/230 [16:19<02:13,  4.78s/it]

Loss: 0.0023251234088093042


 88%|████████▊ | 203/230 [16:24<02:10,  4.82s/it]

Loss: 0.012404541485011578


 89%|████████▊ | 204/230 [16:29<02:05,  4.81s/it]

Loss: 0.008422566577792168


 89%|████████▉ | 205/230 [16:34<01:59,  4.79s/it]

Loss: 0.01137266494333744


 90%|████████▉ | 206/230 [16:39<01:57,  4.88s/it]

Loss: 0.0019510191632434726


 90%|█████████ | 207/230 [16:44<01:52,  4.91s/it]

Loss: 0.005850940942764282


 90%|█████████ | 208/230 [16:49<01:47,  4.89s/it]

Loss: 0.005884369369596243


 91%|█████████ | 209/230 [16:54<01:41,  4.85s/it]

Loss: 0.003549158340319991


 91%|█████████▏| 210/230 [16:58<01:37,  4.86s/it]

Loss: 0.0037618589121848345


 92%|█████████▏| 211/230 [17:03<01:31,  4.84s/it]

Loss: 0.002577604725956917


 92%|█████████▏| 212/230 [17:08<01:27,  4.84s/it]

Loss: 0.002529258606955409


 93%|█████████▎| 213/230 [17:13<01:22,  4.85s/it]

Loss: 0.013789852149784565


 93%|█████████▎| 214/230 [17:18<01:17,  4.82s/it]

Loss: 0.0034414774272590876


 93%|█████████▎| 215/230 [17:23<01:12,  4.85s/it]

Loss: 0.12723128497600555


 94%|█████████▍| 216/230 [17:27<01:07,  4.80s/it]

Loss: 0.0020816947799175978


 94%|█████████▍| 217/230 [17:32<01:02,  4.77s/it]

Loss: 0.002767016878351569


 95%|█████████▍| 218/230 [17:37<00:57,  4.78s/it]

Loss: 0.0014399721985682845


 95%|█████████▌| 219/230 [17:42<00:53,  4.84s/it]

Loss: 0.005287785083055496


 96%|█████████▌| 220/230 [17:47<00:48,  4.86s/it]

Loss: 0.2573317289352417


 96%|█████████▌| 221/230 [17:52<00:43,  4.85s/it]

Loss: 0.008282925933599472


 97%|█████████▋| 222/230 [17:56<00:38,  4.84s/it]

Loss: 0.017554180696606636


 97%|█████████▋| 223/230 [18:01<00:33,  4.83s/it]

Loss: 0.043973032385110855


 97%|█████████▋| 224/230 [18:06<00:29,  4.84s/it]

Loss: 0.011503458023071289


 98%|█████████▊| 225/230 [18:11<00:24,  4.82s/it]

Loss: 0.015414247289299965


 98%|█████████▊| 226/230 [18:16<00:19,  4.81s/it]

Loss: 0.010416904464364052


 99%|█████████▊| 227/230 [18:20<00:14,  4.80s/it]

Loss: 0.007404521107673645


 99%|█████████▉| 228/230 [18:25<00:09,  4.79s/it]

Loss: 0.009900092147290707


100%|█████████▉| 229/230 [18:30<00:04,  4.77s/it]

Loss: 0.003573592286556959


100%|██████████| 230/230 [18:35<00:00,  4.85s/it]

Loss: 0.002626403234899044





('../models/reward_model/tokenizer_config.json',
 '../models/reward_model/special_tokens_map.json',
 '../models/reward_model/vocab.txt',
 '../models/reward_model/added_tokens.json')