1. 라벨 인코딩할 때 원본/숫자 어떻게 변한건지 확인할 수 있도록

# 0. GPU check

* 이 코드는 Nvidia GPU를 사용하는 컴퓨터에서, train / test 데이터가 분리되어있는 csv 파일을 사용하는 것을 전제로 작성됨

In [1]:
import torch

if torch.cuda.is_available():
    device_count = torch.cuda.device_count()
    print("device_count: {}".format(device_count))
    for device_num in range(device_count):
        print("device {} capability {}".format(
            device_num,
            torch.cuda.get_device_capability(device_num)))
        print("device {} name {}".format(
            device_num, 
            torch.cuda.get_device_name(device_num)))
else:
    print("no cuda device")

device_count: 1
device 0 capability (8, 6)
device 0 name NVIDIA GeForce RTX 3080


In [2]:
if torch.cuda.is_available() :
    device = torch.device("cuda:0")
else : 
    device = torch.device("cpu")

In [3]:
from pynvml import *

def print_gpu_utilization():
    nvmlInit()
    handle = nvmlDeviceGetHandleByIndex(0)
    info = nvmlDeviceGetMemoryInfo(handle)
    print(f"GPU memory occupied: {info.used//1024**2} MB.")

def print_summary(result):
    print(f"Time: {result.metrics['train_runtime']:.2f}")
    print(f"Samples/second: {result.metrics['train_samples_per_second']:.2f}")
    print_gpu_utilization()
    
print_gpu_utilization()

GPU memory occupied: 7505 MB.


* 모델 훈련과정에서 GPU 메모리 용량 초과 시, 개발서버 콘솔에서 직접 `nvidia-smi` 명령어 실행 후 메모리를 점유하고 있는 process의 PID를 찾아 `sudo kill -9 {pid}` 로 프로세스 종료해주면 됨

# 1. Import packages

In [4]:
## Need to check if packages are compatible
# !pip install accelerate nvidia-ml-py3
# !pip install datasets==2.4.0
# !pip install huggingface_hub==0.9.1
# !pip install transformers==4.22.1 # bf16, tf32 등 사용하려면 4.2 이상 필요
# !pip install pyarrow==9.0.0

* huggingface_hub와 transformers 간 호환가능한 버전 확인 필요
* 만약 성능 테스트를 위해 datasets api를 사용할거라면 datasets 역시 호환 가능 버전 확인해야 함
* 세 가지 dependencies를 사용한다는 가정 하에, pyarrow 라이브러리도 필요.

In [5]:
import transformers
import datasets
import huggingface_hub
import pyarrow

print(transformers.__version__)
print(datasets.__version__)
print(huggingface_hub.__version__)
print(pyarrow.__version__)

# 4.22.1
# 2.4.0
# 0.9.1
# 9.0.0

4.22.1
2.4.0
0.9.1
9.0.0


In [6]:
import os
import re
import math
import numpy as np
import pandas as pd

# 'You can use tf32' if you are acessing Ampere hardware
import torch
torch.backends.cuda.matmul.allow_tf32 = True

from datasets import load_dataset, load_metric, ClassLabel
from sklearn.utils.class_weight import compute_class_weight
from sklearn.metrics import confusion_matrix, accuracy_score, roc_auc_score, precision_score, recall_score, f1_score

import ray
from ray import tune
from ray.tune import CLIReporter
from ray.tune.examples.pbt_transformers.utils import (
    download_data,
    build_compute_metrics_fn,
)
from ray.tune.schedulers import PopulationBasedTraining
from transformers import (
    glue_tasks_num_labels,
    AutoConfig,
    AutoModelForSequenceClassification,
    AutoTokenizer,
    Trainer,
    GlueDataset,
    GlueDataTrainingArguments,
    TrainingArguments,
    EarlyStoppingCallback
)

# 2. Import Data

* xxx_train.csv, xxx_test.csv 파일은 아래 형식으로 전처리된 csv 파일이어야 함 (column name: `text`, `label`)


<table class="features-table">
  <tr>
    <th class="mdc-text-light-green-600", style="text-align:center">
    text
    </th>
    <th class="mdc-text-purple-600", style="text-align:center">
    label
    </th>
  </tr>
  <tr>
    <td class="mdc-bg-light-green-50" style="text-align:left">
      Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...
    </td>
    <td class="mdc-bg-purple-50">
      0
    </td>
  </tr>
  <tr>
    <td class="mdc-bg-light-green-50" style="text-align:left">
      Ok lar... Joking wif u oni...
    </td>
    <td class="mdc-bg-purple-50">
      0
    </td>
  </tr>
  <tr>
    <td class="mdc-bg-light-green-50" style="text-align:left">
      Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)
    </td>
    <td class="mdc-bg-purple-50">
      1
    </td>
  </tr>
  <tr>
    <td class="mdc-bg-light-green-50" style="text-align:left">
      U dun say so early hor... U c already then say...
    </td>
    <td class="mdc-bg-purple-50">
      0
    </td>
  </tr>
  <tr>
    <td class="mdc-bg-light-green-50" style="text-align:left">
      Nah I don't think he goes to usf, he lives around here though
    </td>
    <td class="mdc-bg-purple-50">
      0
    </td>
  </tr>
</table>

In [7]:
data_name = "naver_movie_review" ## IMDB / naver_movie_review / spam

dataset = load_dataset('csv', data_files={'train': f'../data_splited/{data_name}_train.csv',
                                          'test': f'../data_splited/{data_name}_test.csv'})
dataset

Using custom data configuration default-e7166b2526575299
Reusing dataset csv (/root/.cache/huggingface/datasets/csv/default-e7166b2526575299/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a)


  0%|          | 0/2 [00:00<?, ?it/s]

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 150000
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 50000
    })
})

# 3. Data Preprocessing

* load_dataset 함수로 불러온 데이터를 수정할 때는 수정 내용을 담은 함수를 만들고, 이를 map 함수로 각 원소에 적용함 ([링크](https://huggingface.co/docs/datasets/v1.4.0/processing.html#processing-data-row-by-row)에서 확인)

In [8]:
## remove specal characters

def remove_sp(example):
    example["text"]=re.sub(r'[^a-z|A-Z|0-9|ㄱ-ㅎ|ㅏ-ㅣ|가-힣| ]+', '', str(example["text"]))
    return example

dataset = dataset.map(remove_sp)

Loading cached processed dataset at /root/.cache/huggingface/datasets/csv/default-e7166b2526575299/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a/cache-b43d4ea7484dfb3f.arrow
Loading cached processed dataset at /root/.cache/huggingface/datasets/csv/default-e7166b2526575299/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a/cache-61842ae510046f1a.arrow


In [9]:
dataset

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 150000
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 50000
    })
})

In [10]:
## label encoding

labels = list(set(dataset["train"]["label"] + dataset["test"]["label"]))
num_labels = len(labels)

def encoding_label(example):
    str_to_int = ClassLabel(num_classes=num_labels, names=labels)
    example["label"]=str_to_int.str2int(example["label"])
    return example

if type(labels[0]) == str:
    dataset = dataset.map(encoding_label)
    
print(num_labels)

2


In [11]:
dataset

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 150000
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 50000
    })
})

In [12]:
# For IMDB and Naver Movie Review, 

# Make imbalanced data to test model performance (label 0:label 1 = 8:2)
# https://discuss.huggingface.co/t/huggingface-datasets-convert-a-dataset-to-pandas-and-then-convert-it-back/14708/3

# df_train = pd.DataFrame(dataset['train'])
# df_train_0 = df_train[df_train["label"]==0]
# df_train_1 = df_train[df_train["label"]==1].sample(frac=1)[0:math.floor(len(df_train[df_train['label']==0])*0.2)]
# dataset["train"] = datasets.Dataset.from_pandas(pd.concat([df_train_0,df_train_1]), preserve_index=False)
# dataset

# 4. Load PLM & Tokenizing

In [13]:
# model_name = "bert-base-cased"
# model_name = "klue/bert-base"

# model_name = "bert-base-multilingual-cased"

# model_name = "xlm-roberta-base"
model_name = "klue/roberta-base"

In [14]:
# Download cache tokenizer

tokenizer = AutoTokenizer.from_pretrained(model_name)

Downloading:   0%|          | 0.00/375 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/248k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/752k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/173 [00:00<?, ?B/s]

In [15]:
def tokenize_function(examples):
    tokenized_batch = tokenizer(examples["text"], padding="max_length", truncation=True) # padding : ['longest', 'max_length', 'do_not_pad']
    return tokenized_batch

In [16]:
tokenized_datasets = dataset.map(tokenize_function, batched=True)

  0%|          | 0/150 [00:00<?, ?ba/s]

  0%|          | 0/50 [00:00<?, ?ba/s]

In [17]:
# train_dataset = tokenized_datasets["train"].shuffle(seed=1919).select(range(0,math.floor(len(tokenized_datasets["train"])*0.7)))
# eval_dataset = tokenized_datasets["train"].shuffle(seed=1919).select(range(math.floor(len(tokenized_datasets["train"])*0.7), len(tokenized_datasets["train"])))
# test_dataset = tokenized_datasets["test"]

In [18]:
# data for test
train_dataset = tokenized_datasets["train"].shuffle(seed=1919).select(range(1000))
eval_dataset = tokenized_datasets["train"].shuffle(seed=1919).select(range(1000))
test_dataset = tokenized_datasets["test"]

Loading cached shuffled indices for dataset at /root/.cache/huggingface/datasets/csv/default-e7166b2526575299/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a/cache-a7743211b41014c6.arrow


# 5. Check class weights

In [19]:
def class_weight(train_dataset) :
    
    train_labels = np.array(train_dataset["label"])
    class_weights = compute_class_weight(class_weight = 'balanced', classes = np.unique(train_labels), y = train_labels)
    
    weights = torch.tensor(class_weights, dtype = torch.float)
    
    return weights

In [20]:
weights = class_weight(train_dataset)
print(weights)

tensor([1.0638, 0.9434])


# 6. Modeling

In [21]:
## Customize training strategy

task_data_dir = "test-model"
gpus_per_trial = 1
cpus_per_trial = 16
n_trials = 5
seed = 818

In [22]:
# Download model and features

config = AutoConfig.from_pretrained(
    model_name, 
    num_labels=num_labels
)

def model_init():
    return AutoModelForSequenceClassification.from_pretrained(
        model_name,
        config=config
        )

Downloading:   0%|          | 0.00/546 [00:00<?, ?B/s]

In [23]:
def compute_metrics(eval_preds):
    metric = load_metric("glue", "mrpc") # Accuracy/F1
    logits, labels = eval_preds
    predictions = np.argmax(logits, axis=-1)
    return metric.compute(predictions=predictions, references=labels)

In [24]:
training_args = TrainingArguments(
    output_dir=".",
    learning_rate=2e-5, # config
    do_train=True,
    do_eval=True,
    no_cuda=gpus_per_trial <= 0,
    evaluation_strategy="steps",
    save_strategy="steps",
    metric_for_best_model="f1",
    greater_is_better=True,
    load_best_model_at_end=True,
    num_train_epochs=2,  # config
    max_steps=-1,  # config
    per_device_train_batch_size=8,  # config
    per_device_eval_batch_size=8,  # config
    warmup_steps=0,
    warmup_ratio=0.1,  # config
    weight_decay=0.1,  # config
    logging_dir="./logs",
    skip_memory_metrics=True,
    report_to="none",
    fp16=True,
    # bf16=True,
    # tf32=True,
    gradient_accumulation_steps=4,
    gradient_checkpointing=True,
    seed=seed,
    eval_steps = 50
    )
    
# trainer = Trainer(
#     model_init=model_init,
#     args=training_args,
#     train_dataset=train_dataset,
#     eval_dataset=eval_dataset,
#     compute_metrics=compute_metrics,
#     )

class CustomTrainer(Trainer):
    def compute_loss(self, model, inputs, return_outputs=False):
        labels = inputs.get("labels")
        # forward pass
        outputs = model(**inputs)
        logits = outputs.get("logits")
        # compute custom loss
        weight = weights.to(device)
        loss_fct = torch.nn.CrossEntropyLoss(weight=weight)
        loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
        return (loss, outputs) if return_outputs else loss
    
trainer = CustomTrainer(
    model_init=model_init,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    compute_metrics=compute_metrics,
    callbacks = [EarlyStoppingCallback(early_stopping_patience=3)]
    )

Downloading:   0%|          | 0.00/443M [00:00<?, ?B/s]

loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--klue--roberta-base/snapshots/67dd433d36ebc66a42c9aaa85abcf8d2620e41d9/pytorch_model.bin
Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.layer_norm.weight', 'lm_head.decoder.bias', 'lm_head.dense.weight', 'lm_head.decoder.weight', 'lm_head.dense.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequ

In [26]:
# Hyperparameter tuning with ray tune

tune_config = {
    "num_train_epochs": tune.choice([2, 5]),
}

# PopulationBasedTraining
# worker might copy the model parameters from a better performing worker or explore new hyperparameters by changing the current values randomly
# cf. ASHAScheduler
scheduler = PopulationBasedTraining(
    time_attr="training_iteration",
    metric="eval_f1",
    mode="max",
    perturbation_interval=1,
    hyperparam_mutations={
        "weight_decay": tune.uniform(0.0, 0.3), # tune.uniform(1, 10) == np.random.uniform(1, 10)
        "learning_rate": tune.uniform(1e-5, 5e-5),
        "warmup_ratio": tune.uniform(0.0, 0.3),
    },
)


reporter = CLIReporter(
    parameter_columns={
        "weight_decay": "w_decay",
        "learning_rate": "lr",
        "per_device_train_batch_size": "train_bs/gpu",
        "num_train_epochs": "num_epochs",
    },
    metric_columns=["eval_f1", "eval_accuracy", "eval_loss", "epoch", "training_iteration"],
)

result = trainer.hyperparameter_search(
    direction = "maximize",
    hp_space = lambda _: tune_config,
    backend="ray",
    n_trials=n_trials,
    resources_per_trial={"cpu": cpus_per_trial, "gpu": gpus_per_trial},
    scheduler=scheduler,
    keep_checkpoints_num=1,
    checkpoint_score_attr="training_iteration",
    stop=None,
    progress_reporter=reporter,
    local_dir="./test-results",
    name="tune_transformer_pbt",
    log_to_file=True,
)

[2m[36m(pid=3822550)[0m 2022-10-19 02:38:59.873107: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0


== Status ==
Current time: 2022-10-19 02:38:58 (running for 00:00:00.15)
Memory usage on this node: 8.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------|
| _objective_2ed07_00000 | RUNNING  | 172.17.0.3:3822550 | 0.190536  | 4.74713e-05 |                |            2 |
| _objective_2ed07_00001 | PENDING  |                    | 0.281782  | 2.12275e-05 |                |            5 

[2m[36m(_objective pid=3822550)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.bias', 'lm_head.decoder.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight']
[2m[36m(_objective pid=3822550)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3822550)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3822550)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:39:05 (running for 00:00:07.37)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------|
| _objective_2ed07_00000 | RUNNING  | 172.17.0.3:3822550 | 0.190536  | 4.74713e-05 |                |            2 |
| _objective_2ed07_00001 | PENDING  |                    | 0.281782  | 2.12275e-05 |                |            5

  3%|▎         | 2/62 [00:01<00:41,  1.46it/s]
  5%|▍         | 3/62 [00:02<00:40,  1.46it/s]
  6%|▋         | 4/62 [00:02<00:39,  1.47it/s]
  8%|▊         | 5/62 [00:03<00:38,  1.49it/s]
 10%|▉         | 6/62 [00:04<00:37,  1.48it/s]
 11%|█▏        | 7/62 [00:04<00:37,  1.48it/s]
 13%|█▎        | 8/62 [00:05<00:36,  1.49it/s]
 15%|█▍        | 9/62 [00:06<00:35,  1.48it/s]


== Status ==
Current time: 2022-10-19 02:39:10 (running for 00:00:12.37)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------|
| _objective_2ed07_00000 | RUNNING  | 172.17.0.3:3822550 | 0.190536  | 4.74713e-05 |                |            2 |
| _objective_2ed07_00001 | PENDING  |                    | 0.281782  | 2.12275e-05 |                |            5

 16%|█▌        | 10/62 [00:06<00:34,  1.49it/s]
 18%|█▊        | 11/62 [00:07<00:34,  1.49it/s]
 19%|█▉        | 12/62 [00:08<00:33,  1.49it/s]
 21%|██        | 13/62 [00:08<00:32,  1.49it/s]
 23%|██▎       | 14/62 [00:09<00:32,  1.49it/s]
 24%|██▍       | 15/62 [00:10<00:31,  1.49it/s]
 26%|██▌       | 16/62 [00:10<00:30,  1.49it/s]


== Status ==
Current time: 2022-10-19 02:39:15 (running for 00:00:17.38)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------|
| _objective_2ed07_00000 | RUNNING  | 172.17.0.3:3822550 | 0.190536  | 4.74713e-05 |                |            2 |
| _objective_2ed07_00001 | PENDING  |                    | 0.281782  | 2.12275e-05 |                |            5

 27%|██▋       | 17/62 [00:11<00:30,  1.49it/s]
 29%|██▉       | 18/62 [00:12<00:29,  1.48it/s]
 31%|███       | 19/62 [00:12<00:28,  1.48it/s]
 32%|███▏      | 20/62 [00:13<00:28,  1.49it/s]
 34%|███▍      | 21/62 [00:14<00:27,  1.49it/s]
 35%|███▌      | 22/62 [00:14<00:26,  1.49it/s]
 37%|███▋      | 23/62 [00:15<00:26,  1.49it/s]
 39%|███▊      | 24/62 [00:16<00:25,  1.49it/s]


== Status ==
Current time: 2022-10-19 02:39:20 (running for 00:00:22.38)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------|
| _objective_2ed07_00000 | RUNNING  | 172.17.0.3:3822550 | 0.190536  | 4.74713e-05 |                |            2 |
| _objective_2ed07_00001 | PENDING  |                    | 0.281782  | 2.12275e-05 |                |            5

 40%|████      | 25/62 [00:16<00:24,  1.49it/s]
 42%|████▏     | 26/62 [00:17<00:24,  1.49it/s]
 44%|████▎     | 27/62 [00:18<00:23,  1.49it/s]
 45%|████▌     | 28/62 [00:18<00:22,  1.49it/s]
 47%|████▋     | 29/62 [00:19<00:22,  1.49it/s]
 48%|████▊     | 30/62 [00:20<00:21,  1.49it/s]
 50%|█████     | 31/62 [00:20<00:20,  1.49it/s]


== Status ==
Current time: 2022-10-19 02:39:25 (running for 00:00:27.38)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------|
| _objective_2ed07_00000 | RUNNING  | 172.17.0.3:3822550 | 0.190536  | 4.74713e-05 |                |            2 |
| _objective_2ed07_00001 | PENDING  |                    | 0.281782  | 2.12275e-05 |                |            5

 52%|█████▏    | 32/62 [00:21<00:21,  1.41it/s]
 53%|█████▎    | 33/62 [00:22<00:19,  1.45it/s]
 55%|█████▍    | 34/62 [00:22<00:18,  1.49it/s]
 56%|█████▋    | 35/62 [00:23<00:17,  1.51it/s]
 58%|█████▊    | 36/62 [00:24<00:16,  1.53it/s]
 60%|█████▉    | 37/62 [00:24<00:16,  1.54it/s]
 61%|██████▏   | 38/62 [00:25<00:15,  1.55it/s]
 63%|██████▎   | 39/62 [00:26<00:14,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:39:30 (running for 00:00:32.38)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------|
| _objective_2ed07_00000 | RUNNING  | 172.17.0.3:3822550 | 0.190536  | 4.74713e-05 |                |            2 |
| _objective_2ed07_00001 | PENDING  |                    | 0.281782  | 2.12275e-05 |                |            5

 65%|██████▍   | 40/62 [00:26<00:14,  1.56it/s]
 66%|██████▌   | 41/62 [00:27<00:13,  1.57it/s]
 68%|██████▊   | 42/62 [00:28<00:12,  1.57it/s]
 69%|██████▉   | 43/62 [00:28<00:12,  1.57it/s]
 71%|███████   | 44/62 [00:29<00:11,  1.57it/s]
 73%|███████▎  | 45/62 [00:29<00:10,  1.57it/s]
 74%|███████▍  | 46/62 [00:30<00:10,  1.57it/s]
 76%|███████▌  | 47/62 [00:31<00:09,  1.57it/s]


== Status ==
Current time: 2022-10-19 02:39:35 (running for 00:00:37.38)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------|
| _objective_2ed07_00000 | RUNNING  | 172.17.0.3:3822550 | 0.190536  | 4.74713e-05 |                |            2 |
| _objective_2ed07_00001 | PENDING  |                    | 0.281782  | 2.12275e-05 |                |            5

 77%|███████▋  | 48/62 [00:31<00:08,  1.57it/s]
 79%|███████▉  | 49/62 [00:32<00:08,  1.57it/s]
 81%|████████  | 50/62 [00:33<00:07,  1.57it/s]
[2m[36m(_objective pid=3822550)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3822550)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.68it/s][A
[2m[36m(_objective pid=3822550)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.29it/s][A
[2m[36m(_objective pid=3822550)[0m 
  9%|▉         | 11/125 [00:00<00:04, 26.15it/s][A
[2m[36m(_objective pid=3822550)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.54it/s][A
[2m[36m(_objective pid=3822550)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.20it/s][A
[2m[36m(_objective pid=3822550)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.96it/s][A
[2m[36m(_objective pid=3822550)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.77it/s][A
[2m[36m(_objective pid=3822550)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.59it/s][A
[2m[36m(_objective pid=3822550)[0m 
 23%|██▎ 

== Status ==
Current time: 2022-10-19 02:39:40 (running for 00:00:42.39)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------|
| _objective_2ed07_00000 | RUNNING  | 172.17.0.3:3822550 | 0.190536  | 4.74713e-05 |                |            2 |
| _objective_2ed07_00001 | PENDING  |                    | 0.281782  | 2.12275e-05 |                |            5

[2m[36m(_objective pid=3822550)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.44it/s][A
[2m[36m(_objective pid=3822550)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.47it/s][A
[2m[36m(_objective pid=3822550)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.48it/s][A
[2m[36m(_objective pid=3822550)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.50it/s][A
[2m[36m(_objective pid=3822550)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.50it/s][A
[2m[36m(_objective pid=3822550)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.50it/s][A
[2m[36m(_objective pid=3822550)[0m 
 78%|███████▊  | 98/125 [00:03<00:01, 24.47it/s][A
[2m[36m(_objective pid=3822550)[0m 
 81%|████████  | 101/125 [00:04<00:00, 24.49it/s][A
[2m[36m(_objective pid=3822550)[0m 
 83%|████████▎ | 104/125 [00:04<00:00, 24.52it/s][A
[2m[36m(_objective pid=3822550)[0m 
 86%|████████▌ | 107/125 [00:04<00:00, 24.53it/s][A
[2m[36m(_objective pid=3822550)[0m 
 88%|████████▊ | 110/125 [00:04<00:00, 24.54it/s

Result for _objective_2ed07_00000:
  date: 2022-10-19_02-39-44
  done: false
  epoch: 1.61
  eval_accuracy: 0.877
  eval_f1: 0.8851540616246499
  eval_loss: 0.32768750190734863
  eval_runtime: 6.8255
  eval_samples_per_second: 146.51
  eval_steps_per_second: 18.314
  experiment_id: cab204b0b71b4a36ba8ad990e4f89560
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.76215406162465
  pid: 3822550
  time_since_restore: 43.60719609260559
  time_this_iter_s: 43.60719609260559
  time_total_s: 43.60719609260559
  timestamp: 1666147184
  timesteps_since_restore: 0
  training_iteration: 1
  trial_id: 2ed07_00000
  warmup_time: 0.0026845932006835938
  
[2m[36m(_objective pid=3822550)[0m {'eval_loss': 0.32768750190734863, 'eval_accuracy': 0.877, 'eval_f1': 0.8851540616246499, 'eval_runtime': 6.8255, 'eval_samples_per_second': 146.51, 'eval_steps_per_second': 18.314, 'epoch': 1.61}


                                               
 81%|████████  | 50/62 [00:39<00:07,  1.57it/s]  
100%|██████████| 125/125 [00:06<00:00, 24.49it/s][A
                                                 [A
 81%|████████  | 50/62 [00:40<00:09,  1.25it/s]
[2m[36m(pid=3822818)[0m 2022-10-19 02:39:46.604076: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0


== Status ==
Current time: 2022-10-19 02:39:50 (running for 00:00:51.87)
Memory usage on this node: 12.9/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 3 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

[2m[36m(_objective pid=3822818)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.decoder.weight', 'lm_head.decoder.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.bias', 'lm_head.bias', 'lm_head.layer_norm.weight']
[2m[36m(_objective pid=3822818)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3822818)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3822818)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:39:55 (running for 00:00:56.88)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 3 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

  5%|▍         | 7/155 [00:04<01:34,  1.57it/s]
  5%|▌         | 8/155 [00:05<01:33,  1.57it/s]
  6%|▌         | 9/155 [00:05<01:33,  1.57it/s]
  6%|▋         | 10/155 [00:06<01:32,  1.57it/s]
  7%|▋         | 11/155 [00:07<01:31,  1.57it/s]
  8%|▊         | 12/155 [00:07<01:31,  1.57it/s]
  8%|▊         | 13/155 [00:08<01:30,  1.57it/s]
  9%|▉         | 14/155 [00:08<01:29,  1.57it/s]


== Status ==
Current time: 2022-10-19 02:40:00 (running for 00:01:01.88)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 3 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 10%|▉         | 15/155 [00:09<01:29,  1.57it/s]
 10%|█         | 16/155 [00:10<01:28,  1.57it/s]
 11%|█         | 17/155 [00:10<01:28,  1.57it/s]
 12%|█▏        | 18/155 [00:11<01:27,  1.57it/s]
 12%|█▏        | 19/155 [00:12<01:26,  1.57it/s]
 13%|█▎        | 20/155 [00:12<01:26,  1.57it/s]
 14%|█▎        | 21/155 [00:13<01:25,  1.56it/s]
 14%|█▍        | 22/155 [00:14<01:24,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:40:05 (running for 00:01:06.88)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 3 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 15%|█▍        | 23/155 [00:14<01:24,  1.57it/s]
 15%|█▌        | 24/155 [00:15<01:23,  1.56it/s]
 16%|█▌        | 25/155 [00:15<01:23,  1.56it/s]
 17%|█▋        | 26/155 [00:16<01:22,  1.56it/s]
 17%|█▋        | 27/155 [00:17<01:21,  1.56it/s]
 18%|█▊        | 28/155 [00:17<01:21,  1.57it/s]
 19%|█▊        | 29/155 [00:18<01:20,  1.57it/s]


== Status ==
Current time: 2022-10-19 02:40:10 (running for 00:01:11.88)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 3 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 19%|█▉        | 30/155 [00:19<01:19,  1.56it/s]
 20%|██        | 31/155 [00:19<01:19,  1.56it/s]
 21%|██        | 32/155 [00:20<01:24,  1.46it/s]
 21%|██▏       | 33/155 [00:21<01:21,  1.49it/s]
 22%|██▏       | 34/155 [00:21<01:20,  1.51it/s]
 23%|██▎       | 35/155 [00:22<01:18,  1.53it/s]
 23%|██▎       | 36/155 [00:23<01:17,  1.54it/s]
 24%|██▍       | 37/155 [00:23<01:16,  1.55it/s]


== Status ==
Current time: 2022-10-19 02:40:15 (running for 00:01:16.89)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 3 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 25%|██▍       | 38/155 [00:24<01:15,  1.55it/s]
 25%|██▌       | 39/155 [00:25<01:14,  1.56it/s]
 26%|██▌       | 40/155 [00:25<01:13,  1.56it/s]
 26%|██▋       | 41/155 [00:26<01:13,  1.56it/s]
 27%|██▋       | 42/155 [00:26<01:12,  1.56it/s]
 28%|██▊       | 43/155 [00:27<01:11,  1.56it/s]
 28%|██▊       | 44/155 [00:28<01:10,  1.56it/s]
 29%|██▉       | 45/155 [00:28<01:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:40:20 (running for 00:01:21.89)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 3 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 30%|██▉       | 46/155 [00:29<01:09,  1.56it/s]
 30%|███       | 47/155 [00:30<01:09,  1.56it/s]
 31%|███       | 48/155 [00:30<01:08,  1.57it/s]
 32%|███▏      | 49/155 [00:31<01:07,  1.57it/s]
 32%|███▏      | 50/155 [00:32<01:07,  1.57it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3822818)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.42it/s][A
[2m[36m(_objective pid=3822818)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.22it/s][A
[2m[36m(_objective pid=3822818)[0m 
  9%|▉         | 11/125 [00:00<00:04, 26.14it/s][A
[2m[36m(_objective pid=3822818)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.53it/s][A
[2m[36m(_objective pid=3822818)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.17it/s][A
[2m[36m(_objective pid=3822818)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.92it/s][A
[2m[36m(_objective pid=3822818)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.64it/s][A
[2m[36m(_objective pid=3822818)[0m 
 21%|██        | 26/125 [00:01<00:04, 

== Status ==
Current time: 2022-10-19 02:40:25 (running for 00:01:26.89)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 3 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

[2m[36m(_objective pid=3822818)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.43it/s][A
[2m[36m(_objective pid=3822818)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.44it/s][A
[2m[36m(_objective pid=3822818)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.45it/s][A
[2m[36m(_objective pid=3822818)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.49it/s][A
[2m[36m(_objective pid=3822818)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.51it/s][A
[2m[36m(_objective pid=3822818)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.52it/s][A
[2m[36m(_objective pid=3822818)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.49it/s][A
[2m[36m(_objective pid=3822818)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.49it/s][A
[2m[36m(_objective pid=3822818)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.39it/s][A
[2m[36m(_objective pid=3822818)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 23.94it/s][A
[2m[36m(_objective pid=3822818)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.12it/s][A

Result for _objective_2ed07_00001:
  date: 2022-10-19_02-40-29
  done: false
  epoch: 1.61
  eval_accuracy: 0.873
  eval_f1: 0.8800755429650613
  eval_loss: 0.33460041880607605
  eval_runtime: 6.3604
  eval_samples_per_second: 157.223
  eval_steps_per_second: 19.653
  experiment_id: 037f0829187c4e36ae81f75b0b213481
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.7530755429650613
  pid: 3822818
  time_since_restore: 42.13527297973633
  time_this_iter_s: 42.13527297973633
  time_total_s: 42.13527297973633
  timestamp: 1666147229
  timesteps_since_restore: 0
  training_iteration: 1
  trial_id: 2ed07_00001
  warmup_time: 0.0026731491088867188
  
[2m[36m(_objective pid=3822818)[0m {'eval_loss': 0.33460041880607605, 'eval_accuracy': 0.873, 'eval_f1': 0.8800755429650613, 'eval_runtime': 6.3604, 'eval_samples_per_second': 157.223, 'eval_steps_per_second': 19.653, 'epoch': 1.61}


[2m[36m(pid=3823074)[0m 2022-10-19 02:40:31.582028: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0


== Status ==
Current time: 2022-10-19 02:40:35 (running for 00:01:36.87)
Memory usage on this node: 12.9/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 2 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

[2m[36m(_objective pid=3823074)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.weight', 'lm_head.bias', 'lm_head.dense.weight', 'lm_head.decoder.bias', 'lm_head.layer_norm.bias']
[2m[36m(_objective pid=3823074)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3823074)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3823074)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:40:40 (running for 00:01:41.88)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 2 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

  5%|▍         | 7/155 [00:04<01:34,  1.57it/s]
  5%|▌         | 8/155 [00:05<01:33,  1.57it/s]
  6%|▌         | 9/155 [00:05<01:33,  1.57it/s]
  6%|▋         | 10/155 [00:06<01:32,  1.56it/s]
  7%|▋         | 11/155 [00:07<01:32,  1.57it/s]
  8%|▊         | 12/155 [00:07<01:31,  1.56it/s]
  8%|▊         | 13/155 [00:08<01:30,  1.56it/s]
  9%|▉         | 14/155 [00:08<01:30,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:40:45 (running for 00:01:46.88)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 2 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 10%|▉         | 15/155 [00:09<01:29,  1.56it/s]
 10%|█         | 16/155 [00:10<01:28,  1.56it/s]
 11%|█         | 17/155 [00:10<01:28,  1.56it/s]
 12%|█▏        | 18/155 [00:11<01:27,  1.56it/s]
 12%|█▏        | 19/155 [00:12<01:26,  1.56it/s]
 13%|█▎        | 20/155 [00:12<01:26,  1.56it/s]
 14%|█▎        | 21/155 [00:13<01:25,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:40:50 (running for 00:01:51.88)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 2 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 14%|█▍        | 22/155 [00:14<01:24,  1.56it/s]
 15%|█▍        | 23/155 [00:14<01:24,  1.57it/s]
 15%|█▌        | 24/155 [00:15<01:23,  1.57it/s]
 16%|█▌        | 25/155 [00:15<01:23,  1.56it/s]
 17%|█▋        | 26/155 [00:16<01:22,  1.56it/s]
 17%|█▋        | 27/155 [00:17<01:21,  1.56it/s]
 18%|█▊        | 28/155 [00:17<01:21,  1.56it/s]
 19%|█▊        | 29/155 [00:18<01:20,  1.57it/s]


== Status ==
Current time: 2022-10-19 02:40:55 (running for 00:01:56.88)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 2 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 19%|█▉        | 30/155 [00:19<01:19,  1.56it/s]
 20%|██        | 31/155 [00:19<01:19,  1.56it/s]
 21%|██        | 32/155 [00:20<01:24,  1.46it/s]
 21%|██▏       | 33/155 [00:21<01:22,  1.49it/s]
 22%|██▏       | 34/155 [00:21<01:20,  1.51it/s]
 23%|██▎       | 35/155 [00:22<01:18,  1.53it/s]
 23%|██▎       | 36/155 [00:23<01:17,  1.54it/s]
 24%|██▍       | 37/155 [00:23<01:16,  1.55it/s]


== Status ==
Current time: 2022-10-19 02:41:00 (running for 00:02:01.88)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 2 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 25%|██▍       | 38/155 [00:24<01:15,  1.55it/s]
 25%|██▌       | 39/155 [00:25<01:14,  1.56it/s]
 26%|██▌       | 40/155 [00:25<01:13,  1.56it/s]
 26%|██▋       | 41/155 [00:26<01:13,  1.56it/s]
 27%|██▋       | 42/155 [00:27<01:12,  1.56it/s]
 28%|██▊       | 43/155 [00:27<01:11,  1.56it/s]
 28%|██▊       | 44/155 [00:28<01:11,  1.56it/s]
 29%|██▉       | 45/155 [00:28<01:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:41:05 (running for 00:02:06.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 2 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 30%|██▉       | 46/155 [00:29<01:09,  1.56it/s]
 30%|███       | 47/155 [00:30<01:09,  1.56it/s]
 31%|███       | 48/155 [00:30<01:08,  1.56it/s]
 32%|███▏      | 49/155 [00:31<01:07,  1.56it/s]
 32%|███▏      | 50/155 [00:32<01:07,  1.56it/s]
[2m[36m(_objective pid=3823074)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3823074)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.23it/s][A
[2m[36m(_objective pid=3823074)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.10it/s][A
[2m[36m(_objective pid=3823074)[0m 
  9%|▉         | 11/125 [00:00<00:04, 26.01it/s][A
[2m[36m(_objective pid=3823074)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.44it/s][A
[2m[36m(_objective pid=3823074)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.08it/s][A
[2m[36m(_objective pid=3823074)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.88it/s][A
[2m[36m(_objective pid=3823074)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.72it/s][A
[2m[36m(_objective pid=3823074)[0m 

== Status ==
Current time: 2022-10-19 02:41:10 (running for 00:02:11.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 2 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

[2m[36m(_objective pid=3823074)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.44it/s][A
[2m[36m(_objective pid=3823074)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.41it/s][A
[2m[36m(_objective pid=3823074)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3823074)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3823074)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.43it/s][A
[2m[36m(_objective pid=3823074)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.45it/s][A
[2m[36m(_objective pid=3823074)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.41it/s][A
[2m[36m(_objective pid=3823074)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.44it/s][A
[2m[36m(_objective pid=3823074)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.43it/s][A
[2m[36m(_objective pid=3823074)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.41it/s][A
[2m[36m(_objective pid=3823074)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.31it/s][A

Result for _objective_2ed07_00002:
  date: 2022-10-19_02-41-14
  done: false
  epoch: 1.61
  eval_accuracy: 0.833
  eval_f1: 0.8258602711157456
  eval_loss: 0.4038117229938507
  eval_runtime: 6.252
  eval_samples_per_second: 159.948
  eval_steps_per_second: 19.993
  experiment_id: aab49e31c6fa43f48990833c86d7c485
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.6588602711157456
  pid: 3823074
  time_since_restore: 42.03655242919922
  time_this_iter_s: 42.03655242919922
  time_total_s: 42.03655242919922
  timestamp: 1666147274
  timesteps_since_restore: 0
  training_iteration: 1
  trial_id: 2ed07_00002
  warmup_time: 0.0017294883728027344
  
[2m[36m(_objective pid=3823074)[0m {'eval_loss': 0.4038117229938507, 'eval_accuracy': 0.833, 'eval_f1': 0.8258602711157456, 'eval_runtime': 6.252, 'eval_samples_per_second': 159.948, 'eval_steps_per_second': 19.993, 'epoch': 1.61}


[2m[36m(pid=3823322)[0m 2022-10-19 02:41:16.606448: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0


== Status ==
Current time: 2022-10-19 02:41:20 (running for 00:02:21.88)
Memory usage on this node: 12.9/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

[2m[36m(_objective pid=3823322)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.layer_norm.weight', 'lm_head.decoder.weight', 'lm_head.decoder.bias', 'lm_head.dense.weight', 'lm_head.dense.bias']
[2m[36m(_objective pid=3823322)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3823322)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3823322)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:41:25 (running for 00:02:26.88)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 11%|█▏        | 7/62 [00:04<00:35,  1.56it/s]
 13%|█▎        | 8/62 [00:05<00:34,  1.56it/s]
 15%|█▍        | 9/62 [00:05<00:33,  1.56it/s]
 16%|█▌        | 10/62 [00:06<00:33,  1.56it/s]
 18%|█▊        | 11/62 [00:07<00:32,  1.56it/s]
 19%|█▉        | 12/62 [00:07<00:31,  1.56it/s]
 21%|██        | 13/62 [00:08<00:31,  1.56it/s]
 23%|██▎       | 14/62 [00:08<00:30,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:41:30 (running for 00:02:31.88)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 24%|██▍       | 15/62 [00:09<00:30,  1.56it/s]
 26%|██▌       | 16/62 [00:10<00:29,  1.56it/s]
 27%|██▋       | 17/62 [00:10<00:28,  1.56it/s]
 29%|██▉       | 18/62 [00:11<00:28,  1.56it/s]
 31%|███       | 19/62 [00:12<00:27,  1.56it/s]
 32%|███▏      | 20/62 [00:12<00:26,  1.56it/s]
 34%|███▍      | 21/62 [00:13<00:26,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:41:35 (running for 00:02:36.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 35%|███▌      | 22/62 [00:14<00:25,  1.56it/s]
 37%|███▋      | 23/62 [00:14<00:24,  1.56it/s]
 39%|███▊      | 24/62 [00:15<00:24,  1.56it/s]
 40%|████      | 25/62 [00:16<00:23,  1.56it/s]
 42%|████▏     | 26/62 [00:16<00:23,  1.56it/s]
 44%|████▎     | 27/62 [00:17<00:22,  1.56it/s]
 45%|████▌     | 28/62 [00:17<00:21,  1.56it/s]
 47%|████▋     | 29/62 [00:18<00:21,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:41:40 (running for 00:02:41.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 48%|████▊     | 30/62 [00:19<00:20,  1.56it/s]
 50%|█████     | 31/62 [00:19<00:19,  1.56it/s]
 52%|█████▏    | 32/62 [00:20<00:20,  1.46it/s]
 53%|█████▎    | 33/62 [00:21<00:19,  1.49it/s]
 55%|█████▍    | 34/62 [00:21<00:18,  1.51it/s]
 56%|█████▋    | 35/62 [00:22<00:17,  1.53it/s]
 58%|█████▊    | 36/62 [00:23<00:16,  1.54it/s]
 60%|█████▉    | 37/62 [00:23<00:16,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:41:45 (running for 00:02:46.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 61%|██████▏   | 38/62 [00:24<00:15,  1.55it/s]
 63%|██████▎   | 39/62 [00:25<00:14,  1.55it/s]
 65%|██████▍   | 40/62 [00:25<00:14,  1.56it/s]
 66%|██████▌   | 41/62 [00:26<00:13,  1.56it/s]
 68%|██████▊   | 42/62 [00:27<00:12,  1.56it/s]
 69%|██████▉   | 43/62 [00:27<00:12,  1.56it/s]
 71%|███████   | 44/62 [00:28<00:11,  1.56it/s]
 73%|███████▎  | 45/62 [00:28<00:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:41:50 (running for 00:02:51.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

 74%|███████▍  | 46/62 [00:29<00:10,  1.56it/s]
 76%|███████▌  | 47/62 [00:30<00:09,  1.56it/s]
 77%|███████▋  | 48/62 [00:30<00:08,  1.56it/s]
 79%|███████▉  | 49/62 [00:31<00:08,  1.56it/s]
 81%|████████  | 50/62 [00:32<00:07,  1.56it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3823322)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.46it/s][A
[2m[36m(_objective pid=3823322)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.04it/s][A
[2m[36m(_objective pid=3823322)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.95it/s][A
[2m[36m(_objective pid=3823322)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.39it/s][A
[2m[36m(_objective pid=3823322)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.03it/s][A
[2m[36m(_objective pid=3823322)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.83it/s][A
[2m[36m(_objective pid=3823322)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.70it/s][A
[2m[36m(_objective pid=3823322)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.60

== Status ==
Current time: 2022-10-19 02:41:55 (running for 00:02:56.90)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+--------------

[2m[36m(_objective pid=3823322)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.42it/s][A
[2m[36m(_objective pid=3823322)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.26it/s][A
[2m[36m(_objective pid=3823322)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.33it/s][A
[2m[36m(_objective pid=3823322)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.25it/s][A
[2m[36m(_objective pid=3823322)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.32it/s][A
[2m[36m(_objective pid=3823322)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.35it/s][A
[2m[36m(_objective pid=3823322)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.37it/s][A
[2m[36m(_objective pid=3823322)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.39it/s][A
[2m[36m(_objective pid=3823322)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.33it/s][A
[2m[36m(_objective pid=3823322)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.27it/s][A
[2m[36m(_objective pid=3823322)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.32it/s][A

Result for _objective_2ed07_00003:
  date: 2022-10-19_02-41-59
  done: false
  epoch: 1.61
  eval_accuracy: 0.889
  eval_f1: 0.89478672985782
  eval_loss: 0.3029094934463501
  eval_runtime: 6.1003
  eval_samples_per_second: 163.926
  eval_steps_per_second: 20.491
  experiment_id: 8919d84288714d75a6c2438ecc0bf4bb
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.78378672985782
  pid: 3823322
  time_since_restore: 41.94896078109741
  time_this_iter_s: 41.94896078109741
  time_total_s: 41.94896078109741
  timestamp: 1666147319
  timesteps_since_restore: 0
  training_iteration: 1
  trial_id: 2ed07_00003
  warmup_time: 0.0018010139465332031
  
[2m[36m(_objective pid=3823322)[0m {'eval_loss': 0.3029094934463501, 'eval_accuracy': 0.889, 'eval_f1': 0.89478672985782, 'eval_runtime': 6.1003, 'eval_samples_per_second': 163.926, 'eval_steps_per_second': 20.491, 'epoch': 1.61}


[2m[36m(pid=3823580)[0m 2022-10-19 02:42:01.587526: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0


== Status ==
Current time: 2022-10-19 02:42:05 (running for 00:03:06.88)
Memory usage on this node: 13.0/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

[2m[36m(_objective pid=3823580)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.dense.weight', 'lm_head.bias', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.weight', 'lm_head.dense.bias', 'lm_head.decoder.bias']
[2m[36m(_objective pid=3823580)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3823580)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3823580)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:42:10 (running for 00:03:11.88)
Memory usage on this node: 13.1/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 11%|█▏        | 7/62 [00:04<00:35,  1.56it/s]
 13%|█▎        | 8/62 [00:05<00:34,  1.56it/s]
 15%|█▍        | 9/62 [00:05<00:33,  1.56it/s]
 16%|█▌        | 10/62 [00:06<00:33,  1.56it/s]
 18%|█▊        | 11/62 [00:07<00:32,  1.56it/s]
 19%|█▉        | 12/62 [00:07<00:31,  1.56it/s]
 21%|██        | 13/62 [00:08<00:31,  1.56it/s]
 23%|██▎       | 14/62 [00:08<00:30,  1.57it/s]


== Status ==
Current time: 2022-10-19 02:42:15 (running for 00:03:16.89)
Memory usage on this node: 13.1/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 24%|██▍       | 15/62 [00:09<00:30,  1.57it/s]
 26%|██▌       | 16/62 [00:10<00:29,  1.57it/s]
 27%|██▋       | 17/62 [00:10<00:28,  1.56it/s]
 29%|██▉       | 18/62 [00:11<00:28,  1.56it/s]
 31%|███       | 19/62 [00:12<00:27,  1.56it/s]
 32%|███▏      | 20/62 [00:12<00:26,  1.56it/s]
 34%|███▍      | 21/62 [00:13<00:26,  1.56it/s]
 35%|███▌      | 22/62 [00:14<00:25,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:42:20 (running for 00:03:21.89)
Memory usage on this node: 13.1/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 37%|███▋      | 23/62 [00:14<00:24,  1.56it/s]
 39%|███▊      | 24/62 [00:15<00:24,  1.56it/s]
 40%|████      | 25/62 [00:16<00:23,  1.56it/s]
 42%|████▏     | 26/62 [00:16<00:23,  1.56it/s]
 44%|████▎     | 27/62 [00:17<00:22,  1.56it/s]
 45%|████▌     | 28/62 [00:17<00:21,  1.56it/s]
 47%|████▋     | 29/62 [00:18<00:21,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:42:25 (running for 00:03:26.89)
Memory usage on this node: 13.1/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 48%|████▊     | 30/62 [00:19<00:20,  1.56it/s]
 50%|█████     | 31/62 [00:19<00:19,  1.56it/s]
 52%|█████▏    | 32/62 [00:20<00:20,  1.46it/s]
 53%|█████▎    | 33/62 [00:21<00:19,  1.49it/s]
 55%|█████▍    | 34/62 [00:21<00:18,  1.51it/s]
 56%|█████▋    | 35/62 [00:22<00:17,  1.53it/s]
 58%|█████▊    | 36/62 [00:23<00:16,  1.54it/s]
 60%|█████▉    | 37/62 [00:23<00:16,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:42:30 (running for 00:03:31.89)
Memory usage on this node: 13.1/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 61%|██████▏   | 38/62 [00:24<00:15,  1.55it/s]
 63%|██████▎   | 39/62 [00:25<00:14,  1.55it/s]
 65%|██████▍   | 40/62 [00:25<00:14,  1.56it/s]
 66%|██████▌   | 41/62 [00:26<00:13,  1.56it/s]
 68%|██████▊   | 42/62 [00:27<00:12,  1.56it/s]
 69%|██████▉   | 43/62 [00:27<00:12,  1.56it/s]
 71%|███████   | 44/62 [00:28<00:11,  1.56it/s]
 73%|███████▎  | 45/62 [00:28<00:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:42:35 (running for 00:03:36.90)
Memory usage on this node: 13.1/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 74%|███████▍  | 46/62 [00:29<00:10,  1.56it/s]
 76%|███████▌  | 47/62 [00:30<00:09,  1.56it/s]
 77%|███████▋  | 48/62 [00:30<00:08,  1.56it/s]
 79%|███████▉  | 49/62 [00:31<00:08,  1.56it/s]
 81%|████████  | 50/62 [00:32<00:07,  1.56it/s]
[2m[36m(_objective pid=3823580)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3823580)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.48it/s][A
[2m[36m(_objective pid=3823580)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.19it/s][A
[2m[36m(_objective pid=3823580)[0m 
  9%|▉         | 11/125 [00:00<00:04, 26.06it/s][A
[2m[36m(_objective pid=3823580)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.45it/s][A
[2m[36m(_objective pid=3823580)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.09it/s][A
[2m[36m(_objective pid=3823580)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.87it/s][A
[2m[36m(_objective pid=3823580)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.75it/s][A
[2m[36m(_objective pid=3823580)[0m 
 21%

== Status ==
Current time: 2022-10-19 02:42:40 (running for 00:03:41.90)
Memory usage on this node: 13.1/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

[2m[36m(_objective pid=3823580)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.41it/s][A
[2m[36m(_objective pid=3823580)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.39it/s][A
[2m[36m(_objective pid=3823580)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.40it/s][A
[2m[36m(_objective pid=3823580)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.43it/s][A
[2m[36m(_objective pid=3823580)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.44it/s][A
[2m[36m(_objective pid=3823580)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.45it/s][A
[2m[36m(_objective pid=3823580)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3823580)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.44it/s][A
[2m[36m(_objective pid=3823580)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.41it/s][A
[2m[36m(_objective pid=3823580)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 23.94it/s][A
[2m[36m(_objective pid=3823580)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.09it/s][A

Result for _objective_2ed07_00004:
  date: 2022-10-19_02-42-44
  done: false
  epoch: 1.61
  eval_accuracy: 0.868
  eval_f1: 0.8784530386740331
  eval_loss: 0.34668323397636414
  eval_runtime: 6.0775
  eval_samples_per_second: 164.541
  eval_steps_per_second: 20.568
  experiment_id: e57f5f92dd414797867a8252b4f4e5f4
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.7464530386740331
  pid: 3823580
  time_since_restore: 41.87562370300293
  time_this_iter_s: 41.87562370300293
  time_total_s: 41.87562370300293
  timestamp: 1666147364
  timesteps_since_restore: 0
  training_iteration: 1
  trial_id: 2ed07_00004
  warmup_time: 0.0018467903137207031
  
[2m[36m(_objective pid=3823580)[0m {'eval_loss': 0.34668323397636414, 'eval_accuracy': 0.868, 'eval_f1': 0.8784530386740331, 'eval_runtime': 6.0775, 'eval_samples_per_second': 164.541, 'eval_steps_per_second': 20.568, 'epoch': 1.61}


[2m[36m(pid=3823839)[0m 2022-10-19 02:42:46.617942: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
[2m[36m(_objective pid=3823839)[0m 2022-10-19 02:42:47,589	INFO trainable.py:668 -- Restored on 172.17.0.3 from checkpoint: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt/_objective_2ed07_00000_0_num_train_epochs=2_2022-10-19_02-38-58/checkpoint_tmpbbc86b
[2m[36m(_objective pid=3823839)[0m 2022-10-19 02:42:47,589	INFO trainable.py:677 -- Current state after restoring: {'_iteration': 0, '_timesteps_total': 0, '_time_total': 43.60719609260559, '_episodes_total': 0}


== Status ==
Current time: 2022-10-19 02:42:50 (running for 00:03:51.89)
Memory usage on this node: 12.9/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

[2m[36m(_objective pid=3823839)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.decoder.weight', 'lm_head.bias', 'lm_head.dense.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight', 'lm_head.decoder.bias']
[2m[36m(_objective pid=3823839)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3823839)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3823839)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:42:55 (running for 00:03:56.89)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 11%|█▏        | 7/62 [00:04<00:35,  1.56it/s]
 13%|█▎        | 8/62 [00:05<00:34,  1.56it/s]
 15%|█▍        | 9/62 [00:05<00:33,  1.56it/s]
 16%|█▌        | 10/62 [00:06<00:33,  1.56it/s]
 18%|█▊        | 11/62 [00:07<00:32,  1.56it/s]
 19%|█▉        | 12/62 [00:07<00:31,  1.56it/s]
 21%|██        | 13/62 [00:08<00:31,  1.56it/s]
 23%|██▎       | 14/62 [00:08<00:30,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:43:00 (running for 00:04:01.89)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 24%|██▍       | 15/62 [00:09<00:30,  1.56it/s]
 26%|██▌       | 16/62 [00:10<00:29,  1.56it/s]
 27%|██▋       | 17/62 [00:10<00:28,  1.57it/s]
 29%|██▉       | 18/62 [00:11<00:28,  1.56it/s]
 31%|███       | 19/62 [00:12<00:27,  1.57it/s]
 32%|███▏      | 20/62 [00:12<00:26,  1.56it/s]
 34%|███▍      | 21/62 [00:13<00:26,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:43:05 (running for 00:04:06.90)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 35%|███▌      | 22/62 [00:14<00:25,  1.57it/s]
 37%|███▋      | 23/62 [00:14<00:24,  1.56it/s]
 39%|███▊      | 24/62 [00:15<00:24,  1.56it/s]
 40%|████      | 25/62 [00:15<00:23,  1.56it/s]
 42%|████▏     | 26/62 [00:16<00:23,  1.56it/s]
 44%|████▎     | 27/62 [00:17<00:22,  1.56it/s]
 45%|████▌     | 28/62 [00:17<00:21,  1.56it/s]
 47%|████▋     | 29/62 [00:18<00:21,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:43:10 (running for 00:04:11.90)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 48%|████▊     | 30/62 [00:19<00:20,  1.56it/s]
 50%|█████     | 31/62 [00:19<00:19,  1.56it/s]
 52%|█████▏    | 32/62 [00:20<00:20,  1.46it/s]
 53%|█████▎    | 33/62 [00:21<00:19,  1.49it/s]
 55%|█████▍    | 34/62 [00:21<00:18,  1.51it/s]
 56%|█████▋    | 35/62 [00:22<00:17,  1.53it/s]
 58%|█████▊    | 36/62 [00:23<00:16,  1.54it/s]
 60%|█████▉    | 37/62 [00:23<00:16,  1.55it/s]


== Status ==
Current time: 2022-10-19 02:43:15 (running for 00:04:16.90)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 61%|██████▏   | 38/62 [00:24<00:15,  1.55it/s]
 63%|██████▎   | 39/62 [00:25<00:14,  1.56it/s]
 65%|██████▍   | 40/62 [00:25<00:14,  1.56it/s]
 66%|██████▌   | 41/62 [00:26<00:13,  1.56it/s]
 68%|██████▊   | 42/62 [00:27<00:12,  1.56it/s]
 69%|██████▉   | 43/62 [00:27<00:12,  1.56it/s]
 71%|███████   | 44/62 [00:28<00:11,  1.56it/s]
 73%|███████▎  | 45/62 [00:28<00:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:43:20 (running for 00:04:21.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 74%|███████▍  | 46/62 [00:29<00:10,  1.56it/s]
 76%|███████▌  | 47/62 [00:30<00:09,  1.56it/s]
 77%|███████▋  | 48/62 [00:30<00:08,  1.56it/s]
 79%|███████▉  | 49/62 [00:31<00:08,  1.56it/s]
 81%|████████  | 50/62 [00:32<00:07,  1.56it/s]
[2m[36m(_objective pid=3823839)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3823839)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.54it/s][A
[2m[36m(_objective pid=3823839)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.12it/s][A
[2m[36m(_objective pid=3823839)[0m 
  9%|▉         | 11/125 [00:00<00:04, 26.03it/s][A
[2m[36m(_objective pid=3823839)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.44it/s][A
[2m[36m(_objective pid=3823839)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.01it/s][A
[2m[36m(_objective pid=3823839)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.84it/s][A
[2m[36m(_objective pid=3823839)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.66it/s][A
[2m[36m(_objective pid=3823839)[0m 
 21%

== Status ==
Current time: 2022-10-19 02:43:25 (running for 00:04:26.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

[2m[36m(_objective pid=3823839)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.43it/s][A
[2m[36m(_objective pid=3823839)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.40it/s][A
[2m[36m(_objective pid=3823839)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3823839)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.44it/s][A
[2m[36m(_objective pid=3823839)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.44it/s][A
[2m[36m(_objective pid=3823839)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.45it/s][A
[2m[36m(_objective pid=3823839)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.44it/s][A
[2m[36m(_objective pid=3823839)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.45it/s][A
[2m[36m(_objective pid=3823839)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3823839)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.43it/s][A
[2m[36m(_objective pid=3823839)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.44it/s][A

Result for _objective_2ed07_00000:
  date: 2022-10-19_02-43-29
  done: false
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.877
  eval_f1: 0.8851540616246499
  eval_loss: 0.32768750190734863
  eval_runtime: 6.0404
  eval_samples_per_second: 165.553
  eval_steps_per_second: 20.694
  experiment_id: cab204b0b71b4a36ba8ad990e4f89560
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.76215406162465
  pid: 3823839
  time_since_restore: 41.871798515319824
  time_this_iter_s: 41.871798515319824
  time_total_s: 85.47899460792542
  timestamp: 1666147409
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00000
  warmup_time: 0.011772394180297852
  
[2m[36m(_objective pid=3823839)[0m {'eval_loss': 0.32768750190734863, 'eval_accuracy': 0.877, 'eval_f1': 0.8851540616246499, 'eval_runtime': 6.0404, 'eval_samples_per_second': 165.553, 'eval_steps_per_second': 20.694, 'epoch': 1.61}


                                               
 81%|████████  | 50/62 [00:38<00:07,  1.56it/s]  
100%|██████████| 125/125 [00:05<00:00, 24.41it/s][A
                                                 [A
 82%|████████▏ | 51/62 [00:38<00:26,  2.45s/it]
 84%|████████▍ | 52/62 [00:39<00:19,  1.91s/it]
 85%|████████▌ | 53/62 [00:40<00:13,  1.53s/it]
 87%|████████▋ | 54/62 [00:40<00:10,  1.26s/it]
 89%|████████▊ | 55/62 [00:41<00:07,  1.08s/it]
 90%|█████████ | 56/62 [00:42<00:05,  1.06it/s]
 92%|█████████▏| 57/62 [00:42<00:04,  1.17it/s]


== Status ==
Current time: 2022-10-19 02:43:34 (running for 00:04:36.01)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (4 PAUSED, 1 RUNNING)
+------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------|
|

 94%|█████████▎| 58/62 [00:43<00:03,  1.27it/s]
 95%|█████████▌| 59/62 [00:43<00:02,  1.34it/s]
 97%|█████████▋| 60/62 [00:44<00:01,  1.40it/s]
 98%|█████████▊| 61/62 [00:45<00:00,  1.45it/s]


Result for _objective_2ed07_00000:
  date: 2022-10-19_02-43-29
  done: true
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.877
  eval_f1: 0.8851540616246499
  eval_loss: 0.32768750190734863
  eval_runtime: 6.0404
  eval_samples_per_second: 165.553
  eval_steps_per_second: 20.694
  experiment_id: cab204b0b71b4a36ba8ad990e4f89560
  experiment_tag: 0_num_train_epochs=2
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.76215406162465
  pid: 3823839
  time_since_restore: 41.871798515319824
  time_this_iter_s: 41.871798515319824
  time_total_s: 85.47899460792542
  timestamp: 1666147409
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00000
  warmup_time: 0.011772394180297852
  
[2m[36m(_objective pid=3823839)[0m {'train_runtime': 46.2012, 'train_samples_per_second': 43.289, 'train_steps_per_second': 1.342, 'train_loss': 0.5555573125039378, 'epoch': 1.99}


100%|██████████| 62/62 [00:45<00:00,  1.35it/s]
[2m[36m(pid=3824122)[0m 2022-10-19 02:43:38.673198: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
[2m[36m(_objective pid=3824122)[0m 2022-10-19 02:43:39,626	INFO trainable.py:668 -- Restored on 172.17.0.3 from checkpoint: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt/_objective_2ed07_00001_1_num_train_epochs=5_2022-10-19_02-39-45/checkpoint_tmpa21a9e
[2m[36m(_objective pid=3824122)[0m 2022-10-19 02:43:39,626	INFO trainable.py:677 -- Current state after restoring: {'_iteration': 0, '_timesteps_total': 0, '_time_total': 42.13527297973633, '_episodes_total': 0}


== Status ==
Current time: 2022-10-19 02:43:42 (running for 00:04:43.87)
Memory usage on this node: 12.8/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3824122)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.decoder.weight', 'lm_head.decoder.bias', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.bias']
[2m[36m(_objective pid=3824122)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3824122)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3824122)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:43:47 (running for 00:04:48.88)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

  5%|▍         | 7/155 [00:04<01:34,  1.56it/s]
  5%|▌         | 8/155 [00:05<01:33,  1.56it/s]
  6%|▌         | 9/155 [00:05<01:33,  1.56it/s]
  6%|▋         | 10/155 [00:06<01:32,  1.56it/s]
  7%|▋         | 11/155 [00:07<01:32,  1.56it/s]
  8%|▊         | 12/155 [00:07<01:31,  1.56it/s]
  8%|▊         | 13/155 [00:08<01:30,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:43:52 (running for 00:04:53.88)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

  9%|▉         | 14/155 [00:08<01:30,  1.56it/s]
 10%|▉         | 15/155 [00:09<01:29,  1.56it/s]
 10%|█         | 16/155 [00:10<01:28,  1.56it/s]
 11%|█         | 17/155 [00:10<01:28,  1.56it/s]
 12%|█▏        | 18/155 [00:11<01:27,  1.56it/s]
 12%|█▏        | 19/155 [00:12<01:26,  1.56it/s]
 13%|█▎        | 20/155 [00:12<01:26,  1.56it/s]
 14%|█▎        | 21/155 [00:13<01:25,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:43:57 (running for 00:04:58.88)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 14%|█▍        | 22/155 [00:14<01:24,  1.56it/s]
 15%|█▍        | 23/155 [00:14<01:24,  1.56it/s]
 15%|█▌        | 24/155 [00:15<01:23,  1.56it/s]
 16%|█▌        | 25/155 [00:15<01:23,  1.56it/s]
 17%|█▋        | 26/155 [00:16<01:22,  1.56it/s]
 17%|█▋        | 27/155 [00:17<01:21,  1.56it/s]
 18%|█▊        | 28/155 [00:17<01:21,  1.56it/s]
 19%|█▊        | 29/155 [00:18<01:20,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:44:02 (running for 00:05:03.89)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 19%|█▉        | 30/155 [00:19<01:19,  1.56it/s]
 20%|██        | 31/155 [00:19<01:19,  1.56it/s]
 21%|██        | 32/155 [00:20<01:24,  1.46it/s]
 21%|██▏       | 33/155 [00:21<01:21,  1.49it/s]
 22%|██▏       | 34/155 [00:21<01:20,  1.51it/s]
 23%|██▎       | 35/155 [00:22<01:18,  1.53it/s]
 23%|██▎       | 36/155 [00:23<01:17,  1.54it/s]
 24%|██▍       | 37/155 [00:23<01:16,  1.55it/s]


== Status ==
Current time: 2022-10-19 02:44:07 (running for 00:05:08.89)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 25%|██▍       | 38/155 [00:24<01:15,  1.55it/s]
 25%|██▌       | 39/155 [00:25<01:14,  1.56it/s]
 26%|██▌       | 40/155 [00:25<01:13,  1.56it/s]
 26%|██▋       | 41/155 [00:26<01:13,  1.56it/s]
 27%|██▋       | 42/155 [00:27<01:12,  1.56it/s]
 28%|██▊       | 43/155 [00:27<01:11,  1.56it/s]
 28%|██▊       | 44/155 [00:28<01:11,  1.56it/s]
 29%|██▉       | 45/155 [00:28<01:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:44:12 (running for 00:05:13.89)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 30%|██▉       | 46/155 [00:29<01:09,  1.56it/s]
 30%|███       | 47/155 [00:30<01:09,  1.56it/s]
 31%|███       | 48/155 [00:30<01:08,  1.56it/s]
 32%|███▏      | 49/155 [00:31<01:07,  1.56it/s]
 32%|███▏      | 50/155 [00:32<01:07,  1.56it/s]
[2m[36m(_objective pid=3824122)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3824122)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.46it/s][A
[2m[36m(_objective pid=3824122)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.16it/s][A
[2m[36m(_objective pid=3824122)[0m 
  9%|▉         | 11/125 [00:00<00:04, 26.01it/s][A
[2m[36m(_objective pid=3824122)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.43it/s][A
[2m[36m(_objective pid=3824122)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.09it/s][A
[2m[36m(_objective pid=3824122)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.87it/s][A
[2m[36m(_objective pid=3824122)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.69it/s][A
[2m[36m(_objective pid=3824122)[0m 

== Status ==
Current time: 2022-10-19 02:44:17 (running for 00:05:18.90)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3824122)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.24it/s][A
[2m[36m(_objective pid=3824122)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.27it/s][A
[2m[36m(_objective pid=3824122)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.26it/s][A
[2m[36m(_objective pid=3824122)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.29it/s][A
[2m[36m(_objective pid=3824122)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.33it/s][A
[2m[36m(_objective pid=3824122)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.35it/s][A
[2m[36m(_objective pid=3824122)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.37it/s][A
[2m[36m(_objective pid=3824122)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.37it/s][A
[2m[36m(_objective pid=3824122)[0m 
 59%|█████▉    | 74/125 [00:03<00:02, 24.29it/s][A
[2m[36m(_objective pid=3824122)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.31it/s][A
[2m[36m(_objective pid=3824122)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.30it/s][A

Result for _objective_2ed07_00001:
  date: 2022-10-19_02-44-21
  done: false
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.873
  eval_f1: 0.8800755429650613
  eval_loss: 0.33460041880607605
  eval_runtime: 6.0618
  eval_samples_per_second: 164.968
  eval_steps_per_second: 20.621
  experiment_id: 037f0829187c4e36ae81f75b0b213481
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.7530755429650613
  pid: 3824122
  time_since_restore: 41.890462160110474
  time_this_iter_s: 41.890462160110474
  time_total_s: 84.0257351398468
  timestamp: 1666147461
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00001
  warmup_time: 0.0033164024353027344
  
[2m[36m(_objective pid=3824122)[0m {'eval_loss': 0.33460041880607605, 'eval_accuracy': 0.873, 'eval_f1': 0.8800755429650613, 'eval_runtime': 6.0618, 'eval_samples_per_second': 164.968, 'eval_steps_per_second': 20.621, 'epoch': 1.61}


                                                
 32%|███▏      | 50/155 [00:38<01:07,  1.56it/s] 
100%|██████████| 125/125 [00:06<00:00, 24.34it/s][A
                                                 [A
 33%|███▎      | 51/155 [00:38<04:15,  2.46s/it]
 34%|███▎      | 52/155 [00:39<03:17,  1.91s/it]
 34%|███▍      | 53/155 [00:40<02:36,  1.53s/it]
 35%|███▍      | 54/155 [00:40<02:07,  1.26s/it]
 35%|███▌      | 55/155 [00:41<01:47,  1.08s/it]
 36%|███▌      | 56/155 [00:42<01:33,  1.06it/s]
 37%|███▋      | 57/155 [00:42<01:23,  1.17it/s]


== Status ==
Current time: 2022-10-19 02:44:26 (running for 00:05:28.07)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 37%|███▋      | 58/155 [00:43<01:16,  1.26it/s]
 38%|███▊      | 59/155 [00:43<01:11,  1.34it/s]
 39%|███▊      | 60/155 [00:44<01:07,  1.40it/s]
 39%|███▉      | 61/155 [00:45<01:05,  1.44it/s]
 40%|████      | 62/155 [00:45<01:02,  1.48it/s]
 41%|████      | 63/155 [00:46<01:05,  1.40it/s]
 41%|████▏     | 64/155 [00:47<01:02,  1.45it/s]
 42%|████▏     | 65/155 [00:47<01:00,  1.48it/s]


== Status ==
Current time: 2022-10-19 02:44:31 (running for 00:05:33.07)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 43%|████▎     | 66/155 [00:48<00:59,  1.50it/s]
 43%|████▎     | 67/155 [00:49<00:57,  1.52it/s]
 44%|████▍     | 68/155 [00:49<00:56,  1.53it/s]
 45%|████▍     | 69/155 [00:50<00:55,  1.54it/s]
 45%|████▌     | 70/155 [00:51<00:54,  1.55it/s]
 46%|████▌     | 71/155 [00:51<00:54,  1.55it/s]
 46%|████▋     | 72/155 [00:52<00:53,  1.55it/s]
 47%|████▋     | 73/155 [00:53<00:52,  1.55it/s]


== Status ==
Current time: 2022-10-19 02:44:36 (running for 00:05:38.08)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 48%|████▊     | 74/155 [00:53<00:52,  1.56it/s]
 48%|████▊     | 75/155 [00:54<00:51,  1.56it/s]
 49%|████▉     | 76/155 [00:55<00:50,  1.56it/s]
 50%|████▉     | 77/155 [00:55<00:50,  1.56it/s]
 50%|█████     | 78/155 [00:56<00:49,  1.56it/s]
 51%|█████     | 79/155 [00:56<00:48,  1.56it/s]
 52%|█████▏    | 80/155 [00:57<00:48,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:44:41 (running for 00:05:43.09)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 52%|█████▏    | 81/155 [00:58<00:47,  1.56it/s]
 53%|█████▎    | 82/155 [00:58<00:46,  1.56it/s]
 54%|█████▎    | 83/155 [00:59<00:46,  1.56it/s]
 54%|█████▍    | 84/155 [01:00<00:45,  1.56it/s]
 55%|█████▍    | 85/155 [01:00<00:44,  1.56it/s]
 55%|█████▌    | 86/155 [01:01<00:44,  1.56it/s]
 56%|█████▌    | 87/155 [01:02<00:43,  1.56it/s]
 57%|█████▋    | 88/155 [01:02<00:42,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:44:46 (running for 00:05:48.09)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 57%|█████▋    | 89/155 [01:03<00:42,  1.56it/s]
 58%|█████▊    | 90/155 [01:03<00:41,  1.56it/s]
 59%|█████▊    | 91/155 [01:04<00:41,  1.56it/s]
 59%|█████▉    | 92/155 [01:05<00:40,  1.56it/s]
 60%|██████    | 93/155 [01:05<00:39,  1.56it/s]
 61%|██████    | 94/155 [01:06<00:41,  1.45it/s]
 61%|██████▏   | 95/155 [01:07<00:40,  1.48it/s]
 62%|██████▏   | 96/155 [01:07<00:39,  1.51it/s]


== Status ==
Current time: 2022-10-19 02:44:51 (running for 00:05:53.09)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 63%|██████▎   | 97/155 [01:08<00:38,  1.52it/s]
 63%|██████▎   | 98/155 [01:09<00:37,  1.53it/s]
 64%|██████▍   | 99/155 [01:09<00:36,  1.54it/s]
 65%|██████▍   | 100/155 [01:10<00:35,  1.55it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3824122)[0m 
  3%|▎         | 4/125 [00:00<00:03, 31.94it/s][A
[2m[36m(_objective pid=3824122)[0m 
  6%|▋         | 8/125 [00:00<00:04, 26.98it/s][A
[2m[36m(_objective pid=3824122)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.93it/s][A
[2m[36m(_objective pid=3824122)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.34it/s][A
[2m[36m(_objective pid=3824122)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.00it/s][A
[2m[36m(_objective pid=3824122)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.77it/s][A
[2m[36m(_objective pid=3824122)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.61it/s][A
[2m[36m(_objective pid=3824122)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.54it/s][A
[2m[36m(_objective pid=3824122)

== Status ==
Current time: 2022-10-19 02:44:56 (running for 00:05:58.09)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 1 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3824122)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.26it/s][A
[2m[36m(_objective pid=3824122)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.25it/s][A
[2m[36m(_objective pid=3824122)[0m 
 59%|█████▉    | 74/125 [00:03<00:02, 24.31it/s][A
[2m[36m(_objective pid=3824122)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.34it/s][A
[2m[36m(_objective pid=3824122)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.26it/s][A
[2m[36m(_objective pid=3824122)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.30it/s][A
[2m[36m(_objective pid=3824122)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.21it/s][A
[2m[36m(_objective pid=3824122)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.25it/s][A
[2m[36m(_objective pid=3824122)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.30it/s][A
[2m[36m(_objective pid=3824122)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.27it/s][A
[2m[36m(_objective pid=3824122)[0m 
 78%|███████▊  | 98/125 [00:03<00:01, 24.32it/s][A

Result for _objective_2ed07_00001:
  date: 2022-10-19_02-45-00
  done: false
  episodes_total: 0
  epoch: 3.22
  eval_accuracy: 0.949
  eval_f1: 0.9524697110904008
  eval_loss: 0.18202358484268188
  eval_runtime: 7.0138
  eval_samples_per_second: 142.576
  eval_steps_per_second: 17.822
  experiment_id: 037f0829187c4e36ae81f75b0b213481
  hostname: 3481a8a2ae33
  iterations_since_restore: 2
  node_ip: 172.17.0.3
  objective: 1.9014697110904009
  pid: 3824122
  time_since_restore: 81.26398229598999
  time_this_iter_s: 39.37352013587952
  time_total_s: 123.39925527572632
  timestamp: 1666147500
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 2
  trial_id: 2ed07_00001
  warmup_time: 0.0033164024353027344
  
[2m[36m(_objective pid=3824122)[0m {'eval_loss': 0.18202358484268188, 'eval_accuracy': 0.949, 'eval_f1': 0.9524697110904008, 'eval_runtime': 7.0138, 'eval_samples_per_second': 142.576, 'eval_steps_per_second': 17.822, 'epoch': 3.22}


[2m[36m(pid=3824521)[0m 2022-10-19 02:45:02.605693: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
[2m[36m(_objective pid=3824521)[0m 2022-10-19 02:45:03,553	INFO trainable.py:668 -- Restored on 172.17.0.3 from checkpoint: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt/_objective_2ed07_00002_2_num_train_epochs=5_2022-10-19_02-40-30/checkpoint_tmp87c45f
[2m[36m(_objective pid=3824521)[0m 2022-10-19 02:45:03,553	INFO trainable.py:677 -- Current state after restoring: {'_iteration': 0, '_timesteps_total': 0, '_time_total': 42.03655242919922, '_episodes_total': 0}


== Status ==
Current time: 2022-10-19 02:45:06 (running for 00:06:07.88)
Memory usage on this node: 13.0/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3824521)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.decoder.weight', 'lm_head.bias', 'lm_head.decoder.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.bias']
[2m[36m(_objective pid=3824521)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3824521)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3824521)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:45:11 (running for 00:06:12.88)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

  5%|▍         | 7/155 [00:04<01:34,  1.56it/s]
  5%|▌         | 8/155 [00:05<01:33,  1.56it/s]
  6%|▌         | 9/155 [00:05<01:33,  1.56it/s]
  6%|▋         | 10/155 [00:06<01:32,  1.56it/s]
  7%|▋         | 11/155 [00:07<01:32,  1.56it/s]
  8%|▊         | 12/155 [00:07<01:31,  1.56it/s]
  8%|▊         | 13/155 [00:08<01:30,  1.56it/s]
  9%|▉         | 14/155 [00:08<01:30,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:45:16 (running for 00:06:17.88)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 10%|▉         | 15/155 [00:09<01:29,  1.56it/s]
 10%|█         | 16/155 [00:10<01:28,  1.56it/s]
 11%|█         | 17/155 [00:10<01:28,  1.56it/s]
 12%|█▏        | 18/155 [00:11<01:27,  1.56it/s]
 12%|█▏        | 19/155 [00:12<01:26,  1.57it/s]
 13%|█▎        | 20/155 [00:12<01:26,  1.57it/s]
 14%|█▎        | 21/155 [00:13<01:25,  1.57it/s]


== Status ==
Current time: 2022-10-19 02:45:21 (running for 00:06:22.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 14%|█▍        | 22/155 [00:14<01:24,  1.57it/s]
 15%|█▍        | 23/155 [00:14<01:24,  1.56it/s]
 15%|█▌        | 24/155 [00:15<01:23,  1.56it/s]
 16%|█▌        | 25/155 [00:15<01:23,  1.56it/s]
 17%|█▋        | 26/155 [00:16<01:22,  1.56it/s]
 17%|█▋        | 27/155 [00:17<01:21,  1.56it/s]
 18%|█▊        | 28/155 [00:17<01:21,  1.56it/s]
 19%|█▊        | 29/155 [00:18<01:20,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:45:26 (running for 00:06:27.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 19%|█▉        | 30/155 [00:19<01:19,  1.56it/s]
 20%|██        | 31/155 [00:19<01:19,  1.56it/s]
 21%|██        | 32/155 [00:20<01:24,  1.46it/s]
 21%|██▏       | 33/155 [00:21<01:21,  1.49it/s]
 22%|██▏       | 34/155 [00:21<01:20,  1.51it/s]
 23%|██▎       | 35/155 [00:22<01:18,  1.53it/s]
 23%|██▎       | 36/155 [00:23<01:17,  1.54it/s]
 24%|██▍       | 37/155 [00:23<01:16,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:45:31 (running for 00:06:32.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 25%|██▍       | 38/155 [00:24<01:15,  1.55it/s]
 25%|██▌       | 39/155 [00:25<01:14,  1.55it/s]
 26%|██▌       | 40/155 [00:25<01:13,  1.55it/s]
 26%|██▋       | 41/155 [00:26<01:13,  1.56it/s]
 27%|██▋       | 42/155 [00:27<01:12,  1.56it/s]
 28%|██▊       | 43/155 [00:27<01:11,  1.56it/s]
 28%|██▊       | 44/155 [00:28<01:11,  1.56it/s]
 29%|██▉       | 45/155 [00:28<01:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:45:36 (running for 00:06:37.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 30%|██▉       | 46/155 [00:29<01:09,  1.56it/s]
 30%|███       | 47/155 [00:30<01:09,  1.56it/s]
 31%|███       | 48/155 [00:30<01:08,  1.56it/s]
 32%|███▏      | 49/155 [00:31<01:08,  1.56it/s]
 32%|███▏      | 50/155 [00:32<01:07,  1.56it/s]
[2m[36m(_objective pid=3824521)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3824521)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.25it/s][A
[2m[36m(_objective pid=3824521)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.09it/s][A
[2m[36m(_objective pid=3824521)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.96it/s][A
[2m[36m(_objective pid=3824521)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.39it/s][A
[2m[36m(_objective pid=3824521)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.04it/s][A
[2m[36m(_objective pid=3824521)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.82it/s][A
[2m[36m(_objective pid=3824521)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.70it/s][A
[2m[36m(_objective pid=3824521)[0m 

== Status ==
Current time: 2022-10-19 02:45:41 (running for 00:06:42.89)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3824521)[0m 
 38%|███▊      | 47/125 [00:01<00:03, 24.42it/s][A
[2m[36m(_objective pid=3824521)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.41it/s][A
[2m[36m(_objective pid=3824521)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.27it/s][A
[2m[36m(_objective pid=3824521)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.32it/s][A
[2m[36m(_objective pid=3824521)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.22it/s][A
[2m[36m(_objective pid=3824521)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.27it/s][A
[2m[36m(_objective pid=3824521)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.27it/s][A
[2m[36m(_objective pid=3824521)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.33it/s][A
[2m[36m(_objective pid=3824521)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.36it/s][A
[2m[36m(_objective pid=3824521)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.36it/s][A
[2m[36m(_objective pid=3824521)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.34it/s][A

Result for _objective_2ed07_00002:
  date: 2022-10-19_02-45-45
  done: false
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.833
  eval_f1: 0.8258602711157456
  eval_loss: 0.4038117229938507
  eval_runtime: 6.3875
  eval_samples_per_second: 156.555
  eval_steps_per_second: 19.569
  experiment_id: aab49e31c6fa43f48990833c86d7c485
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.6588602711157456
  pid: 3824521
  time_since_restore: 42.269349575042725
  time_this_iter_s: 42.269349575042725
  time_total_s: 84.30590200424194
  timestamp: 1666147545
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00002
  warmup_time: 0.003549337387084961
  
[2m[36m(_objective pid=3824521)[0m {'eval_loss': 0.4038117229938507, 'eval_accuracy': 0.833, 'eval_f1': 0.8258602711157456, 'eval_runtime': 6.3875, 'eval_samples_per_second': 156.555, 'eval_steps_per_second': 19.569, 'epoch': 1.61}


                                                
 32%|███▏      | 50/155 [00:38<01:07,  1.56it/s] 
100%|██████████| 125/125 [00:06<00:00, 24.29it/s][A
                                                 [A
 33%|███▎      | 51/155 [00:39<04:26,  2.56s/it]
 34%|███▎      | 52/155 [00:39<03:24,  1.98s/it]
[2m[36m(_objective pid=3824521)[0m   nn.utils.clip_grad_norm_(
 34%|███▍      | 53/155 [00:40<02:40,  1.58s/it]
 35%|███▍      | 54/155 [00:41<02:10,  1.30s/it]
 35%|███▌      | 55/155 [00:41<01:49,  1.10s/it]
 36%|███▌      | 56/155 [00:42<01:35,  1.04it/s]
 37%|███▋      | 57/155 [00:43<01:24,  1.16it/s]


== Status ==
Current time: 2022-10-19 02:45:50 (running for 00:06:52.38)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 37%|███▋      | 58/155 [00:43<01:17,  1.25it/s]
 38%|███▊      | 59/155 [00:44<01:12,  1.33it/s]
 39%|███▊      | 60/155 [00:44<01:08,  1.39it/s]
 39%|███▉      | 61/155 [00:45<01:05,  1.44it/s]
 40%|████      | 62/155 [00:46<01:03,  1.47it/s]
 41%|████      | 63/155 [00:47<01:05,  1.40it/s]
 41%|████▏     | 64/155 [00:47<01:02,  1.44it/s]
 42%|████▏     | 65/155 [00:48<01:00,  1.48it/s]


== Status ==
Current time: 2022-10-19 02:45:55 (running for 00:06:57.38)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 43%|████▎     | 66/155 [00:48<00:59,  1.50it/s]
 43%|████▎     | 67/155 [00:49<00:57,  1.52it/s]
 44%|████▍     | 68/155 [00:50<00:56,  1.53it/s]
 45%|████▍     | 69/155 [00:50<00:55,  1.54it/s]
 45%|████▌     | 70/155 [00:51<00:55,  1.55it/s]
 46%|████▌     | 71/155 [00:52<00:54,  1.55it/s]
 46%|████▋     | 72/155 [00:52<00:53,  1.55it/s]
 47%|████▋     | 73/155 [00:53<00:52,  1.55it/s]


== Status ==
Current time: 2022-10-19 02:46:00 (running for 00:07:02.38)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 48%|████▊     | 74/155 [00:54<00:52,  1.56it/s]
 48%|████▊     | 75/155 [00:54<00:51,  1.56it/s]
 49%|████▉     | 76/155 [00:55<00:50,  1.56it/s]
 50%|████▉     | 77/155 [00:55<00:50,  1.56it/s]
 50%|█████     | 78/155 [00:56<00:49,  1.56it/s]
 51%|█████     | 79/155 [00:57<00:48,  1.56it/s]
 52%|█████▏    | 80/155 [00:57<00:48,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:46:05 (running for 00:07:07.38)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 52%|█████▏    | 81/155 [00:58<00:47,  1.56it/s]
 53%|█████▎    | 82/155 [00:59<00:46,  1.56it/s]
 54%|█████▎    | 83/155 [00:59<00:46,  1.56it/s]
 54%|█████▍    | 84/155 [01:00<00:45,  1.56it/s]
 55%|█████▍    | 85/155 [01:01<00:44,  1.56it/s]
 55%|█████▌    | 86/155 [01:01<00:44,  1.56it/s]
 56%|█████▌    | 87/155 [01:02<00:43,  1.56it/s]
 57%|█████▋    | 88/155 [01:03<00:42,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:46:10 (running for 00:07:12.39)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 57%|█████▋    | 89/155 [01:03<00:42,  1.56it/s]
 58%|█████▊    | 90/155 [01:04<00:41,  1.56it/s]
 59%|█████▊    | 91/155 [01:04<00:40,  1.56it/s]
 59%|█████▉    | 92/155 [01:05<00:40,  1.56it/s]
 60%|██████    | 93/155 [01:06<00:39,  1.56it/s]
 61%|██████    | 94/155 [01:07<00:41,  1.46it/s]
 61%|██████▏   | 95/155 [01:07<00:40,  1.48it/s]
 62%|██████▏   | 96/155 [01:08<00:39,  1.51it/s]


== Status ==
Current time: 2022-10-19 02:46:15 (running for 00:07:17.39)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 63%|██████▎   | 97/155 [01:08<00:38,  1.52it/s]
 63%|██████▎   | 98/155 [01:09<00:37,  1.53it/s]
 64%|██████▍   | 99/155 [01:10<00:36,  1.54it/s]
 65%|██████▍   | 100/155 [01:10<00:35,  1.55it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3824521)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.43it/s][A
[2m[36m(_objective pid=3824521)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.00it/s][A
[2m[36m(_objective pid=3824521)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.92it/s][A
[2m[36m(_objective pid=3824521)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.37it/s][A
[2m[36m(_objective pid=3824521)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 24.90it/s][A
[2m[36m(_objective pid=3824521)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.75it/s][A
[2m[36m(_objective pid=3824521)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.51it/s][A
[2m[36m(_objective pid=3824521)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.48it/s][A
[2m[36m(_objective pid=3824521)

== Status ==
Current time: 2022-10-19 02:46:20 (running for 00:07:22.39)
Memory usage on this node: 13.2/31.1 GiB
PopulationBasedTraining: 2 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3824521)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 23.83it/s][A
[2m[36m(_objective pid=3824521)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.00it/s][A
[2m[36m(_objective pid=3824521)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.13it/s][A
[2m[36m(_objective pid=3824521)[0m 
 59%|█████▉    | 74/125 [00:03<00:02, 24.23it/s][A
[2m[36m(_objective pid=3824521)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.28it/s][A
[2m[36m(_objective pid=3824521)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.29it/s][A
[2m[36m(_objective pid=3824521)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.33it/s][A
[2m[36m(_objective pid=3824521)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.36it/s][A
[2m[36m(_objective pid=3824521)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.37it/s][A
[2m[36m(_objective pid=3824521)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.40it/s][A
[2m[36m(_objective pid=3824521)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.41it/s][A

Result for _objective_2ed07_00002:
  date: 2022-10-19_02-46-24
  done: false
  episodes_total: 0
  epoch: 3.22
  eval_accuracy: 0.951
  eval_f1: 0.9533777354900095
  eval_loss: 0.16220815479755402
  eval_runtime: 6.4603
  eval_samples_per_second: 154.792
  eval_steps_per_second: 19.349
  experiment_id: aab49e31c6fa43f48990833c86d7c485
  hostname: 3481a8a2ae33
  iterations_since_restore: 2
  node_ip: 172.17.0.3
  objective: 1.9043777354900095
  pid: 3824521
  time_since_restore: 81.0607078075409
  time_this_iter_s: 38.79135823249817
  time_total_s: 123.09726023674011
  timestamp: 1666147584
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 2
  trial_id: 2ed07_00002
  warmup_time: 0.003549337387084961
  
[2m[36m(_objective pid=3824521)[0m {'eval_loss': 0.16220815479755402, 'eval_accuracy': 0.951, 'eval_f1': 0.9533777354900095, 'eval_runtime': 6.4603, 'eval_samples_per_second': 154.792, 'eval_steps_per_second': 19.349, 'epoch': 3.22}


                                                 
 65%|██████▍   | 100/155 [01:17<00:35,  1.55it/s]
100%|██████████| 125/125 [00:06<00:00, 24.41it/s][A
                                                 [A
 65%|██████▍   | 100/155 [01:17<00:42,  1.29it/s]
[2m[36m(pid=3824952)[0m 2022-10-19 02:46:26.617855: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
[2m[36m(_objective pid=3824952)[0m 2022-10-19 02:46:27,567	INFO trainable.py:668 -- Restored on 172.17.0.3 from checkpoint: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt/_objective_2ed07_00003_3_num_train_epochs=2_2022-10-19_02-41-15/checkpoint_tmp0707f9
[2m[36m(_objective pid=3824952)[0m 2022-10-19 02:46:27,567	INFO trainable.py:677 -- Current state after restoring: {'_iteration': 0, '_timesteps_total': 0, '_time_total': 41.94896078109741, '_episodes_total': 0}


== Status ==
Current time: 2022-10-19 02:46:30 (running for 00:07:31.90)
Memory usage on this node: 13.0/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3824952)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.layer_norm.bias', 'lm_head.layer_norm.weight', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'lm_head.decoder.bias', 'lm_head.bias', 'lm_head.dense.bias']
[2m[36m(_objective pid=3824952)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3824952)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3824952)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:46:35 (running for 00:07:36.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 11%|█▏        | 7/62 [00:04<00:35,  1.56it/s]
 13%|█▎        | 8/62 [00:05<00:34,  1.56it/s]
 15%|█▍        | 9/62 [00:05<00:33,  1.56it/s]
 16%|█▌        | 10/62 [00:06<00:33,  1.56it/s]
 18%|█▊        | 11/62 [00:07<00:32,  1.56it/s]
 19%|█▉        | 12/62 [00:07<00:31,  1.56it/s]
 21%|██        | 13/62 [00:08<00:31,  1.56it/s]
 23%|██▎       | 14/62 [00:08<00:30,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:46:40 (running for 00:07:41.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 24%|██▍       | 15/62 [00:09<00:30,  1.56it/s]
 26%|██▌       | 16/62 [00:10<00:29,  1.56it/s]
 27%|██▋       | 17/62 [00:10<00:28,  1.57it/s]
 29%|██▉       | 18/62 [00:11<00:28,  1.57it/s]
 31%|███       | 19/62 [00:12<00:27,  1.57it/s]
 32%|███▏      | 20/62 [00:12<00:26,  1.57it/s]
 34%|███▍      | 21/62 [00:13<00:26,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:46:45 (running for 00:07:46.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 35%|███▌      | 22/62 [00:14<00:25,  1.57it/s]
 37%|███▋      | 23/62 [00:14<00:24,  1.57it/s]
 39%|███▊      | 24/62 [00:15<00:24,  1.57it/s]
 40%|████      | 25/62 [00:15<00:23,  1.57it/s]
 42%|████▏     | 26/62 [00:16<00:23,  1.57it/s]
 44%|████▎     | 27/62 [00:17<00:22,  1.56it/s]
 45%|████▌     | 28/62 [00:17<00:21,  1.56it/s]
 47%|████▋     | 29/62 [00:18<00:21,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:46:50 (running for 00:07:51.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 48%|████▊     | 30/62 [00:19<00:20,  1.56it/s]
 50%|█████     | 31/62 [00:19<00:19,  1.56it/s]
 52%|█████▏    | 32/62 [00:20<00:20,  1.46it/s]
 53%|█████▎    | 33/62 [00:21<00:19,  1.49it/s]
 55%|█████▍    | 34/62 [00:21<00:18,  1.51it/s]
 56%|█████▋    | 35/62 [00:22<00:17,  1.52it/s]
 58%|█████▊    | 36/62 [00:23<00:16,  1.53it/s]
 60%|█████▉    | 37/62 [00:23<00:16,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:46:55 (running for 00:07:56.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 61%|██████▏   | 38/62 [00:24<00:15,  1.55it/s]
 63%|██████▎   | 39/62 [00:25<00:14,  1.55it/s]
 65%|██████▍   | 40/62 [00:25<00:14,  1.55it/s]
 66%|██████▌   | 41/62 [00:26<00:13,  1.56it/s]
 68%|██████▊   | 42/62 [00:27<00:12,  1.56it/s]
 69%|██████▉   | 43/62 [00:27<00:12,  1.56it/s]
 71%|███████   | 44/62 [00:28<00:11,  1.56it/s]
 73%|███████▎  | 45/62 [00:28<00:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:47:00 (running for 00:08:01.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 74%|███████▍  | 46/62 [00:29<00:10,  1.56it/s]
 76%|███████▌  | 47/62 [00:30<00:09,  1.56it/s]
 77%|███████▋  | 48/62 [00:30<00:08,  1.56it/s]
 79%|███████▉  | 49/62 [00:31<00:08,  1.56it/s]
 81%|████████  | 50/62 [00:32<00:07,  1.56it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3824952)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.36it/s][A
[2m[36m(_objective pid=3824952)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.10it/s][A
[2m[36m(_objective pid=3824952)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.98it/s][A
[2m[36m(_objective pid=3824952)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.41it/s][A
[2m[36m(_objective pid=3824952)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.08it/s][A
[2m[36m(_objective pid=3824952)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.82it/s][A
[2m[36m(_objective pid=3824952)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.67it/s][A
[2m[36m(_objective pid=3824952)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.52

== Status ==
Current time: 2022-10-19 02:47:05 (running for 00:08:06.93)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3824952)[0m 
 38%|███▊      | 47/125 [00:01<00:03, 24.45it/s][A
[2m[36m(_objective pid=3824952)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.41it/s][A
[2m[36m(_objective pid=3824952)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.40it/s][A
[2m[36m(_objective pid=3824952)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.31it/s][A
[2m[36m(_objective pid=3824952)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.25it/s][A
[2m[36m(_objective pid=3824952)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.30it/s][A
[2m[36m(_objective pid=3824952)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.33it/s][A
[2m[36m(_objective pid=3824952)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.36it/s][A
[2m[36m(_objective pid=3824952)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.39it/s][A
[2m[36m(_objective pid=3824952)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.37it/s][A
[2m[36m(_objective pid=3824952)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.38it/s][A

Result for _objective_2ed07_00003:
  date: 2022-10-19_02-47-09
  done: false
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.889
  eval_f1: 0.89478672985782
  eval_loss: 0.3029094934463501
  eval_runtime: 6.2637
  eval_samples_per_second: 159.649
  eval_steps_per_second: 19.956
  experiment_id: 8919d84288714d75a6c2438ecc0bf4bb
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.78378672985782
  pid: 3824952
  time_since_restore: 42.1159942150116
  time_this_iter_s: 42.1159942150116
  time_total_s: 84.06495499610901
  timestamp: 1666147629
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00003
  warmup_time: 0.0034630298614501953
  
[2m[36m(_objective pid=3824952)[0m {'eval_loss': 0.3029094934463501, 'eval_accuracy': 0.889, 'eval_f1': 0.89478672985782, 'eval_runtime': 6.2637, 'eval_samples_per_second': 159.649, 'eval_steps_per_second': 19.956, 'epoch': 1.61}


                                               
 81%|████████  | 50/62 [00:38<00:07,  1.56it/s]  
100%|██████████| 125/125 [00:06<00:00, 24.24it/s][A
                                                 [A
 82%|████████▏ | 51/62 [00:39<00:27,  2.52s/it]
 84%|████████▍ | 52/62 [00:39<00:19,  1.96s/it]
 85%|████████▌ | 53/62 [00:40<00:14,  1.56s/it]
 87%|████████▋ | 54/62 [00:40<00:10,  1.29s/it]
 89%|████████▊ | 55/62 [00:41<00:07,  1.09s/it]
 90%|█████████ | 56/62 [00:42<00:05,  1.05it/s]
 92%|█████████▏| 57/62 [00:42<00:04,  1.16it/s]


== Status ==
Current time: 2022-10-19 02:47:14 (running for 00:08:16.24)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (3 PAUSED, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 94%|█████████▎| 58/62 [00:43<00:03,  1.26it/s]
 95%|█████████▌| 59/62 [00:44<00:02,  1.34it/s]
 97%|█████████▋| 60/62 [00:44<00:01,  1.40it/s]
 98%|█████████▊| 61/62 [00:45<00:00,  1.44it/s]


Result for _objective_2ed07_00003:
  date: 2022-10-19_02-47-09
  done: true
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.889
  eval_f1: 0.89478672985782
  eval_loss: 0.3029094934463501
  eval_runtime: 6.2637
  eval_samples_per_second: 159.649
  eval_steps_per_second: 19.956
  experiment_id: 8919d84288714d75a6c2438ecc0bf4bb
  experiment_tag: 3_num_train_epochs=2
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.78378672985782
  pid: 3824952
  time_since_restore: 42.1159942150116
  time_this_iter_s: 42.1159942150116
  time_total_s: 84.06495499610901
  timestamp: 1666147629
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00003
  warmup_time: 0.0034630298614501953
  
[2m[36m(_objective pid=3824952)[0m {'train_runtime': 46.4465, 'train_samples_per_second': 43.06, 'train_steps_per_second': 1.335, 'train_loss': 0.4954779532647902, 'epoch': 1.99}


100%|██████████| 62/62 [00:46<00:00,  1.34it/s]
[2m[36m(pid=3825234)[0m 2022-10-19 02:47:19.621097: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
[2m[36m(_objective pid=3825234)[0m 2022-10-19 02:47:20,589	INFO trainable.py:668 -- Restored on 172.17.0.3 from checkpoint: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt/_objective_2ed07_00004_4_num_train_epochs=2_2022-10-19_02-42-00/checkpoint_tmp20b85c
[2m[36m(_objective pid=3825234)[0m 2022-10-19 02:47:20,589	INFO trainable.py:677 -- Current state after restoring: {'_iteration': 0, '_timesteps_total': 0, '_time_total': 41.87562370300293, '_episodes_total': 0}


== Status ==
Current time: 2022-10-19 02:47:23 (running for 00:08:24.90)
Memory usage on this node: 12.9/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 1 RUNNING, 2 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3825234)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.layer_norm.weight', 'lm_head.dense.bias', 'lm_head.bias', 'lm_head.decoder.weight', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.bias']
[2m[36m(_objective pid=3825234)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3825234)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3825234)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:47:28 (running for 00:08:29.90)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 1 RUNNING, 2 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 11%|█▏        | 7/62 [00:04<00:35,  1.56it/s]
 13%|█▎        | 8/62 [00:05<00:34,  1.56it/s]
 15%|█▍        | 9/62 [00:05<00:33,  1.56it/s]
 16%|█▌        | 10/62 [00:06<00:33,  1.56it/s]
 18%|█▊        | 11/62 [00:07<00:32,  1.56it/s]
 19%|█▉        | 12/62 [00:07<00:31,  1.56it/s]
 21%|██        | 13/62 [00:08<00:31,  1.56it/s]
 23%|██▎       | 14/62 [00:08<00:30,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:47:33 (running for 00:08:34.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 1 RUNNING, 2 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 24%|██▍       | 15/62 [00:09<00:30,  1.56it/s]
 26%|██▌       | 16/62 [00:10<00:29,  1.56it/s]
 27%|██▋       | 17/62 [00:10<00:28,  1.56it/s]
 29%|██▉       | 18/62 [00:11<00:28,  1.56it/s]
 31%|███       | 19/62 [00:12<00:27,  1.56it/s]
 32%|███▏      | 20/62 [00:12<00:26,  1.56it/s]
 34%|███▍      | 21/62 [00:13<00:26,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:47:38 (running for 00:08:39.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 1 RUNNING, 2 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 35%|███▌      | 22/62 [00:14<00:25,  1.56it/s]
 37%|███▋      | 23/62 [00:14<00:24,  1.56it/s]
 39%|███▊      | 24/62 [00:15<00:24,  1.56it/s]
 40%|████      | 25/62 [00:16<00:23,  1.56it/s]
 42%|████▏     | 26/62 [00:16<00:23,  1.56it/s]
 44%|████▎     | 27/62 [00:17<00:22,  1.56it/s]
 45%|████▌     | 28/62 [00:17<00:21,  1.56it/s]
 47%|████▋     | 29/62 [00:18<00:21,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:47:43 (running for 00:08:44.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 1 RUNNING, 2 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 48%|████▊     | 30/62 [00:19<00:20,  1.56it/s]
 50%|█████     | 31/62 [00:19<00:19,  1.56it/s]
 52%|█████▏    | 32/62 [00:20<00:20,  1.46it/s]
 53%|█████▎    | 33/62 [00:21<00:19,  1.49it/s]
 55%|█████▍    | 34/62 [00:21<00:18,  1.51it/s]
 56%|█████▋    | 35/62 [00:22<00:17,  1.53it/s]
 58%|█████▊    | 36/62 [00:23<00:16,  1.54it/s]
 60%|█████▉    | 37/62 [00:23<00:16,  1.55it/s]


== Status ==
Current time: 2022-10-19 02:47:48 (running for 00:08:49.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 1 RUNNING, 2 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 61%|██████▏   | 38/62 [00:24<00:15,  1.55it/s]
 63%|██████▎   | 39/62 [00:25<00:14,  1.55it/s]
 65%|██████▍   | 40/62 [00:25<00:14,  1.55it/s]
 66%|██████▌   | 41/62 [00:26<00:13,  1.56it/s]
 68%|██████▊   | 42/62 [00:27<00:12,  1.56it/s]
 69%|██████▉   | 43/62 [00:27<00:12,  1.56it/s]
 71%|███████   | 44/62 [00:28<00:11,  1.56it/s]
 73%|███████▎  | 45/62 [00:28<00:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:47:53 (running for 00:08:54.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 1 RUNNING, 2 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 74%|███████▍  | 46/62 [00:29<00:10,  1.56it/s]
 76%|███████▌  | 47/62 [00:30<00:09,  1.56it/s]
 77%|███████▋  | 48/62 [00:30<00:08,  1.56it/s]
 79%|███████▉  | 49/62 [00:31<00:08,  1.56it/s]
 81%|████████  | 50/62 [00:32<00:07,  1.56it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3825234)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.32it/s][A
[2m[36m(_objective pid=3825234)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.11it/s][A
[2m[36m(_objective pid=3825234)[0m 
  9%|▉         | 11/125 [00:00<00:04, 26.00it/s][A
[2m[36m(_objective pid=3825234)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.26it/s][A
[2m[36m(_objective pid=3825234)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 24.98it/s][A
[2m[36m(_objective pid=3825234)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.79it/s][A
[2m[36m(_objective pid=3825234)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.54it/s][A
[2m[36m(_objective pid=3825234)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.45

== Status ==
Current time: 2022-10-19 02:47:58 (running for 00:08:59.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 1 RUNNING, 2 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3825234)[0m 
 38%|███▊      | 47/125 [00:01<00:03, 24.38it/s][A
[2m[36m(_objective pid=3825234)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.36it/s][A
[2m[36m(_objective pid=3825234)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.36it/s][A
[2m[36m(_objective pid=3825234)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.37it/s][A
[2m[36m(_objective pid=3825234)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.38it/s][A
[2m[36m(_objective pid=3825234)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.40it/s][A
[2m[36m(_objective pid=3825234)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.41it/s][A
[2m[36m(_objective pid=3825234)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3825234)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.30it/s][A
[2m[36m(_objective pid=3825234)[0m 
 59%|█████▉    | 74/125 [00:03<00:03, 13.56it/s][A
[2m[36m(_objective pid=3825234)[0m 
 62%|██████▏   | 77/125 [00:03<00:03, 15.64it/s][A

Result for _objective_2ed07_00004:
  date: 2022-10-19_02-48-02
  done: false
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.868
  eval_f1: 0.8784530386740331
  eval_loss: 0.34668323397636414
  eval_runtime: 6.3993
  eval_samples_per_second: 156.268
  eval_steps_per_second: 19.534
  experiment_id: e57f5f92dd414797867a8252b4f4e5f4
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.7464530386740331
  pid: 3825234
  time_since_restore: 42.302751541137695
  time_this_iter_s: 42.302751541137695
  time_total_s: 84.17837524414062
  timestamp: 1666147682
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00004
  warmup_time: 0.0036301612854003906
  
[2m[36m(_objective pid=3825234)[0m {'eval_loss': 0.34668323397636414, 'eval_accuracy': 0.868, 'eval_f1': 0.8784530386740331, 'eval_runtime': 6.3993, 'eval_samples_per_second': 156.268, 'eval_steps_per_second': 19.534, 'epoch': 1.61}


                                               
 81%|████████  | 50/62 [00:38<00:07,  1.56it/s]  
100%|██████████| 125/125 [00:06<00:00, 24.31it/s][A
                                                 [A
 82%|████████▏ | 51/62 [00:39<00:28,  2.56s/it]
 84%|████████▍ | 52/62 [00:39<00:19,  1.99s/it]
 85%|████████▌ | 53/62 [00:40<00:14,  1.58s/it]
 87%|████████▋ | 54/62 [00:41<00:10,  1.30s/it]
 89%|████████▊ | 55/62 [00:41<00:07,  1.10s/it]
 90%|█████████ | 56/62 [00:42<00:05,  1.04it/s]
 92%|█████████▏| 57/62 [00:43<00:04,  1.15it/s]


== Status ==
Current time: 2022-10-19 02:48:07 (running for 00:09:09.45)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (2 PAUSED, 1 RUNNING, 2 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 94%|█████████▎| 58/62 [00:43<00:03,  1.25it/s]
 95%|█████████▌| 59/62 [00:44<00:02,  1.33it/s]
 97%|█████████▋| 60/62 [00:45<00:01,  1.39it/s]
 98%|█████████▊| 61/62 [00:45<00:00,  1.44it/s]


Result for _objective_2ed07_00004:
  date: 2022-10-19_02-48-02
  done: true
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.868
  eval_f1: 0.8784530386740331
  eval_loss: 0.34668323397636414
  eval_runtime: 6.3993
  eval_samples_per_second: 156.268
  eval_steps_per_second: 19.534
  experiment_id: e57f5f92dd414797867a8252b4f4e5f4
  experiment_tag: 4_num_train_epochs=2
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.7464530386740331
  pid: 3825234
  time_since_restore: 42.302751541137695
  time_this_iter_s: 42.302751541137695
  time_total_s: 84.17837524414062
  timestamp: 1666147682
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00004
  warmup_time: 0.0036301612854003906
  
[2m[36m(_objective pid=3825234)[0m {'train_runtime': 46.6196, 'train_samples_per_second': 42.9, 'train_steps_per_second': 1.33, 'train_loss': 0.528211901264806, 'epoch': 1.99}


100%|██████████| 62/62 [00:46<00:00,  1.34it/s]
[2m[36m(pid=3825535)[0m 2022-10-19 02:48:12.619557: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
[2m[36m(_objective pid=3825535)[0m 2022-10-19 02:48:13,573	INFO trainable.py:668 -- Restored on 172.17.0.3 from checkpoint: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt/_objective_2ed07_00001_1_num_train_epochs=5_2022-10-19_02-39-45/checkpoint_tmpfa7ce5
[2m[36m(_objective pid=3825535)[0m 2022-10-19 02:48:13,573	INFO trainable.py:677 -- Current state after restoring: {'_iteration': 0, '_timesteps_total': 0, '_time_total': 123.39925527572632, '_episodes_total': 0}


== Status ==
Current time: 2022-10-19 02:48:16 (running for 00:09:17.91)
Memory usage on this node: 12.9/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3825535)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.decoder.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.weight']
[2m[36m(_objective pid=3825535)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3825535)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3825535)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:48:21 (running for 00:09:22.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

  5%|▍         | 7/155 [00:04<01:34,  1.57it/s]
  5%|▌         | 8/155 [00:05<01:33,  1.57it/s]
  6%|▌         | 9/155 [00:05<01:33,  1.57it/s]
  6%|▋         | 10/155 [00:06<01:32,  1.57it/s]
  7%|▋         | 11/155 [00:07<01:31,  1.57it/s]
  8%|▊         | 12/155 [00:07<01:31,  1.57it/s]
  8%|▊         | 13/155 [00:08<01:30,  1.57it/s]
  9%|▉         | 14/155 [00:08<01:29,  1.57it/s]


== Status ==
Current time: 2022-10-19 02:48:26 (running for 00:09:27.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 10%|▉         | 15/155 [00:09<01:29,  1.57it/s]
 10%|█         | 16/155 [00:10<01:28,  1.57it/s]
 11%|█         | 17/155 [00:10<01:28,  1.57it/s]
 12%|█▏        | 18/155 [00:11<01:27,  1.57it/s]
 12%|█▏        | 19/155 [00:12<01:26,  1.56it/s]
 13%|█▎        | 20/155 [00:12<01:26,  1.56it/s]
 14%|█▎        | 21/155 [00:13<01:25,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:48:31 (running for 00:09:32.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 14%|█▍        | 22/155 [00:14<01:25,  1.56it/s]
 15%|█▍        | 23/155 [00:14<01:24,  1.56it/s]
 15%|█▌        | 24/155 [00:15<01:23,  1.56it/s]
 16%|█▌        | 25/155 [00:15<01:23,  1.56it/s]
 17%|█▋        | 26/155 [00:16<01:22,  1.56it/s]
 17%|█▋        | 27/155 [00:17<01:21,  1.56it/s]
 18%|█▊        | 28/155 [00:17<01:21,  1.56it/s]
 19%|█▊        | 29/155 [00:18<01:20,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:48:36 (running for 00:09:37.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 19%|█▉        | 30/155 [00:19<01:19,  1.56it/s]
 20%|██        | 31/155 [00:19<01:19,  1.56it/s]
 21%|██        | 32/155 [00:20<01:24,  1.46it/s]
 21%|██▏       | 33/155 [00:21<01:22,  1.49it/s]
 22%|██▏       | 34/155 [00:21<01:20,  1.51it/s]
 23%|██▎       | 35/155 [00:22<01:18,  1.53it/s]
 23%|██▎       | 36/155 [00:23<01:17,  1.54it/s]
 24%|██▍       | 37/155 [00:23<01:16,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:48:41 (running for 00:09:42.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 25%|██▍       | 38/155 [00:24<01:15,  1.55it/s]
 25%|██▌       | 39/155 [00:25<01:14,  1.55it/s]
 26%|██▌       | 40/155 [00:25<01:13,  1.56it/s]
 26%|██▋       | 41/155 [00:26<01:13,  1.56it/s]
 27%|██▋       | 42/155 [00:27<01:12,  1.56it/s]
 28%|██▊       | 43/155 [00:27<01:11,  1.56it/s]
 28%|██▊       | 44/155 [00:28<01:11,  1.56it/s]
 29%|██▉       | 45/155 [00:28<01:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:48:46 (running for 00:09:47.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 30%|██▉       | 46/155 [00:29<01:09,  1.56it/s]
 30%|███       | 47/155 [00:30<01:09,  1.56it/s]
 31%|███       | 48/155 [00:30<01:08,  1.56it/s]
 32%|███▏      | 49/155 [00:31<01:07,  1.56it/s]
 32%|███▏      | 50/155 [00:32<01:07,  1.56it/s]
[2m[36m(_objective pid=3825535)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3825535)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.16it/s][A
[2m[36m(_objective pid=3825535)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.09it/s][A
[2m[36m(_objective pid=3825535)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.99it/s][A
[2m[36m(_objective pid=3825535)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.37it/s][A
[2m[36m(_objective pid=3825535)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.04it/s][A
[2m[36m(_objective pid=3825535)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.83it/s][A
[2m[36m(_objective pid=3825535)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.68it/s][A
[2m[36m(_objective pid=3825535)[0m 

== Status ==
Current time: 2022-10-19 02:48:51 (running for 00:09:52.93)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3825535)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.35it/s][A
[2m[36m(_objective pid=3825535)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.34it/s][A
[2m[36m(_objective pid=3825535)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.35it/s][A
[2m[36m(_objective pid=3825535)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.37it/s][A
[2m[36m(_objective pid=3825535)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.37it/s][A
[2m[36m(_objective pid=3825535)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.37it/s][A
[2m[36m(_objective pid=3825535)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.40it/s][A
[2m[36m(_objective pid=3825535)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3825535)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.40it/s][A
[2m[36m(_objective pid=3825535)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.37it/s][A
[2m[36m(_objective pid=3825535)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.37it/s][A

Result for _objective_2ed07_00001:
  date: 2022-10-19_02-48-55
  done: false
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.873
  eval_f1: 0.8800755429650613
  eval_loss: 0.33460041880607605
  eval_runtime: 6.0839
  eval_samples_per_second: 164.367
  eval_steps_per_second: 20.546
  experiment_id: 037f0829187c4e36ae81f75b0b213481
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.7530755429650613
  pid: 3825535
  time_since_restore: 41.90017247200012
  time_this_iter_s: 41.90017247200012
  time_total_s: 165.29942774772644
  timestamp: 1666147735
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00001
  warmup_time: 0.003434896469116211
  
[2m[36m(_objective pid=3825535)[0m {'eval_loss': 0.33460041880607605, 'eval_accuracy': 0.873, 'eval_f1': 0.8800755429650613, 'eval_runtime': 6.0839, 'eval_samples_per_second': 164.367, 'eval_steps_per_second': 20.546, 'epoch': 1.61}


                                                
 32%|███▏      | 50/155 [00:38<01:07,  1.56it/s] 
100%|██████████| 125/125 [00:06<00:00, 24.27it/s][A
                                                 [A
 33%|███▎      | 51/155 [00:38<04:16,  2.47s/it]
 34%|███▎      | 52/155 [00:39<03:17,  1.92s/it]
 34%|███▍      | 53/155 [00:40<02:36,  1.54s/it]
 35%|███▍      | 54/155 [00:40<02:07,  1.27s/it]
 35%|███▌      | 55/155 [00:41<01:47,  1.08s/it]
 36%|███▌      | 56/155 [00:42<01:33,  1.06it/s]
 37%|███▋      | 57/155 [00:42<01:23,  1.17it/s]


== Status ==
Current time: 2022-10-19 02:49:00 (running for 00:10:02.03)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 37%|███▋      | 58/155 [00:43<01:16,  1.26it/s]
 38%|███▊      | 59/155 [00:43<01:11,  1.34it/s]
 39%|███▊      | 60/155 [00:44<01:07,  1.40it/s]
 39%|███▉      | 61/155 [00:45<01:05,  1.45it/s]
 40%|████      | 62/155 [00:45<01:02,  1.48it/s]
 41%|████      | 63/155 [00:46<01:05,  1.40it/s]
 41%|████▏     | 64/155 [00:47<01:02,  1.45it/s]
 42%|████▏     | 65/155 [00:47<01:00,  1.48it/s]


== Status ==
Current time: 2022-10-19 02:49:05 (running for 00:10:07.03)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 43%|████▎     | 66/155 [00:48<00:59,  1.50it/s]
 43%|████▎     | 67/155 [00:49<00:57,  1.52it/s]
 44%|████▍     | 68/155 [00:49<00:56,  1.53it/s]
 45%|████▍     | 69/155 [00:50<00:55,  1.54it/s]
 45%|████▌     | 70/155 [00:51<00:54,  1.55it/s]
 46%|████▌     | 71/155 [00:51<00:54,  1.55it/s]
 46%|████▋     | 72/155 [00:52<00:53,  1.55it/s]
 47%|████▋     | 73/155 [00:53<00:52,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:49:10 (running for 00:10:12.04)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 48%|████▊     | 74/155 [00:53<00:52,  1.56it/s]
 48%|████▊     | 75/155 [00:54<00:51,  1.56it/s]
 49%|████▉     | 76/155 [00:55<00:50,  1.56it/s]
 50%|████▉     | 77/155 [00:55<00:49,  1.56it/s]
 50%|█████     | 78/155 [00:56<00:49,  1.56it/s]
 51%|█████     | 79/155 [00:56<00:48,  1.56it/s]
 52%|█████▏    | 80/155 [00:57<00:48,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:49:15 (running for 00:10:17.04)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 52%|█████▏    | 81/155 [00:58<00:47,  1.56it/s]
 53%|█████▎    | 82/155 [00:58<00:46,  1.56it/s]
 54%|█████▎    | 83/155 [00:59<00:46,  1.56it/s]
 54%|█████▍    | 84/155 [01:00<00:45,  1.56it/s]
 55%|█████▍    | 85/155 [01:00<00:44,  1.56it/s]
 55%|█████▌    | 86/155 [01:01<00:44,  1.56it/s]
 56%|█████▌    | 87/155 [01:02<00:43,  1.56it/s]
 57%|█████▋    | 88/155 [01:02<00:42,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:49:20 (running for 00:10:22.04)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 57%|█████▋    | 89/155 [01:03<00:42,  1.56it/s]
 58%|█████▊    | 90/155 [01:03<00:41,  1.56it/s]
 59%|█████▊    | 91/155 [01:04<00:40,  1.56it/s]
 59%|█████▉    | 92/155 [01:05<00:40,  1.56it/s]
 60%|██████    | 93/155 [01:05<00:39,  1.56it/s]
 61%|██████    | 94/155 [01:06<00:41,  1.46it/s]
 61%|██████▏   | 95/155 [01:07<00:40,  1.49it/s]
 62%|██████▏   | 96/155 [01:08<00:39,  1.51it/s]


== Status ==
Current time: 2022-10-19 02:49:25 (running for 00:10:27.04)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 63%|██████▎   | 97/155 [01:08<00:38,  1.52it/s]
 63%|██████▎   | 98/155 [01:09<00:37,  1.53it/s]
 64%|██████▍   | 99/155 [01:09<00:36,  1.54it/s]
 65%|██████▍   | 100/155 [01:10<00:35,  1.55it/s]
[2m[36m(_objective pid=3825535)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3825535)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.44it/s][A
[2m[36m(_objective pid=3825535)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.13it/s][A
[2m[36m(_objective pid=3825535)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.99it/s][A
[2m[36m(_objective pid=3825535)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.41it/s][A
[2m[36m(_objective pid=3825535)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.01it/s][A
[2m[36m(_objective pid=3825535)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.81it/s][A
[2m[36m(_objective pid=3825535)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.67it/s][A
[2m[36m(_objective pid=3825535)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.58it/s

== Status ==
Current time: 2022-10-19 02:49:30 (running for 00:10:32.05)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3825535)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.18it/s][A
[2m[36m(_objective pid=3825535)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.26it/s][A
[2m[36m(_objective pid=3825535)[0m 
 59%|█████▉    | 74/125 [00:03<00:02, 24.26it/s][A
[2m[36m(_objective pid=3825535)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.30it/s][A
[2m[36m(_objective pid=3825535)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.33it/s][A
[2m[36m(_objective pid=3825535)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.36it/s][A
[2m[36m(_objective pid=3825535)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.37it/s][A
[2m[36m(_objective pid=3825535)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.36it/s][A
[2m[36m(_objective pid=3825535)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.28it/s][A
[2m[36m(_objective pid=3825535)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.31it/s][A
[2m[36m(_objective pid=3825535)[0m 
 78%|███████▊  | 98/125 [00:03<00:01, 24.35it/s][A

Result for _objective_2ed07_00001:
  date: 2022-10-19_02-49-33
  done: false
  episodes_total: 0
  epoch: 3.22
  eval_accuracy: 0.949
  eval_f1: 0.9524697110904008
  eval_loss: 0.18202358484268188
  eval_runtime: 6.0497
  eval_samples_per_second: 165.296
  eval_steps_per_second: 20.662
  experiment_id: 037f0829187c4e36ae81f75b0b213481
  hostname: 3481a8a2ae33
  iterations_since_restore: 2
  node_ip: 172.17.0.3
  objective: 1.9014697110904009
  pid: 3825535
  time_since_restore: 80.2885513305664
  time_this_iter_s: 38.388378858566284
  time_total_s: 203.68780660629272
  timestamp: 1666147773
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 2
  trial_id: 2ed07_00001
  warmup_time: 0.003434896469116211
  
[2m[36m(_objective pid=3825535)[0m {'eval_loss': 0.18202358484268188, 'eval_accuracy': 0.949, 'eval_f1': 0.9524697110904008, 'eval_runtime': 6.0497, 'eval_samples_per_second': 165.296, 'eval_steps_per_second': 20.662, 'epoch': 3.22}


                                                 
 65%|██████▍   | 100/155 [01:16<00:35,  1.55it/s]
100%|██████████| 125/125 [00:06<00:00, 24.23it/s][A
                                                 [A
 65%|██████▌   | 101/155 [01:17<02:12,  2.46s/it]
 66%|██████▌   | 102/155 [01:17<01:41,  1.91s/it]
 66%|██████▋   | 103/155 [01:18<01:19,  1.53s/it]
 67%|██████▋   | 104/155 [01:19<01:04,  1.26s/it]
 68%|██████▊   | 105/155 [01:19<00:53,  1.08s/it]
 68%|██████▊   | 106/155 [01:20<00:46,  1.06it/s]
 69%|██████▉   | 107/155 [01:21<00:41,  1.17it/s]


== Status ==
Current time: 2022-10-19 02:49:38 (running for 00:10:40.42)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 70%|██████▉   | 108/155 [01:21<00:37,  1.26it/s]
 70%|███████   | 109/155 [01:22<00:34,  1.34it/s]
 71%|███████   | 110/155 [01:23<00:32,  1.40it/s]
 72%|███████▏  | 111/155 [01:23<00:30,  1.44it/s]
 72%|███████▏  | 112/155 [01:24<00:29,  1.48it/s]
 73%|███████▎  | 113/155 [01:24<00:27,  1.50it/s]
 74%|███████▎  | 114/155 [01:25<00:26,  1.52it/s]
 74%|███████▍  | 115/155 [01:26<00:26,  1.53it/s]


== Status ==
Current time: 2022-10-19 02:49:43 (running for 00:10:45.42)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 75%|███████▍  | 116/155 [01:26<00:25,  1.54it/s]
 75%|███████▌  | 117/155 [01:27<00:24,  1.55it/s]
 76%|███████▌  | 118/155 [01:28<00:23,  1.55it/s]
 77%|███████▋  | 119/155 [01:28<00:23,  1.55it/s]
 77%|███████▋  | 120/155 [01:29<00:22,  1.56it/s]
 78%|███████▊  | 121/155 [01:30<00:21,  1.56it/s]
 79%|███████▊  | 122/155 [01:30<00:21,  1.56it/s]
 79%|███████▉  | 123/155 [01:31<00:20,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:49:48 (running for 00:10:50.42)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 80%|████████  | 124/155 [01:31<00:19,  1.56it/s]
 81%|████████  | 125/155 [01:32<00:20,  1.45it/s]
 81%|████████▏ | 126/155 [01:33<00:19,  1.48it/s]
 82%|████████▏ | 127/155 [01:34<00:18,  1.51it/s]
 83%|████████▎ | 128/155 [01:34<00:17,  1.52it/s]
 83%|████████▎ | 129/155 [01:35<00:16,  1.53it/s]
 84%|████████▍ | 130/155 [01:35<00:16,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:49:53 (running for 00:10:55.42)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 85%|████████▍ | 131/155 [01:36<00:15,  1.55it/s]
 85%|████████▌ | 132/155 [01:37<00:14,  1.55it/s]
 86%|████████▌ | 133/155 [01:37<00:14,  1.55it/s]
 86%|████████▋ | 134/155 [01:38<00:13,  1.56it/s]
 87%|████████▋ | 135/155 [01:39<00:12,  1.56it/s]
 88%|████████▊ | 136/155 [01:39<00:12,  1.56it/s]
 88%|████████▊ | 137/155 [01:40<00:11,  1.56it/s]
 89%|████████▉ | 138/155 [01:41<00:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:49:58 (running for 00:11:00.43)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 90%|████████▉ | 139/155 [01:41<00:10,  1.56it/s]
 90%|█████████ | 140/155 [01:42<00:09,  1.56it/s]
 91%|█████████ | 141/155 [01:43<00:08,  1.56it/s]
 92%|█████████▏| 142/155 [01:43<00:08,  1.56it/s]
 92%|█████████▏| 143/155 [01:44<00:07,  1.56it/s]
 93%|█████████▎| 144/155 [01:44<00:07,  1.56it/s]
 94%|█████████▎| 145/155 [01:45<00:06,  1.56it/s]
 94%|█████████▍| 146/155 [01:46<00:05,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:50:03 (running for 00:11:05.43)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 95%|█████████▍| 147/155 [01:46<00:05,  1.56it/s]
 95%|█████████▌| 148/155 [01:47<00:04,  1.56it/s]
 96%|█████████▌| 149/155 [01:48<00:03,  1.56it/s]
 97%|█████████▋| 150/155 [01:48<00:03,  1.56it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3825535)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.40it/s][A
[2m[36m(_objective pid=3825535)[0m 
  6%|▋         | 8/125 [00:00<00:04, 26.86it/s][A
[2m[36m(_objective pid=3825535)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.78it/s][A
[2m[36m(_objective pid=3825535)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.20it/s][A
[2m[36m(_objective pid=3825535)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 24.86it/s][A
[2m[36m(_objective pid=3825535)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.67it/s][A
[2m[36m(_objective pid=3825535)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.57it/s][A
[2m[36m(_objective pid=3825535)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.51it/s][A
[2m[36m(_objective pid=382553

== Status ==
Current time: 2022-10-19 02:50:08 (running for 00:11:10.43)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 3 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3825535)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.23it/s][A
[2m[36m(_objective pid=3825535)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.25it/s][A
[2m[36m(_objective pid=3825535)[0m 
 59%|█████▉    | 74/125 [00:03<00:02, 24.28it/s][A
[2m[36m(_objective pid=3825535)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.32it/s][A
[2m[36m(_objective pid=3825535)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.35it/s][A
[2m[36m(_objective pid=3825535)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.34it/s][A
[2m[36m(_objective pid=3825535)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.36it/s][A
[2m[36m(_objective pid=3825535)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.36it/s][A
[2m[36m(_objective pid=3825535)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.35it/s][A
[2m[36m(_objective pid=3825535)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.36it/s][A
[2m[36m(_objective pid=3825535)[0m 
 78%|███████▊  | 98/125 [00:03<00:01, 24.34it/s][A

Result for _objective_2ed07_00001:
  date: 2022-10-19_02-50-12
  done: false
  episodes_total: 0
  epoch: 4.83
  eval_accuracy: 0.967
  eval_f1: 0.9689557855126999
  eval_loss: 0.12148106843233109
  eval_runtime: 6.8548
  eval_samples_per_second: 145.884
  eval_steps_per_second: 18.236
  experiment_id: 037f0829187c4e36ae81f75b0b213481
  hostname: 3481a8a2ae33
  iterations_since_restore: 3
  node_ip: 172.17.0.3
  objective: 1.9359557855126999
  pid: 3825535
  time_since_restore: 119.34399390220642
  time_this_iter_s: 39.055442571640015
  time_total_s: 242.74324917793274
  timestamp: 1666147812
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 3
  trial_id: 2ed07_00001
  warmup_time: 0.003434896469116211
  
[2m[36m(_objective pid=3825535)[0m {'eval_loss': 0.12148106843233109, 'eval_accuracy': 0.967, 'eval_f1': 0.9689557855126999, 'eval_runtime': 6.8548, 'eval_samples_per_second': 145.884, 'eval_steps_per_second': 18.236, 'epoch': 4.83}


                                                 
 97%|█████████▋| 150/155 [01:55<00:03,  1.56it/s]
100%|██████████| 125/125 [00:06<00:00, 24.38it/s][A
                                                 [A
 97%|█████████▋| 150/155 [01:55<00:03,  1.30it/s]
[2m[36m(pid=3826177)[0m 2022-10-19 02:50:14.627859: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
[2m[36m(_objective pid=3826177)[0m 2022-10-19 02:50:15,585	INFO trainable.py:668 -- Restored on 172.17.0.3 from checkpoint: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt/_objective_2ed07_00002_2_num_train_epochs=5_2022-10-19_02-40-30/checkpoint_tmp5871ee
[2m[36m(_objective pid=3826177)[0m 2022-10-19 02:50:15,585	INFO trainable.py:677 -- Current state after restoring: {'_iteration': 0, '_timesteps_total': 0, '_time_total': 123.09726023674011, '_episodes_total': 0}


== Status ==
Current time: 2022-10-19 02:50:18 (running for 00:11:19.89)
Memory usage on this node: 13.0/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3826177)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.decoder.bias', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.weight', 'lm_head.dense.bias', 'lm_head.bias', 'lm_head.dense.weight']
[2m[36m(_objective pid=3826177)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3826177)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3826177)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:50:23 (running for 00:11:24.90)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

  5%|▍         | 7/155 [00:04<01:34,  1.56it/s]
  5%|▌         | 8/155 [00:05<01:33,  1.56it/s]
  6%|▌         | 9/155 [00:05<01:33,  1.56it/s]
  6%|▋         | 10/155 [00:06<01:32,  1.56it/s]
  7%|▋         | 11/155 [00:07<01:32,  1.56it/s]
  8%|▊         | 12/155 [00:07<01:31,  1.56it/s]
  8%|▊         | 13/155 [00:08<01:30,  1.56it/s]
  9%|▉         | 14/155 [00:08<01:30,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:50:28 (running for 00:11:29.90)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 10%|▉         | 15/155 [00:09<01:29,  1.56it/s]
 10%|█         | 16/155 [00:10<01:28,  1.56it/s]
 11%|█         | 17/155 [00:10<01:28,  1.56it/s]
 12%|█▏        | 18/155 [00:11<01:27,  1.56it/s]
 12%|█▏        | 19/155 [00:12<01:26,  1.56it/s]
 13%|█▎        | 20/155 [00:12<01:26,  1.56it/s]
 14%|█▎        | 21/155 [00:13<01:25,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:50:33 (running for 00:11:34.90)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 14%|█▍        | 22/155 [00:14<01:25,  1.56it/s]
 15%|█▍        | 23/155 [00:14<01:24,  1.56it/s]
 15%|█▌        | 24/155 [00:15<01:23,  1.56it/s]
 16%|█▌        | 25/155 [00:16<01:23,  1.56it/s]
 17%|█▋        | 26/155 [00:16<01:22,  1.56it/s]
 17%|█▋        | 27/155 [00:17<01:21,  1.56it/s]
 18%|█▊        | 28/155 [00:17<01:21,  1.56it/s]
 19%|█▊        | 29/155 [00:18<01:20,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:50:38 (running for 00:11:39.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 19%|█▉        | 30/155 [00:19<01:19,  1.56it/s]
 20%|██        | 31/155 [00:19<01:19,  1.56it/s]
 21%|██        | 32/155 [00:20<01:24,  1.45it/s]
 21%|██▏       | 33/155 [00:21<01:22,  1.48it/s]
 22%|██▏       | 34/155 [00:21<01:20,  1.51it/s]
 23%|██▎       | 35/155 [00:22<01:18,  1.52it/s]
 23%|██▎       | 36/155 [00:23<01:17,  1.54it/s]
 24%|██▍       | 37/155 [00:23<01:16,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:50:43 (running for 00:11:44.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 25%|██▍       | 38/155 [00:24<01:15,  1.55it/s]
 25%|██▌       | 39/155 [00:25<01:14,  1.55it/s]
 26%|██▌       | 40/155 [00:25<01:13,  1.56it/s]
 26%|██▋       | 41/155 [00:26<01:13,  1.56it/s]
 27%|██▋       | 42/155 [00:27<01:12,  1.56it/s]
 28%|██▊       | 43/155 [00:27<01:11,  1.56it/s]
 28%|██▊       | 44/155 [00:28<01:11,  1.56it/s]
 29%|██▉       | 45/155 [00:28<01:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:50:48 (running for 00:11:49.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 30%|██▉       | 46/155 [00:29<01:09,  1.56it/s]
 30%|███       | 47/155 [00:30<01:09,  1.56it/s]
 31%|███       | 48/155 [00:30<01:08,  1.56it/s]
 32%|███▏      | 49/155 [00:31<01:07,  1.56it/s]
 32%|███▏      | 50/155 [00:32<01:07,  1.56it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3826177)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.42it/s][A
[2m[36m(_objective pid=3826177)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.13it/s][A
[2m[36m(_objective pid=3826177)[0m 
  9%|▉         | 11/125 [00:00<00:04, 26.01it/s][A
[2m[36m(_objective pid=3826177)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.37it/s][A
[2m[36m(_objective pid=3826177)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 24.99it/s][A
[2m[36m(_objective pid=3826177)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.82it/s][A
[2m[36m(_objective pid=3826177)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.68it/s][A
[2m[36m(_objective pid=3826177)[0m 
 21%|██        | 26/125 [00:01<00:04, 

== Status ==
Current time: 2022-10-19 02:50:53 (running for 00:11:54.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3826177)[0m 
 38%|███▊      | 47/125 [00:01<00:03, 24.38it/s][A
[2m[36m(_objective pid=3826177)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.37it/s][A
[2m[36m(_objective pid=3826177)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.35it/s][A
[2m[36m(_objective pid=3826177)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.35it/s][A
[2m[36m(_objective pid=3826177)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.38it/s][A
[2m[36m(_objective pid=3826177)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.41it/s][A
[2m[36m(_objective pid=3826177)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.40it/s][A
[2m[36m(_objective pid=3826177)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3826177)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.41it/s][A
[2m[36m(_objective pid=3826177)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.32it/s][A
[2m[36m(_objective pid=3826177)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.34it/s][A

Result for _objective_2ed07_00002:
  date: 2022-10-19_02-50-57
  done: false
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.833
  eval_f1: 0.8258602711157456
  eval_loss: 0.4038117229938507
  eval_runtime: 6.4913
  eval_samples_per_second: 154.052
  eval_steps_per_second: 19.257
  experiment_id: aab49e31c6fa43f48990833c86d7c485
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.6588602711157456
  pid: 3826177
  time_since_restore: 42.35510468482971
  time_this_iter_s: 42.35510468482971
  time_total_s: 165.45236492156982
  timestamp: 1666147857
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00002
  warmup_time: 0.004002571105957031
  
[2m[36m(_objective pid=3826177)[0m {'eval_loss': 0.4038117229938507, 'eval_accuracy': 0.833, 'eval_f1': 0.8258602711157456, 'eval_runtime': 6.4913, 'eval_samples_per_second': 154.052, 'eval_steps_per_second': 19.257, 'epoch': 1.61}


                                                
 32%|███▏      | 50/155 [00:38<01:07,  1.56it/s] 
100%|██████████| 125/125 [00:06<00:00, 24.39it/s][A
                                                 [A
 33%|███▎      | 51/155 [00:39<04:29,  2.59s/it]
 34%|███▎      | 52/155 [00:39<03:26,  2.00s/it]
[2m[36m(_objective pid=3826177)[0m   nn.utils.clip_grad_norm_(
 34%|███▍      | 53/155 [00:40<02:42,  1.59s/it]
 35%|███▍      | 54/155 [00:41<02:12,  1.31s/it]
 35%|███▌      | 55/155 [00:41<01:50,  1.11s/it]
 36%|███▌      | 56/155 [00:42<01:35,  1.03it/s]
 37%|███▋      | 57/155 [00:43<01:25,  1.15it/s]


== Status ==
Current time: 2022-10-19 02:51:02 (running for 00:12:04.49)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 37%|███▋      | 58/155 [00:43<01:17,  1.25it/s]
 38%|███▊      | 59/155 [00:44<01:12,  1.33it/s]
 39%|███▊      | 60/155 [00:45<01:08,  1.39it/s]
 39%|███▉      | 61/155 [00:45<01:05,  1.44it/s]
 40%|████      | 62/155 [00:46<01:03,  1.47it/s]
 41%|████      | 63/155 [00:47<01:05,  1.40it/s]
 41%|████▏     | 64/155 [00:47<01:03,  1.44it/s]
 42%|████▏     | 65/155 [00:48<01:00,  1.48it/s]


== Status ==
Current time: 2022-10-19 02:51:07 (running for 00:12:09.50)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 43%|████▎     | 66/155 [00:49<00:59,  1.50it/s]
 43%|████▎     | 67/155 [00:49<00:57,  1.52it/s]
 44%|████▍     | 68/155 [00:50<00:56,  1.53it/s]
 45%|████▍     | 69/155 [00:50<00:55,  1.54it/s]
 45%|████▌     | 70/155 [00:51<00:54,  1.55it/s]
 46%|████▌     | 71/155 [00:52<00:54,  1.55it/s]
 46%|████▋     | 72/155 [00:52<00:53,  1.55it/s]
 47%|████▋     | 73/155 [00:53<00:52,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:51:12 (running for 00:12:14.50)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 48%|████▊     | 74/155 [00:54<00:51,  1.56it/s]
 48%|████▊     | 75/155 [00:54<00:51,  1.56it/s]
 49%|████▉     | 76/155 [00:55<00:50,  1.56it/s]
 50%|████▉     | 77/155 [00:56<00:49,  1.56it/s]
 50%|█████     | 78/155 [00:56<00:49,  1.56it/s]
 51%|█████     | 79/155 [00:57<00:48,  1.56it/s]
 52%|█████▏    | 80/155 [00:58<00:48,  1.56it/s]
 52%|█████▏    | 81/155 [00:58<00:47,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:51:17 (running for 00:12:19.50)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 53%|█████▎    | 82/155 [00:59<00:46,  1.56it/s]
 54%|█████▎    | 83/155 [00:59<00:46,  1.56it/s]
 54%|█████▍    | 84/155 [01:00<00:45,  1.56it/s]
 55%|█████▍    | 85/155 [01:01<00:44,  1.56it/s]
 55%|█████▌    | 86/155 [01:01<00:44,  1.56it/s]
 56%|█████▌    | 87/155 [01:02<00:43,  1.56it/s]
 57%|█████▋    | 88/155 [01:03<00:42,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:51:22 (running for 00:12:24.50)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 57%|█████▋    | 89/155 [01:03<00:42,  1.56it/s]
 58%|█████▊    | 90/155 [01:04<00:41,  1.56it/s]
 59%|█████▊    | 91/155 [01:05<00:41,  1.56it/s]
 59%|█████▉    | 92/155 [01:05<00:40,  1.56it/s]
 60%|██████    | 93/155 [01:06<00:39,  1.56it/s]
 61%|██████    | 94/155 [01:07<00:41,  1.45it/s]
 61%|██████▏   | 95/155 [01:07<00:40,  1.48it/s]
 62%|██████▏   | 96/155 [01:08<00:39,  1.51it/s]


== Status ==
Current time: 2022-10-19 02:51:27 (running for 00:12:29.51)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 63%|██████▎   | 97/155 [01:09<00:38,  1.52it/s]
 63%|██████▎   | 98/155 [01:09<00:37,  1.53it/s]
 64%|██████▍   | 99/155 [01:10<00:36,  1.54it/s]
 65%|██████▍   | 100/155 [01:11<00:35,  1.55it/s]
[2m[36m(_objective pid=3826177)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3826177)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.38it/s][A
[2m[36m(_objective pid=3826177)[0m 
  6%|▋         | 8/125 [00:00<00:04, 26.91it/s][A
[2m[36m(_objective pid=3826177)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.66it/s][A
[2m[36m(_objective pid=3826177)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.19it/s][A
[2m[36m(_objective pid=3826177)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 24.91it/s][A
[2m[36m(_objective pid=3826177)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.73it/s][A
[2m[36m(_objective pid=3826177)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.63it/s][A
[2m[36m(_objective pid=3826177)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.56it/s

== Status ==
Current time: 2022-10-19 02:51:32 (running for 00:12:34.51)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3826177)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.38it/s][A
[2m[36m(_objective pid=3826177)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.38it/s][A
[2m[36m(_objective pid=3826177)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.38it/s][A
[2m[36m(_objective pid=3826177)[0m 
 59%|█████▉    | 74/125 [00:03<00:02, 24.38it/s][A
[2m[36m(_objective pid=3826177)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.34it/s][A
[2m[36m(_objective pid=3826177)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.23it/s][A
[2m[36m(_objective pid=3826177)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.20it/s][A
[2m[36m(_objective pid=3826177)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.25it/s][A
[2m[36m(_objective pid=3826177)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.29it/s][A
[2m[36m(_objective pid=3826177)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.32it/s][A
[2m[36m(_objective pid=3826177)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.31it/s][A

Result for _objective_2ed07_00002:
  date: 2022-10-19_02-51-36
  done: false
  episodes_total: 0
  epoch: 3.22
  eval_accuracy: 0.951
  eval_f1: 0.9533777354900095
  eval_loss: 0.16220815479755402
  eval_runtime: 6.353
  eval_samples_per_second: 157.406
  eval_steps_per_second: 19.676
  experiment_id: aab49e31c6fa43f48990833c86d7c485
  hostname: 3481a8a2ae33
  iterations_since_restore: 2
  node_ip: 172.17.0.3
  objective: 1.9043777354900095
  pid: 3826177
  time_since_restore: 81.06049060821533
  time_this_iter_s: 38.70538592338562
  time_total_s: 204.15775084495544
  timestamp: 1666147896
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 2
  trial_id: 2ed07_00002
  warmup_time: 0.004002571105957031
  
[2m[36m(_objective pid=3826177)[0m {'eval_loss': 0.16220815479755402, 'eval_accuracy': 0.951, 'eval_f1': 0.9533777354900095, 'eval_runtime': 6.353, 'eval_samples_per_second': 157.406, 'eval_steps_per_second': 19.676, 'epoch': 3.22}


                                                 
 65%|██████▍   | 100/155 [01:17<00:35,  1.55it/s]
100%|██████████| 125/125 [00:06<00:00, 23.88it/s][A
                                                 [A
 65%|██████▌   | 101/155 [01:18<02:17,  2.55s/it]
 66%|██████▌   | 102/155 [01:18<01:44,  1.98s/it]
 66%|██████▋   | 103/155 [01:19<01:21,  1.58s/it]
 67%|██████▋   | 104/155 [01:19<01:06,  1.30s/it]
 68%|██████▊   | 105/155 [01:20<00:54,  1.10s/it]
 68%|██████▊   | 106/155 [01:21<00:47,  1.04it/s]
 69%|██████▉   | 107/155 [01:21<00:41,  1.16it/s]


== Status ==
Current time: 2022-10-19 02:51:41 (running for 00:12:43.20)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 70%|██████▉   | 108/155 [01:22<00:37,  1.25it/s]
 70%|███████   | 109/155 [01:23<00:34,  1.33it/s]
 71%|███████   | 110/155 [01:23<00:32,  1.39it/s]
 72%|███████▏  | 111/155 [01:24<00:30,  1.44it/s]
 72%|███████▏  | 112/155 [01:25<00:29,  1.48it/s]
 73%|███████▎  | 113/155 [01:25<00:27,  1.50it/s]
 74%|███████▎  | 114/155 [01:26<00:26,  1.52it/s]
 74%|███████▍  | 115/155 [01:26<00:26,  1.53it/s]


== Status ==
Current time: 2022-10-19 02:51:46 (running for 00:12:48.20)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 75%|███████▍  | 116/155 [01:27<00:25,  1.54it/s]
 75%|███████▌  | 117/155 [01:28<00:24,  1.55it/s]
 76%|███████▌  | 118/155 [01:28<00:23,  1.55it/s]
 77%|███████▋  | 119/155 [01:29<00:23,  1.55it/s]
 77%|███████▋  | 120/155 [01:30<00:22,  1.56it/s]
 78%|███████▊  | 121/155 [01:30<00:21,  1.56it/s]
 79%|███████▊  | 122/155 [01:31<00:21,  1.56it/s]
 79%|███████▉  | 123/155 [01:32<00:20,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:51:51 (running for 00:12:53.21)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 80%|████████  | 124/155 [01:32<00:19,  1.56it/s]
 81%|████████  | 125/155 [01:33<00:20,  1.46it/s]
 81%|████████▏ | 126/155 [01:34<00:19,  1.49it/s]
 82%|████████▏ | 127/155 [01:34<00:18,  1.51it/s]
 83%|████████▎ | 128/155 [01:35<00:17,  1.52it/s]
 83%|████████▎ | 129/155 [01:36<00:16,  1.54it/s]
 84%|████████▍ | 130/155 [01:36<00:16,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:51:56 (running for 00:12:58.21)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 85%|████████▍ | 131/155 [01:37<00:15,  1.55it/s]
 85%|████████▌ | 132/155 [01:37<00:14,  1.55it/s]
 86%|████████▌ | 133/155 [01:38<00:14,  1.55it/s]
 86%|████████▋ | 134/155 [01:39<00:13,  1.56it/s]
 87%|████████▋ | 135/155 [01:39<00:12,  1.56it/s]
 88%|████████▊ | 136/155 [01:40<00:12,  1.56it/s]
 88%|████████▊ | 137/155 [01:41<00:11,  1.56it/s]
 89%|████████▉ | 138/155 [01:41<00:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:52:01 (running for 00:13:03.21)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 90%|████████▉ | 139/155 [01:42<00:10,  1.56it/s]
 90%|█████████ | 140/155 [01:43<00:09,  1.56it/s]
 91%|█████████ | 141/155 [01:43<00:08,  1.56it/s]
 92%|█████████▏| 142/155 [01:44<00:08,  1.56it/s]
 92%|█████████▏| 143/155 [01:45<00:07,  1.56it/s]
 93%|█████████▎| 144/155 [01:45<00:07,  1.56it/s]
 94%|█████████▎| 145/155 [01:46<00:06,  1.56it/s]
 94%|█████████▍| 146/155 [01:46<00:05,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:52:06 (running for 00:13:08.21)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 95%|█████████▍| 147/155 [01:47<00:05,  1.56it/s]
 95%|█████████▌| 148/155 [01:48<00:04,  1.56it/s]
 96%|█████████▌| 149/155 [01:48<00:03,  1.56it/s]
 97%|█████████▋| 150/155 [01:49<00:03,  1.56it/s]
[2m[36m(_objective pid=3826177)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3826177)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.38it/s][A
[2m[36m(_objective pid=3826177)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.17it/s][A
[2m[36m(_objective pid=3826177)[0m 
  9%|▉         | 11/125 [00:00<00:04, 26.01it/s][A
[2m[36m(_objective pid=3826177)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.40it/s][A
[2m[36m(_objective pid=3826177)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.02it/s][A
[2m[36m(_objective pid=3826177)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.82it/s][A
[2m[36m(_objective pid=3826177)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.68it/s][A
[2m[36m(_objective pid=3826177)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.59i

== Status ==
Current time: 2022-10-19 02:52:11 (running for 00:13:13.22)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 4 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3826177)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.26it/s][A
[2m[36m(_objective pid=3826177)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.31it/s][A
[2m[36m(_objective pid=3826177)[0m 
 59%|█████▉    | 74/125 [00:03<00:02, 24.32it/s][A
[2m[36m(_objective pid=3826177)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.36it/s][A
[2m[36m(_objective pid=3826177)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.36it/s][A
[2m[36m(_objective pid=3826177)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.39it/s][A
[2m[36m(_objective pid=3826177)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.41it/s][A
[2m[36m(_objective pid=3826177)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.25it/s][A
[2m[36m(_objective pid=3826177)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.31it/s][A
[2m[36m(_objective pid=3826177)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.23it/s][A
[2m[36m(_objective pid=3826177)[0m 
 78%|███████▊  | 98/125 [00:04<00:01, 24.28it/s][A

[2m[36m(_objective pid=3826177)[0m {'eval_loss': 0.09452543407678604, 'eval_accuracy': 0.973, 'eval_f1': 0.9744075829383886, 'eval_runtime': 6.4461, 'eval_samples_per_second': 155.133, 'eval_steps_per_second': 19.392, 'epoch': 4.83}
Result for _objective_2ed07_00002:
  date: 2022-10-19_02-52-15
  done: false
  episodes_total: 0
  epoch: 4.83
  eval_accuracy: 0.973
  eval_f1: 0.9744075829383886
  eval_loss: 0.09452543407678604
  eval_runtime: 6.4461
  eval_samples_per_second: 155.133
  eval_steps_per_second: 19.392
  experiment_id: aab49e31c6fa43f48990833c86d7c485
  hostname: 3481a8a2ae33
  iterations_since_restore: 3
  node_ip: 172.17.0.3
  objective: 1.9474075829383886
  pid: 3826177
  time_since_restore: 119.66292405128479
  time_this_iter_s: 38.60243344306946
  time_total_s: 242.7601842880249
  timestamp: 1666147935
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 3
  trial_id: 2ed07_00002
  warmup_time: 0.004002571105957031
  


[2m[36m(pid=3826794)[0m 2022-10-19 02:52:17.625275: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
[2m[36m(_objective pid=3826794)[0m 2022-10-19 02:52:18,576	INFO trainable.py:668 -- Restored on 172.17.0.3 from checkpoint: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt/_objective_2ed07_00001_1_num_train_epochs=5_2022-10-19_02-39-45/checkpoint_tmpb4f3ed
[2m[36m(_objective pid=3826794)[0m 2022-10-19 02:52:18,576	INFO trainable.py:677 -- Current state after restoring: {'_iteration': 0, '_timesteps_total': 0, '_time_total': 242.74324917793274, '_episodes_total': 0}


== Status ==
Current time: 2022-10-19 02:52:21 (running for 00:13:22.91)
Memory usage on this node: 13.0/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3826794)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.decoder.weight', 'lm_head.dense.weight', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.decoder.bias', 'lm_head.layer_norm.weight']
[2m[36m(_objective pid=3826794)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3826794)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3826794)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:52:26 (running for 00:13:27.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

  5%|▍         | 7/155 [00:04<01:34,  1.56it/s]
  5%|▌         | 8/155 [00:05<01:34,  1.56it/s]
  6%|▌         | 9/155 [00:05<01:33,  1.56it/s]
  6%|▋         | 10/155 [00:06<01:32,  1.56it/s]
  7%|▋         | 11/155 [00:07<01:32,  1.56it/s]
  8%|▊         | 12/155 [00:07<01:31,  1.56it/s]
  8%|▊         | 13/155 [00:08<01:30,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:52:31 (running for 00:13:32.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

  9%|▉         | 14/155 [00:08<01:30,  1.56it/s]
 10%|▉         | 15/155 [00:09<01:29,  1.56it/s]
 10%|█         | 16/155 [00:10<01:28,  1.56it/s]
 11%|█         | 17/155 [00:10<01:28,  1.56it/s]
 12%|█▏        | 18/155 [00:11<01:27,  1.56it/s]
 12%|█▏        | 19/155 [00:12<01:26,  1.56it/s]
 13%|█▎        | 20/155 [00:12<01:26,  1.56it/s]
 14%|█▎        | 21/155 [00:13<01:25,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:52:36 (running for 00:13:37.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 14%|█▍        | 22/155 [00:14<01:25,  1.56it/s]
 15%|█▍        | 23/155 [00:14<01:24,  1.56it/s]
 15%|█▌        | 24/155 [00:15<01:23,  1.56it/s]
 16%|█▌        | 25/155 [00:16<01:23,  1.56it/s]
 17%|█▋        | 26/155 [00:16<01:22,  1.56it/s]
 17%|█▋        | 27/155 [00:17<01:21,  1.56it/s]
 18%|█▊        | 28/155 [00:17<01:21,  1.56it/s]
 19%|█▊        | 29/155 [00:18<01:20,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:52:41 (running for 00:13:42.93)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 19%|█▉        | 30/155 [00:19<01:19,  1.56it/s]
 20%|██        | 31/155 [00:19<01:19,  1.56it/s]
 21%|██        | 32/155 [00:20<01:24,  1.46it/s]
 21%|██▏       | 33/155 [00:21<01:22,  1.49it/s]
 22%|██▏       | 34/155 [00:21<01:20,  1.51it/s]
 23%|██▎       | 35/155 [00:22<01:18,  1.53it/s]
 23%|██▎       | 36/155 [00:23<01:17,  1.54it/s]
 24%|██▍       | 37/155 [00:23<01:16,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:52:46 (running for 00:13:47.93)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 25%|██▍       | 38/155 [00:24<01:15,  1.55it/s]
 25%|██▌       | 39/155 [00:25<01:14,  1.55it/s]
 26%|██▌       | 40/155 [00:25<01:13,  1.56it/s]
 26%|██▋       | 41/155 [00:26<01:13,  1.56it/s]
 27%|██▋       | 42/155 [00:27<01:12,  1.56it/s]
 28%|██▊       | 43/155 [00:27<01:11,  1.56it/s]
 28%|██▊       | 44/155 [00:28<01:11,  1.56it/s]
 29%|██▉       | 45/155 [00:28<01:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:52:51 (running for 00:13:52.93)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 30%|██▉       | 46/155 [00:29<01:09,  1.56it/s]
 30%|███       | 47/155 [00:30<01:09,  1.56it/s]
 31%|███       | 48/155 [00:30<01:08,  1.56it/s]
 32%|███▏      | 49/155 [00:31<01:07,  1.56it/s]
 32%|███▏      | 50/155 [00:32<01:07,  1.56it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3826794)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.10it/s][A
[2m[36m(_objective pid=3826794)[0m 
  6%|▋         | 8/125 [00:00<00:04, 26.99it/s][A
[2m[36m(_objective pid=3826794)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.89it/s][A
[2m[36m(_objective pid=3826794)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.31it/s][A
[2m[36m(_objective pid=3826794)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.00it/s][A
[2m[36m(_objective pid=3826794)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.81it/s][A
[2m[36m(_objective pid=3826794)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.69it/s][A
[2m[36m(_objective pid=3826794)[0m 
 21%|██        | 26/125 [00:01<00:04, 

== Status ==
Current time: 2022-10-19 02:52:56 (running for 00:13:57.93)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3826794)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.38it/s][A
[2m[36m(_objective pid=3826794)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.36it/s][A
[2m[36m(_objective pid=3826794)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.39it/s][A
[2m[36m(_objective pid=3826794)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.41it/s][A
[2m[36m(_objective pid=3826794)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3826794)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3826794)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.44it/s][A
[2m[36m(_objective pid=3826794)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.44it/s][A
[2m[36m(_objective pid=3826794)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3826794)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.40it/s][A
[2m[36m(_objective pid=3826794)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.41it/s][A

Result for _objective_2ed07_00001:
  date: 2022-10-19_02-53-00
  done: false
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.873
  eval_f1: 0.8800755429650613
  eval_loss: 0.33460041880607605
  eval_runtime: 6.0477
  eval_samples_per_second: 165.352
  eval_steps_per_second: 20.669
  experiment_id: 037f0829187c4e36ae81f75b0b213481
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.7530755429650613
  pid: 3826794
  time_since_restore: 41.93969511985779
  time_this_iter_s: 41.93969511985779
  time_total_s: 284.6829442977905
  timestamp: 1666147980
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00001
  warmup_time: 0.0033965110778808594
  
[2m[36m(_objective pid=3826794)[0m {'eval_loss': 0.33460041880607605, 'eval_accuracy': 0.873, 'eval_f1': 0.8800755429650613, 'eval_runtime': 6.0477, 'eval_samples_per_second': 165.352, 'eval_steps_per_second': 20.669, 'epoch': 1.61}


                                                
 32%|███▏      | 50/155 [00:38<01:07,  1.56it/s] 
100%|██████████| 125/125 [00:06<00:00, 24.41it/s][A
                                                 [A
 33%|███▎      | 51/155 [00:38<04:15,  2.46s/it]
 34%|███▎      | 52/155 [00:39<03:16,  1.91s/it]
 34%|███▍      | 53/155 [00:40<02:36,  1.53s/it]
 35%|███▍      | 54/155 [00:40<02:07,  1.26s/it]
 35%|███▌      | 55/155 [00:41<01:47,  1.08s/it]
 36%|███▌      | 56/155 [00:42<01:33,  1.06it/s]
 37%|███▋      | 57/155 [00:42<01:23,  1.17it/s]


== Status ==
Current time: 2022-10-19 02:53:05 (running for 00:14:07.07)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 37%|███▋      | 58/155 [00:43<01:16,  1.27it/s]
 38%|███▊      | 59/155 [00:43<01:11,  1.34it/s]
 39%|███▊      | 60/155 [00:44<01:07,  1.40it/s]
 39%|███▉      | 61/155 [00:45<01:04,  1.45it/s]
 40%|████      | 62/155 [00:45<01:02,  1.48it/s]
 41%|████      | 63/155 [00:46<01:05,  1.41it/s]
 41%|████▏     | 64/155 [00:47<01:02,  1.45it/s]
 42%|████▏     | 65/155 [00:47<01:00,  1.48it/s]


== Status ==
Current time: 2022-10-19 02:53:10 (running for 00:14:12.07)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 43%|████▎     | 66/155 [00:48<00:59,  1.51it/s]
 43%|████▎     | 67/155 [00:49<00:57,  1.52it/s]
 44%|████▍     | 68/155 [00:49<00:56,  1.53it/s]
 45%|████▍     | 69/155 [00:50<00:55,  1.54it/s]
 45%|████▌     | 70/155 [00:51<00:54,  1.55it/s]
 46%|████▌     | 71/155 [00:51<00:54,  1.55it/s]
 46%|████▋     | 72/155 [00:52<00:53,  1.56it/s]
 47%|████▋     | 73/155 [00:53<00:52,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:53:15 (running for 00:14:17.07)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 48%|████▊     | 74/155 [00:53<00:51,  1.56it/s]
 48%|████▊     | 75/155 [00:54<00:51,  1.56it/s]
 49%|████▉     | 76/155 [00:54<00:50,  1.56it/s]
 50%|████▉     | 77/155 [00:55<00:49,  1.56it/s]
 50%|█████     | 78/155 [00:56<00:49,  1.56it/s]
 51%|█████     | 79/155 [00:56<00:48,  1.56it/s]
 52%|█████▏    | 80/155 [00:57<00:47,  1.56it/s]
 52%|█████▏    | 81/155 [00:58<00:47,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:53:20 (running for 00:14:22.08)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 53%|█████▎    | 82/155 [00:58<00:46,  1.56it/s]
 54%|█████▎    | 83/155 [00:59<00:46,  1.56it/s]
 54%|█████▍    | 84/155 [01:00<00:45,  1.56it/s]
 55%|█████▍    | 85/155 [01:00<00:44,  1.56it/s]
 55%|█████▌    | 86/155 [01:01<00:44,  1.56it/s]
 56%|█████▌    | 87/155 [01:02<00:43,  1.56it/s]
 57%|█████▋    | 88/155 [01:02<00:42,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:53:25 (running for 00:14:27.08)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 57%|█████▋    | 89/155 [01:03<00:42,  1.56it/s]
 58%|█████▊    | 90/155 [01:03<00:41,  1.56it/s]
 59%|█████▊    | 91/155 [01:04<00:40,  1.56it/s]
 59%|█████▉    | 92/155 [01:05<00:40,  1.56it/s]
 60%|██████    | 93/155 [01:05<00:39,  1.56it/s]
 61%|██████    | 94/155 [01:06<00:41,  1.46it/s]
 61%|██████▏   | 95/155 [01:07<00:40,  1.49it/s]
 62%|██████▏   | 96/155 [01:07<00:39,  1.51it/s]


== Status ==
Current time: 2022-10-19 02:53:30 (running for 00:14:32.08)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 63%|██████▎   | 97/155 [01:08<00:38,  1.52it/s]
 63%|██████▎   | 98/155 [01:09<00:37,  1.54it/s]
 64%|██████▍   | 99/155 [01:09<00:36,  1.54it/s]
 65%|██████▍   | 100/155 [01:10<00:35,  1.55it/s]
[2m[36m(_objective pid=3826794)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3826794)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.32it/s][A
[2m[36m(_objective pid=3826794)[0m 
  6%|▋         | 8/125 [00:00<00:04, 26.89it/s][A
[2m[36m(_objective pid=3826794)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.88it/s][A
[2m[36m(_objective pid=3826794)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.32it/s][A
[2m[36m(_objective pid=3826794)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 24.81it/s][A
[2m[36m(_objective pid=3826794)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.63it/s][A
[2m[36m(_objective pid=3826794)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.48it/s][A
[2m[36m(_objective pid=3826794)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.46it/s

== Status ==
Current time: 2022-10-19 02:53:35 (running for 00:14:37.08)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3826794)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.38it/s][A
[2m[36m(_objective pid=3826794)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.40it/s][A
[2m[36m(_objective pid=3826794)[0m 
 59%|█████▉    | 74/125 [00:03<00:02, 24.41it/s][A
[2m[36m(_objective pid=3826794)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.37it/s][A
[2m[36m(_objective pid=3826794)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.37it/s][A
[2m[36m(_objective pid=3826794)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.34it/s][A
[2m[36m(_objective pid=3826794)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.26it/s][A
[2m[36m(_objective pid=3826794)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.22it/s][A
[2m[36m(_objective pid=3826794)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.28it/s][A
[2m[36m(_objective pid=3826794)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.29it/s][A
[2m[36m(_objective pid=3826794)[0m 
 78%|███████▊  | 98/125 [00:03<00:01, 24.34it/s][A

Result for _objective_2ed07_00001:
  date: 2022-10-19_02-53-38
  done: false
  episodes_total: 0
  epoch: 3.22
  eval_accuracy: 0.949
  eval_f1: 0.9524697110904008
  eval_loss: 0.18202358484268188
  eval_runtime: 6.0397
  eval_samples_per_second: 165.572
  eval_steps_per_second: 20.697
  experiment_id: 037f0829187c4e36ae81f75b0b213481
  hostname: 3481a8a2ae33
  iterations_since_restore: 2
  node_ip: 172.17.0.3
  objective: 1.9014697110904009
  pid: 3826794
  time_since_restore: 80.27536869049072
  time_this_iter_s: 38.335673570632935
  time_total_s: 323.01861786842346
  timestamp: 1666148018
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 2
  trial_id: 2ed07_00001
  warmup_time: 0.0033965110778808594
  
[2m[36m(_objective pid=3826794)[0m {'eval_loss': 0.18202358484268188, 'eval_accuracy': 0.949, 'eval_f1': 0.9524697110904008, 'eval_runtime': 6.0397, 'eval_samples_per_second': 165.572, 'eval_steps_per_second': 20.697, 'epoch': 3.22}


                                                 
 65%|██████▍   | 100/155 [01:16<00:35,  1.55it/s]
100%|██████████| 125/125 [00:05<00:00, 24.42it/s][A
                                                 [A
 65%|██████▌   | 101/155 [01:17<02:12,  2.46s/it]
 66%|██████▌   | 102/155 [01:17<01:41,  1.91s/it]
 66%|██████▋   | 103/155 [01:18<01:19,  1.53s/it]
 67%|██████▋   | 104/155 [01:19<01:04,  1.26s/it]
 68%|██████▊   | 105/155 [01:19<00:53,  1.08s/it]
 68%|██████▊   | 106/155 [01:20<00:46,  1.06it/s]
 69%|██████▉   | 107/155 [01:21<00:40,  1.17it/s]


== Status ==
Current time: 2022-10-19 02:53:43 (running for 00:14:45.41)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 70%|██████▉   | 108/155 [01:21<00:37,  1.27it/s]
 70%|███████   | 109/155 [01:22<00:34,  1.34it/s]
 71%|███████   | 110/155 [01:22<00:32,  1.40it/s]
 72%|███████▏  | 111/155 [01:23<00:30,  1.45it/s]
 72%|███████▏  | 112/155 [01:24<00:29,  1.48it/s]
 73%|███████▎  | 113/155 [01:24<00:27,  1.50it/s]
 74%|███████▎  | 114/155 [01:25<00:26,  1.52it/s]
 74%|███████▍  | 115/155 [01:26<00:26,  1.53it/s]


== Status ==
Current time: 2022-10-19 02:53:48 (running for 00:14:50.41)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 75%|███████▍  | 116/155 [01:26<00:25,  1.54it/s]
 75%|███████▌  | 117/155 [01:27<00:24,  1.55it/s]
 76%|███████▌  | 118/155 [01:28<00:23,  1.55it/s]
 77%|███████▋  | 119/155 [01:28<00:23,  1.55it/s]
 77%|███████▋  | 120/155 [01:29<00:22,  1.56it/s]
 78%|███████▊  | 121/155 [01:29<00:21,  1.56it/s]
 79%|███████▊  | 122/155 [01:30<00:21,  1.56it/s]
 79%|███████▉  | 123/155 [01:31<00:20,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:53:53 (running for 00:14:55.41)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 80%|████████  | 124/155 [01:31<00:19,  1.56it/s]
 81%|████████  | 125/155 [01:32<00:20,  1.46it/s]
 81%|████████▏ | 126/155 [01:33<00:19,  1.49it/s]
 82%|████████▏ | 127/155 [01:33<00:18,  1.51it/s]
 83%|████████▎ | 128/155 [01:34<00:17,  1.53it/s]
 83%|████████▎ | 129/155 [01:35<00:16,  1.54it/s]
 84%|████████▍ | 130/155 [01:35<00:16,  1.54it/s]
 85%|████████▍ | 131/155 [01:36<00:15,  1.55it/s]


== Status ==
Current time: 2022-10-19 02:53:58 (running for 00:15:00.42)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 85%|████████▌ | 132/155 [01:37<00:14,  1.55it/s]
 86%|████████▌ | 133/155 [01:37<00:14,  1.55it/s]
 86%|████████▋ | 134/155 [01:38<00:13,  1.55it/s]
 87%|████████▋ | 135/155 [01:39<00:12,  1.55it/s]
 88%|████████▊ | 136/155 [01:39<00:12,  1.56it/s]
 88%|████████▊ | 137/155 [01:40<00:11,  1.56it/s]
 89%|████████▉ | 138/155 [01:41<00:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:54:03 (running for 00:15:05.42)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 90%|████████▉ | 139/155 [01:41<00:10,  1.56it/s]
 90%|█████████ | 140/155 [01:42<00:09,  1.56it/s]
 91%|█████████ | 141/155 [01:42<00:08,  1.56it/s]
 92%|█████████▏| 142/155 [01:43<00:08,  1.56it/s]
 92%|█████████▏| 143/155 [01:44<00:07,  1.56it/s]
 93%|█████████▎| 144/155 [01:44<00:07,  1.56it/s]
 94%|█████████▎| 145/155 [01:45<00:06,  1.56it/s]
 94%|█████████▍| 146/155 [01:46<00:05,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:54:08 (running for 00:15:10.42)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

 95%|█████████▍| 147/155 [01:46<00:05,  1.56it/s]
 95%|█████████▌| 148/155 [01:47<00:04,  1.56it/s]
 96%|█████████▌| 149/155 [01:48<00:03,  1.56it/s]
 97%|█████████▋| 150/155 [01:48<00:03,  1.56it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3826794)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.33it/s][A
[2m[36m(_objective pid=3826794)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.07it/s][A
[2m[36m(_objective pid=3826794)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.96it/s][A
[2m[36m(_objective pid=3826794)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.37it/s][A
[2m[36m(_objective pid=3826794)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.01it/s][A
[2m[36m(_objective pid=3826794)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.81it/s][A
[2m[36m(_objective pid=3826794)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.67it/s][A
[2m[36m(_objective pid=3826794)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.59it/s][A
[2m[36m(_objective pid=382679

== Status ==
Current time: 2022-10-19 02:54:13 (running for 00:15:15.42)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 PAUSED, 1 RUNNING, 3 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+-----

[2m[36m(_objective pid=3826794)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.41it/s][A
[2m[36m(_objective pid=3826794)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3826794)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.41it/s][A
[2m[36m(_objective pid=3826794)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.42it/s][A
[2m[36m(_objective pid=3826794)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.42it/s][A
[2m[36m(_objective pid=3826794)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.40it/s][A
[2m[36m(_objective pid=3826794)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.39it/s][A
[2m[36m(_objective pid=3826794)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.40it/s][A
[2m[36m(_objective pid=3826794)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.39it/s][A
[2m[36m(_objective pid=3826794)[0m 
 78%|███████▊  | 98/125 [00:03<00:01, 24.40it/s][A
[2m[36m(_objective pid=3826794)[0m 
 81%|████████  | 101/125 [00:04<00:00, 24.41it/s][

Result for _objective_2ed07_00001:
  date: 2022-10-19_02-54-17
  done: false
  episodes_total: 0
  epoch: 4.83
  eval_accuracy: 0.967
  eval_f1: 0.9689557855126999
  eval_loss: 0.12148106843233109
  eval_runtime: 6.0467
  eval_samples_per_second: 165.38
  eval_steps_per_second: 20.672
  experiment_id: 037f0829187c4e36ae81f75b0b213481
  hostname: 3481a8a2ae33
  iterations_since_restore: 3
  node_ip: 172.17.0.3
  objective: 1.9359557855126999
  pid: 3826794
  time_since_restore: 118.5000729560852
  time_this_iter_s: 38.22470426559448
  time_total_s: 361.24332213401794
  timestamp: 1666148057
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 3
  trial_id: 2ed07_00001
  warmup_time: 0.0033965110778808594
  
[2m[36m(_objective pid=3826794)[0m {'eval_loss': 0.12148106843233109, 'eval_accuracy': 0.967, 'eval_f1': 0.9689557855126999, 'eval_runtime': 6.0467, 'eval_samples_per_second': 165.38, 'eval_steps_per_second': 20.672, 'epoch': 4.83}


                                                 
 97%|█████████▋| 150/155 [01:54<00:03,  1.56it/s]
100%|██████████| 125/125 [00:06<00:00, 24.33it/s][A
                                                 [A
 97%|█████████▋| 151/155 [01:55<00:09,  2.46s/it]
 98%|█████████▊| 152/155 [01:56<00:05,  1.91s/it]
 99%|█████████▊| 153/155 [01:56<00:03,  1.53s/it]
 99%|█████████▉| 154/155 [01:57<00:01,  1.26s/it]


Result for _objective_2ed07_00001:
  date: 2022-10-19_02-54-17
  done: true
  episodes_total: 0
  epoch: 4.83
  eval_accuracy: 0.967
  eval_f1: 0.9689557855126999
  eval_loss: 0.12148106843233109
  eval_runtime: 6.0467
  eval_samples_per_second: 165.38
  eval_steps_per_second: 20.672
  experiment_id: 037f0829187c4e36ae81f75b0b213481
  experiment_tag: 1_num_train_epochs=5
  hostname: 3481a8a2ae33
  iterations_since_restore: 3
  node_ip: 172.17.0.3
  objective: 1.9359557855126999
  pid: 3826794
  time_since_restore: 118.5000729560852
  time_this_iter_s: 38.22470426559448
  time_total_s: 361.24332213401794
  timestamp: 1666148057
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 3
  trial_id: 2ed07_00001
  warmup_time: 0.0033965110778808594
  
== Status ==
Current time: 2022-10-19 02:54:20 (running for 00:15:21.84)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 0/20 CPUs, 0/1 GPUs, 0.0/11.46 GiB heap, 

100%|██████████| 155/155 [01:57<00:00,  1.31it/s]
[2m[36m(pid=3827416)[0m 2022-10-19 02:54:21.834196: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
[2m[36m(_objective pid=3827416)[0m 2022-10-19 02:54:22,789	INFO trainable.py:668 -- Restored on 172.17.0.3 from checkpoint: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt/_objective_2ed07_00002_2_num_train_epochs=5_2022-10-19_02-40-30/checkpoint_tmp894b98
[2m[36m(_objective pid=3827416)[0m 2022-10-19 02:54:22,789	INFO trainable.py:677 -- Current state after restoring: {'_iteration': 0, '_timesteps_total': 0, '_time_total': 242.7601842880249, '_episodes_total': 0}


== Status ==
Current time: 2022-10-19 02:54:25 (running for 00:15:26.91)
Memory usage on this node: 12.8/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

[2m[36m(_objective pid=3827416)[0m Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.bias', 'lm_head.bias', 'lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight']
[2m[36m(_objective pid=3827416)[0m - This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=3827416)[0m - This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=3827416)[0m Some weights of RobertaForSequenceClassification were no

== Status ==
Current time: 2022-10-19 02:54:30 (running for 00:15:31.91)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

  5%|▍         | 7/155 [00:04<01:34,  1.56it/s]
  5%|▌         | 8/155 [00:05<01:34,  1.56it/s]
  6%|▌         | 9/155 [00:05<01:33,  1.56it/s]
  6%|▋         | 10/155 [00:06<01:32,  1.56it/s]
  7%|▋         | 11/155 [00:07<01:32,  1.56it/s]
  8%|▊         | 12/155 [00:07<01:31,  1.56it/s]
  8%|▊         | 13/155 [00:08<01:30,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:54:35 (running for 00:15:36.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

  9%|▉         | 14/155 [00:08<01:30,  1.56it/s]
 10%|▉         | 15/155 [00:09<01:29,  1.56it/s]
 10%|█         | 16/155 [00:10<01:29,  1.56it/s]
 11%|█         | 17/155 [00:10<01:28,  1.56it/s]
 12%|█▏        | 18/155 [00:11<01:27,  1.56it/s]
 12%|█▏        | 19/155 [00:12<01:26,  1.56it/s]
 13%|█▎        | 20/155 [00:12<01:26,  1.56it/s]
 14%|█▎        | 21/155 [00:13<01:25,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:54:40 (running for 00:15:41.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 14%|█▍        | 22/155 [00:14<01:25,  1.56it/s]
 15%|█▍        | 23/155 [00:14<01:24,  1.56it/s]
 15%|█▌        | 24/155 [00:15<01:23,  1.56it/s]
 16%|█▌        | 25/155 [00:16<01:23,  1.56it/s]
 17%|█▋        | 26/155 [00:16<01:22,  1.56it/s]
 17%|█▋        | 27/155 [00:17<01:21,  1.56it/s]
 18%|█▊        | 28/155 [00:17<01:21,  1.56it/s]
 19%|█▊        | 29/155 [00:18<01:20,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:54:45 (running for 00:15:46.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 19%|█▉        | 30/155 [00:19<01:19,  1.56it/s]
 20%|██        | 31/155 [00:19<01:19,  1.56it/s]
 21%|██        | 32/155 [00:20<01:24,  1.46it/s]
 21%|██▏       | 33/155 [00:21<01:22,  1.49it/s]
 22%|██▏       | 34/155 [00:21<01:20,  1.51it/s]
 23%|██▎       | 35/155 [00:22<01:18,  1.52it/s]
 23%|██▎       | 36/155 [00:23<01:17,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:54:50 (running for 00:15:51.92)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 24%|██▍       | 37/155 [00:23<01:16,  1.54it/s]
 25%|██▍       | 38/155 [00:24<01:15,  1.55it/s]
 25%|██▌       | 39/155 [00:25<01:14,  1.55it/s]
 26%|██▌       | 40/155 [00:25<01:13,  1.56it/s]
 26%|██▋       | 41/155 [00:26<01:13,  1.56it/s]
 27%|██▋       | 42/155 [00:27<01:12,  1.56it/s]
 28%|██▊       | 43/155 [00:27<01:11,  1.56it/s]
 28%|██▊       | 44/155 [00:28<01:11,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:54:55 (running for 00:15:56.93)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 29%|██▉       | 45/155 [00:28<01:10,  1.56it/s]
 30%|██▉       | 46/155 [00:29<01:09,  1.56it/s]
 30%|███       | 47/155 [00:30<01:09,  1.56it/s]
 31%|███       | 48/155 [00:30<01:08,  1.56it/s]
 32%|███▏      | 49/155 [00:31<01:07,  1.56it/s]
 32%|███▏      | 50/155 [00:32<01:07,  1.56it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3827416)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.44it/s][A
[2m[36m(_objective pid=3827416)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.12it/s][A
[2m[36m(_objective pid=3827416)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.99it/s][A
[2m[36m(_objective pid=3827416)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.43it/s][A
[2m[36m(_objective pid=3827416)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.11it/s][A
[2m[36m(_objective pid=3827416)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.88it/s][A
[2m[36m(_objective pid=3827416)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.69it/s][A
[2m[36m(_objective pid=382

== Status ==
Current time: 2022-10-19 02:55:00 (running for 00:16:01.93)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

[2m[36m(_objective pid=3827416)[0m 
 35%|███▌      | 44/125 [00:01<00:03, 24.43it/s][A
[2m[36m(_objective pid=3827416)[0m 
 38%|███▊      | 47/125 [00:01<00:03, 24.41it/s][A
[2m[36m(_objective pid=3827416)[0m 
 40%|████      | 50/125 [00:02<00:03, 24.38it/s][A
[2m[36m(_objective pid=3827416)[0m 
 42%|████▏     | 53/125 [00:02<00:02, 24.40it/s][A
[2m[36m(_objective pid=3827416)[0m 
 45%|████▍     | 56/125 [00:02<00:02, 24.39it/s][A
[2m[36m(_objective pid=3827416)[0m 
 47%|████▋     | 59/125 [00:02<00:02, 24.32it/s][A
[2m[36m(_objective pid=3827416)[0m 
 50%|████▉     | 62/125 [00:02<00:02, 24.35it/s][A
[2m[36m(_objective pid=3827416)[0m 
 52%|█████▏    | 65/125 [00:02<00:02, 24.20it/s][A
[2m[36m(_objective pid=3827416)[0m 
 54%|█████▍    | 68/125 [00:02<00:02, 24.15it/s][A
[2m[36m(_objective pid=3827416)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.23it/s][A
[2m[36m(_objective pid=3827416)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.27it/s][A

Result for _objective_2ed07_00002:
  date: 2022-10-19_02-55-04
  done: false
  episodes_total: 0
  epoch: 1.61
  eval_accuracy: 0.833
  eval_f1: 0.8258602711157456
  eval_loss: 0.4038117229938507
  eval_runtime: 6.3048
  eval_samples_per_second: 158.61
  eval_steps_per_second: 19.826
  experiment_id: aab49e31c6fa43f48990833c86d7c485
  hostname: 3481a8a2ae33
  iterations_since_restore: 1
  node_ip: 172.17.0.3
  objective: 1.6588602711157456
  pid: 3827416
  time_since_restore: 42.14650893211365
  time_this_iter_s: 42.14650893211365
  time_total_s: 284.90669322013855
  timestamp: 1666148104
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 1
  trial_id: 2ed07_00002
  warmup_time: 0.0033295154571533203
  
[2m[36m(_objective pid=3827416)[0m {'eval_loss': 0.4038117229938507, 'eval_accuracy': 0.833, 'eval_f1': 0.8258602711157456, 'eval_runtime': 6.3048, 'eval_samples_per_second': 158.61, 'eval_steps_per_second': 19.826, 'epoch': 1.61}


                                                
 32%|███▏      | 50/155 [00:38<01:07,  1.56it/s] 
100%|██████████| 125/125 [00:06<00:00, 24.29it/s][A
                                                 [A
 33%|███▎      | 51/155 [00:39<04:23,  2.53s/it]
 34%|███▎      | 52/155 [00:39<03:22,  1.96s/it]
[2m[36m(_objective pid=3827416)[0m   nn.utils.clip_grad_norm_(
 34%|███▍      | 53/155 [00:40<02:39,  1.56s/it]
 35%|███▍      | 54/155 [00:41<02:09,  1.29s/it]
 35%|███▌      | 55/155 [00:41<01:49,  1.09s/it]
 36%|███▌      | 56/155 [00:42<01:34,  1.05it/s]
 37%|███▋      | 57/155 [00:42<01:24,  1.16it/s]


== Status ==
Current time: 2022-10-19 02:55:09 (running for 00:16:11.49)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 37%|███▋      | 58/155 [00:43<01:17,  1.26it/s]
 38%|███▊      | 59/155 [00:44<01:11,  1.34it/s]
 39%|███▊      | 60/155 [00:44<01:07,  1.40it/s]
 39%|███▉      | 61/155 [00:45<01:05,  1.44it/s]
 40%|████      | 62/155 [00:46<01:02,  1.48it/s]
 41%|████      | 63/155 [00:46<01:05,  1.40it/s]
 41%|████▏     | 64/155 [00:47<01:02,  1.45it/s]
 42%|████▏     | 65/155 [00:48<01:00,  1.48it/s]


== Status ==
Current time: 2022-10-19 02:55:14 (running for 00:16:16.49)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 43%|████▎     | 66/155 [00:48<00:59,  1.51it/s]
 43%|████▎     | 67/155 [00:49<00:57,  1.52it/s]
 44%|████▍     | 68/155 [00:50<00:56,  1.53it/s]
 45%|████▍     | 69/155 [00:50<00:55,  1.54it/s]
 45%|████▌     | 70/155 [00:51<00:54,  1.55it/s]
 46%|████▌     | 71/155 [00:52<00:54,  1.55it/s]
 46%|████▋     | 72/155 [00:52<00:53,  1.56it/s]
 47%|████▋     | 73/155 [00:53<00:52,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:55:19 (running for 00:16:21.49)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 48%|████▊     | 74/155 [00:53<00:51,  1.56it/s]
 48%|████▊     | 75/155 [00:54<00:51,  1.56it/s]
 49%|████▉     | 76/155 [00:55<00:50,  1.56it/s]
 50%|████▉     | 77/155 [00:55<00:49,  1.56it/s]
 50%|█████     | 78/155 [00:56<00:49,  1.56it/s]
 51%|█████     | 79/155 [00:57<00:48,  1.56it/s]
 52%|█████▏    | 80/155 [00:57<00:47,  1.56it/s]
 52%|█████▏    | 81/155 [00:58<00:47,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:55:24 (running for 00:16:26.50)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 53%|█████▎    | 82/155 [00:59<00:46,  1.56it/s]
 54%|█████▎    | 83/155 [00:59<00:46,  1.56it/s]
 54%|█████▍    | 84/155 [01:00<00:45,  1.56it/s]
 55%|█████▍    | 85/155 [01:01<00:44,  1.56it/s]
 55%|█████▌    | 86/155 [01:01<00:44,  1.56it/s]
 56%|█████▌    | 87/155 [01:02<00:43,  1.56it/s]
 57%|█████▋    | 88/155 [01:02<00:42,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:55:29 (running for 00:16:31.50)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 57%|█████▋    | 89/155 [01:03<00:42,  1.56it/s]
 58%|█████▊    | 90/155 [01:04<00:41,  1.56it/s]
 59%|█████▊    | 91/155 [01:04<00:40,  1.56it/s]
 59%|█████▉    | 92/155 [01:05<00:40,  1.56it/s]
 60%|██████    | 93/155 [01:06<00:39,  1.56it/s]
 61%|██████    | 94/155 [01:06<00:41,  1.46it/s]
 61%|██████▏   | 95/155 [01:07<00:40,  1.49it/s]
 62%|██████▏   | 96/155 [01:08<00:39,  1.51it/s]


== Status ==
Current time: 2022-10-19 02:55:34 (running for 00:16:36.50)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 63%|██████▎   | 97/155 [01:08<00:38,  1.52it/s]
 63%|██████▎   | 98/155 [01:09<00:37,  1.54it/s]
 64%|██████▍   | 99/155 [01:10<00:36,  1.54it/s]
 65%|██████▍   | 100/155 [01:10<00:35,  1.55it/s]
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3827416)[0m 
  3%|▎         | 4/125 [00:00<00:03, 32.53it/s][A
[2m[36m(_objective pid=3827416)[0m 
  6%|▋         | 8/125 [00:00<00:04, 27.13it/s][A
[2m[36m(_objective pid=3827416)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.99it/s][A
[2m[36m(_objective pid=3827416)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.39it/s][A
[2m[36m(_objective pid=3827416)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 25.07it/s][A
[2m[36m(_objective pid=3827416)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.69it/s][A
[2m[36m(_objective pid=3827416)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.59it/s][A
[2m[36m(_objective pid=3827416)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.40it/s][A
[2m[36m(_objective pid=3827416)

== Status ==
Current time: 2022-10-19 02:55:39 (running for 00:16:41.50)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

[2m[36m(_objective pid=3827416)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.37it/s][A
[2m[36m(_objective pid=3827416)[0m 
 59%|█████▉    | 74/125 [00:03<00:02, 24.35it/s][A
[2m[36m(_objective pid=3827416)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.36it/s][A
[2m[36m(_objective pid=3827416)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.32it/s][A
[2m[36m(_objective pid=3827416)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.20it/s][A
[2m[36m(_objective pid=3827416)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.19it/s][A
[2m[36m(_objective pid=3827416)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.23it/s][A
[2m[36m(_objective pid=3827416)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.26it/s][A
[2m[36m(_objective pid=3827416)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.27it/s][A
[2m[36m(_objective pid=3827416)[0m 
 78%|███████▊  | 98/125 [00:03<00:01, 24.28it/s][A
[2m[36m(_objective pid=3827416)[0m 
 81%|████████  | 101/125 [00:04<00:00, 24.31it/s][

Result for _objective_2ed07_00002:
  date: 2022-10-19_02-55-44
  done: false
  episodes_total: 0
  epoch: 3.22
  eval_accuracy: 0.951
  eval_f1: 0.9533777354900095
  eval_loss: 0.16220815479755402
  eval_runtime: 6.968
  eval_samples_per_second: 143.514
  eval_steps_per_second: 17.939
  experiment_id: aab49e31c6fa43f48990833c86d7c485
  hostname: 3481a8a2ae33
  iterations_since_restore: 2
  node_ip: 172.17.0.3
  objective: 1.9043777354900095
  pid: 3827416
  time_since_restore: 81.3965699672699
  time_this_iter_s: 39.25006103515625
  time_total_s: 324.1567542552948
  timestamp: 1666148144
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 2
  trial_id: 2ed07_00002
  warmup_time: 0.0033295154571533203
  
[2m[36m(_objective pid=3827416)[0m {'eval_loss': 0.16220815479755402, 'eval_accuracy': 0.951, 'eval_f1': 0.9533777354900095, 'eval_runtime': 6.968, 'eval_samples_per_second': 143.514, 'eval_steps_per_second': 17.939, 'epoch': 3.22}


                                                 
 65%|██████▍   | 100/155 [01:17<00:35,  1.55it/s]
100%|██████████| 125/125 [00:06<00:00, 24.33it/s][A
                                                 [A
 65%|██████▌   | 101/155 [01:18<02:27,  2.74s/it]
 66%|██████▌   | 102/155 [01:19<01:51,  2.11s/it]
 66%|██████▋   | 103/155 [01:19<01:26,  1.67s/it]
 67%|██████▋   | 104/155 [01:20<01:09,  1.36s/it]
 68%|██████▊   | 105/155 [01:20<00:57,  1.14s/it]
 68%|██████▊   | 106/155 [01:21<00:48,  1.01it/s]
 69%|██████▉   | 107/155 [01:22<00:42,  1.13it/s]


== Status ==
Current time: 2022-10-19 02:55:49 (running for 00:16:50.74)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 70%|██████▉   | 108/155 [01:22<00:38,  1.23it/s]
 70%|███████   | 109/155 [01:23<00:34,  1.31it/s]
 71%|███████   | 110/155 [01:24<00:32,  1.38it/s]
 72%|███████▏  | 111/155 [01:24<00:30,  1.43it/s]
 72%|███████▏  | 112/155 [01:25<00:29,  1.47it/s]
 73%|███████▎  | 113/155 [01:26<00:28,  1.49it/s]
 74%|███████▎  | 114/155 [01:26<00:27,  1.51it/s]
 74%|███████▍  | 115/155 [01:27<00:26,  1.53it/s]


== Status ==
Current time: 2022-10-19 02:55:54 (running for 00:16:55.74)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 75%|███████▍  | 116/155 [01:27<00:25,  1.54it/s]
 75%|███████▌  | 117/155 [01:28<00:24,  1.55it/s]
 76%|███████▌  | 118/155 [01:29<00:23,  1.55it/s]
 77%|███████▋  | 119/155 [01:29<00:23,  1.55it/s]
 77%|███████▋  | 120/155 [01:30<00:22,  1.56it/s]
 78%|███████▊  | 121/155 [01:31<00:21,  1.56it/s]
 79%|███████▊  | 122/155 [01:31<00:21,  1.56it/s]
 79%|███████▉  | 123/155 [01:32<00:20,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:55:59 (running for 00:17:00.74)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 80%|████████  | 124/155 [01:33<00:19,  1.56it/s]
 81%|████████  | 125/155 [01:33<00:20,  1.45it/s]
 81%|████████▏ | 126/155 [01:34<00:19,  1.49it/s]
 82%|████████▏ | 127/155 [01:35<00:18,  1.51it/s]
 83%|████████▎ | 128/155 [01:35<00:17,  1.52it/s]
 83%|████████▎ | 129/155 [01:36<00:16,  1.54it/s]
 84%|████████▍ | 130/155 [01:37<00:16,  1.54it/s]


== Status ==
Current time: 2022-10-19 02:56:04 (running for 00:17:05.75)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 85%|████████▍ | 131/155 [01:37<00:15,  1.55it/s]
 85%|████████▌ | 132/155 [01:38<00:14,  1.55it/s]
 86%|████████▌ | 133/155 [01:39<00:14,  1.56it/s]
 86%|████████▋ | 134/155 [01:39<00:13,  1.56it/s]
 87%|████████▋ | 135/155 [01:40<00:12,  1.56it/s]
 88%|████████▊ | 136/155 [01:40<00:12,  1.56it/s]
 88%|████████▊ | 137/155 [01:41<00:11,  1.56it/s]
 89%|████████▉ | 138/155 [01:42<00:10,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:56:09 (running for 00:17:10.75)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 90%|████████▉ | 139/155 [01:42<00:10,  1.56it/s]
 90%|█████████ | 140/155 [01:43<00:09,  1.56it/s]
 91%|█████████ | 141/155 [01:44<00:08,  1.56it/s]
 92%|█████████▏| 142/155 [01:44<00:08,  1.56it/s]
 92%|█████████▏| 143/155 [01:45<00:07,  1.56it/s]
 93%|█████████▎| 144/155 [01:46<00:07,  1.56it/s]
 94%|█████████▎| 145/155 [01:46<00:06,  1.56it/s]
 94%|█████████▍| 146/155 [01:47<00:05,  1.56it/s]


== Status ==
Current time: 2022-10-19 02:56:14 (running for 00:17:15.75)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

 95%|█████████▍| 147/155 [01:47<00:05,  1.56it/s]
 95%|█████████▌| 148/155 [01:48<00:04,  1.56it/s]
 96%|█████████▌| 149/155 [01:49<00:03,  1.56it/s]
 97%|█████████▋| 150/155 [01:49<00:03,  1.56it/s]
[2m[36m(_objective pid=3827416)[0m 
  0%|          | 0/125 [00:00<?, ?it/s][A
[2m[36m(_objective pid=3827416)[0m 
  3%|▎         | 4/125 [00:00<00:03, 31.81it/s][A
[2m[36m(_objective pid=3827416)[0m 
  6%|▋         | 8/125 [00:00<00:04, 26.99it/s][A
[2m[36m(_objective pid=3827416)[0m 
  9%|▉         | 11/125 [00:00<00:04, 25.92it/s][A
[2m[36m(_objective pid=3827416)[0m 
 11%|█         | 14/125 [00:00<00:04, 25.16it/s][A
[2m[36m(_objective pid=3827416)[0m 
 14%|█▎        | 17/125 [00:00<00:04, 24.90it/s][A
[2m[36m(_objective pid=3827416)[0m 
 16%|█▌        | 20/125 [00:00<00:04, 24.74it/s][A
[2m[36m(_objective pid=3827416)[0m 
 18%|█▊        | 23/125 [00:00<00:04, 24.63it/s][A
[2m[36m(_objective pid=3827416)[0m 
 21%|██        | 26/125 [00:01<00:04, 24.53i

== Status ==
Current time: 2022-10-19 02:56:19 (running for 00:17:20.76)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/11.46 GiB heap, 0.0/5.73 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 5/5 (1 RUNNING, 4 TERMINATED)
+------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   w_decay |          lr | train_bs/gpu   |   num_epochs |   eval_f1 |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+-----------+-------------+----------------+--------------+-----------+-----------------+-------------+---------+---------------

[2m[36m(_objective pid=3827416)[0m 
 57%|█████▋    | 71/125 [00:02<00:02, 24.42it/s][A
[2m[36m(_objective pid=3827416)[0m 
 59%|█████▉    | 74/125 [00:02<00:02, 24.43it/s][A
[2m[36m(_objective pid=3827416)[0m 
 62%|██████▏   | 77/125 [00:03<00:01, 24.41it/s][A
[2m[36m(_objective pid=3827416)[0m 
 64%|██████▍   | 80/125 [00:03<00:01, 24.40it/s][A
[2m[36m(_objective pid=3827416)[0m 
 66%|██████▋   | 83/125 [00:03<00:01, 24.41it/s][A
[2m[36m(_objective pid=3827416)[0m 
 69%|██████▉   | 86/125 [00:03<00:01, 24.40it/s][A
[2m[36m(_objective pid=3827416)[0m 
 71%|███████   | 89/125 [00:03<00:01, 24.39it/s][A
[2m[36m(_objective pid=3827416)[0m 
 74%|███████▎  | 92/125 [00:03<00:01, 24.40it/s][A
[2m[36m(_objective pid=3827416)[0m 
 76%|███████▌  | 95/125 [00:03<00:01, 24.39it/s][A
[2m[36m(_objective pid=3827416)[0m 
 78%|███████▊  | 98/125 [00:03<00:01, 24.40it/s][A
[2m[36m(_objective pid=3827416)[0m 
 81%|████████  | 101/125 [00:04<00:00, 24.39it/s][

Result for _objective_2ed07_00002:
  date: 2022-10-19_02-56-22
  done: false
  episodes_total: 0
  epoch: 4.83
  eval_accuracy: 0.973
  eval_f1: 0.9744075829383886
  eval_loss: 0.09452543407678604
  eval_runtime: 6.4908
  eval_samples_per_second: 154.063
  eval_steps_per_second: 19.258
  experiment_id: aab49e31c6fa43f48990833c86d7c485
  hostname: 3481a8a2ae33
  iterations_since_restore: 3
  node_ip: 172.17.0.3
  objective: 1.9474075829383886
  pid: 3827416
  time_since_restore: 120.05144333839417
  time_this_iter_s: 38.65487337112427
  time_total_s: 362.81162762641907
  timestamp: 1666148182
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 3
  trial_id: 2ed07_00002
  warmup_time: 0.0033295154571533203
  
[2m[36m(_objective pid=3827416)[0m {'eval_loss': 0.09452543407678604, 'eval_accuracy': 0.973, 'eval_f1': 0.9744075829383886, 'eval_runtime': 6.4908, 'eval_samples_per_second': 154.063, 'eval_steps_per_second': 19.258, 'epoch': 4.83}


                                                 
 97%|█████████▋| 150/155 [01:56<00:03,  1.56it/s]
100%|██████████| 125/125 [00:06<00:00, 24.25it/s][A
                                                 [A
 97%|█████████▋| 151/155 [01:57<00:10,  2.59s/it]
 98%|█████████▊| 152/155 [01:57<00:06,  2.00s/it]
 99%|█████████▊| 153/155 [01:58<00:03,  1.59s/it]
 99%|█████████▉| 154/155 [01:58<00:01,  1.31s/it]
100%|██████████| 155/155 [01:59<00:00,  1.30it/s]
2022-10-19 02:56:26,167	INFO tune.py:758 -- Total run time: 1047.72 seconds (1047.61 seconds for the tuning loop).


Result for _objective_2ed07_00002:
  date: 2022-10-19_02-56-22
  done: true
  episodes_total: 0
  epoch: 4.83
  eval_accuracy: 0.973
  eval_f1: 0.9744075829383886
  eval_loss: 0.09452543407678604
  eval_runtime: 6.4908
  eval_samples_per_second: 154.063
  eval_steps_per_second: 19.258
  experiment_id: aab49e31c6fa43f48990833c86d7c485
  experiment_tag: 2_num_train_epochs=5
  hostname: 3481a8a2ae33
  iterations_since_restore: 3
  node_ip: 172.17.0.3
  objective: 1.9474075829383886
  pid: 3827416
  time_since_restore: 120.05144333839417
  time_this_iter_s: 38.65487337112427
  time_total_s: 362.81162762641907
  timestamp: 1666148182
  timesteps_since_restore: 0
  timesteps_total: 0
  training_iteration: 3
  trial_id: 2ed07_00002
  warmup_time: 0.0033295154571533203
  
== Status ==
Current time: 2022-10-19 02:56:26 (running for 00:17:27.61)
Memory usage on this node: 13.3/31.1 GiB
PopulationBasedTraining: 5 checkpoints, 0 perturbs
Resources requested: 0/20 CPUs, 0/1 GPUs, 0.0/11.46 GiB heap

In [27]:
result

BestRun(run_id='2ed07_00002', objective=1.9474075829383886, hyperparameters={'num_train_epochs': 5, 'weight_decay': 0.025440804525314963, 'learning_rate': 2.7455269030181142e-05, 'warmup_ratio': 0.2543007778205497})

In [28]:
for n, v in result.hyperparameters.items():
    setattr(trainer.args, n, v)

In [29]:
# trainer.args

In [30]:
trainer.train()

loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--klue--roberta-base/snapshots/67dd433d36ebc66a42c9aaa85abcf8d2620e41d9/pytorch_model.bin
Some weights of the model checkpoint at klue/roberta-base were not used when initializing RobertaForSequenceClassification: ['lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.layer_norm.weight', 'lm_head.decoder.bias', 'lm_head.dense.weight', 'lm_head.decoder.weight', 'lm_head.dense.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequ

Step,Training Loss,Validation Loss,Accuracy,F1
50,No log,0.403812,0.833,0.82586
100,No log,0.162208,0.951,0.953378
150,No log,0.094525,0.973,0.974408


The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 1000
  Batch size = 8
  nn.utils.clip_grad_norm_(
The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 1000
  Batch size = 8
The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num exa

TrainOutput(global_step=155, training_loss=0.372362789030998, metrics={'train_runtime': 118.8512, 'train_samples_per_second': 42.069, 'train_steps_per_second': 1.304, 'total_flos': 1313450388357120.0, 'train_loss': 0.372362789030998, 'epoch': 4.99})

In [31]:
trainer.evaluate()

The following columns in the evaluation set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 1000
  Batch size = 8


{'eval_loss': 0.09378402680158615,
 'eval_accuracy': 0.974,
 'eval_f1': 0.9753787878787878,
 'eval_runtime': 6.0945,
 'eval_samples_per_second': 164.082,
 'eval_steps_per_second': 20.51,
 'epoch': 4.99}

In [32]:
pred = trainer.predict(test_dataset=test_dataset)
pred

The following columns in the test set don't have a corresponding argument in `RobertaForSequenceClassification.forward` and have been ignored: text. If text are not expected by `RobertaForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Prediction *****
  Num examples = 50000
  Batch size = 8


PredictionOutput(predictions=array([[-1.581 ,  2.074 ],
       [ 0.7646, -0.9473],
       [ 0.2524, -0.322 ],
       ...,
       [-0.495 ,  0.973 ],
       [ 1.955 , -2.412 ],
       [ 1.684 , -2.037 ]], dtype=float16), label_ids=array([1, 0, 0, ..., 0, 0, 0]), metrics={'test_loss': 0.4627728760242462, 'test_accuracy': 0.8405, 'test_f1': 0.845580404685836, 'test_runtime': 262.3031, 'test_samples_per_second': 190.619, 'test_steps_per_second': 23.827})

In [33]:
label_test = list(pred.label_ids)
pred_test = list(map(lambda x: x.index(max(x)), pred.predictions.tolist()))

In [34]:
print(confusion_matrix(label_test, pred_test))

[[20190  4637]
 [ 3338 21835]]


In [35]:
accuracy = accuracy_score(label_test, pred_test)
f1 = f1_score(label_test, pred_test)
recall = recall_score(label_test, pred_test)
precision = precision_score(label_test, pred_test)

print(accuracy)
print(f1)
print(recall)
print(precision)

0.8405
0.845580404685836
0.867397608548842
0.8248337866424902


In [None]:
# model_path = "test-model"
# trainer.model.save_pretrained(model_path)
# tokenizer.save_pretrained(model_path)

# Reference

https://bo-10000.tistory.com/154  
https://huggingface.co/blog/ray-tune  
https://docs.ray.io/en/latest/tune/examples/pbt_transformers.html  
https://wood-b.github.io/post/a-novices-guide-to-hyperparameter-optimization-at-scale/#schedulers-vs-search-algorithms  
https://docs.ray.io/en/latest/tune/api_docs/search_space.html  
https://docs.ray.io/en/latest/tune/tutorials/tune-advanced-tutorial.html  
https://docs.ray.io/en/latest/tune/api_docs/schedulers.html  
https://blog.ml.cmu.edu/2018/12/12/massively-parallel-hyperparameter-optimization/  
https://docs.ray.io/en/latest/tune/faq.html  
https://docs.ray.io/en/latest/tune/api_docs/schedulers.html#population-based-training-tune-schedulers-populationbasedtraining  
https://huggingface.co/docs/transformers/main/en/main_classes/trainer#transformers.Trainer.hyperparameter_search  
https://docs.ray.io/en/latest/tune/api_docs/suggestion.html#optuna-tune-search-optuna-optunasearch  
https://kyunghyunlim.github.io/nlp/ml_ai/2021/09/22/hugging_face_5.html  