# 0. GPU check

* 이 코드는 Nvidia GPU를 사용하는 컴퓨터에서, train / test 데이터가 분리되어있는 csv 파일을 사용하는 것을 전제로 작성됨

In [1]:
import torch

if torch.cuda.is_available():
    device_count = torch.cuda.device_count()
    print("device_count: {}".format(device_count))
    for device_num in range(device_count):
        print("device {} capability {}".format(
            device_num,
            torch.cuda.get_device_capability(device_num)))
        print("device {} name {}".format(
            device_num, 
            torch.cuda.get_device_name(device_num)))
else:
    print("no cuda device")

device_count: 1
device 0 capability (8, 6)
device 0 name NVIDIA GeForce RTX 3080


In [2]:
if torch.cuda.is_available() :
    device = torch.device("cuda:0")
else : 
    device = torch.device("cpu")

In [3]:
from pynvml import *

def print_gpu_utilization():
    nvmlInit()
    handle = nvmlDeviceGetHandleByIndex(0)
    info = nvmlDeviceGetMemoryInfo(handle)
    print(f"GPU memory occupied: {info.used//1024**2} MB.")

def print_summary(result):
    print(f"Time: {result.metrics['train_runtime']:.2f}")
    print(f"Samples/second: {result.metrics['train_samples_per_second']:.2f}")
    print_gpu_utilization()
    
print_gpu_utilization()

GPU memory occupied: 424 MB.


* 모델 훈련과정에서 GPU 메모리 용량 초과 시, 개발서버 콘솔에서 직접 `nvidia-smi` 명령어 실행 후 메모리를 점유하고 있는 process의 PID를 찾아 `sudo kill -9 {pid}` 로 프로세스 종료해주면 됨

# 1. Import packages

In [4]:
## Need to check if packages are compatible
# !pip install accelerate nvidia-ml-py3
# !pip install datasets==2.4.0
# !pip install huggingface_hub==0.9.1
# !pip install transformers==4.22.1 # bf16, tf32 등 사용하려면 4.2 이상 필요
# !pip install pyarrow==9.0.0

* huggingface_hub와 transformers 간 호환가능한 버전 확인 필요
* 만약 성능 테스트를 위해 datasets api를 사용할거라면 datasets 역시 호환 가능 버전 확인해야 함
* 세 가지 dependencies를 사용한다는 가정 하에, pyarrow 라이브러리도 필요.

In [5]:
## Install libraries for optimizing hyperparameters

# !pip install ray optuna
# !pip install sigopt
# !pip install wandb

In [6]:
import transformers
import datasets
import huggingface_hub
import pyarrow

print(transformers.__version__)
print(datasets.__version__)
print(huggingface_hub.__version__)
print(pyarrow.__version__)

# 4.22.1
# 2.4.0
# 0.9.1
# 9.0.0

4.22.1
2.4.0
0.9.1
9.0.0


In [7]:
import os
import re
import math
import numpy as np
import pandas as pd

# 'You can use tf32' if you are acessing Ampere hardware
import torch
torch.backends.cuda.matmul.allow_tf32 = True

from datasets import load_dataset, load_metric, ClassLabel
from sklearn.utils.class_weight import compute_class_weight
import ray
from ray import tune
from ray.tune import CLIReporter
from ray.tune.examples.pbt_transformers.utils import (
    download_data,
    build_compute_metrics_fn,
)
from ray.tune.schedulers import PopulationBasedTraining
from transformers import (
    glue_tasks_num_labels,
    AutoConfig,
    AutoModelForSequenceClassification,
    AutoTokenizer,
    Trainer,
    GlueDataset,
    GlueDataTrainingArguments,
    TrainingArguments,
)

# 2. Import Data

* xxx_train.csv, xxx_test.csv 파일은 아래 형식으로 전처리된 csv 파일이어야 함 (column name: `text`, `label`)


<table class="features-table">
  <tr>
    <th class="mdc-text-light-green-600", style="text-align:center">
    text
    </th>
    <th class="mdc-text-purple-600", style="text-align:center">
    label
    </th>
  </tr>
  <tr>
    <td class="mdc-bg-light-green-50" style="text-align:left">
      Go until jurong point, crazy.. Available only in bugis n great world la e buffet... Cine there got amore wat...
    </td>
    <td class="mdc-bg-purple-50">
      0
    </td>
  </tr>
  <tr>
    <td class="mdc-bg-light-green-50" style="text-align:left">
      Ok lar... Joking wif u oni...
    </td>
    <td class="mdc-bg-purple-50">
      0
    </td>
  </tr>
  <tr>
    <td class="mdc-bg-light-green-50" style="text-align:left">
      Free entry in 2 a wkly comp to win FA Cup final tkts 21st May 2005. Text FA to 87121 to receive entry question(std txt rate)
    </td>
    <td class="mdc-bg-purple-50">
      1
    </td>
  </tr>
  <tr>
    <td class="mdc-bg-light-green-50" style="text-align:left">
      U dun say so early hor... U c already then say...
    </td>
    <td class="mdc-bg-purple-50">
      0
    </td>
  </tr>
  <tr>
    <td class="mdc-bg-light-green-50" style="text-align:left">
      Nah I don't think he goes to usf, he lives around here though
    </td>
    <td class="mdc-bg-purple-50">
      0
    </td>
  </tr>
</table>

In [8]:
data_name = "IMDB" ## covid_articles / financial_news / IMDB / naver_movie_review / spam

dataset = load_dataset('csv', data_files={'train': f'../data_splited/{data_name}_train.csv',
                                          'test': f'../data_splited/{data_name}_test.csv'})
dataset

Using custom data configuration default-5e9b2acfce9f0b59
Reusing dataset csv (/root/.cache/huggingface/datasets/csv/default-5e9b2acfce9f0b59/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a)


  0%|          | 0/2 [00:00<?, ?it/s]

DatasetDict({
    train: Dataset({
        features: ['text', 'label'],
        num_rows: 39999
    })
    test: Dataset({
        features: ['text', 'label'],
        num_rows: 9999
    })
})

# 3. Data Preprocessing

* load_dataset 함수로 불러온 데이터를 수정할 때는 수정 내용을 담은 함수를 만들고, 이를 map 함수로 각 원소에 적용함 ([링크](https://huggingface.co/docs/datasets/v1.4.0/processing.html#processing-data-row-by-row)에서 확인)

In [9]:
## remove specal characters

def remove_sp(example):
    example["text"]=re.sub(r'[^a-z|A-Z|0-9|ㄱ-ㅎ|ㅏ-ㅣ|가-힣| ]+', '', str(example["text"]))
    return example

dataset = dataset.map(remove_sp)

Loading cached processed dataset at /root/.cache/huggingface/datasets/csv/default-5e9b2acfce9f0b59/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a/cache-0c659aeae188f731.arrow
Loading cached processed dataset at /root/.cache/huggingface/datasets/csv/default-5e9b2acfce9f0b59/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a/cache-170d6a353647063a.arrow


In [10]:
## label encoding

labels = list(set(dataset["train"]["label"] + dataset["test"]["label"]))
num_labels = len(labels)

def encoding_label(example):
    str_to_int = ClassLabel(num_classes=num_labels, names=labels)
    example["label"]=str_to_int.str2int(example["label"])
    return example

if type(labels[0]) == str:
    dataset = dataset.map(encoding_label)

Loading cached processed dataset at /root/.cache/huggingface/datasets/csv/default-5e9b2acfce9f0b59/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a/cache-82b60351b1fd1890.arrow
Loading cached processed dataset at /root/.cache/huggingface/datasets/csv/default-5e9b2acfce9f0b59/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a/cache-0fbc2fb5dafc34c1.arrow


# 4. Load PLM & Tokenizing

In [11]:
# model_name = "bert-base-multilingual-cased"
# model_name = "klue/bert-base"
# model_name = "klue/roberta-base"
model_name = "xlm-roberta-base"

In [12]:
# Download cache tokenizer

tokenizer = AutoTokenizer.from_pretrained(model_name)

In [13]:
def tokenize_function(examples):
    tokenized_batch = tokenizer(examples["text"], padding="max_length", truncation=True) # padding : ['longest', 'max_length', 'do_not_pad']
    return tokenized_batch

In [14]:
tokenized_datasets = dataset.map(tokenize_function, batched=True)

Loading cached processed dataset at /root/.cache/huggingface/datasets/csv/default-5e9b2acfce9f0b59/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a/cache-5d1b3f390e177513.arrow


  0%|          | 0/10 [00:00<?, ?ba/s]

In [15]:
# train_dataset = tokenized_datasets["train"].shuffle(seed=1919).select(range(0,math.floor(len(tokenized_datasets["train"])*0.7)))
# eval_dataset = tokenized_datasets["train"].shuffle(seed=1919).select(range(math.floor(len(tokenized_datasets["train"])*0.7), len(tokenized_datasets["train"])))
# test_dataset = tokenized_datasets["test"]

In [16]:
# data for test
train_dataset = tokenized_datasets["train"].shuffle(seed=1919).select(range(1000))
eval_dataset = tokenized_datasets["train"].shuffle(seed=1919).select(range(1000))
test_dataset = tokenized_datasets["test"]

Loading cached shuffled indices for dataset at /root/.cache/huggingface/datasets/csv/default-5e9b2acfce9f0b59/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a/cache-67a5ed5b34b2fec1.arrow
Loading cached shuffled indices for dataset at /root/.cache/huggingface/datasets/csv/default-5e9b2acfce9f0b59/0.0.0/652c3096f041ee27b04d2232d41f10547a8fecda3e284a79a0ec4053c916ef7a/cache-67a5ed5b34b2fec1.arrow


# 5. Check class weights

In [17]:
def class_weight(train_dataset) :
    
    train_labels = np.array(train_dataset["label"])
    class_weights = compute_class_weight(class_weight = 'balanced', classes = np.unique(train_labels), y = train_labels)
    
    weights = torch.tensor(class_weights, dtype = torch.float)
    
    return weights

In [18]:
weights = class_weight(train_dataset)
print(weights)

tensor([1.0225, 0.9785])


# 6. Modeling

In [19]:
task_data_dir = "test-model"
gpus_per_trial = 1
n_trials = 1
metric = load_metric("accuracy") # atasets.list_metrics() 

In [20]:
# Download model and features

config = AutoConfig.from_pretrained(
    model_name, 
    num_labels=num_labels
)

def model_init():
    return AutoModelForSequenceClassification.from_pretrained(
        model_name,
        config=config
        )

In [21]:
def compute_metrics(eval_pred):
    logits, labels = eval_pred
    predictions=np.argmax(logits, axis = -1)
    return metric.compute(predictions=predictions, references=labels)

```python
from transformers import TrainingArguments

training_args = TrainingArguments(
    output_dir='./results',          # output directory
    num_train_epochs=1,              # total number of training epochs
    per_device_train_batch_size=1,   # batch size per device during training
    per_device_eval_batch_size=10,   # batch size for evaluation
    warmup_steps=1000,               # number of warmup steps for learning rate scheduler
    weight_decay=0.01,               # strength of weight decay
    logging_dir='./logs',            # directory for storing logs
    logging_steps=200,               # How often to print logs
    do_train=True,                   # Perform training
    do_eval=True,                    # Perform evaluation
    evaluation_strategy="epoch",     # evalute after each epoch
    gradient_accumulation_steps=64,  # total number of steps before back propagation
    fp16=True,                       # Use mixed precision
    fp16_opt_level="02",             # mixed precision mode
    run_name="ProBert-BFD-MS",       # experiment name
    seed=3                           # Seed for experiment reproducibility 3x3
)
```

In [22]:
training_args = TrainingArguments(
    output_dir=".",
    learning_rate=2e-5,
    do_train=True,
    do_eval=True,
    no_cuda=gpus_per_trial <= 0,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    load_best_model_at_end=True,
    num_train_epochs=20,  # config
    max_steps=-1,
    per_device_train_batch_size=6,  # config
    per_device_eval_batch_size=6,  # config
    warmup_steps=0,
    warmup_ratio=0.1,
    weight_decay=0.1,  # config
    logging_dir="./logs",
    skip_memory_metrics=True,
    report_to="none",
    fp16=True,
    # bf16=True,
    # tf32=True,
    gradient_accumulation_steps=4,
    gradient_checkpointing=True,
    seed=818
    )
    
# trainer = Trainer(
#     model_init=model_init,
#     args=training_args,
#     train_dataset=train_dataset,
#     eval_dataset=eval_dataset,
#     compute_metrics=compute_metrics,
#     )


class CustomTrainer(Trainer):
    def compute_loss(self, model, inputs, return_outputs=False):
        labels = inputs.get("labels")
        # forward pass
        outputs = model(**inputs)
        logits = outputs.get("logits")
        # compute custom loss
        weight = weights.to(device)
        loss_fct = torch.nn.CrossEntropyLoss(weight=weight)
        loss = loss_fct(logits.view(-1, self.model.config.num_labels), labels.view(-1))
        return (loss, outputs) if return_outputs else loss
    
trainer = CustomTrainer(
    model_init=model_init,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
    compute_metrics=compute_metrics,
    )

loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--xlm-roberta-base/snapshots/f6d161e8f5f6f2ed433fb4023d6cb34146506b3f/pytorch_model.bin
Some weights of the model checkpoint at xlm-roberta-base were not used when initializing XLMRobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'roberta.pooler.dense.weight', 'lm_head.dense.weight', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.decoder.weight', 'lm_head.dense.bias']
- This IS expected if you are initializing XLMRobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing XLMRobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassificat

In [24]:
# Hyperparameter tuning with ray tune

tune_config = {
    "per_device_train_batch_size": 6, 
    "per_device_eval_batch_size": 6,
#     "num_train_epochs": tune.choice([2, 3]),
    "max_steps": -1
}

# PopulationBasedTraining
# worker might copy the model parameters from a better performing worker or explore new hyperparameters by changing the current values randomly
# cf. ASHAScheduler
scheduler = PopulationBasedTraining(
    time_attr="training_iteration",
    metric="eval_accuracy",
    mode="max",
    perturbation_interval=1,
    hyperparam_mutations={
        "weight_decay": tune.uniform(0.0, 0.3), # tune.uniform(1, 10) == np.random.uniform(1, 10)
#         "learning_rate": tune.uniform(1e-5, 5e-5),
        "per_device_train_batch_size": [6],
    },
)

reporter = CLIReporter(
    parameter_columns={
        "weight_decay": "w_decay",
        "learning_rate": "lr",
        "per_device_train_batch_size": "train_bs/gpu",
        "num_train_epochs": "num_epochs",
    },
    metric_columns=["eval_accuracy", "eval_loss", "epoch", "training_iteration"],
)

result = trainer.hyperparameter_search(
    hp_space = lambda _: tune_config,
    backend="ray",
    n_trials=n_trials,
    resources_per_trial={"cpu": 16, "gpu": gpus_per_trial},
    scheduler=scheduler,
    keep_checkpoints_num=1,
    checkpoint_score_attr="training_iteration",
    stop=None,
    progress_reporter=reporter,
    local_dir="./test-results",
    name="tune_transformer_pbt",
    log_to_file=True,
)

2022-10-11 03:55:58,262	INFO worker.py:1518 -- Started a local Ray instance.

from ray.air import session

def train(config):
    # ...
    session.report({"metric": metric}, checkpoint=checkpoint)

For more information please see https://docs.ray.io/en/master/ray-air/key-concepts.html#session

[2m[36m(pid=2373378)[0m 2022-10-11 03:56:06.788726: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0


== Status ==
Current time: 2022-10-11 03:56:05 (running for 00:00:00.16)
Memory usage on this node: 8.8/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |
|------------------------+----------+--------------------+-----------+------+----------------+--------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967 |      |              6 |              |
+------------------------+----------+--------------------+-----------+------+----------------+--------------+




[2m[36m(_objective pid=2373378)[0m Some weights of the model checkpoint at xlm-roberta-base were not used when initializing XLMRobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.bias', 'roberta.pooler.dense.weight', 'lm_head.decoder.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.weight']
[2m[36m(_objective pid=2373378)[0m - This IS expected if you are initializing XLMRobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=2373378)[0m - This IS NOT expected if you are initializing XLMRobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=2373378)[0m Some weights

== Status ==
Current time: 2022-10-11 03:56:12 (running for 00:00:07.41)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |
|------------------------+----------+--------------------+-----------+------+----------------+--------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967 |      |              6 |              |
+------------------------+----------+--------------------+-----------+------+----------------+--------------+




  0%|          | 1/820 [00:00<07:31,  1.81it/s]
  0%|          | 2/820 [00:01<07:39,  1.78it/s]
  0%|          | 3/820 [00:01<07:44,  1.76it/s]
  0%|          | 4/820 [00:02<07:46,  1.75it/s]
  1%|          | 5/820 [00:02<07:47,  1.74it/s]
  1%|          | 6/820 [00:03<07:46,  1.74it/s]
  1%|          | 7/820 [00:03<07:44,  1.75it/s]
  1%|          | 8/820 [00:04<07:45,  1.75it/s]


== Status ==
Current time: 2022-10-11 03:56:17 (running for 00:00:12.41)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |
|------------------------+----------+--------------------+-----------+------+----------------+--------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967 |      |              6 |              |
+------------------------+----------+--------------------+-----------+------+----------------+--------------+




  1%|          | 9/820 [00:05<07:44,  1.75it/s]
  1%|          | 10/820 [00:05<07:40,  1.76it/s]
  1%|▏         | 11/820 [00:06<07:42,  1.75it/s]
  1%|▏         | 12/820 [00:06<07:42,  1.75it/s]
  2%|▏         | 13/820 [00:07<07:40,  1.75it/s]
  2%|▏         | 14/820 [00:07<07:39,  1.75it/s]
  2%|▏         | 15/820 [00:08<07:40,  1.75it/s]
  2%|▏         | 16/820 [00:09<07:40,  1.75it/s]
  2%|▏         | 17/820 [00:09<07:39,  1.75it/s]


== Status ==
Current time: 2022-10-11 03:56:22 (running for 00:00:17.41)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |
|------------------------+----------+--------------------+-----------+------+----------------+--------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967 |      |              6 |              |
+------------------------+----------+--------------------+-----------+------+----------------+--------------+




  2%|▏         | 18/820 [00:10<07:40,  1.74it/s]
  2%|▏         | 19/820 [00:10<07:40,  1.74it/s]
  2%|▏         | 20/820 [00:11<07:39,  1.74it/s]
  3%|▎         | 21/820 [00:12<07:39,  1.74it/s]
  3%|▎         | 22/820 [00:12<07:35,  1.75it/s]
  3%|▎         | 23/820 [00:13<07:35,  1.75it/s]
  3%|▎         | 24/820 [00:13<07:34,  1.75it/s]
  3%|▎         | 25/820 [00:14<07:32,  1.76it/s]
  3%|▎         | 26/820 [00:14<07:33,  1.75it/s]


== Status ==
Current time: 2022-10-11 03:56:27 (running for 00:00:22.41)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |
|------------------------+----------+--------------------+-----------+------+----------------+--------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967 |      |              6 |              |
+------------------------+----------+--------------------+-----------+------+----------------+--------------+




  3%|▎         | 27/820 [00:15<07:31,  1.75it/s]
  3%|▎         | 28/820 [00:15<07:28,  1.76it/s]
  4%|▎         | 29/820 [00:16<07:30,  1.76it/s]
  4%|▎         | 30/820 [00:17<07:31,  1.75it/s]
  4%|▍         | 31/820 [00:17<07:29,  1.75it/s]
  4%|▍         | 32/820 [00:18<07:30,  1.75it/s]
  4%|▍         | 33/820 [00:18<07:30,  1.75it/s]
  4%|▍         | 34/820 [00:19<07:28,  1.75it/s]
  4%|▍         | 35/820 [00:19<07:25,  1.76it/s]


== Status ==
Current time: 2022-10-11 03:56:32 (running for 00:00:27.41)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |
|------------------------+----------+--------------------+-----------+------+----------------+--------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967 |      |              6 |              |
+------------------------+----------+--------------------+-----------+------+----------------+--------------+




  4%|▍         | 36/820 [00:20<07:24,  1.76it/s]
  5%|▍         | 37/820 [00:21<07:25,  1.76it/s]
  5%|▍         | 38/820 [00:21<07:26,  1.75it/s]
  5%|▍         | 39/820 [00:22<07:26,  1.75it/s]
  5%|▍         | 40/820 [00:22<07:22,  1.76it/s]
  5%|▌         | 41/820 [00:23<07:23,  1.76it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 39.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.50it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.96it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.20it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.86it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.67it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.

== Status ==
Current time: 2022-10-11 03:56:37 (running for 00:00:32.41)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |
|------------------------+----------+--------------------+-----------+------+----------------+--------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967 |      |              6 |              |
+------------------------+----------+--------------------+-----------+------+----------------+--------------+




[2m[36m(_objective pid=2373378)[0m 
 27%|██▋       | 45/167 [00:01<00:03, 31.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
 29%|██▉       | 49/167 [00:01<00:03, 31.33it/s][A
[2m[36m(_objective pid=2373378)[0m 
 32%|███▏      | 53/167 [00:01<00:03, 31.22it/s][A
[2m[36m(_objective pid=2373378)[0m 
 34%|███▍      | 57/167 [00:01<00:03, 31.27it/s][A
[2m[36m(_objective pid=2373378)[0m 
 37%|███▋      | 61/167 [00:01<00:03, 31.31it/s][A
[2m[36m(_objective pid=2373378)[0m 
 39%|███▉      | 65/167 [00:02<00:03, 31.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
 41%|████▏     | 69/167 [00:02<00:03, 31.34it/s][A
[2m[36m(_objective pid=2373378)[0m 
 44%|████▎     | 73/167 [00:02<00:03, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 46%|████▌     | 77/167 [00:02<00:02, 31.24it/s][A
[2m[36m(_objective pid=2373378)[0m 
 49%|████▊     | 81/167 [00:02<00:02, 31.29it/s][A
[2m[36m(_objective pid=2373378)[0m 
 51%|█████     | 85/167 [00:02<00:02, 31.30it/s][A

[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.6999965906143188, 'eval_accuracy': 0.489, 'eval_runtime': 5.341, 'eval_samples_per_second': 187.231, 'eval_steps_per_second': 31.268, 'epoch': 0.98}
== Status ==
Current time: 2022-10-11 03:56:42 (running for 00:00:37.42)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |
|------------------------+----------+--------------------+-----------+------+----------------+--------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  

  5%|▌         | 42/820 [00:36<54:44,  4.22s/it]
  5%|▌         | 43/820 [00:36<40:22,  3.12s/it]
  5%|▌         | 44/820 [00:37<30:19,  2.34s/it]


== Status ==
Current time: 2022-10-11 03:56:50 (running for 00:00:44.69)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

  5%|▌         | 45/820 [00:37<23:17,  1.80s/it]
  6%|▌         | 46/820 [00:38<18:22,  1.42s/it]
  6%|▌         | 47/820 [00:38<14:56,  1.16s/it]
  6%|▌         | 48/820 [00:39<12:32,  1.03it/s]
  6%|▌         | 49/820 [00:39<10:50,  1.18it/s]
  6%|▌         | 50/820 [00:40<09:40,  1.33it/s]
  6%|▌         | 51/820 [00:41<08:50,  1.45it/s]
  6%|▋         | 52/820 [00:41<08:15,  1.55it/s]
  6%|▋         | 53/820 [00:42<07:51,  1.63it/s]


== Status ==
Current time: 2022-10-11 03:56:55 (running for 00:00:49.69)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

  7%|▋         | 54/820 [00:42<07:33,  1.69it/s]
  7%|▋         | 55/820 [00:43<07:21,  1.73it/s]
  7%|▋         | 56/820 [00:43<07:12,  1.77it/s]
  7%|▋         | 57/820 [00:44<07:06,  1.79it/s]
  7%|▋         | 58/820 [00:44<07:01,  1.81it/s]
  7%|▋         | 59/820 [00:45<06:58,  1.82it/s]
  7%|▋         | 60/820 [00:45<06:55,  1.83it/s]
  7%|▋         | 61/820 [00:46<06:53,  1.83it/s]
  8%|▊         | 62/820 [00:46<06:52,  1.84it/s]


== Status ==
Current time: 2022-10-11 03:57:00 (running for 00:00:54.69)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

  8%|▊         | 63/820 [00:47<06:51,  1.84it/s]
  8%|▊         | 64/820 [00:48<06:50,  1.84it/s]
  8%|▊         | 65/820 [00:48<06:49,  1.84it/s]
  8%|▊         | 66/820 [00:49<06:48,  1.84it/s]
  8%|▊         | 67/820 [00:49<06:48,  1.85it/s]
  8%|▊         | 68/820 [00:50<06:47,  1.85it/s]
  8%|▊         | 69/820 [00:50<06:46,  1.85it/s]
  9%|▊         | 70/820 [00:51<06:45,  1.85it/s]
  9%|▊         | 71/820 [00:51<06:45,  1.85it/s]


== Status ==
Current time: 2022-10-11 03:57:05 (running for 00:00:59.69)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

  9%|▉         | 72/820 [00:52<06:44,  1.85it/s]
  9%|▉         | 73/820 [00:52<06:44,  1.85it/s]
  9%|▉         | 74/820 [00:53<06:43,  1.85it/s]
  9%|▉         | 75/820 [00:54<06:43,  1.85it/s]
  9%|▉         | 76/820 [00:54<06:42,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m   nn.utils.clip_grad_norm_(
  9%|▉         | 77/820 [00:55<06:36,  1.88it/s]
 10%|▉         | 78/820 [00:55<06:31,  1.90it/s]
 10%|▉         | 79/820 [00:56<06:33,  1.88it/s]
 10%|▉         | 80/820 [00:56<06:35,  1.87it/s]
 10%|▉         | 81/820 [00:57<06:36,  1.86it/s]


== Status ==
Current time: 2022-10-11 03:57:10 (running for 00:01:04.70)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 10%|█         | 82/820 [00:57<06:36,  1.86it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 39.20it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.58it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.96it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 31.53it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.49it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.43it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.29it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.29it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.34it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍     

== Status ==
Current time: 2022-10-11 03:57:15 (running for 00:01:09.70)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.33it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.33it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.34it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:04<00:00, 31.35it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.35it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.37it/s][A
                                                
 10%|█         | 82/820 [01:03<06:36,  1.86it/s] 
100%|██████████| 167/167 [00:05<00:00, 31.37it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.5027593970298767, 'eval_accuracy': 0.857, 'eval_runtime': 5.3391, 'eval_samples_per_second': 187.299, 'eval_steps_per_second': 31.279, 'epoch': 1.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_03-57-19
  done: false
  epoch: 1.98
  eval_accuracy: 0.857
  eval_loss: 0.5027593970298767
  eval_runtime: 5.3391
  eval_samples_per_second: 187.299
  eval_steps_per_second: 31.279
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 2
  node_ip: 172.17.0.3
  objective: 0.857
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 71.5713484287262
  time_this_iter_s: 34.298479080200195
  time_total_s: 71.5713484287262
  timestamp: 1665460639
  timesteps_since_restore: 0
  training_iteration: 2
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 10%|█         | 83/820 [01:10<52:12,  4.25s/it]
 10%|█         | 84/820 [01:11<38:29,  3.14s/it]
 10%|█         | 85/820 [01:11<28:53,  2.36s/it]


== Status ==
Current time: 2022-10-11 03:57:24 (running for 00:01:19.18)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 10%|█         | 86/820 [01:12<22:11,  1.81s/it]
 11%|█         | 87/820 [01:12<17:29,  1.43s/it]
 11%|█         | 88/820 [01:13<14:12,  1.17s/it]
 11%|█         | 89/820 [01:13<11:54,  1.02it/s]
 11%|█         | 90/820 [01:14<10:18,  1.18it/s]
 11%|█         | 91/820 [01:14<09:10,  1.32it/s]
 11%|█         | 92/820 [01:15<08:23,  1.45it/s]
 11%|█▏        | 93/820 [01:16<07:50,  1.55it/s]
 11%|█▏        | 94/820 [01:16<07:27,  1.62it/s]


== Status ==
Current time: 2022-10-11 03:57:29 (running for 00:01:24.18)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 12%|█▏        | 95/820 [01:17<07:10,  1.69it/s]
 12%|█▏        | 96/820 [01:17<06:58,  1.73it/s]
 12%|█▏        | 97/820 [01:18<06:50,  1.76it/s]
 12%|█▏        | 98/820 [01:18<06:43,  1.79it/s]
 12%|█▏        | 99/820 [01:19<06:39,  1.81it/s]
 12%|█▏        | 100/820 [01:19<06:36,  1.82it/s]
 12%|█▏        | 101/820 [01:20<06:33,  1.83it/s]
 12%|█▏        | 102/820 [01:20<06:31,  1.83it/s]
 13%|█▎        | 103/820 [01:21<06:30,  1.84it/s]


== Status ==
Current time: 2022-10-11 03:57:34 (running for 00:01:29.19)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 13%|█▎        | 104/820 [01:22<06:29,  1.84it/s]
 13%|█▎        | 105/820 [01:22<06:28,  1.84it/s]
 13%|█▎        | 106/820 [01:23<06:27,  1.84it/s]
 13%|█▎        | 107/820 [01:23<06:26,  1.84it/s]
 13%|█▎        | 108/820 [01:24<06:25,  1.85it/s]
 13%|█▎        | 109/820 [01:24<06:25,  1.85it/s]
 13%|█▎        | 110/820 [01:25<06:24,  1.85it/s]
 14%|█▎        | 111/820 [01:25<06:23,  1.85it/s]
 14%|█▎        | 112/820 [01:26<06:23,  1.85it/s]


== Status ==
Current time: 2022-10-11 03:57:39 (running for 00:01:34.19)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 14%|█▍        | 113/820 [01:26<06:22,  1.85it/s]
 14%|█▍        | 114/820 [01:27<06:22,  1.85it/s]
 14%|█▍        | 115/820 [01:27<06:21,  1.85it/s]
 14%|█▍        | 116/820 [01:28<06:21,  1.85it/s]
 14%|█▍        | 117/820 [01:29<06:20,  1.85it/s]
 14%|█▍        | 118/820 [01:29<06:20,  1.85it/s]
 15%|█▍        | 119/820 [01:30<06:19,  1.85it/s]
 15%|█▍        | 120/820 [01:30<06:19,  1.85it/s]
 15%|█▍        | 121/820 [01:31<06:18,  1.85it/s]
 15%|█▍        | 122/820 [01:31<06:18,  1.85it/s]


== Status ==
Current time: 2022-10-11 03:57:44 (running for 00:01:39.19)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 15%|█▌        | 123/820 [01:32<06:17,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 39.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.61it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 33.14it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.42it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 32.06it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.82it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.69it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.48it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.43it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 03:57:49 (running for 00:01:44.19)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.22it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.14it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.20it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.22it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:04<00:00, 31.24it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.10it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.16it/s][A
                                                 
 15%|█▌        | 123/820 [01:37<06:17,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.16it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.3978446424007416, 'eval_accuracy': 0.88, 'eval_runtime': 5.3379, 'eval_samples_per_second': 187.341, 'eval_steps_per_second': 31.286, 'epoch': 2.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_03-57-53
  done: false
  epoch: 2.98
  eval_accuracy: 0.88
  eval_loss: 0.3978446424007416
  eval_runtime: 5.3379
  eval_samples_per_second: 187.341
  eval_steps_per_second: 31.286
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 3
  node_ip: 172.17.0.3
  objective: 0.88
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 106.1750819683075
  time_this_iter_s: 34.6037335395813
  time_total_s: 106.1750819683075
  timestamp: 1665460673
  timesteps_since_restore: 0
  training_iteration: 3
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 15%|█▌        | 124/820 [01:45<49:25,  4.26s/it]
 15%|█▌        | 125/820 [01:45<36:26,  3.15s/it]
 15%|█▌        | 126/820 [01:46<27:20,  2.36s/it]


== Status ==
Current time: 2022-10-11 03:57:59 (running for 00:01:53.78)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 15%|█▌        | 127/820 [01:46<20:59,  1.82s/it]
 16%|█▌        | 128/820 [01:47<16:32,  1.43s/it]
 16%|█▌        | 129/820 [01:47<13:26,  1.17s/it]
 16%|█▌        | 130/820 [01:48<11:15,  1.02it/s]
 16%|█▌        | 131/820 [01:49<09:43,  1.18it/s]
 16%|█▌        | 132/820 [01:49<08:39,  1.32it/s]
 16%|█▌        | 133/820 [01:50<07:55,  1.45it/s]
 16%|█▋        | 134/820 [01:50<07:23,  1.55it/s]
 16%|█▋        | 135/820 [01:51<07:01,  1.63it/s]


== Status ==
Current time: 2022-10-11 03:58:04 (running for 00:01:58.78)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 17%|█▋        | 136/820 [01:51<06:45,  1.69it/s]
 17%|█▋        | 137/820 [01:52<06:34,  1.73it/s]
 17%|█▋        | 138/820 [01:52<06:26,  1.76it/s]
 17%|█▋        | 139/820 [01:53<06:20,  1.79it/s]
 17%|█▋        | 140/820 [01:53<06:16,  1.81it/s]
 17%|█▋        | 141/820 [01:54<06:13,  1.82it/s]
 17%|█▋        | 142/820 [01:55<06:11,  1.83it/s]
 17%|█▋        | 143/820 [01:55<06:09,  1.83it/s]
 18%|█▊        | 144/820 [01:56<06:07,  1.84it/s]


== Status ==
Current time: 2022-10-11 03:58:09 (running for 00:02:03.79)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 18%|█▊        | 145/820 [01:56<06:06,  1.84it/s]
 18%|█▊        | 146/820 [01:57<06:05,  1.84it/s]
 18%|█▊        | 147/820 [01:57<06:05,  1.84it/s]
 18%|█▊        | 148/820 [01:58<06:04,  1.84it/s]
 18%|█▊        | 149/820 [01:58<06:03,  1.84it/s]
 18%|█▊        | 150/820 [01:59<06:03,  1.84it/s]
 18%|█▊        | 151/820 [01:59<06:02,  1.85it/s]
 19%|█▊        | 152/820 [02:00<06:01,  1.85it/s]
 19%|█▊        | 153/820 [02:00<06:01,  1.85it/s]


== Status ==
Current time: 2022-10-11 03:58:14 (running for 00:02:08.79)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 19%|█▉        | 154/820 [02:01<06:00,  1.85it/s]
 19%|█▉        | 155/820 [02:02<06:00,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m   nn.utils.clip_grad_norm_(
 19%|█▉        | 156/820 [02:02<05:54,  1.88it/s]
 19%|█▉        | 157/820 [02:03<05:55,  1.87it/s]
 19%|█▉        | 158/820 [02:03<05:55,  1.86it/s]
 19%|█▉        | 159/820 [02:04<05:56,  1.86it/s]
 20%|█▉        | 160/820 [02:04<05:56,  1.85it/s]
 20%|█▉        | 161/820 [02:05<05:55,  1.85it/s]
 20%|█▉        | 162/820 [02:05<05:55,  1.85it/s]
 20%|█▉        | 163/820 [02:06<05:55,  1.85it/s]


== Status ==
Current time: 2022-10-11 03:58:19 (running for 00:02:13.79)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 20%|██        | 164/820 [02:06<05:55,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 39.12it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.45it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 33.00it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 31.52it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.44it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.36it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.28it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.28it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 03:58:24 (running for 00:02:18.79)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.22it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.23it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.12it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.11it/s][A
                                                 
 20%|██        | 164/820 [02:12<05:55,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.11it/s][A
                   

[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.2228958159685135, 'eval_accuracy': 0.916, 'eval_runtime': 5.3568, 'eval_samples_per_second': 186.679, 'eval_steps_per_second': 31.175, 'epoch': 3.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_03-58-28
  done: false
  epoch: 3.98
  eval_accuracy: 0.916
  eval_loss: 0.2228958159685135
  eval_runtime: 5.3568
  eval_samples_per_second: 186.679
  eval_steps_per_second: 31.175
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 4
  node_ip: 172.17.0.3
  objective: 0.916
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 140.74024057388306
  time_this_iter_s: 34.56515860557556
  time_total_s: 140.74024057388306
  timestamp: 1665460708
  timesteps_since_restore: 0
  training_iteration: 4
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 20%|██        | 165/820 [02:19<46:30,  4.26s/it]
 20%|██        | 166/820 [02:20<34:16,  3.14s/it]
 20%|██        | 167/820 [02:20<25:43,  2.36s/it]


== Status ==
Current time: 2022-10-11 03:58:33 (running for 00:02:28.35)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 20%|██        | 168/820 [02:21<19:44,  1.82s/it]
 21%|██        | 169/820 [02:21<15:33,  1.43s/it]
 21%|██        | 170/820 [02:22<12:38,  1.17s/it]
 21%|██        | 171/820 [02:23<10:35,  1.02it/s]
 21%|██        | 172/820 [02:23<09:09,  1.18it/s]
 21%|██        | 173/820 [02:24<08:08,  1.32it/s]
 21%|██        | 174/820 [02:24<07:26,  1.45it/s]
 21%|██▏       | 175/820 [02:25<06:56,  1.55it/s]
 21%|██▏       | 176/820 [02:25<06:35,  1.63it/s]


== Status ==
Current time: 2022-10-11 03:58:38 (running for 00:02:33.35)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 22%|██▏       | 177/820 [02:26<06:21,  1.69it/s]
 22%|██▏       | 178/820 [02:26<06:10,  1.73it/s]
 22%|██▏       | 179/820 [02:27<06:03,  1.77it/s]
 22%|██▏       | 180/820 [02:27<05:57,  1.79it/s]
 22%|██▏       | 181/820 [02:28<05:53,  1.81it/s]
 22%|██▏       | 182/820 [02:29<05:50,  1.82it/s]
 22%|██▏       | 183/820 [02:29<05:48,  1.83it/s]
 22%|██▏       | 184/820 [02:30<05:46,  1.83it/s]
 23%|██▎       | 185/820 [02:30<05:45,  1.84it/s]


== Status ==
Current time: 2022-10-11 03:58:43 (running for 00:02:38.36)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 23%|██▎       | 186/820 [02:31<05:44,  1.84it/s]
 23%|██▎       | 187/820 [02:31<05:43,  1.84it/s]
 23%|██▎       | 188/820 [02:32<05:43,  1.84it/s]
 23%|██▎       | 189/820 [02:32<05:42,  1.84it/s]
 23%|██▎       | 190/820 [02:33<05:41,  1.84it/s]
 23%|██▎       | 191/820 [02:33<05:40,  1.84it/s]
 23%|██▎       | 192/820 [02:34<05:40,  1.84it/s]
 24%|██▎       | 193/820 [02:34<05:39,  1.84it/s]
 24%|██▎       | 194/820 [02:35<05:39,  1.85it/s]


== Status ==
Current time: 2022-10-11 03:58:48 (running for 00:02:43.36)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 24%|██▍       | 195/820 [02:36<05:38,  1.85it/s]
 24%|██▍       | 196/820 [02:36<05:38,  1.85it/s]
 24%|██▍       | 197/820 [02:37<05:37,  1.85it/s]
 24%|██▍       | 198/820 [02:37<05:36,  1.85it/s]
 24%|██▍       | 199/820 [02:38<05:36,  1.85it/s]
 24%|██▍       | 200/820 [02:38<05:35,  1.85it/s]
 25%|██▍       | 201/820 [02:39<05:35,  1.85it/s]
 25%|██▍       | 202/820 [02:39<05:35,  1.84it/s]
 25%|██▍       | 203/820 [02:40<05:34,  1.84it/s]
 25%|██▍       | 204/820 [02:40<05:34,  1.84it/s]


== Status ==
Current time: 2022-10-11 03:58:53 (running for 00:02:48.36)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 25%|██▌       | 205/820 [02:41<05:33,  1.84it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 39.12it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.43it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.86it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.23it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.90it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.70it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.46it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.39it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.36it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 03:58:58 (running for 00:02:53.37)
Memory usage on this node: 13.5/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.00it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.08it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.03it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.10it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.16it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.20it/s][A
[2m[36m(_objective pid=2373378)[0m 
                                                 [A
 25%|██▌       | 205/820 [02:47<05:33,  1.84it/s]
100%|██████████| 167/167 [00:05<00:00, 31.19it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.16677820682525635, 'eval_accuracy': 0.952, 'eval_runtime': 5.3574, 'eval_samples_per_second': 186.658, 'eval_steps_per_second': 31.172, 'epoch': 4.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_03-59-03
  done: false
  epoch: 4.98
  eval_accuracy: 0.952
  eval_loss: 0.16677820682525635
  eval_runtime: 5.3574
  eval_samples_per_second: 186.658
  eval_steps_per_second: 31.172
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 5
  node_ip: 172.17.0.3
  objective: 0.952
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 175.35847640037537
  time_this_iter_s: 34.61823582649231
  time_total_s: 175.35847640037537
  timestamp: 1665460743
  timesteps_since_restore: 0
  training_iteration: 5
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 25%|██▌       | 206/820 [02:54<43:51,  4.29s/it]
 25%|██▌       | 207/820 [02:55<32:18,  3.16s/it]


== Status ==
Current time: 2022-10-11 03:59:08 (running for 00:03:02.98)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 25%|██▌       | 208/820 [02:55<24:13,  2.38s/it]
 25%|██▌       | 209/820 [02:56<18:35,  1.83s/it]
 26%|██▌       | 210/820 [02:56<14:38,  1.44s/it]
 26%|██▌       | 211/820 [02:57<11:52,  1.17s/it]
 26%|██▌       | 212/820 [02:57<09:56,  1.02it/s]
 26%|██▌       | 213/820 [02:58<08:35,  1.18it/s]
 26%|██▌       | 214/820 [02:58<07:38,  1.32it/s]
 26%|██▌       | 215/820 [02:59<06:59,  1.44it/s]
 26%|██▋       | 216/820 [02:59<06:31,  1.54it/s]
 26%|██▋       | 217/820 [03:00<06:11,  1.62it/s]


== Status ==
Current time: 2022-10-11 03:59:13 (running for 00:03:07.98)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 27%|██▋       | 218/820 [03:01<05:57,  1.69it/s]
 27%|██▋       | 219/820 [03:01<05:47,  1.73it/s]
 27%|██▋       | 220/820 [03:02<05:40,  1.76it/s]
 27%|██▋       | 221/820 [03:02<05:35,  1.79it/s]
 27%|██▋       | 222/820 [03:03<05:31,  1.81it/s]
 27%|██▋       | 223/820 [03:03<05:28,  1.82it/s]
 27%|██▋       | 224/820 [03:04<05:26,  1.83it/s]
 27%|██▋       | 225/820 [03:04<05:24,  1.83it/s]
 28%|██▊       | 226/820 [03:05<05:22,  1.84it/s]


== Status ==
Current time: 2022-10-11 03:59:18 (running for 00:03:12.98)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 28%|██▊       | 227/820 [03:05<05:22,  1.84it/s]
 28%|██▊       | 228/820 [03:06<05:21,  1.84it/s]
 28%|██▊       | 229/820 [03:06<05:20,  1.84it/s]
 28%|██▊       | 230/820 [03:07<05:20,  1.84it/s]
 28%|██▊       | 231/820 [03:08<05:19,  1.84it/s]
 28%|██▊       | 232/820 [03:08<05:19,  1.84it/s]
 28%|██▊       | 233/820 [03:09<05:18,  1.84it/s]
 29%|██▊       | 234/820 [03:09<05:17,  1.84it/s]
 29%|██▊       | 235/820 [03:10<05:17,  1.84it/s]


== Status ==
Current time: 2022-10-11 03:59:23 (running for 00:03:17.98)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 29%|██▉       | 236/820 [03:10<05:16,  1.84it/s]
 29%|██▉       | 237/820 [03:11<05:16,  1.84it/s]
 29%|██▉       | 238/820 [03:11<05:15,  1.84it/s]
 29%|██▉       | 239/820 [03:12<05:15,  1.84it/s]
 29%|██▉       | 240/820 [03:12<05:14,  1.84it/s]
 29%|██▉       | 241/820 [03:13<05:13,  1.84it/s]
 30%|██▉       | 242/820 [03:14<05:13,  1.84it/s]
 30%|██▉       | 243/820 [03:14<05:12,  1.84it/s]
 30%|██▉       | 244/820 [03:15<05:12,  1.84it/s]
 30%|██▉       | 245/820 [03:15<05:11,  1.84it/s]


== Status ==
Current time: 2022-10-11 03:59:28 (running for 00:03:22.99)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 30%|███       | 246/820 [03:16<05:11,  1.84it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 38.79it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.39it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.94it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 31.48it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.37it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.34it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.33it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.28it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.27it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 03:59:33 (running for 00:03:27.99)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.23it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.25it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.25it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.24it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.22it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.24it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.25it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.25it/s][A
                                                 
 30%|███       | 246/820 [03:21<05:11,  1.84it/s]
100%|██████████| 167/167 [00:05<00:00, 31.25it/s][A
                   

[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.1416747272014618, 'eval_accuracy': 0.962, 'eval_runtime': 5.3539, 'eval_samples_per_second': 186.78, 'eval_steps_per_second': 31.192, 'epoch': 5.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_03-59-37
  done: false
  epoch: 5.98
  eval_accuracy: 0.962
  eval_loss: 0.1416747272014618
  eval_runtime: 5.3539
  eval_samples_per_second: 186.78
  eval_steps_per_second: 31.192
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 6
  node_ip: 172.17.0.3
  objective: 0.962
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 210.04690718650818
  time_this_iter_s: 34.68843078613281
  time_total_s: 210.04690718650818
  timestamp: 1665460777
  timesteps_since_restore: 0
  training_iteration: 6
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 30%|███       | 247/820 [03:29<40:42,  4.26s/it]
 30%|███       | 248/820 [03:29<29:59,  3.15s/it]
 30%|███       | 249/820 [03:30<22:30,  2.36s/it]


== Status ==
Current time: 2022-10-11 03:59:43 (running for 00:03:37.67)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 30%|███       | 250/820 [03:30<17:16,  1.82s/it]
 31%|███       | 251/820 [03:31<13:36,  1.44s/it]
 31%|███       | 252/820 [03:31<11:03,  1.17s/it]
 31%|███       | 253/820 [03:32<09:15,  1.02it/s]
 31%|███       | 254/820 [03:32<07:59,  1.18it/s]
 31%|███       | 255/820 [03:33<07:07,  1.32it/s]
 31%|███       | 256/820 [03:33<06:30,  1.44it/s]
 31%|███▏      | 257/820 [03:34<06:04,  1.55it/s]
 31%|███▏      | 258/820 [03:35<05:45,  1.62it/s]


== Status ==
Current time: 2022-10-11 03:59:48 (running for 00:03:42.67)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 32%|███▏      | 259/820 [03:35<05:32,  1.69it/s]
 32%|███▏      | 260/820 [03:36<05:23,  1.73it/s]
 32%|███▏      | 261/820 [03:36<05:16,  1.76it/s]
 32%|███▏      | 262/820 [03:37<05:12,  1.79it/s]
 32%|███▏      | 263/820 [03:37<05:08,  1.80it/s]
 32%|███▏      | 264/820 [03:38<05:05,  1.82it/s]
 32%|███▏      | 265/820 [03:38<05:03,  1.83it/s]
 32%|███▏      | 266/820 [03:39<05:01,  1.84it/s]
 33%|███▎      | 267/820 [03:39<05:00,  1.84it/s]


== Status ==
Current time: 2022-10-11 03:59:53 (running for 00:03:47.68)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 33%|███▎      | 268/820 [03:40<04:59,  1.84it/s]
 33%|███▎      | 269/820 [03:41<04:59,  1.84it/s]
 33%|███▎      | 270/820 [03:41<04:58,  1.84it/s]
 33%|███▎      | 271/820 [03:42<04:57,  1.85it/s]
 33%|███▎      | 272/820 [03:42<04:57,  1.84it/s]
 33%|███▎      | 273/820 [03:43<04:56,  1.84it/s]
 33%|███▎      | 274/820 [03:43<04:55,  1.85it/s]
 34%|███▎      | 275/820 [03:44<04:55,  1.84it/s]
 34%|███▎      | 276/820 [03:44<04:55,  1.84it/s]


== Status ==
Current time: 2022-10-11 03:59:58 (running for 00:03:52.68)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 34%|███▍      | 277/820 [03:45<04:53,  1.85it/s]
 34%|███▍      | 278/820 [03:45<04:53,  1.85it/s]
 34%|███▍      | 279/820 [03:46<04:52,  1.85it/s]
 34%|███▍      | 280/820 [03:46<04:52,  1.85it/s]
 34%|███▍      | 281/820 [03:47<04:51,  1.85it/s]
 34%|███▍      | 282/820 [03:48<04:50,  1.85it/s]
 35%|███▍      | 283/820 [03:48<04:50,  1.85it/s]
 35%|███▍      | 284/820 [03:49<04:49,  1.85it/s]
 35%|███▍      | 285/820 [03:49<04:49,  1.85it/s]
 35%|███▍      | 286/820 [03:50<04:49,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:00:03 (running for 00:03:57.69)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 35%|███▌      | 287/820 [03:50<04:48,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 39.11it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.33it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.91it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.26it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.78it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.60it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.48it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.37it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:00:08 (running for 00:04:02.70)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.20it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.09it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.16it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.20it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.08it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
                                                 [A
 35%|███▌      | 287/820 [03:56<04:48,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.16it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.07010375708341599, 'eval_accuracy': 0.985, 'eval_runtime': 5.3598, 'eval_samples_per_second': 186.574, 'eval_steps_per_second': 31.158, 'epoch': 6.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-00-12
  done: false
  epoch: 6.98
  eval_accuracy: 0.985
  eval_loss: 0.07010375708341599
  eval_runtime: 5.3598
  eval_samples_per_second: 186.574
  eval_steps_per_second: 31.158
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 7
  node_ip: 172.17.0.3
  objective: 0.985
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 244.66121864318848
  time_this_iter_s: 34.6143114566803
  time_total_s: 244.66121864318848
  timestamp: 1665460812
  timesteps_since_restore: 0
  training_iteration: 7
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 35%|███▌      | 288/820 [04:03<37:59,  4.28s/it]
 35%|███▌      | 289/820 [04:04<27:58,  3.16s/it]


== Status ==
Current time: 2022-10-11 04:00:17 (running for 00:04:12.29)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 35%|███▌      | 290/820 [04:04<20:58,  2.38s/it]
 35%|███▌      | 291/820 [04:05<16:05,  1.83s/it]
 36%|███▌      | 292/820 [04:05<12:39,  1.44s/it]
 36%|███▌      | 293/820 [04:06<10:16,  1.17s/it]
 36%|███▌      | 294/820 [04:07<08:36,  1.02it/s]
 36%|███▌      | 295/820 [04:07<07:26,  1.18it/s]
 36%|███▌      | 296/820 [04:08<06:36,  1.32it/s]
 36%|███▌      | 297/820 [04:08<06:02,  1.44it/s]
 36%|███▋      | 298/820 [04:09<05:38,  1.54it/s]
 36%|███▋      | 299/820 [04:09<05:20,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:00:22 (running for 00:04:17.30)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 37%|███▋      | 300/820 [04:10<05:08,  1.69it/s]
 37%|███▋      | 301/820 [04:10<04:59,  1.73it/s]
 37%|███▋      | 302/820 [04:11<04:52,  1.77it/s]
 37%|███▋      | 303/820 [04:11<04:48,  1.79it/s]
 37%|███▋      | 304/820 [04:12<04:44,  1.81it/s]
 37%|███▋      | 305/820 [04:12<04:42,  1.82it/s]
 37%|███▋      | 306/820 [04:13<04:40,  1.83it/s]
 37%|███▋      | 307/820 [04:14<04:39,  1.83it/s]
 38%|███▊      | 308/820 [04:14<04:38,  1.84it/s]


== Status ==
Current time: 2022-10-11 04:00:27 (running for 00:04:22.30)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 38%|███▊      | 309/820 [04:15<04:37,  1.84it/s]
 38%|███▊      | 310/820 [04:15<04:36,  1.84it/s]
 38%|███▊      | 311/820 [04:16<04:35,  1.85it/s]
 38%|███▊      | 312/820 [04:16<04:34,  1.85it/s]
 38%|███▊      | 313/820 [04:17<04:34,  1.85it/s]
 38%|███▊      | 314/820 [04:17<04:33,  1.85it/s]
 38%|███▊      | 315/820 [04:18<04:33,  1.85it/s]
 39%|███▊      | 316/820 [04:18<04:32,  1.85it/s]
 39%|███▊      | 317/820 [04:19<04:32,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:00:32 (running for 00:04:27.30)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 39%|███▉      | 318/820 [04:20<04:31,  1.85it/s]
 39%|███▉      | 319/820 [04:20<04:31,  1.85it/s]
 39%|███▉      | 320/820 [04:21<04:30,  1.85it/s]
 39%|███▉      | 321/820 [04:21<04:29,  1.85it/s]
 39%|███▉      | 322/820 [04:22<04:28,  1.85it/s]
 39%|███▉      | 323/820 [04:22<04:28,  1.85it/s]
 40%|███▉      | 324/820 [04:23<04:27,  1.85it/s]
 40%|███▉      | 325/820 [04:23<04:27,  1.85it/s]
 40%|███▉      | 326/820 [04:24<04:26,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m   nn.utils.clip_grad_norm_(
 40%|███▉      | 327/820 [04:24<04:22,  1.88it/s]


== Status ==
Current time: 2022-10-11 04:00:37 (running for 00:04:32.30)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 40%|████      | 328/820 [04:25<04:23,  1.87it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 38.88it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.36it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.85it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.84it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.50it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.40it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.35it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:00:42 (running for 00:04:37.30)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.16it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.16it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.10it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.14it/s][A
                                                 
 40%|████      | 328/820 [04:31<04:23,  1.87it/s]
100%|██████████| 167/167 [00:05<00:00, 31.14it/s][A
                   

[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.06337638199329376, 'eval_accuracy': 0.986, 'eval_runtime': 5.3569, 'eval_samples_per_second': 186.674, 'eval_steps_per_second': 31.174, 'epoch': 7.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-00-47
  done: false
  epoch: 7.98
  eval_accuracy: 0.986
  eval_loss: 0.06337638199329376
  eval_runtime: 5.3569
  eval_samples_per_second: 186.674
  eval_steps_per_second: 31.174
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 8
  node_ip: 172.17.0.3
  objective: 0.986
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 279.289870262146
  time_this_iter_s: 34.62865161895752
  time_total_s: 279.289870262146
  timestamp: 1665460847
  timesteps_since_restore: 0
  training_iteration: 8
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 40%|████      | 329/820 [04:38<34:56,  4.27s/it]
 40%|████      | 330/820 [04:38<25:43,  3.15s/it]
 40%|████      | 331/820 [04:39<19:17,  2.37s/it]


== Status ==
Current time: 2022-10-11 04:00:52 (running for 00:04:46.91)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 40%|████      | 332/820 [04:40<14:48,  1.82s/it]
 41%|████      | 333/820 [04:40<11:39,  1.44s/it]
 41%|████      | 334/820 [04:41<09:26,  1.17s/it]
 41%|████      | 335/820 [04:41<07:54,  1.02it/s]
 41%|████      | 336/820 [04:42<06:49,  1.18it/s]
 41%|████      | 337/820 [04:42<06:04,  1.33it/s]
 41%|████      | 338/820 [04:43<05:32,  1.45it/s]
 41%|████▏     | 339/820 [04:43<05:10,  1.55it/s]
 41%|████▏     | 340/820 [04:44<04:55,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:00:57 (running for 00:04:51.92)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 42%|████▏     | 341/820 [04:44<04:43,  1.69it/s]
 42%|████▏     | 342/820 [04:45<04:35,  1.73it/s]
 42%|████▏     | 343/820 [04:45<04:30,  1.77it/s]
 42%|████▏     | 344/820 [04:46<04:26,  1.79it/s]
 42%|████▏     | 345/820 [04:47<04:23,  1.81it/s]
 42%|████▏     | 346/820 [04:47<04:20,  1.82it/s]
 42%|████▏     | 347/820 [04:48<04:18,  1.83it/s]
 42%|████▏     | 348/820 [04:48<04:17,  1.84it/s]
 43%|████▎     | 349/820 [04:49<04:16,  1.84it/s]


== Status ==
Current time: 2022-10-11 04:01:02 (running for 00:04:56.93)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 43%|████▎     | 350/820 [04:49<04:15,  1.84it/s]
 43%|████▎     | 351/820 [04:50<04:13,  1.85it/s]
 43%|████▎     | 352/820 [04:50<04:13,  1.85it/s]
 43%|████▎     | 353/820 [04:51<04:12,  1.85it/s]
 43%|████▎     | 354/820 [04:51<04:12,  1.85it/s]
 43%|████▎     | 355/820 [04:52<04:11,  1.85it/s]
 43%|████▎     | 356/820 [04:52<04:10,  1.85it/s]
 44%|████▎     | 357/820 [04:53<04:10,  1.85it/s]
 44%|████▎     | 358/820 [04:54<04:09,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:01:07 (running for 00:05:01.93)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 44%|████▍     | 359/820 [04:54<04:09,  1.85it/s]
 44%|████▍     | 360/820 [04:55<04:08,  1.85it/s]
 44%|████▍     | 361/820 [04:55<04:08,  1.85it/s]
 44%|████▍     | 362/820 [04:56<04:07,  1.85it/s]
 44%|████▍     | 363/820 [04:56<04:07,  1.85it/s]
 44%|████▍     | 364/820 [04:57<04:06,  1.85it/s]
 45%|████▍     | 365/820 [04:57<04:05,  1.85it/s]
 45%|████▍     | 366/820 [04:58<04:04,  1.85it/s]
 45%|████▍     | 367/820 [04:58<04:04,  1.86it/s]
 45%|████▍     | 368/820 [04:59<04:03,  1.86it/s]


== Status ==
Current time: 2022-10-11 04:01:12 (running for 00:05:06.94)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 45%|████▌     | 369/820 [04:59<04:03,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 39.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.92it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.23it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.86it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.57it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.44it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.36it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.29it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:01:17 (running for 00:05:11.94)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.16it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.08it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.12it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.16it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.15it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
                                                 [A
 45%|████▌     | 369/820 [05:05<04:03,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.19it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.041535377502441406, 'eval_accuracy': 0.993, 'eval_runtime': 5.357, 'eval_samples_per_second': 186.672, 'eval_steps_per_second': 31.174, 'epoch': 8.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-01-21
  done: false
  epoch: 8.98
  eval_accuracy: 0.993
  eval_loss: 0.041535377502441406
  eval_runtime: 5.357
  eval_samples_per_second: 186.672
  eval_steps_per_second: 31.174
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 9
  node_ip: 172.17.0.3
  objective: 0.993
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 313.8696928024292
  time_this_iter_s: 34.5798225402832
  time_total_s: 313.8696928024292
  timestamp: 1665460881
  timesteps_since_restore: 0
  training_iteration: 9
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 45%|████▌     | 370/820 [05:12<31:59,  4.27s/it]
 45%|████▌     | 371/820 [05:13<23:33,  3.15s/it]
 45%|████▌     | 372/820 [05:14<17:39,  2.37s/it]


== Status ==
Current time: 2022-10-11 04:01:26 (running for 00:05:21.49)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 45%|████▌     | 373/820 [05:14<13:32,  1.82s/it]
 46%|████▌     | 374/820 [05:15<10:39,  1.43s/it]
 46%|████▌     | 375/820 [05:15<08:39,  1.17s/it]
 46%|████▌     | 376/820 [05:16<07:14,  1.02it/s]
 46%|████▌     | 377/820 [05:16<06:14,  1.18it/s]
 46%|████▌     | 378/820 [05:17<05:33,  1.33it/s]
[2m[36m(_objective pid=2373378)[0m   nn.utils.clip_grad_norm_(
 46%|████▌     | 379/820 [05:17<05:00,  1.47it/s]
 46%|████▋     | 380/820 [05:18<04:41,  1.56it/s]
 46%|████▋     | 381/820 [05:18<04:27,  1.64it/s]


== Status ==
Current time: 2022-10-11 04:01:31 (running for 00:05:26.49)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 47%|████▋     | 382/820 [05:19<04:18,  1.70it/s]
 47%|████▋     | 383/820 [05:19<04:11,  1.74it/s]
 47%|████▋     | 384/820 [05:20<04:05,  1.77it/s]
 47%|████▋     | 385/820 [05:21<04:02,  1.80it/s]
 47%|████▋     | 386/820 [05:21<03:59,  1.81it/s]
 47%|████▋     | 387/820 [05:22<03:56,  1.83it/s]
 47%|████▋     | 388/820 [05:22<03:55,  1.83it/s]
 47%|████▋     | 389/820 [05:23<03:54,  1.84it/s]
 48%|████▊     | 390/820 [05:23<03:53,  1.84it/s]


== Status ==
Current time: 2022-10-11 04:01:36 (running for 00:05:31.50)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 48%|████▊     | 391/820 [05:24<03:52,  1.84it/s]
 48%|████▊     | 392/820 [05:24<03:51,  1.85it/s]
 48%|████▊     | 393/820 [05:25<03:50,  1.85it/s]
 48%|████▊     | 394/820 [05:25<03:49,  1.85it/s]
 48%|████▊     | 395/820 [05:26<03:49,  1.85it/s]
 48%|████▊     | 396/820 [05:26<03:49,  1.85it/s]
 48%|████▊     | 397/820 [05:27<03:48,  1.85it/s]
 49%|████▊     | 398/820 [05:28<03:47,  1.85it/s]
 49%|████▊     | 399/820 [05:28<03:47,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:01:41 (running for 00:05:36.50)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 49%|████▉     | 400/820 [05:29<03:46,  1.85it/s]
 49%|████▉     | 401/820 [05:29<03:46,  1.85it/s]
 49%|████▉     | 402/820 [05:30<03:45,  1.85it/s]
 49%|████▉     | 403/820 [05:30<03:44,  1.85it/s]
 49%|████▉     | 404/820 [05:31<03:44,  1.85it/s]
 49%|████▉     | 405/820 [05:31<03:44,  1.85it/s]
 50%|████▉     | 406/820 [05:32<03:43,  1.85it/s]
 50%|████▉     | 407/820 [05:32<03:43,  1.85it/s]
 50%|████▉     | 408/820 [05:33<03:42,  1.85it/s]
 50%|████▉     | 409/820 [05:33<03:42,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:01:46 (running for 00:05:41.51)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 50%|█████     | 410/820 [05:34<03:41,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 38.98it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.38it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.90it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.24it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.71it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.56it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.46it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.37it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:01:51 (running for 00:05:46.51)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.04it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.10it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.15it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.20it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.18it/s][A
                                                 
 50%|█████     | 410/820 [05:40<03:41,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.18it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.037167053669691086, 'eval_accuracy': 0.993, 'eval_runtime': 5.3636, 'eval_samples_per_second': 186.442, 'eval_steps_per_second': 31.136, 'epoch': 9.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-01-56
  done: false
  epoch: 9.98
  eval_accuracy: 0.993
  eval_loss: 0.037167053669691086
  eval_runtime: 5.3636
  eval_samples_per_second: 186.442
  eval_steps_per_second: 31.136
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 10
  node_ip: 172.17.0.3
  objective: 0.993
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 348.39941000938416
  time_this_iter_s: 34.529717206954956
  time_total_s: 348.39941000938416
  timestamp: 1665460916
  timesteps_since_restore: 0
  training_iteration: 10
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 50%|█████     | 411/820 [05:47<29:08,  4.27s/it]
 50%|█████     | 412/820 [05:48<21:26,  3.15s/it]
 50%|█████     | 413/820 [05:48<16:04,  2.37s/it]


== Status ==
Current time: 2022-10-11 04:02:01 (running for 00:05:56.01)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 50%|█████     | 414/820 [05:49<12:19,  1.82s/it]
 51%|█████     | 415/820 [05:49<09:41,  1.44s/it]
 51%|█████     | 416/820 [05:50<07:51,  1.17s/it]
 51%|█████     | 417/820 [05:50<06:34,  1.02it/s]
 51%|█████     | 418/820 [05:51<05:40,  1.18it/s]
 51%|█████     | 419/820 [05:51<05:02,  1.32it/s]
 51%|█████     | 420/820 [05:52<04:36,  1.45it/s]
 51%|█████▏    | 421/820 [05:52<04:17,  1.55it/s]
 51%|█████▏    | 422/820 [05:53<04:04,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:02:06 (running for 00:06:01.01)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 52%|█████▏    | 423/820 [05:53<03:55,  1.69it/s]
 52%|█████▏    | 424/820 [05:54<03:48,  1.74it/s]
 52%|█████▏    | 425/820 [05:55<03:43,  1.77it/s]
 52%|█████▏    | 426/820 [05:55<03:39,  1.79it/s]
 52%|█████▏    | 427/820 [05:56<03:37,  1.81it/s]
 52%|█████▏    | 428/820 [05:56<03:35,  1.82it/s]
 52%|█████▏    | 429/820 [05:57<03:33,  1.83it/s]
 52%|█████▏    | 430/820 [05:57<03:32,  1.84it/s]
 53%|█████▎    | 431/820 [05:58<03:31,  1.84it/s]


== Status ==
Current time: 2022-10-11 04:02:11 (running for 00:06:06.02)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 53%|█████▎    | 432/820 [05:58<03:30,  1.84it/s]
 53%|█████▎    | 433/820 [05:59<03:29,  1.85it/s]
 53%|█████▎    | 434/820 [05:59<03:28,  1.85it/s]
 53%|█████▎    | 435/820 [06:00<03:28,  1.85it/s]
 53%|█████▎    | 436/820 [06:00<03:27,  1.85it/s]
 53%|█████▎    | 437/820 [06:01<03:26,  1.85it/s]
 53%|█████▎    | 438/820 [06:02<03:26,  1.85it/s]
 54%|█████▎    | 439/820 [06:02<03:25,  1.85it/s]
 54%|█████▎    | 440/820 [06:03<03:25,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:02:16 (running for 00:06:11.02)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 54%|█████▍    | 441/820 [06:03<03:24,  1.85it/s]
 54%|█████▍    | 442/820 [06:04<03:23,  1.85it/s]
 54%|█████▍    | 443/820 [06:04<03:23,  1.85it/s]
 54%|█████▍    | 444/820 [06:05<03:22,  1.86it/s]
 54%|█████▍    | 445/820 [06:05<03:22,  1.85it/s]
 54%|█████▍    | 446/820 [06:06<03:21,  1.85it/s]
 55%|█████▍    | 447/820 [06:06<03:21,  1.85it/s]
 55%|█████▍    | 448/820 [06:07<03:20,  1.86it/s]
 55%|█████▍    | 449/820 [06:08<03:20,  1.85it/s]
 55%|█████▍    | 450/820 [06:08<03:19,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:02:21 (running for 00:06:16.02)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 55%|█████▌    | 451/820 [06:09<03:18,  1.86it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 39.14it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.47it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.96it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 31.50it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.36it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.31it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.28it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.26it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.27it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:02:26 (running for 00:06:21.02)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.20it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.12it/s][A
                                                 
 55%|█████▌    | 451/820 [06:14<03:18,  1.86it/s]
100%|██████████| 167/167 [00:05<00:00, 31.12it/s][A
                   

[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.025289999321103096, 'eval_accuracy': 0.995, 'eval_runtime': 5.3599, 'eval_samples_per_second': 186.571, 'eval_steps_per_second': 31.157, 'epoch': 10.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-02-30
  done: false
  epoch: 10.98
  eval_accuracy: 0.995
  eval_loss: 0.025289999321103096
  eval_runtime: 5.3599
  eval_samples_per_second: 186.571
  eval_steps_per_second: 31.157
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 11
  node_ip: 172.17.0.3
  objective: 0.995
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 382.95319080352783
  time_this_iter_s: 34.55378079414368
  time_total_s: 382.95319080352783
  timestamp: 1665460950
  timesteps_since_restore: 0
  training_iteration: 11
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 55%|█████▌    | 452/820 [06:22<26:09,  4.26s/it]
 55%|█████▌    | 453/820 [06:22<19:14,  3.15s/it]
 55%|█████▌    | 454/820 [06:23<14:25,  2.36s/it]


== Status ==
Current time: 2022-10-11 04:02:35 (running for 00:06:30.57)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 55%|█████▌    | 455/820 [06:23<11:03,  1.82s/it]
 56%|█████▌    | 456/820 [06:24<08:41,  1.43s/it]
 56%|█████▌    | 457/820 [06:24<07:02,  1.16s/it]
 56%|█████▌    | 458/820 [06:25<05:53,  1.02it/s]
 56%|█████▌    | 459/820 [06:25<05:05,  1.18it/s]
 56%|█████▌    | 460/820 [06:26<04:31,  1.33it/s]
 56%|█████▌    | 461/820 [06:26<04:07,  1.45it/s]
 56%|█████▋    | 462/820 [06:27<03:50,  1.55it/s]
 56%|█████▋    | 463/820 [06:27<03:39,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:02:40 (running for 00:06:35.57)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 57%|█████▋    | 464/820 [06:28<03:31,  1.69it/s]
 57%|█████▋    | 465/820 [06:29<03:24,  1.73it/s]
 57%|█████▋    | 466/820 [06:29<03:20,  1.77it/s]
 57%|█████▋    | 467/820 [06:30<03:16,  1.79it/s]
 57%|█████▋    | 468/820 [06:30<03:14,  1.81it/s]
 57%|█████▋    | 469/820 [06:31<03:12,  1.82it/s]
 57%|█████▋    | 470/820 [06:31<03:11,  1.83it/s]
 57%|█████▋    | 471/820 [06:32<03:09,  1.84it/s]
 58%|█████▊    | 472/820 [06:32<03:08,  1.84it/s]


== Status ==
Current time: 2022-10-11 04:02:45 (running for 00:06:40.58)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 58%|█████▊    | 473/820 [06:33<03:08,  1.84it/s]
 58%|█████▊    | 474/820 [06:33<03:07,  1.85it/s]
 58%|█████▊    | 475/820 [06:34<03:06,  1.85it/s]
 58%|█████▊    | 476/820 [06:34<03:05,  1.85it/s]
 58%|█████▊    | 477/820 [06:35<03:05,  1.85it/s]
 58%|█████▊    | 478/820 [06:36<03:04,  1.85it/s]
 58%|█████▊    | 479/820 [06:36<03:04,  1.85it/s]
 59%|█████▊    | 480/820 [06:37<03:03,  1.85it/s]
 59%|█████▊    | 481/820 [06:37<03:03,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:02:50 (running for 00:06:45.58)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 59%|█████▉    | 482/820 [06:38<03:02,  1.85it/s]
 59%|█████▉    | 483/820 [06:38<03:02,  1.85it/s]
 59%|█████▉    | 484/820 [06:39<03:01,  1.85it/s]
 59%|█████▉    | 485/820 [06:39<03:01,  1.85it/s]
 59%|█████▉    | 486/820 [06:40<03:00,  1.85it/s]
 59%|█████▉    | 487/820 [06:40<03:00,  1.85it/s]
 60%|█████▉    | 488/820 [06:41<02:59,  1.85it/s]
 60%|█████▉    | 489/820 [06:42<02:59,  1.85it/s]
 60%|█████▉    | 490/820 [06:42<02:58,  1.84it/s]
 60%|█████▉    | 491/820 [06:43<02:58,  1.84it/s]


== Status ==
Current time: 2022-10-11 04:02:55 (running for 00:06:50.59)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 60%|██████    | 492/820 [06:43<02:57,  1.84it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 38.99it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.81it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.82it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.61it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.50it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.39it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:03:00 (running for 00:06:55.59)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 30.54it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 30.74it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 30.88it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 30.86it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 30.96it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.02it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.03it/s][A
[2m[36m(_objective pid=2373378)[0m 
                                                 [A
 60%|██████    | 492/820 [06:49<02:57,  1.84it/s]
100%|██████████| 167/167 [00:05<00:00, 31.11it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.015258822590112686, 'eval_accuracy': 0.996, 'eval_runtime': 5.366, 'eval_samples_per_second': 186.36, 'eval_steps_per_second': 31.122, 'epoch': 11.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-03-05
  done: false
  epoch: 11.98
  eval_accuracy: 0.996
  eval_loss: 0.015258822590112686
  eval_runtime: 5.366
  eval_samples_per_second: 186.36
  eval_steps_per_second: 31.122
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 12
  node_ip: 172.17.0.3
  objective: 0.996
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 417.5583736896515
  time_this_iter_s: 34.60518288612366
  time_total_s: 417.5583736896515
  timestamp: 1665460985
  timesteps_since_restore: 0
  training_iteration: 12
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 60%|██████    | 493/820 [06:56<23:17,  4.28s/it]
 60%|██████    | 494/820 [06:57<17:08,  3.15s/it]
 60%|██████    | 495/820 [06:57<12:50,  2.37s/it]


== Status ==
Current time: 2022-10-11 04:03:10 (running for 00:07:05.17)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 60%|██████    | 496/820 [06:58<09:49,  1.82s/it]
 61%|██████    | 497/820 [06:58<07:43,  1.44s/it]
 61%|██████    | 498/820 [06:59<06:15,  1.17s/it]
 61%|██████    | 499/820 [06:59<05:14,  1.02it/s]
 61%|██████    | 500/820 [07:00<04:30,  1.18it/s]


[2m[36m(_objective pid=2373378)[0m {'loss': 0.2676, 'learning_rate': 8.80758807588076e-06, 'epoch': 12.19}


 61%|██████    | 501/820 [07:00<04:00,  1.33it/s]
 61%|██████    | 502/820 [07:01<03:39,  1.45it/s]
 61%|██████▏   | 503/820 [07:02<03:24,  1.55it/s]
 61%|██████▏   | 504/820 [07:02<03:13,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:03:15 (running for 00:07:10.18)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 62%|██████▏   | 505/820 [07:03<03:06,  1.69it/s]
 62%|██████▏   | 506/820 [07:03<03:00,  1.74it/s]
 62%|██████▏   | 507/820 [07:04<02:56,  1.77it/s]
 62%|██████▏   | 508/820 [07:04<02:53,  1.79it/s]
 62%|██████▏   | 509/820 [07:05<02:51,  1.81it/s]
 62%|██████▏   | 510/820 [07:05<02:49,  1.83it/s]
 62%|██████▏   | 511/820 [07:06<02:48,  1.83it/s]
 62%|██████▏   | 512/820 [07:06<02:47,  1.84it/s]
 63%|██████▎   | 513/820 [07:07<02:46,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:03:20 (running for 00:07:15.18)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 63%|██████▎   | 514/820 [07:07<02:45,  1.85it/s]
 63%|██████▎   | 515/820 [07:08<02:44,  1.85it/s]
 63%|██████▎   | 516/820 [07:09<02:44,  1.85it/s]
 63%|██████▎   | 517/820 [07:09<02:43,  1.85it/s]
 63%|██████▎   | 518/820 [07:10<02:43,  1.85it/s]
 63%|██████▎   | 519/820 [07:10<02:42,  1.85it/s]
 63%|██████▎   | 520/820 [07:11<02:42,  1.85it/s]
 64%|██████▎   | 521/820 [07:11<02:41,  1.85it/s]
 64%|██████▎   | 522/820 [07:12<02:41,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:03:25 (running for 00:07:20.18)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 64%|██████▍   | 523/820 [07:12<02:40,  1.85it/s]
 64%|██████▍   | 524/820 [07:13<02:40,  1.85it/s]
 64%|██████▍   | 525/820 [07:13<02:39,  1.85it/s]
 64%|██████▍   | 526/820 [07:14<02:39,  1.85it/s]
 64%|██████▍   | 527/820 [07:15<02:38,  1.85it/s]
 64%|██████▍   | 528/820 [07:15<02:38,  1.85it/s]
 65%|██████▍   | 529/820 [07:16<02:37,  1.85it/s]
 65%|██████▍   | 530/820 [07:16<02:37,  1.84it/s]
 65%|██████▍   | 531/820 [07:17<02:36,  1.85it/s]
 65%|██████▍   | 532/820 [07:17<02:35,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:03:30 (running for 00:07:25.18)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 65%|██████▌   | 533/820 [07:18<02:35,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 38.96it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.38it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.96it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.10it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.79it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.55it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.43it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 30.73it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 30.89it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:03:35 (running for 00:07:30.18)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.18it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.20it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.20it/s][A
                                                 
 65%|██████▌   | 533/820 [07:23<02:35,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.20it/s][A
                   

[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.014451620168983936, 'eval_accuracy': 0.996, 'eval_runtime': 5.3623, 'eval_samples_per_second': 186.486, 'eval_steps_per_second': 31.143, 'epoch': 12.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-03-39
  done: false
  epoch: 12.98
  eval_accuracy: 0.996
  eval_loss: 0.014451620168983936
  eval_runtime: 5.3623
  eval_samples_per_second: 186.486
  eval_steps_per_second: 31.143
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 13
  node_ip: 172.17.0.3
  objective: 0.996
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 452.1154537200928
  time_this_iter_s: 34.557080030441284
  time_total_s: 452.1154537200928
  timestamp: 1665461019
  timesteps_since_restore: 0
  training_iteration: 13
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 65%|██████▌   | 534/820 [07:31<20:20,  4.27s/it]
 65%|██████▌   | 535/820 [07:31<14:57,  3.15s/it]
 65%|██████▌   | 536/820 [07:32<11:11,  2.37s/it]


== Status ==
Current time: 2022-10-11 04:03:45 (running for 00:07:39.73)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 65%|██████▌   | 537/820 [07:32<08:34,  1.82s/it]
 66%|██████▌   | 538/820 [07:33<06:44,  1.43s/it]
 66%|██████▌   | 539/820 [07:33<05:27,  1.17s/it]
 66%|██████▌   | 540/820 [07:34<04:33,  1.02it/s]
 66%|██████▌   | 541/820 [07:34<03:56,  1.18it/s]
 66%|██████▌   | 542/820 [07:35<03:29,  1.33it/s]
 66%|██████▌   | 543/820 [07:36<03:11,  1.45it/s]
 66%|██████▋   | 544/820 [07:36<02:58,  1.55it/s]
 66%|██████▋   | 545/820 [07:37<02:48,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:03:50 (running for 00:07:44.73)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 67%|██████▋   | 546/820 [07:37<02:42,  1.69it/s]
 67%|██████▋   | 547/820 [07:38<02:37,  1.74it/s]
 67%|██████▋   | 548/820 [07:38<02:33,  1.77it/s]
 67%|██████▋   | 549/820 [07:39<02:30,  1.79it/s]
 67%|██████▋   | 550/820 [07:39<02:29,  1.81it/s]
 67%|██████▋   | 551/820 [07:40<02:27,  1.82it/s]
 67%|██████▋   | 552/820 [07:40<02:26,  1.83it/s]
 67%|██████▋   | 553/820 [07:41<02:25,  1.84it/s]
 68%|██████▊   | 554/820 [07:42<02:24,  1.84it/s]


== Status ==
Current time: 2022-10-11 04:03:55 (running for 00:07:49.74)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 68%|██████▊   | 555/820 [07:42<02:23,  1.84it/s]
 68%|██████▊   | 556/820 [07:43<02:23,  1.85it/s]
 68%|██████▊   | 557/820 [07:43<02:22,  1.85it/s]
 68%|██████▊   | 558/820 [07:44<02:21,  1.85it/s]
 68%|██████▊   | 559/820 [07:44<02:21,  1.85it/s]
 68%|██████▊   | 560/820 [07:45<02:20,  1.85it/s]
 68%|██████▊   | 561/820 [07:45<02:19,  1.85it/s]
 69%|██████▊   | 562/820 [07:46<02:19,  1.85it/s]
 69%|██████▊   | 563/820 [07:46<02:18,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:04:00 (running for 00:07:54.74)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 69%|██████▉   | 564/820 [07:47<02:18,  1.85it/s]
 69%|██████▉   | 565/820 [07:47<02:17,  1.85it/s]
 69%|██████▉   | 566/820 [07:48<02:17,  1.85it/s]
 69%|██████▉   | 567/820 [07:49<02:16,  1.85it/s]
 69%|██████▉   | 568/820 [07:49<02:16,  1.85it/s]
 69%|██████▉   | 569/820 [07:50<02:15,  1.85it/s]
 70%|██████▉   | 570/820 [07:50<02:14,  1.85it/s]
 70%|██████▉   | 571/820 [07:51<02:14,  1.85it/s]
 70%|██████▉   | 572/820 [07:51<02:13,  1.85it/s]
 70%|██████▉   | 573/820 [07:52<02:13,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:04:05 (running for 00:07:59.74)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 70%|███████   | 574/820 [07:52<02:12,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 38.93it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.37it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.92it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.06it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.75it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.58it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.47it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.22it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:04:10 (running for 00:08:04.74)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.14it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.21it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.24it/s][A
                                                 
 70%|███████   | 574/820 [07:58<02:12,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.24it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.012605157680809498, 'eval_accuracy': 0.996, 'eval_runtime': 5.3592, 'eval_samples_per_second': 186.596, 'eval_steps_per_second': 31.162, 'epoch': 13.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-04-14
  done: false
  epoch: 13.98
  eval_accuracy: 0.996
  eval_loss: 0.012605157680809498
  eval_runtime: 5.3592
  eval_samples_per_second: 186.596
  eval_steps_per_second: 31.162
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 14
  node_ip: 172.17.0.3
  objective: 0.996
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 486.66021847724915
  time_this_iter_s: 34.54476475715637
  time_total_s: 486.66021847724915
  timestamp: 1665461054
  timesteps_since_restore: 0
  training_iteration: 14
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 70%|███████   | 575/820 [08:05<17:24,  4.26s/it]
 70%|███████   | 576/820 [08:06<12:47,  3.15s/it]
 70%|███████   | 577/820 [08:06<09:34,  2.36s/it]


== Status ==
Current time: 2022-10-11 04:04:19 (running for 00:08:14.28)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 70%|███████   | 578/820 [08:07<07:19,  1.82s/it]
 71%|███████   | 579/820 [08:07<05:45,  1.43s/it]
 71%|███████   | 580/820 [08:08<04:39,  1.16s/it]
 71%|███████   | 581/820 [08:08<03:53,  1.02it/s]
 71%|███████   | 582/820 [08:09<03:21,  1.18it/s]
 71%|███████   | 583/820 [08:10<02:58,  1.33it/s]
 71%|███████   | 584/820 [08:10<02:43,  1.45it/s]
 71%|███████▏  | 585/820 [08:11<02:31,  1.55it/s]
 71%|███████▏  | 586/820 [08:11<02:23,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:04:24 (running for 00:08:19.28)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 72%|███████▏  | 587/820 [08:12<02:17,  1.69it/s]
 72%|███████▏  | 588/820 [08:12<02:13,  1.74it/s]
 72%|███████▏  | 589/820 [08:13<02:10,  1.77it/s]
 72%|███████▏  | 590/820 [08:13<02:07,  1.80it/s]
 72%|███████▏  | 591/820 [08:14<02:06,  1.81it/s]
 72%|███████▏  | 592/820 [08:14<02:04,  1.83it/s]
 72%|███████▏  | 593/820 [08:15<02:03,  1.84it/s]
 72%|███████▏  | 594/820 [08:15<02:02,  1.84it/s]
 73%|███████▎  | 595/820 [08:16<02:02,  1.84it/s]


== Status ==
Current time: 2022-10-11 04:04:29 (running for 00:08:24.29)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 73%|███████▎  | 596/820 [08:17<02:01,  1.85it/s]
 73%|███████▎  | 597/820 [08:17<02:00,  1.85it/s]
 73%|███████▎  | 598/820 [08:18<01:59,  1.85it/s]
 73%|███████▎  | 599/820 [08:18<01:59,  1.85it/s]
 73%|███████▎  | 600/820 [08:19<01:58,  1.85it/s]
 73%|███████▎  | 601/820 [08:19<01:58,  1.85it/s]
 73%|███████▎  | 602/820 [08:20<01:57,  1.85it/s]
 74%|███████▎  | 603/820 [08:20<01:57,  1.85it/s]
 74%|███████▎  | 604/820 [08:21<01:56,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:04:34 (running for 00:08:29.29)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 74%|███████▍  | 605/820 [08:21<01:56,  1.85it/s]
 74%|███████▍  | 606/820 [08:22<01:55,  1.85it/s]
 74%|███████▍  | 607/820 [08:23<01:55,  1.85it/s]
 74%|███████▍  | 608/820 [08:23<01:54,  1.85it/s]
 74%|███████▍  | 609/820 [08:24<01:54,  1.85it/s]
 74%|███████▍  | 610/820 [08:24<01:53,  1.85it/s]
 75%|███████▍  | 611/820 [08:25<01:53,  1.85it/s]
 75%|███████▍  | 612/820 [08:25<01:52,  1.85it/s]
 75%|███████▍  | 613/820 [08:26<01:51,  1.85it/s]
 75%|███████▍  | 614/820 [08:26<01:51,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:04:39 (running for 00:08:34.29)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 75%|███████▌  | 615/820 [08:27<01:50,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 38.97it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.39it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.83it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.20it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.86it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.64it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.42it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.36it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.34it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:04:44 (running for 00:08:39.30)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.15it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.10it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.14it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.17it/s][A
                                                 
 75%|███████▌  | 615/820 [08:33<01:50,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.17it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.009595293551683426, 'eval_accuracy': 0.996, 'eval_runtime': 5.3512, 'eval_samples_per_second': 186.874, 'eval_steps_per_second': 31.208, 'epoch': 14.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-04-48
  done: false
  epoch: 14.98
  eval_accuracy: 0.996
  eval_loss: 0.009595293551683426
  eval_runtime: 5.3512
  eval_samples_per_second: 186.874
  eval_steps_per_second: 31.208
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 15
  node_ip: 172.17.0.3
  objective: 0.996
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 521.227929353714
  time_this_iter_s: 34.567710876464844
  time_total_s: 521.227929353714
  timestamp: 1665461088
  timesteps_since_restore: 0
  training_iteration: 15
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 75%|███████▌  | 616/820 [08:40<14:32,  4.28s/it]
 75%|███████▌  | 617/820 [08:40<10:40,  3.15s/it]
 75%|███████▌  | 618/820 [08:41<07:58,  2.37s/it]


== Status ==
Current time: 2022-10-11 04:04:54 (running for 00:08:48.84)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 75%|███████▌  | 619/820 [08:41<06:05,  1.82s/it]
 76%|███████▌  | 620/820 [08:42<04:47,  1.44s/it]
 76%|███████▌  | 621/820 [08:43<03:52,  1.17s/it]
 76%|███████▌  | 622/820 [08:43<03:13,  1.02it/s]
 76%|███████▌  | 623/820 [08:44<02:46,  1.18it/s]
 76%|███████▌  | 624/820 [08:44<02:28,  1.32it/s]
 76%|███████▌  | 625/820 [08:45<02:14,  1.45it/s]
 76%|███████▋  | 626/820 [08:45<02:05,  1.55it/s]
 76%|███████▋  | 627/820 [08:46<01:58,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:04:59 (running for 00:08:53.84)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 77%|███████▋  | 628/820 [08:46<01:53,  1.69it/s]
 77%|███████▋  | 629/820 [08:47<01:50,  1.74it/s]
 77%|███████▋  | 630/820 [08:47<01:47,  1.77it/s]
 77%|███████▋  | 631/820 [08:48<01:45,  1.80it/s]
 77%|███████▋  | 632/820 [08:48<01:43,  1.81it/s]
 77%|███████▋  | 633/820 [08:49<01:42,  1.82it/s]
 77%|███████▋  | 634/820 [08:50<01:41,  1.83it/s]
 77%|███████▋  | 635/820 [08:50<01:40,  1.84it/s]
 78%|███████▊  | 636/820 [08:51<01:39,  1.84it/s]


== Status ==
Current time: 2022-10-11 04:05:04 (running for 00:08:58.84)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 78%|███████▊  | 637/820 [08:51<01:39,  1.85it/s]
 78%|███████▊  | 638/820 [08:52<01:38,  1.85it/s]
 78%|███████▊  | 639/820 [08:52<01:37,  1.85it/s]
 78%|███████▊  | 640/820 [08:53<01:37,  1.85it/s]
 78%|███████▊  | 641/820 [08:53<01:36,  1.85it/s]
 78%|███████▊  | 642/820 [08:54<01:36,  1.85it/s]
 78%|███████▊  | 643/820 [08:54<01:35,  1.85it/s]
 79%|███████▊  | 644/820 [08:55<01:35,  1.85it/s]
 79%|███████▊  | 645/820 [08:55<01:34,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:05:09 (running for 00:09:03.85)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 79%|███████▉  | 646/820 [08:56<01:33,  1.85it/s]
 79%|███████▉  | 647/820 [08:57<01:33,  1.85it/s]
 79%|███████▉  | 648/820 [08:57<01:32,  1.85it/s]
 79%|███████▉  | 649/820 [08:58<01:32,  1.85it/s]
 79%|███████▉  | 650/820 [08:58<01:31,  1.85it/s]
 79%|███████▉  | 651/820 [08:59<01:31,  1.85it/s]
 80%|███████▉  | 652/820 [08:59<01:30,  1.85it/s]
 80%|███████▉  | 653/820 [09:00<01:30,  1.85it/s]
 80%|███████▉  | 654/820 [09:00<01:29,  1.85it/s]
 80%|███████▉  | 655/820 [09:01<01:29,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:05:14 (running for 00:09:08.85)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 80%|████████  | 656/820 [09:01<01:28,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 39.04it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.36it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.93it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.29it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.79it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.56it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.41it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.34it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.33it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:05:19 (running for 00:09:13.85)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 77%|███████▋  | 129/167 [00:04<00:01, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.05it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.06it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.11it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.16it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.16it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.19it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.21it/s][A
                                                 
 80%|████████  | 656/820 [09:0

[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.00705485325306654, 'eval_accuracy': 0.997, 'eval_runtime': 5.363, 'eval_samples_per_second': 186.463, 'eval_steps_per_second': 31.139, 'epoch': 15.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-05-23
  done: false
  epoch: 15.98
  eval_accuracy: 0.997
  eval_loss: 0.00705485325306654
  eval_runtime: 5.363
  eval_samples_per_second: 186.463
  eval_steps_per_second: 31.139
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 16
  node_ip: 172.17.0.3
  objective: 0.997
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 555.8337845802307
  time_this_iter_s: 34.605855226516724
  time_total_s: 555.8337845802307
  timestamp: 1665461123
  timesteps_since_restore: 0
  training_iteration: 16
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 80%|████████  | 657/820 [09:14<11:37,  4.28s/it]
 80%|████████  | 658/820 [09:15<08:31,  3.16s/it]
 80%|████████  | 659/820 [09:16<06:21,  2.37s/it]


== Status ==
Current time: 2022-10-11 04:05:28 (running for 00:09:23.46)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 80%|████████  | 660/820 [09:16<04:51,  1.82s/it]
 81%|████████  | 661/820 [09:17<03:48,  1.44s/it]
 81%|████████  | 662/820 [09:17<03:04,  1.17s/it]
 81%|████████  | 663/820 [09:18<02:33,  1.02it/s]
 81%|████████  | 664/820 [09:18<02:12,  1.18it/s]
 81%|████████  | 665/820 [09:19<01:56,  1.33it/s]
 81%|████████  | 666/820 [09:19<01:46,  1.45it/s]
 81%|████████▏ | 667/820 [09:20<01:38,  1.55it/s]
 81%|████████▏ | 668/820 [09:20<01:33,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:05:33 (running for 00:09:28.46)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 82%|████████▏ | 669/820 [09:21<01:29,  1.69it/s]
 82%|████████▏ | 670/820 [09:21<01:26,  1.74it/s]
 82%|████████▏ | 671/820 [09:22<01:24,  1.77it/s]
 82%|████████▏ | 672/820 [09:23<01:22,  1.80it/s]
 82%|████████▏ | 673/820 [09:23<01:21,  1.81it/s]
 82%|████████▏ | 674/820 [09:24<01:19,  1.83it/s]
 82%|████████▏ | 675/820 [09:24<01:19,  1.84it/s]
 82%|████████▏ | 676/820 [09:25<01:18,  1.84it/s]
 83%|████████▎ | 677/820 [09:25<01:17,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:05:38 (running for 00:09:33.46)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 83%|████████▎ | 678/820 [09:26<01:16,  1.85it/s]
 83%|████████▎ | 679/820 [09:26<01:16,  1.85it/s]
 83%|████████▎ | 680/820 [09:27<01:15,  1.85it/s]
 83%|████████▎ | 681/820 [09:27<01:14,  1.85it/s]
 83%|████████▎ | 682/820 [09:28<01:14,  1.85it/s]
 83%|████████▎ | 683/820 [09:28<01:13,  1.85it/s]
 83%|████████▎ | 684/820 [09:29<01:13,  1.85it/s]
 84%|████████▎ | 685/820 [09:30<01:12,  1.85it/s]
 84%|████████▎ | 686/820 [09:30<01:12,  1.85it/s]
 84%|████████▍ | 687/820 [09:31<01:11,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:05:43 (running for 00:09:38.46)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 84%|████████▍ | 688/820 [09:31<01:11,  1.85it/s]
 84%|████████▍ | 689/820 [09:32<01:10,  1.85it/s]
 84%|████████▍ | 690/820 [09:32<01:10,  1.85it/s]
 84%|████████▍ | 691/820 [09:33<01:09,  1.85it/s]
 84%|████████▍ | 692/820 [09:33<01:09,  1.85it/s]
 85%|████████▍ | 693/820 [09:34<01:08,  1.85it/s]
 85%|████████▍ | 694/820 [09:34<01:08,  1.85it/s]
 85%|████████▍ | 695/820 [09:35<01:07,  1.85it/s]
 85%|████████▍ | 696/820 [09:35<01:06,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:05:48 (running for 00:09:43.46)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 85%|████████▌ | 697/820 [09:36<01:06,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 38.91it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.36it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.73it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.80it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.56it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.33it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.28it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.26it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:05:53 (running for 00:09:48.47)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.10it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.12it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.09it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.12it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.15it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.15it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.05it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.11it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 30.49it/s][A
                                                 
 85%|████████▌ | 697/820 [09:42<01:06,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 30.49it/s][A
                   

[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.005226995330303907, 'eval_accuracy': 0.998, 'eval_runtime': 5.3789, 'eval_samples_per_second': 185.912, 'eval_steps_per_second': 31.047, 'epoch': 16.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-05-58
  done: false
  epoch: 16.98
  eval_accuracy: 0.998
  eval_loss: 0.005226995330303907
  eval_runtime: 5.3789
  eval_samples_per_second: 185.912
  eval_steps_per_second: 31.047
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 17
  node_ip: 172.17.0.3
  objective: 0.998
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 590.4037139415741
  time_this_iter_s: 34.569929361343384
  time_total_s: 590.4037139415741
  timestamp: 1665461158
  timesteps_since_restore: 0
  training_iteration: 17
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 85%|████████▌ | 698/820 [09:49<08:54,  4.38s/it]
 85%|████████▌ | 699/820 [09:50<06:30,  3.23s/it]
 85%|████████▌ | 700/820 [09:50<04:50,  2.42s/it]


== Status ==
Current time: 2022-10-11 04:06:03 (running for 00:09:58.36)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 85%|████████▌ | 701/820 [09:51<03:40,  1.86s/it]
 86%|████████▌ | 702/820 [09:51<02:52,  1.46s/it]
 86%|████████▌ | 703/820 [09:52<02:18,  1.18s/it]
 86%|████████▌ | 704/820 [09:53<01:54,  1.01it/s]
 86%|████████▌ | 705/820 [09:53<01:38,  1.17it/s]
 86%|████████▌ | 706/820 [09:54<01:26,  1.31it/s]
 86%|████████▌ | 707/820 [09:54<01:18,  1.44it/s]
 86%|████████▋ | 708/820 [09:55<01:12,  1.54it/s]
 86%|████████▋ | 709/820 [09:55<01:08,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:06:08 (running for 00:10:03.36)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 87%|████████▋ | 710/820 [09:56<01:05,  1.69it/s]
 87%|████████▋ | 711/820 [09:56<01:02,  1.74it/s]
 87%|████████▋ | 712/820 [09:57<01:00,  1.77it/s]
 87%|████████▋ | 713/820 [09:57<00:59,  1.79it/s]
 87%|████████▋ | 714/820 [09:58<00:58,  1.81it/s]
 87%|████████▋ | 715/820 [09:59<00:57,  1.83it/s]
 87%|████████▋ | 716/820 [09:59<00:56,  1.83it/s]
 87%|████████▋ | 717/820 [10:00<00:55,  1.84it/s]
 88%|████████▊ | 718/820 [10:00<00:55,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:06:13 (running for 00:10:08.36)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 88%|████████▊ | 719/820 [10:01<00:54,  1.85it/s]
 88%|████████▊ | 720/820 [10:01<00:54,  1.85it/s]
 88%|████████▊ | 721/820 [10:02<00:53,  1.85it/s]
 88%|████████▊ | 722/820 [10:02<00:52,  1.85it/s]
 88%|████████▊ | 723/820 [10:03<00:52,  1.85it/s]
 88%|████████▊ | 724/820 [10:03<00:51,  1.85it/s]
 88%|████████▊ | 725/820 [10:04<00:51,  1.85it/s]
 89%|████████▊ | 726/820 [10:04<00:50,  1.85it/s]
 89%|████████▊ | 727/820 [10:05<00:50,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:06:18 (running for 00:10:13.36)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 89%|████████▉ | 728/820 [10:06<00:49,  1.85it/s]
 89%|████████▉ | 729/820 [10:06<00:49,  1.85it/s]
 89%|████████▉ | 730/820 [10:07<00:48,  1.85it/s]
 89%|████████▉ | 731/820 [10:07<00:48,  1.85it/s]
 89%|████████▉ | 732/820 [10:08<00:47,  1.85it/s]
 89%|████████▉ | 733/820 [10:08<00:47,  1.85it/s]
 90%|████████▉ | 734/820 [10:09<00:46,  1.85it/s]
 90%|████████▉ | 735/820 [10:09<00:45,  1.85it/s]
 90%|████████▉ | 736/820 [10:10<00:45,  1.85it/s]
 90%|████████▉ | 737/820 [10:10<00:44,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:06:23 (running for 00:10:18.37)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 90%|█████████ | 738/820 [10:11<00:44,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 38.99it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.40it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.92it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.27it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.85it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.61it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.48it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.39it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.35it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:06:28 (running for 00:10:23.37)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.11it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.14it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.15it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.12it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.12it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.15it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.15it/s][A
[2m[36m(_objective pid=2373378)[0m 
                                                 [A
 90%|█████████ | 738/820 [10:17<00:44,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.16it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.0022720813285559416, 'eval_accuracy': 0.999, 'eval_runtime': 5.3541, 'eval_samples_per_second': 186.774, 'eval_steps_per_second': 31.191, 'epoch': 17.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-06-33
  done: false
  epoch: 17.98
  eval_accuracy: 0.999
  eval_loss: 0.0022720813285559416
  eval_runtime: 5.3541
  eval_samples_per_second: 186.774
  eval_steps_per_second: 31.191
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 18
  node_ip: 172.17.0.3
  objective: 0.999
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 625.2878444194794
  time_this_iter_s: 34.88413047790527
  time_total_s: 625.2878444194794
  timestamp: 1665461193
  timesteps_since_restore: 0
  training_iteration: 18
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 90%|█████████ | 739/820 [10:24<05:46,  4.27s/it]
 90%|█████████ | 740/820 [10:24<04:12,  3.15s/it]
 90%|█████████ | 741/820 [10:25<03:07,  2.37s/it]


== Status ==
Current time: 2022-10-11 04:06:38 (running for 00:10:32.90)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 90%|█████████ | 742/820 [10:26<02:21,  1.82s/it]
 91%|█████████ | 743/820 [10:26<01:50,  1.44s/it]
 91%|█████████ | 744/820 [10:27<01:28,  1.17s/it]
 91%|█████████ | 745/820 [10:27<01:13,  1.02it/s]
 91%|█████████ | 746/820 [10:28<01:02,  1.18it/s]
 91%|█████████ | 747/820 [10:28<00:55,  1.33it/s]
 91%|█████████ | 748/820 [10:29<00:49,  1.45it/s]
 91%|█████████▏| 749/820 [10:29<00:45,  1.55it/s]
 91%|█████████▏| 750/820 [10:30<00:42,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:06:43 (running for 00:10:37.91)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 92%|█████████▏| 751/820 [10:30<00:40,  1.69it/s]
 92%|█████████▏| 752/820 [10:31<00:39,  1.74it/s]
 92%|█████████▏| 753/820 [10:31<00:37,  1.77it/s]
 92%|█████████▏| 754/820 [10:32<00:36,  1.80it/s]
 92%|█████████▏| 755/820 [10:33<00:35,  1.81it/s]
 92%|█████████▏| 756/820 [10:33<00:35,  1.83it/s]
 92%|█████████▏| 757/820 [10:34<00:34,  1.84it/s]
 92%|█████████▏| 758/820 [10:34<00:33,  1.84it/s]
 93%|█████████▎| 759/820 [10:35<00:33,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:06:48 (running for 00:10:42.91)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 93%|█████████▎| 760/820 [10:35<00:32,  1.85it/s]
 93%|█████████▎| 761/820 [10:36<00:31,  1.85it/s]
 93%|█████████▎| 762/820 [10:36<00:31,  1.85it/s]
 93%|█████████▎| 763/820 [10:37<00:30,  1.85it/s]
 93%|█████████▎| 764/820 [10:37<00:30,  1.85it/s]
 93%|█████████▎| 765/820 [10:38<00:29,  1.85it/s]
 93%|█████████▎| 766/820 [10:38<00:29,  1.85it/s]
 94%|█████████▎| 767/820 [10:39<00:28,  1.85it/s]
 94%|█████████▎| 768/820 [10:40<00:28,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:06:53 (running for 00:10:47.91)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 94%|█████████▍| 769/820 [10:40<00:27,  1.85it/s]
 94%|█████████▍| 770/820 [10:41<00:27,  1.85it/s]
 94%|█████████▍| 771/820 [10:41<00:26,  1.85it/s]
 94%|█████████▍| 772/820 [10:42<00:25,  1.85it/s]
 94%|█████████▍| 773/820 [10:42<00:25,  1.85it/s]
 94%|█████████▍| 774/820 [10:43<00:24,  1.85it/s]
 95%|█████████▍| 775/820 [10:43<00:24,  1.85it/s]
 95%|█████████▍| 776/820 [10:44<00:23,  1.85it/s]
 95%|█████████▍| 777/820 [10:44<00:23,  1.84it/s]
 95%|█████████▍| 778/820 [10:45<00:22,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:06:58 (running for 00:10:52.91)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 95%|█████████▌| 779/820 [10:45<00:22,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 38.94it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.44it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.97it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.26it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.88it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.62it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.49it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.31it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.30it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:07:03 (running for 00:10:57.91)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 80%|███████▉  | 133/167 [00:04<00:01, 31.12it/s][A
[2m[36m(_objective pid=2373378)[0m 
 82%|████████▏ | 137/167 [00:04<00:00, 31.15it/s][A
[2m[36m(_objective pid=2373378)[0m 
 84%|████████▍ | 141/167 [00:04<00:00, 31.05it/s][A
[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 31.10it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.13it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.09it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.11it/s][A
[2m[36m(_objective pid=2373378)[0m 
                                                 [A
 95%|█████████▌| 779/820 [10:51<00:22,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.16it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.0021458733826875687, 'eval_accuracy': 0.999, 'eval_runtime': 5.3623, 'eval_samples_per_second': 186.487, 'eval_steps_per_second': 31.143, 'epoch': 18.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-07-07
  done: false
  epoch: 18.98
  eval_accuracy: 0.999
  eval_loss: 0.0021458733826875687
  eval_runtime: 5.3623
  eval_samples_per_second: 186.487
  eval_steps_per_second: 31.143
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 19
  node_ip: 172.17.0.3
  objective: 0.999
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 659.85524725914
  time_this_iter_s: 34.567402839660645
  time_total_s: 659.85524725914
  timestamp: 1665461227
  timesteps_since_restore: 0
  training_iteration: 19
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


 95%|█████████▌| 780/820 [10:58<02:50,  4.27s/it]
 95%|█████████▌| 781/820 [10:59<02:02,  3.15s/it]
 95%|█████████▌| 782/820 [11:00<01:29,  2.37s/it]


== Status ==
Current time: 2022-10-11 04:07:12 (running for 00:11:07.48)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 95%|█████████▌| 783/820 [11:00<01:07,  1.82s/it]
 96%|█████████▌| 784/820 [11:01<00:51,  1.43s/it]
 96%|█████████▌| 785/820 [11:01<00:40,  1.17s/it]
 96%|█████████▌| 786/820 [11:02<00:33,  1.02it/s]
 96%|█████████▌| 787/820 [11:02<00:27,  1.18it/s]
 96%|█████████▌| 788/820 [11:03<00:24,  1.33it/s]
 96%|█████████▌| 789/820 [11:03<00:21,  1.45it/s]
 96%|█████████▋| 790/820 [11:04<00:19,  1.55it/s]
 96%|█████████▋| 791/820 [11:04<00:17,  1.63it/s]


== Status ==
Current time: 2022-10-11 04:07:17 (running for 00:11:12.48)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 97%|█████████▋| 792/820 [11:05<00:16,  1.69it/s]
 97%|█████████▋| 793/820 [11:05<00:15,  1.74it/s]
 97%|█████████▋| 794/820 [11:06<00:14,  1.77it/s]
 97%|█████████▋| 795/820 [11:07<00:13,  1.79it/s]
 97%|█████████▋| 796/820 [11:07<00:13,  1.81it/s]
 97%|█████████▋| 797/820 [11:08<00:12,  1.83it/s]
 97%|█████████▋| 798/820 [11:08<00:11,  1.84it/s]
 97%|█████████▋| 799/820 [11:09<00:11,  1.84it/s]
 98%|█████████▊| 800/820 [11:09<00:10,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:07:22 (running for 00:11:17.49)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 98%|█████████▊| 801/820 [11:10<00:10,  1.85it/s]
 98%|█████████▊| 802/820 [11:10<00:09,  1.85it/s]
 98%|█████████▊| 803/820 [11:11<00:09,  1.85it/s]
 98%|█████████▊| 804/820 [11:11<00:08,  1.85it/s]
 98%|█████████▊| 805/820 [11:12<00:08,  1.85it/s]
 98%|█████████▊| 806/820 [11:12<00:07,  1.85it/s]
 98%|█████████▊| 807/820 [11:13<00:07,  1.85it/s]
 99%|█████████▊| 808/820 [11:14<00:06,  1.85it/s]
 99%|█████████▊| 809/820 [11:14<00:05,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:07:27 (running for 00:11:22.49)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

 99%|█████████▉| 810/820 [11:15<00:05,  1.85it/s]
 99%|█████████▉| 811/820 [11:15<00:04,  1.85it/s]
 99%|█████████▉| 812/820 [11:16<00:04,  1.85it/s]
 99%|█████████▉| 813/820 [11:16<00:03,  1.85it/s]
 99%|█████████▉| 814/820 [11:17<00:03,  1.85it/s]
 99%|█████████▉| 815/820 [11:17<00:02,  1.85it/s]
100%|█████████▉| 816/820 [11:18<00:02,  1.85it/s]
100%|█████████▉| 817/820 [11:18<00:01,  1.85it/s]
100%|█████████▉| 818/820 [11:19<00:01,  1.85it/s]
100%|█████████▉| 819/820 [11:19<00:00,  1.85it/s]


== Status ==
Current time: 2022-10-11 04:07:32 (running for 00:11:27.49)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

100%|██████████| 820/820 [11:20<00:00,  1.85it/s]
[2m[36m(_objective pid=2373378)[0m 
  0%|          | 0/167 [00:00<?, ?it/s][A
[2m[36m(_objective pid=2373378)[0m 
  3%|▎         | 5/167 [00:00<00:04, 39.14it/s][A
[2m[36m(_objective pid=2373378)[0m 
  5%|▌         | 9/167 [00:00<00:04, 34.17it/s][A
[2m[36m(_objective pid=2373378)[0m 
  8%|▊         | 13/167 [00:00<00:04, 32.80it/s][A
[2m[36m(_objective pid=2373378)[0m 
 10%|█         | 17/167 [00:00<00:04, 32.16it/s][A
[2m[36m(_objective pid=2373378)[0m 
 13%|█▎        | 21/167 [00:00<00:04, 31.81it/s][A
[2m[36m(_objective pid=2373378)[0m 
 15%|█▍        | 25/167 [00:00<00:04, 31.60it/s][A
[2m[36m(_objective pid=2373378)[0m 
 17%|█▋        | 29/167 [00:00<00:04, 31.47it/s][A
[2m[36m(_objective pid=2373378)[0m 
 20%|█▉        | 33/167 [00:01<00:04, 31.38it/s][A
[2m[36m(_objective pid=2373378)[0m 
 22%|██▏       | 37/167 [00:01<00:04, 31.32it/s][A
[2m[36m(_objective pid=2373378)[0m 
 25%|██▍    

== Status ==
Current time: 2022-10-11 04:07:37 (running for 00:11:32.49)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 16.0/20 CPUs, 1.0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspace/syc/BERT_classification_binary/test-results/tune_transformer_pbt
Number of trials: 1/1 (1 RUNNING)
+------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------+
| Trial name             | status   | loc                |   w_decay | lr   |   train_bs/gpu | num_epochs   |   eval_accuracy |   eval_loss |   epoch |   training_iteration |
|------------------------+----------+--------------------+-----------+------+----------------+--------------+-----------------+-------------+---------+----------------------|
| _objective_a15a2_00000 | RUNNING  | 172.17.0.3:2373378 |  0.258967

[2m[36m(_objective pid=2373378)[0m 
 87%|████████▋ | 145/167 [00:04<00:00, 30.99it/s][A
[2m[36m(_objective pid=2373378)[0m 
 89%|████████▉ | 149/167 [00:04<00:00, 31.06it/s][A
[2m[36m(_objective pid=2373378)[0m 
 92%|█████████▏| 153/167 [00:04<00:00, 31.10it/s][A
[2m[36m(_objective pid=2373378)[0m 
 94%|█████████▍| 157/167 [00:05<00:00, 31.14it/s][A
[2m[36m(_objective pid=2373378)[0m 
 96%|█████████▋| 161/167 [00:05<00:00, 31.14it/s][A
[2m[36m(_objective pid=2373378)[0m 
 99%|█████████▉| 165/167 [00:05<00:00, 31.18it/s][A
                                                 
100%|██████████| 820/820 [11:25<00:00,  1.85it/s]
100%|██████████| 167/167 [00:05<00:00, 31.18it/s][A
                                                 [A


[2m[36m(_objective pid=2373378)[0m {'eval_loss': 0.002041089115664363, 'eval_accuracy': 0.999, 'eval_runtime': 5.3572, 'eval_samples_per_second': 186.664, 'eval_steps_per_second': 31.173, 'epoch': 19.98}
Result for _objective_a15a2_00000:
  date: 2022-10-11_04-07-41
  done: false
  epoch: 19.98
  eval_accuracy: 0.999
  eval_loss: 0.002041089115664363
  eval_runtime: 5.3572
  eval_samples_per_second: 186.664
  eval_steps_per_second: 31.173
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  hostname: 3481a8a2ae33
  iterations_since_restore: 20
  node_ip: 172.17.0.3
  objective: 0.999
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 694.060560464859
  time_this_iter_s: 34.205313205718994
  time_total_s: 694.060560464859
  timestamp: 1665461261
  timesteps_since_restore: 0
  training_iteration: 20
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  


100%|██████████| 820/820 [11:32<00:00,  1.18it/s]
2022-10-11 04:07:45,793	INFO tune.py:758 -- Total run time: 700.55 seconds (700.32 seconds for the tuning loop).


Result for _objective_a15a2_00000:
  date: 2022-10-11_04-07-41
  done: true
  epoch: 19.98
  eval_accuracy: 0.999
  eval_loss: 0.002041089115664363
  eval_runtime: 5.3572
  eval_samples_per_second: 186.664
  eval_steps_per_second: 31.173
  experiment_id: 2db33cb4467541ffb8badd9bbe1fdeb1
  experiment_tag: '0'
  hostname: 3481a8a2ae33
  iterations_since_restore: 20
  node_ip: 172.17.0.3
  objective: 0.999
  pid: 2373378
  should_checkpoint: true
  time_since_restore: 694.060560464859
  time_this_iter_s: 34.205313205718994
  time_total_s: 694.060560464859
  timestamp: 1665461261
  timesteps_since_restore: 0
  training_iteration: 20
  trial_id: a15a2_00000
  warmup_time: 0.003078460693359375
  
== Status ==
Current time: 2022-10-11 04:07:45 (running for 00:11:40.32)
Memory usage on this node: 13.4/31.1 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 0/20 CPUs, 0/1 GPUs, 0.0/15.02 GiB heap, 0.0/7.51 GiB objects (0.0/1.0 accelerator_type:G)
Result logdir: /workspa

In [25]:
result

BestRun(run_id='a15a2_00000', objective=0.999, hyperparameters={'per_device_train_batch_size': 6, 'per_device_eval_batch_size': 6, 'max_steps': -1, 'weight_decay': 0.2589669844681074})

# Hyperparameters 반영 내용

scheduler는 ASHAScheduler 대신 PBT scheduler를 사용했고, metric, mode, max_t는 동일하게 적용  
AdamW Optimizer는 PBT scheduler에 적용할 수 없어, 관련 hyperparameters 제외하고 적용

num_samples = 1 -> n_trials 변수에 반영  
max_num_epochs = 20 -> 동일하게 반영  
seed = 818 -> 동일하게 반영  
model_type = modeltype -> model_name = "xlm-roberta-base"  
num_labels =  2 -> 동일하게 반영  
bias_correction =  True -> AdamW의 hyperparameters  
batch_size =  8 -> 동일하게 반영  
eps =  1e-8 -> AdamW의 hyperparameters  
warmup =  0.1 -> 동일하게 반영  
beta =  (0.9, 0.999) -> AdamW의 hyperparameters  
lr =  2e-5 -> AdamW의 hyperparameters이지만, TrainingArguments의 hyperparameters에 반영

# Reference

https://bo-10000.tistory.com/154
https://huggingface.co/blog/ray-tune  
https://docs.ray.io/en/latest/tune/examples/pbt_transformers.html
https://wood-b.github.io/post/a-novices-guide-to-hyperparameter-optimization-at-scale/#schedulers-vs-search-algorithms
https://docs.ray.io/en/latest/tune/api_docs/search_space.html
https://docs.ray.io/en/latest/tune/tutorials/tune-advanced-tutorial.html
https://docs.ray.io/en/latest/tune/api_docs/schedulers.html
https://blog.ml.cmu.edu/2018/12/12/massively-parallel-hyperparameter-optimization/
https://docs.ray.io/en/latest/tune/faq.html