<a href="https://colab.research.google.com/github/TongLuo2021/HPO-DistilBert-With-Ray-and-Optuna/blob/main/HPO_DistilBert_with_Ray_and_Optuna.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Hyperparameter Search with Transformers and Ray Tune, and Optuna
* by Tong Luo

Hyperparameters (HPs) can have significant impact on the final model parameters and model performance. [Study show](https://towardsdatascience.com/hyperparameter-optimization-for-optimum-transformer-models-b95a32b70949) that the most significant HPs are learning rate and training epochs, while other parameters also matters. Depends on the applications such as CV, NLP, the HPs can be significantly.

----------
###TLuo's works are:
1. Ray and Optuna has [default HPs]((https://github.com/huggingface/transformers/blob/main/src/transformers/trainer_utils.py): 
* learning_rate, 
* num_train_epochs, 
* seed, 
* per_device_train_batch_size. 
2. Besides that, I used customized HPO, by defining hp_space_ray function following the [source code](https://github.com/huggingface/transformers/blob/main/src/transformers/trainer_utils.py), and [sgugers's example](https://github.com/huggingface/transformers/blob/main/src/transformers/trainer_utils.py)
3. Uses same frame work for Optuna.
4. Use same frame work, we can tune Other hyper parameters are listed [here](https://simpletransformers.ai/docs/usage/#configuring-a-simple-transformers-model) .
5. For small HPO setting (n_runtime = 3), the total time is similiar. Ray suppose to be faster because it can parallel moltiple tasks. More experiment need to be done here. 
----------
###[References]
* [Hyperparameter Search with Transformers and Ray Tune](https://huggingface.co/blog/ray-tune)
* [Hyperparameter optimization for optimun transformer models](https://towardsdatascience.com/hyperparameter-optimization-for-optimum-transformer-models-b95a32b70949)


## Choose which HPO you want to run, default to RUN_BOTH

In [1]:
RUN_RAY = 0
RUN_OPTUNA = 1
RUN_ALL = 2

# Make your selection here
RUN = RUN_ALL

In [2]:
pip install "ray[tune]" transformers datasets scipy sklearn torch

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting ray[tune]
  Downloading ray-2.1.0-cp37-cp37m-manylinux2014_x86_64.whl (59.1 MB)
[K     |████████████████████████████████| 59.1 MB 1.4 MB/s 
[?25hCollecting transformers
  Downloading transformers-4.24.0-py3-none-any.whl (5.5 MB)
[K     |████████████████████████████████| 5.5 MB 67.3 MB/s 
[?25hCollecting datasets
  Downloading datasets-2.7.1-py3-none-any.whl (451 kB)
[K     |████████████████████████████████| 451 kB 69.1 MB/s 
Collecting sklearn
  Downloading sklearn-0.0.post1.tar.gz (3.6 kB)
Collecting huggingface-hub<1.0,>=0.10.0
  Downloading huggingface_hub-0.11.1-py3-none-any.whl (182 kB)
[K     |████████████████████████████████| 182 kB 91.7 MB/s 
[?25hCollecting tokenizers!=0.11.3,<0.14,>=0.11.1
  Downloading tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
[K     |████████████████████████████████| 7.6 MB 78.6 MB/s 
Collecting res

In [3]:
from datasets import load_dataset, load_metric
from transformers import (AutoModelForSequenceClassification, AutoTokenizer,
                          Trainer, TrainingArguments)

In [4]:
tokenizer = AutoTokenizer.from_pretrained('distilbert-base-uncased')
dataset = load_dataset('glue', 'mrpc')
metric = load_metric('glue', 'mrpc')

def encode(examples):
    outputs = tokenizer(
        examples['sentence1'], examples['sentence2'], truncation=True)
    return outputs

encoded_dataset = dataset.map(encode, batched=True)

def model_init():
    return AutoModelForSequenceClassification.from_pretrained(
        'distilbert-base-uncased', return_dict=True)

def compute_metrics(eval_pred):
    predictions, labels = eval_pred
    predictions = predictions.argmax(axis=-1)
    return metric.compute(predictions=predictions, references=labels)

# Evaluate during training and a bit more often
# than the default to be able to prune bad trials early.
# Disabling tqdm is a matter of preference.
training_args = TrainingArguments(
    "test", 
    evaluation_strategy="steps", 
    eval_steps=500, 
    disable_tqdm=False)

trainer = Trainer(
    args=training_args,
    tokenizer=tokenizer,
    train_dataset=encoded_dataset["train"],
    eval_dataset=encoded_dataset["validation"],
    model_init=model_init,
    compute_metrics=compute_metrics,
)





Downloading:   0%|          | 0.00/28.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/483 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading builder script:   0%|          | 0.00/28.8k [00:00<?, ?B/s]

Downloading metadata:   0%|          | 0.00/28.7k [00:00<?, ?B/s]

Downloading readme:   0%|          | 0.00/27.8k [00:00<?, ?B/s]

Downloading and preparing dataset glue/mrpc to /root/.cache/huggingface/datasets/glue/mrpc/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad...


Downloading data files:   0%|          | 0/3 [00:00<?, ?it/s]

Downloading data: 0.00B [00:00, ?B/s]

Downloading data: 0.00B [00:00, ?B/s]

Downloading data: 0.00B [00:00, ?B/s]

Generating train split:   0%|          | 0/3668 [00:00<?, ? examples/s]

Generating validation split:   0%|          | 0/408 [00:00<?, ? examples/s]

Generating test split:   0%|          | 0/1725 [00:00<?, ? examples/s]

Dataset glue downloaded and prepared to /root/.cache/huggingface/datasets/glue/mrpc/1.0.0/dacbe3125aa31d7f70367a07a8a9e72a5a0bfeb5fc42e75c9db75b96da6053ad. Subsequent calls will reuse this data.


  0%|          | 0/3 [00:00<?, ?it/s]

  This is separate from the ipykernel package so we can avoid doing imports until


Downloading builder script:   0%|          | 0.00/1.84k [00:00<?, ?B/s]

  0%|          | 0/4 [00:00<?, ?ba/s]

  0%|          | 0/1 [00:00<?, ?ba/s]

  0%|          | 0/2 [00:00<?, ?ba/s]

loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--distilbert-base-uncased/snapshots/1c4513b2eedbda136f57676a34eea67aba266e5c/config.json
Model config DistilBertConfig {
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "tie_weights_": true,
  "transformers_version": "4.24.0",
  "vocab_size": 30522
}



Downloading:   0%|          | 0.00/268M [00:00<?, ?B/s]

loading weights file pytorch_model.bin from cache at /root/.cache/huggingface/hub/models--distilbert-base-uncased/snapshots/1c4513b2eedbda136f57676a34eea67aba266e5c/pytorch_model.bin
Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_layer_norm.bias', 'vocab_transform.weight', 'vocab_projector.weight', 'vocab_projector.bias', 'vocab_transform.bias', 'vocab_layer_norm.weight']
- This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of DistilBert

##Using Ray to customize the hyper parameters

In [5]:
from typing import Dict

def hp_space_ray(trial) -> Dict[str, float]:
  from ray import tune
  return {
      "learning_rate": tune.loguniform(1e-6, 1e-4),
      "num_train_epochs": tune.choice(list(range(1, 6))),
      "seed": tune.uniform(1, 40),
      "per_device_train_batch_size": tune.choice([4, 8, 16, 32, 64]),
      "weight_decay": tune.choice([0.1, 0.3, 0.03])
  }


* un-comment this block to run Ray HPO

In [6]:
# Default objective is the sum of all metrics
# when metrics are provided, so we have to maximize it.
%%time
if RUN == RUN_RAY or RUN == RUN_ALL:
  best_run = trainer.hyperparameter_search(
    hp_space = hp_space_ray,
    direction="maximize", 
    backend="ray", 
    n_trials=3)
  print("------------RAY------------")
  print(best_run)

No `resources_per_trial` arg was passed into `hyperparameter_search`. Setting it to a default value of 1 CPU and 1 GPU for each trial.
2022-11-29 17:25:34,506	INFO worker.py:1528 -- Started a local Ray instance.

from ray.air import session

def train(config):
    # ...
    session.report({"metric": metric}, checkpoint=checkpoint)

For more information please see https://docs.ray.io/en/master/tune/api_docs/trainable.html



== Status ==
Current time: 2022-11-29 17:25:38 (running for 00:00:00.19)
Memory usage on this node: 3.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

[2m[36m(_objective pid=659)[0m Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_layer_norm.bias', 'vocab_layer_norm.weight', 'vocab_transform.weight', 'vocab_projector.weight', 'vocab_transform.bias', 'vocab_projector.bias']
[2m[36m(_objective pid=659)[0m - This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=659)[0m - This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=659)[0m Some weights of DistilBertForSequenceClassification were not initialized 

== Status ==
Current time: 2022-11-29 17:25:47 (running for 00:00:09.92)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

  0%|          | 1/290 [00:03<15:04,  3.13s/it]
  1%|          | 2/290 [00:03<07:13,  1.51s/it]
  1%|          | 3/290 [00:03<04:40,  1.02it/s]
  1%|▏         | 4/290 [00:04<03:31,  1.35it/s]
  2%|▏         | 5/290 [00:04<02:52,  1.65it/s]
  2%|▏         | 6/290 [00:04<02:29,  1.89it/s]
  2%|▏         | 7/290 [00:05<02:13,  2.12it/s]


== Status ==
Current time: 2022-11-29 17:25:53 (running for 00:00:14.92)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

  3%|▎         | 8/290 [00:05<02:09,  2.17it/s]
  3%|▎         | 9/290 [00:06<02:02,  2.30it/s]
  3%|▎         | 10/290 [00:06<01:57,  2.38it/s]
  4%|▍         | 11/290 [00:06<01:51,  2.50it/s]
  4%|▍         | 12/290 [00:07<01:54,  2.42it/s]
  4%|▍         | 13/290 [00:07<01:53,  2.44it/s]
  5%|▍         | 14/290 [00:08<01:52,  2.45it/s]
  5%|▌         | 15/290 [00:08<01:48,  2.54it/s]
  6%|▌         | 16/290 [00:08<01:44,  2.62it/s]
  6%|▌         | 17/290 [00:09<01:45,  2.58it/s]
  6%|▌         | 18/290 [00:09<01:43,  2.63it/s]
  7%|▋         | 19/290 [00:09<01:43,  2.62it/s]
  7%|▋         | 20/290 [00:10<01:45,  2.56it/s]


== Status ==
Current time: 2022-11-29 17:25:58 (running for 00:00:19.93)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

  7%|▋         | 21/290 [00:10<01:44,  2.56it/s]
  8%|▊         | 22/290 [00:11<01:42,  2.63it/s]
  8%|▊         | 23/290 [00:11<01:41,  2.63it/s]
  8%|▊         | 24/290 [00:11<01:42,  2.60it/s]
  9%|▊         | 25/290 [00:12<01:39,  2.66it/s]
  9%|▉         | 26/290 [00:12<01:40,  2.63it/s]
  9%|▉         | 27/290 [00:13<01:37,  2.68it/s]
 10%|▉         | 28/290 [00:13<01:36,  2.71it/s]
 10%|█         | 29/290 [00:13<01:38,  2.65it/s]
 10%|█         | 30/290 [00:14<01:38,  2.64it/s]
 11%|█         | 31/290 [00:14<01:39,  2.61it/s]
 11%|█         | 32/290 [00:14<01:39,  2.60it/s]
 11%|█▏        | 33/290 [00:15<01:43,  2.48it/s]


== Status ==
Current time: 2022-11-29 17:26:03 (running for 00:00:24.93)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 12%|█▏        | 34/290 [00:15<01:41,  2.52it/s]
 12%|█▏        | 35/290 [00:16<01:44,  2.43it/s]
 12%|█▏        | 36/290 [00:16<01:49,  2.32it/s]
 13%|█▎        | 37/290 [00:17<01:43,  2.45it/s]
 13%|█▎        | 38/290 [00:17<01:43,  2.43it/s]
 13%|█▎        | 39/290 [00:17<01:41,  2.47it/s]
 14%|█▍        | 40/290 [00:18<01:44,  2.40it/s]
 14%|█▍        | 41/290 [00:18<01:39,  2.50it/s]
 14%|█▍        | 42/290 [00:19<01:39,  2.49it/s]
 15%|█▍        | 43/290 [00:19<01:38,  2.50it/s]
 15%|█▌        | 44/290 [00:19<01:35,  2.59it/s]
 16%|█▌        | 45/290 [00:20<01:34,  2.58it/s]
 16%|█▌        | 46/290 [00:20<01:32,  2.64it/s]


== Status ==
Current time: 2022-11-29 17:26:08 (running for 00:00:29.94)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 16%|█▌        | 47/290 [00:20<01:30,  2.67it/s]
 17%|█▋        | 48/290 [00:21<01:31,  2.65it/s]
 17%|█▋        | 49/290 [00:21<01:29,  2.69it/s]
 17%|█▋        | 50/290 [00:22<01:32,  2.61it/s]
 18%|█▊        | 51/290 [00:22<01:32,  2.60it/s]
 18%|█▊        | 52/290 [00:22<01:33,  2.53it/s]
 18%|█▊        | 53/290 [00:23<01:31,  2.58it/s]
 19%|█▊        | 54/290 [00:23<01:31,  2.58it/s]
 19%|█▉        | 55/290 [00:24<01:30,  2.60it/s]
 19%|█▉        | 56/290 [00:24<01:28,  2.64it/s]
 20%|█▉        | 57/290 [00:24<01:29,  2.61it/s]
 20%|██        | 58/290 [00:24<01:12,  3.19it/s]
 20%|██        | 59/290 [00:25<01:17,  2.97it/s]


== Status ==
Current time: 2022-11-29 17:26:13 (running for 00:00:34.94)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 21%|██        | 60/290 [00:25<01:20,  2.84it/s]
 21%|██        | 61/290 [00:26<01:23,  2.74it/s]
 21%|██▏       | 62/290 [00:26<01:24,  2.69it/s]
 22%|██▏       | 63/290 [00:26<01:25,  2.66it/s]
 22%|██▏       | 64/290 [00:27<01:25,  2.63it/s]
 22%|██▏       | 65/290 [00:27<01:28,  2.56it/s]
 23%|██▎       | 66/290 [00:28<01:26,  2.59it/s]
 23%|██▎       | 67/290 [00:28<01:26,  2.57it/s]
 23%|██▎       | 68/290 [00:28<01:25,  2.61it/s]
 24%|██▍       | 69/290 [00:29<01:23,  2.66it/s]
 24%|██▍       | 70/290 [00:29<01:23,  2.62it/s]
 24%|██▍       | 71/290 [00:29<01:23,  2.61it/s]
 25%|██▍       | 72/290 [00:30<01:25,  2.54it/s]


== Status ==
Current time: 2022-11-29 17:26:18 (running for 00:00:39.95)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 25%|██▌       | 73/290 [00:30<01:23,  2.59it/s]
 26%|██▌       | 74/290 [00:31<01:25,  2.53it/s]
 26%|██▌       | 75/290 [00:31<01:24,  2.54it/s]
 26%|██▌       | 76/290 [00:31<01:23,  2.58it/s]
 27%|██▋       | 77/290 [00:32<01:23,  2.56it/s]
 27%|██▋       | 78/290 [00:32<01:21,  2.59it/s]
 27%|██▋       | 79/290 [00:33<01:23,  2.51it/s]
 28%|██▊       | 80/290 [00:33<01:23,  2.50it/s]
 28%|██▊       | 81/290 [00:33<01:27,  2.40it/s]
 28%|██▊       | 82/290 [00:34<01:23,  2.48it/s]
 29%|██▊       | 83/290 [00:34<01:26,  2.40it/s]
 29%|██▉       | 84/290 [00:35<01:23,  2.48it/s]
 29%|██▉       | 85/290 [00:35<01:20,  2.54it/s]


== Status ==
Current time: 2022-11-29 17:26:23 (running for 00:00:44.95)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 30%|██▉       | 86/290 [00:35<01:20,  2.54it/s]
 30%|███       | 87/290 [00:36<01:18,  2.59it/s]
 30%|███       | 88/290 [00:36<01:20,  2.50it/s]
 31%|███       | 89/290 [00:37<01:19,  2.54it/s]
 31%|███       | 90/290 [00:37<01:18,  2.54it/s]
 31%|███▏      | 91/290 [00:37<01:22,  2.42it/s]
 32%|███▏      | 92/290 [00:38<01:19,  2.50it/s]
 32%|███▏      | 93/290 [00:38<01:18,  2.51it/s]
 32%|███▏      | 94/290 [00:39<01:16,  2.56it/s]
 33%|███▎      | 95/290 [00:39<01:19,  2.46it/s]
 33%|███▎      | 96/290 [00:39<01:16,  2.52it/s]
 33%|███▎      | 97/290 [00:40<01:14,  2.59it/s]


== Status ==
Current time: 2022-11-29 17:26:28 (running for 00:00:49.96)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 34%|███▍      | 98/290 [00:40<01:14,  2.57it/s]
 34%|███▍      | 99/290 [00:41<01:18,  2.44it/s]
 34%|███▍      | 100/290 [00:41<01:17,  2.46it/s]
 35%|███▍      | 101/290 [00:41<01:16,  2.48it/s]
 35%|███▌      | 102/290 [00:42<01:16,  2.45it/s]
 36%|███▌      | 103/290 [00:42<01:15,  2.47it/s]
 36%|███▌      | 104/290 [00:43<01:13,  2.52it/s]
 36%|███▌      | 105/290 [00:43<01:10,  2.63it/s]
 37%|███▋      | 106/290 [00:43<01:10,  2.59it/s]
 37%|███▋      | 107/290 [00:44<01:12,  2.52it/s]
 37%|███▋      | 108/290 [00:44<01:10,  2.57it/s]
 38%|███▊      | 109/290 [00:45<01:10,  2.55it/s]
 38%|███▊      | 110/290 [00:45<01:12,  2.49it/s]


== Status ==
Current time: 2022-11-29 17:26:33 (running for 00:00:54.96)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 38%|███▊      | 111/290 [00:45<01:13,  2.43it/s]
 39%|███▊      | 112/290 [00:46<01:10,  2.51it/s]
 39%|███▉      | 113/290 [00:46<01:11,  2.49it/s]
 39%|███▉      | 114/290 [00:47<01:15,  2.33it/s]
 40%|███▉      | 115/290 [00:47<01:12,  2.41it/s]
 40%|████      | 116/290 [00:47<00:57,  3.01it/s]
 40%|████      | 117/290 [00:48<00:59,  2.90it/s]
 41%|████      | 118/290 [00:48<01:00,  2.82it/s]
 41%|████      | 119/290 [00:48<01:03,  2.69it/s]
 41%|████▏     | 120/290 [00:49<01:06,  2.57it/s]
 42%|████▏     | 121/290 [00:49<01:05,  2.60it/s]
 42%|████▏     | 122/290 [00:50<01:05,  2.57it/s]
 42%|████▏     | 123/290 [00:50<01:07,  2.47it/s]


== Status ==
Current time: 2022-11-29 17:26:38 (running for 00:00:59.97)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 43%|████▎     | 124/290 [00:50<01:08,  2.42it/s]
 43%|████▎     | 125/290 [00:51<01:12,  2.28it/s]
 43%|████▎     | 126/290 [00:51<01:08,  2.40it/s]
 44%|████▍     | 127/290 [00:52<01:07,  2.42it/s]
 44%|████▍     | 128/290 [00:52<01:06,  2.43it/s]
 44%|████▍     | 129/290 [00:53<01:07,  2.40it/s]
 45%|████▍     | 130/290 [00:53<01:04,  2.47it/s]
 45%|████▌     | 131/290 [00:53<01:07,  2.37it/s]
 46%|████▌     | 132/290 [00:54<01:06,  2.39it/s]
 46%|████▌     | 133/290 [00:54<01:04,  2.42it/s]
 46%|████▌     | 134/290 [00:55<01:02,  2.48it/s]
 47%|████▋     | 135/290 [00:55<01:02,  2.48it/s]


== Status ==
Current time: 2022-11-29 17:26:43 (running for 00:01:04.97)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 47%|████▋     | 136/290 [00:55<01:03,  2.41it/s]
 47%|████▋     | 137/290 [00:56<01:01,  2.49it/s]
 48%|████▊     | 138/290 [00:56<01:03,  2.39it/s]
 48%|████▊     | 139/290 [00:57<01:01,  2.45it/s]
 48%|████▊     | 140/290 [00:57<01:00,  2.49it/s]
 49%|████▊     | 141/290 [00:57<01:00,  2.46it/s]
 49%|████▉     | 142/290 [00:58<00:59,  2.51it/s]
 49%|████▉     | 143/290 [00:58<00:59,  2.47it/s]
 50%|████▉     | 144/290 [00:59<00:58,  2.51it/s]
 50%|█████     | 145/290 [00:59<00:58,  2.49it/s]
 50%|█████     | 146/290 [00:59<00:58,  2.48it/s]
 51%|█████     | 147/290 [01:00<00:57,  2.47it/s]


== Status ==
Current time: 2022-11-29 17:26:48 (running for 00:01:09.98)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 51%|█████     | 148/290 [01:00<00:58,  2.41it/s]
 51%|█████▏    | 149/290 [01:01<00:57,  2.47it/s]
 52%|█████▏    | 150/290 [01:01<00:57,  2.45it/s]
 52%|█████▏    | 151/290 [01:02<00:57,  2.42it/s]
 52%|█████▏    | 152/290 [01:02<00:59,  2.32it/s]
 53%|█████▎    | 153/290 [01:02<00:57,  2.39it/s]
 53%|█████▎    | 154/290 [01:03<00:59,  2.30it/s]
 53%|█████▎    | 155/290 [01:03<00:56,  2.38it/s]
 54%|█████▍    | 156/290 [01:04<00:55,  2.40it/s]
 54%|█████▍    | 157/290 [01:04<00:54,  2.45it/s]
 54%|█████▍    | 158/290 [01:04<00:52,  2.49it/s]
 55%|█████▍    | 159/290 [01:05<00:53,  2.46it/s]


== Status ==
Current time: 2022-11-29 17:26:53 (running for 00:01:14.98)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 55%|█████▌    | 160/290 [01:05<00:51,  2.50it/s]
 56%|█████▌    | 161/290 [01:06<00:52,  2.47it/s]
 56%|█████▌    | 162/290 [01:06<00:54,  2.34it/s]
 56%|█████▌    | 163/290 [01:07<00:56,  2.25it/s]
 57%|█████▋    | 164/290 [01:07<00:53,  2.34it/s]
 57%|█████▋    | 165/290 [01:07<00:51,  2.41it/s]
 57%|█████▋    | 166/290 [01:08<00:52,  2.36it/s]
 58%|█████▊    | 167/290 [01:08<00:50,  2.41it/s]
 58%|█████▊    | 168/290 [01:09<00:50,  2.43it/s]
 58%|█████▊    | 169/290 [01:09<00:47,  2.53it/s]
 59%|█████▊    | 170/290 [01:09<00:49,  2.43it/s]
 59%|█████▉    | 171/290 [01:10<00:49,  2.40it/s]


== Status ==
Current time: 2022-11-29 17:26:58 (running for 00:01:19.99)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 59%|█████▉    | 172/290 [01:10<00:49,  2.40it/s]
 60%|█████▉    | 173/290 [01:11<00:49,  2.39it/s]
 60%|██████    | 174/290 [01:11<00:38,  2.98it/s]
 60%|██████    | 175/290 [01:11<00:41,  2.79it/s]
 61%|██████    | 176/290 [01:12<00:44,  2.58it/s]
 61%|██████    | 177/290 [01:12<00:43,  2.58it/s]
 61%|██████▏   | 178/290 [01:13<00:45,  2.47it/s]
 62%|██████▏   | 179/290 [01:13<00:45,  2.45it/s]
 62%|██████▏   | 180/290 [01:13<00:45,  2.43it/s]
 62%|██████▏   | 181/290 [01:14<00:44,  2.45it/s]
 63%|██████▎   | 182/290 [01:14<00:43,  2.49it/s]
 63%|██████▎   | 183/290 [01:15<00:46,  2.28it/s]
 63%|██████▎   | 184/290 [01:15<00:47,  2.22it/s]


== Status ==
Current time: 2022-11-29 17:27:03 (running for 00:01:24.99)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 64%|██████▍   | 185/290 [01:16<00:47,  2.22it/s]
 64%|██████▍   | 186/290 [01:16<00:44,  2.31it/s]
 64%|██████▍   | 187/290 [01:16<00:44,  2.29it/s]
 65%|██████▍   | 188/290 [01:17<00:44,  2.31it/s]
 65%|██████▌   | 189/290 [01:17<00:42,  2.38it/s]
 66%|██████▌   | 190/290 [01:18<00:42,  2.38it/s]
 66%|██████▌   | 191/290 [01:18<00:43,  2.28it/s]
 66%|██████▌   | 192/290 [01:19<00:43,  2.27it/s]
 67%|██████▋   | 193/290 [01:19<00:42,  2.30it/s]
 67%|██████▋   | 194/290 [01:19<00:41,  2.32it/s]
 67%|██████▋   | 195/290 [01:20<00:41,  2.27it/s]


== Status ==
Current time: 2022-11-29 17:27:08 (running for 00:01:30.00)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 68%|██████▊   | 196/290 [01:20<00:42,  2.21it/s]
 68%|██████▊   | 197/290 [01:21<00:40,  2.30it/s]
 68%|██████▊   | 198/290 [01:21<00:38,  2.38it/s]
 69%|██████▊   | 199/290 [01:22<00:38,  2.36it/s]
 69%|██████▉   | 200/290 [01:22<00:38,  2.34it/s]
 69%|██████▉   | 201/290 [01:23<00:38,  2.31it/s]
 70%|██████▉   | 202/290 [01:23<00:37,  2.32it/s]
 70%|███████   | 203/290 [01:23<00:37,  2.34it/s]
 70%|███████   | 204/290 [01:24<00:36,  2.34it/s]
 71%|███████   | 205/290 [01:24<00:35,  2.39it/s]
 71%|███████   | 206/290 [01:25<00:35,  2.38it/s]
 71%|███████▏  | 207/290 [01:25<00:34,  2.42it/s]


== Status ==
Current time: 2022-11-29 17:27:13 (running for 00:01:35.00)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 72%|███████▏  | 208/290 [01:25<00:34,  2.40it/s]
 72%|███████▏  | 209/290 [01:26<00:34,  2.38it/s]
 72%|███████▏  | 210/290 [01:26<00:35,  2.26it/s]
 73%|███████▎  | 211/290 [01:27<00:36,  2.19it/s]
 73%|███████▎  | 212/290 [01:27<00:34,  2.27it/s]
 73%|███████▎  | 213/290 [01:28<00:33,  2.30it/s]
 74%|███████▍  | 214/290 [01:28<00:33,  2.29it/s]
 74%|███████▍  | 215/290 [01:29<00:32,  2.29it/s]
 74%|███████▍  | 216/290 [01:29<00:31,  2.35it/s]
 75%|███████▍  | 217/290 [01:29<00:31,  2.34it/s]
 75%|███████▌  | 218/290 [01:30<00:30,  2.34it/s]


== Status ==
Current time: 2022-11-29 17:27:18 (running for 00:01:40.01)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 76%|███████▌  | 219/290 [01:30<00:31,  2.27it/s]
 76%|███████▌  | 220/290 [01:31<00:30,  2.27it/s]
 76%|███████▌  | 221/290 [01:31<00:30,  2.29it/s]
 77%|███████▋  | 222/290 [01:32<00:29,  2.34it/s]
 77%|███████▋  | 223/290 [01:32<00:30,  2.23it/s]
 77%|███████▋  | 224/290 [01:32<00:29,  2.22it/s]
 78%|███████▊  | 225/290 [01:33<00:28,  2.29it/s]
 78%|███████▊  | 226/290 [01:33<00:27,  2.30it/s]
 78%|███████▊  | 227/290 [01:34<00:27,  2.28it/s]
 79%|███████▊  | 228/290 [01:34<00:27,  2.25it/s]
 79%|███████▉  | 229/290 [01:35<00:26,  2.30it/s]
 79%|███████▉  | 230/290 [01:35<00:25,  2.35it/s]


== Status ==
Current time: 2022-11-29 17:27:23 (running for 00:01:45.01)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

[2m[36m(_objective pid=659)[0m  80%|███████▉  | 231/290 [01:35<00:24,  2.38it/s]
 80%|████████  | 232/290 [01:36<00:19,  2.95it/s]
 80%|████████  | 233/290 [01:36<00:21,  2.69it/s]
 81%|████████  | 234/290 [01:36<00:21,  2.56it/s]
 81%|████████  | 235/290 [01:37<00:23,  2.34it/s]
 81%|████████▏ | 236/290 [01:37<00:23,  2.28it/s]
 82%|████████▏ | 237/290 [01:38<00:23,  2.28it/s]
 82%|████████▏ | 238/290 [01:38<00:22,  2.33it/s]
 82%|████████▏ | 239/290 [01:39<00:22,  2.31it/s]
 83%|████████▎ | 240/290 [01:39<00:21,  2.36it/s]
 83%|████████▎ | 241/290 [01:40<00:20,  2.37it/s]
 83%|████████▎ | 242/290 [01:40<00:20,  2.38it/s]


== Status ==
Current time: 2022-11-29 17:27:28 (running for 00:01:50.02)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 84%|████████▍ | 243/290 [01:40<00:19,  2.38it/s]
 84%|████████▍ | 244/290 [01:41<00:20,  2.24it/s]
 84%|████████▍ | 245/290 [01:41<00:20,  2.23it/s]
 85%|████████▍ | 246/290 [01:42<00:19,  2.23it/s]
 85%|████████▌ | 247/290 [01:42<00:18,  2.29it/s]
 86%|████████▌ | 248/290 [01:43<00:18,  2.32it/s]
 86%|████████▌ | 249/290 [01:43<00:17,  2.29it/s]
 86%|████████▌ | 250/290 [01:44<00:17,  2.28it/s]
 87%|████████▋ | 251/290 [01:44<00:17,  2.22it/s]
 87%|████████▋ | 252/290 [01:44<00:17,  2.22it/s]
 87%|████████▋ | 253/290 [01:45<00:16,  2.23it/s]


== Status ==
Current time: 2022-11-29 17:27:33 (running for 00:01:55.02)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 88%|████████▊ | 254/290 [01:45<00:16,  2.21it/s]
 88%|████████▊ | 255/290 [01:46<00:16,  2.16it/s]
 88%|████████▊ | 256/290 [01:46<00:16,  2.10it/s]
 89%|████████▊ | 257/290 [01:47<00:16,  2.00it/s]
 89%|████████▉ | 258/290 [01:47<00:15,  2.06it/s]
 89%|████████▉ | 259/290 [01:48<00:14,  2.12it/s]
 90%|████████▉ | 260/290 [01:48<00:13,  2.19it/s]
 90%|█████████ | 261/290 [01:49<00:13,  2.11it/s]
 90%|█████████ | 262/290 [01:49<00:13,  2.13it/s]
 91%|█████████ | 263/290 [01:50<00:12,  2.12it/s]
 91%|█████████ | 264/290 [01:50<00:11,  2.18it/s]


== Status ==
Current time: 2022-11-29 17:27:38 (running for 00:02:00.03)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 91%|█████████▏| 265/290 [01:51<00:11,  2.09it/s]
 92%|█████████▏| 266/290 [01:51<00:11,  2.17it/s]
 92%|█████████▏| 267/290 [01:52<00:10,  2.20it/s]
 92%|█████████▏| 268/290 [01:52<00:10,  2.18it/s]
 93%|█████████▎| 269/290 [01:52<00:09,  2.13it/s]
 93%|█████████▎| 270/290 [01:53<00:09,  2.11it/s]
 93%|█████████▎| 271/290 [01:53<00:08,  2.17it/s]
 94%|█████████▍| 272/290 [01:54<00:08,  2.16it/s]
 94%|█████████▍| 273/290 [01:54<00:07,  2.18it/s]
 94%|█████████▍| 274/290 [01:55<00:07,  2.18it/s]
 95%|█████████▍| 275/290 [01:55<00:06,  2.19it/s]


== Status ==
Current time: 2022-11-29 17:27:43 (running for 00:02:05.03)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 95%|█████████▌| 276/290 [01:56<00:06,  2.24it/s]
 96%|█████████▌| 277/290 [01:56<00:05,  2.27it/s]
 96%|█████████▌| 278/290 [01:57<00:05,  2.21it/s]
 96%|█████████▌| 279/290 [01:57<00:04,  2.25it/s]
 97%|█████████▋| 280/290 [01:57<00:04,  2.28it/s]
 97%|█████████▋| 281/290 [01:58<00:03,  2.25it/s]
 97%|█████████▋| 282/290 [01:58<00:03,  2.24it/s]
 98%|█████████▊| 283/290 [01:59<00:03,  2.26it/s]
 98%|█████████▊| 284/290 [01:59<00:02,  2.24it/s]
 98%|█████████▊| 285/290 [02:00<00:02,  2.26it/s]
 99%|█████████▊| 286/290 [02:00<00:01,  2.24it/s]


== Status ==
Current time: 2022-11-29 17:27:48 (running for 00:02:10.04)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (2 PENDING, 1 RUNNING)
+------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------+
| Trial name             | status   | loc            |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |
|                        |          |                |                 |                    |                ch_size |         |                |
|------------------------+----------+----------------+-----------------+--------------------+------------------------+---------+----------------|
| _objective_d6e35_00000 | RUNNING  | 172.2

 99%|█████████▉| 287/290 [02:01<00:01,  2.23it/s]
 99%|█████████▉| 288/290 [02:01<00:00,  2.26it/s]
100%|█████████▉| 289/290 [02:01<00:00,  2.14it/s]
100%|██████████| 290/290 [02:02<00:00,  2.37it/s]


[2m[36m(_objective pid=659)[0m {'train_runtime': 122.152, 'train_samples_per_second': 150.141, 'train_steps_per_second': 2.374, 'train_loss': 0.5605058341190733, 'epoch': 5.0}


[2m[36m(_objective pid=659)[0m   0%|          | 0/51 [00:00<?, ?it/s]
 14%|█▎        | 7/51 [00:00<00:00, 67.67it/s]
 27%|██▋       | 14/51 [00:00<00:00, 56.50it/s]
 39%|███▉      | 20/51 [00:00<00:00, 52.59it/s]
 51%|█████     | 26/51 [00:00<00:00, 52.01it/s]
 63%|██████▎   | 32/51 [00:00<00:00, 51.52it/s]
 75%|███████▍  | 38/51 [00:00<00:00, 52.77it/s]
 86%|████████▋ | 44/51 [00:00<00:00, 52.40it/s]


Trial name,date,done,episodes_total,epoch,eval_accuracy,eval_f1,eval_loss,eval_runtime,eval_samples_per_second,eval_steps_per_second,experiment_id,experiment_tag,hostname,iterations_since_restore,node_ip,objective,pid,time_since_restore,time_this_iter_s,time_total_s,timestamp,timesteps_since_restore,timesteps_total,training_iteration,trial_id,warmup_time
_objective_d6e35_00000,2022-11-29_17-27-50,True,,5.0,0.703431,0.82127,0.514007,1.0127,402.889,50.361,966cc2dc30ce49e19ebaca8fff357cb9,,f1240d2c210b,1,172.28.0.2,1.5247,659,127.472,127.472,127.472,1669742870,0,,1,d6e35_00000,0.00345135
_objective_d6e35_00001,2022-11-29_17-29-03,True,,2.17,0.683824,0.812227,0.582026,0.9524,428.397,53.55,eff3fa4b312448b6ac1c14727de33ccc,"1_learning_rate=0.0000,num_train_epochs=3,per_device_train_batch_size=16,seed=4.8990,weight_decay=0.1000",f1240d2c210b,1,172.28.0.2,1.49605,1063,68.2043,68.2043,68.2043,1669742943,0,,1,d6e35_00001,0.00335979
_objective_d6e35_00002,2022-11-29_17-30-58,True,,2.18,0.845588,0.891566,0.422553,0.9505,429.253,53.657,ca96cad9370a4dccb685a52b491aac53,"2_learning_rate=0.0000,num_train_epochs=3,per_device_train_batch_size=8,seed=1.8028,weight_decay=0.3000",f1240d2c210b,2,172.28.0.2,1.73715,1846,84.173,40.047,84.173,1669743058,0,,2,d6e35_00002,0.00326443


[2m[36m(_objective pid=659)[0m  98%|█████████▊| 50/51 [00:00<00:00, 50.36it/s]100%|██████████| 51/51 [00:00<00:00, 51.80it/s]


== Status ==
Current time: 2022-11-29 17:27:56 (running for 00:02:18.10)
Memory usage on this node: 3.7/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_transform.bias', 'vocab_layer_norm.weight', 'vocab_projector.weight', 'vocab_projector.bias', 'vocab_transform.weight', 'vocab_layer_norm.bias']
[2m[36m(_objective pid=1063)[0m - This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=1063)[0m - This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=1063)[0m Some weights of DistilBertForSequenceClassification were not initiali

== Status ==
Current time: 2022-11-29 17:28:01 (running for 00:02:23.11)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m   1%|          | 4/690 [00:01<02:35,  4.42it/s]
  1%|          | 5/690 [00:01<02:09,  5.30it/s]
  1%|          | 6/690 [00:01<01:51,  6.11it/s]
  1%|          | 7/690 [00:01<01:41,  6.72it/s]
  1%|          | 8/690 [00:01<01:35,  7.14it/s]
  1%|▏         | 9/690 [00:01<01:29,  7.63it/s]
  1%|▏         | 10/690 [00:01<01:31,  7.46it/s]
  2%|▏         | 11/690 [00:02<01:27,  7.78it/s]
  2%|▏         | 12/690 [00:02<01:24,  8.02it/s]
  2%|▏         | 13/690 [00:02<01:23,  8.11it/s]
  2%|▏         | 14/690 [00:02<01:22,  8.20it/s]
  2%|▏         | 15/690 [00:02<01:17,  8.66it/s]
  2%|▏         | 16/690 [00:02<01:21,  8.23it/s]
  2%|▏         | 17/690 [00:02<01:20,  8.35it/s]
  3%|▎         | 18/690 [00:02<01:20,  8.34it/s]
  3%|▎         | 19/690 [00:03<01:22,  8.17it/s]
  3%|▎         | 20/690 [00:03<01:23,  7.99it/s]
  3%|▎         | 21/690 [00:03<01:21,  8.17it/s]
  3%|▎         | 22/690 [00:03<01:20,  8.27it/s]
  3%|▎         | 23/690 [00:03<01:22,  

== Status ==
Current time: 2022-11-29 17:28:06 (running for 00:02:28.11)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m   7%|▋         | 46/690 [00:06<01:21,  7.86it/s]
  7%|▋         | 47/690 [00:06<01:20,  7.99it/s]
  7%|▋         | 48/690 [00:06<01:19,  8.04it/s]
  7%|▋         | 49/690 [00:06<01:19,  8.08it/s]
  7%|▋         | 50/690 [00:06<01:19,  8.05it/s]
  7%|▋         | 51/690 [00:06<01:19,  8.05it/s]
  8%|▊         | 52/690 [00:07<01:18,  8.13it/s]
  8%|▊         | 53/690 [00:07<01:17,  8.21it/s]
  8%|▊         | 54/690 [00:07<01:17,  8.18it/s]
  8%|▊         | 55/690 [00:07<01:16,  8.33it/s]
  8%|▊         | 56/690 [00:07<01:16,  8.26it/s]
  8%|▊         | 57/690 [00:07<01:15,  8.34it/s]
  8%|▊         | 58/690 [00:07<01:16,  8.28it/s]
  9%|▊         | 59/690 [00:07<01:16,  8.28it/s]
  9%|▊         | 60/690 [00:08<01:16,  8.26it/s]
  9%|▉         | 61/690 [00:08<01:18,  8.06it/s]
  9%|▉         | 62/690 [00:08<01:15,  8.26it/s]
  9%|▉         | 63/690 [00:08<01:15,  8.35it/s]
  9%|▉         | 64/690 [00:08<01:14,  8.39it/s]
  9%|▉         | 65/690 [00:08<01

== Status ==
Current time: 2022-11-29 17:28:11 (running for 00:02:33.12)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m  12%|█▏        | 86/690 [00:11<01:11,  8.46it/s]
 13%|█▎        | 87/690 [00:11<01:12,  8.37it/s]
 13%|█▎        | 88/690 [00:11<01:11,  8.38it/s]
 13%|█▎        | 89/690 [00:11<01:17,  7.73it/s]
 13%|█▎        | 90/690 [00:11<01:16,  7.84it/s]
 13%|█▎        | 91/690 [00:11<01:19,  7.58it/s]
 13%|█▎        | 92/690 [00:11<01:16,  7.80it/s]
 13%|█▎        | 93/690 [00:12<01:16,  7.76it/s]
 14%|█▎        | 94/690 [00:12<01:15,  7.93it/s]
 14%|█▍        | 95/690 [00:12<01:12,  8.18it/s]
 14%|█▍        | 96/690 [00:12<01:12,  8.21it/s]
 14%|█▍        | 97/690 [00:12<01:11,  8.29it/s]
 14%|█▍        | 98/690 [00:12<01:11,  8.25it/s]
 14%|█▍        | 99/690 [00:12<01:12,  8.12it/s]
 14%|█▍        | 100/690 [00:12<01:13,  8.08it/s]
 15%|█▍        | 101/690 [00:13<01:11,  8.23it/s]
 15%|█▍        | 102/690 [00:13<01:11,  8.20it/s]
 15%|█▍        | 103/690 [00:13<01:11,  8.25it/s]
 15%|█▌        | 104/690 [00:13<01:13,  8.01it/s]
 15%|█▌        | 105/690 [00

== Status ==
Current time: 2022-11-29 17:28:16 (running for 00:02:38.12)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m  18%|█▊        | 127/690 [00:16<01:12,  7.77it/s]
 19%|█▊        | 128/690 [00:16<01:11,  7.88it/s]
 19%|█▊        | 129/690 [00:16<01:12,  7.77it/s]
 19%|█▉        | 130/690 [00:16<01:11,  7.81it/s]
 19%|█▉        | 131/690 [00:16<01:12,  7.76it/s]
 19%|█▉        | 132/690 [00:16<01:10,  7.89it/s]
 19%|█▉        | 133/690 [00:17<01:09,  8.03it/s]
 19%|█▉        | 134/690 [00:17<01:08,  8.06it/s]
 20%|█▉        | 135/690 [00:17<01:08,  8.08it/s]
 20%|█▉        | 136/690 [00:17<01:06,  8.37it/s]
 20%|█▉        | 137/690 [00:17<01:09,  7.98it/s]
 20%|██        | 138/690 [00:17<01:08,  8.05it/s]
 20%|██        | 139/690 [00:17<01:07,  8.16it/s]
 20%|██        | 140/690 [00:17<01:09,  7.86it/s]
 20%|██        | 141/690 [00:18<01:10,  7.73it/s]
 21%|██        | 142/690 [00:18<01:07,  8.12it/s]
 21%|██        | 143/690 [00:18<01:07,  8.13it/s]
 21%|██        | 144/690 [00:18<01:12,  7.54it/s]
 21%|██        | 145/690 [00:18<01:13,  7.45it/s]
 21%|██       

== Status ==
Current time: 2022-11-29 17:28:21 (running for 00:02:43.13)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m  24%|██▍       | 165/690 [00:21<01:05,  7.98it/s]
 24%|██▍       | 166/690 [00:21<01:05,  8.01it/s]
 24%|██▍       | 167/690 [00:21<01:03,  8.21it/s]
 24%|██▍       | 168/690 [00:21<01:06,  7.89it/s]
 24%|██▍       | 169/690 [00:21<01:05,  7.92it/s]
 25%|██▍       | 170/690 [00:21<01:07,  7.71it/s]
 25%|██▍       | 171/690 [00:21<01:05,  7.94it/s]
 25%|██▍       | 172/690 [00:22<01:06,  7.77it/s]
 25%|██▌       | 173/690 [00:22<01:09,  7.49it/s]
 25%|██▌       | 174/690 [00:22<01:08,  7.58it/s]
 25%|██▌       | 175/690 [00:22<01:07,  7.68it/s]
 26%|██▌       | 176/690 [00:22<01:08,  7.54it/s]
 26%|██▌       | 177/690 [00:22<01:06,  7.72it/s]
 26%|██▌       | 178/690 [00:22<01:05,  7.78it/s]
 26%|██▌       | 179/690 [00:22<01:03,  8.00it/s]
 26%|██▌       | 180/690 [00:23<01:03,  8.02it/s]
 26%|██▌       | 181/690 [00:23<01:03,  8.02it/s]
 26%|██▋       | 182/690 [00:23<01:05,  7.78it/s]
 27%|██▋       | 183/690 [00:23<01:04,  7.89it/s]
 27%|██▋      

== Status ==
Current time: 2022-11-29 17:28:26 (running for 00:02:48.13)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

 30%|██▉       | 206/690 [00:26<01:03,  7.59it/s]
 30%|███       | 207/690 [00:26<01:02,  7.77it/s]
 30%|███       | 208/690 [00:26<01:03,  7.57it/s]
 30%|███       | 209/690 [00:26<01:02,  7.65it/s]
 30%|███       | 210/690 [00:26<01:01,  7.85it/s]
 31%|███       | 211/690 [00:26<01:01,  7.84it/s]
 31%|███       | 212/690 [00:27<01:00,  7.97it/s]
 31%|███       | 213/690 [00:27<00:59,  7.97it/s]
 31%|███       | 214/690 [00:27<00:58,  8.10it/s]
 31%|███       | 215/690 [00:27<00:59,  8.05it/s]
 31%|███▏      | 216/690 [00:27<01:00,  7.81it/s]
 31%|███▏      | 217/690 [00:27<00:58,  8.02it/s]
 32%|███▏      | 218/690 [00:27<00:58,  8.11it/s]
 32%|███▏      | 219/690 [00:27<00:57,  8.20it/s]
 32%|███▏      | 220/690 [00:28<00:57,  8.21it/s]
 32%|███▏      | 221/690 [00:28<00:56,  8.27it/s]
 32%|███▏      | 222/690 [00:28<00:55,  8.45it/s]
 32%|███▏      | 223/690 [00:28<00:56,  8.33it/s]
 32%|███▏      | 224/690 [00:28<00:55,  8.37it/s]
 33%|███▎      | 225/690 [00:28<00:58,  7.97it/s]


== Status ==
Current time: 2022-11-29 17:28:31 (running for 00:02:53.14)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m  36%|███▌      | 245/690 [00:31<00:55,  8.06it/s]
 36%|███▌      | 246/690 [00:31<00:59,  7.51it/s]
 36%|███▌      | 247/690 [00:31<00:57,  7.75it/s]
 36%|███▌      | 248/690 [00:31<00:58,  7.59it/s]
 36%|███▌      | 249/690 [00:31<00:56,  7.85it/s]
 36%|███▌      | 250/690 [00:31<00:55,  7.96it/s]
 36%|███▋      | 251/690 [00:31<00:53,  8.17it/s]
 37%|███▋      | 252/690 [00:32<00:53,  8.18it/s]
 37%|███▋      | 253/690 [00:32<00:55,  7.90it/s]
 37%|███▋      | 254/690 [00:32<00:55,  7.90it/s]
 37%|███▋      | 255/690 [00:32<00:56,  7.68it/s]
 37%|███▋      | 256/690 [00:32<00:56,  7.75it/s]
 37%|███▋      | 257/690 [00:32<00:55,  7.80it/s]
 37%|███▋      | 258/690 [00:32<00:54,  7.97it/s]
 38%|███▊      | 259/690 [00:32<00:53,  8.05it/s]
 38%|███▊      | 260/690 [00:33<00:55,  7.80it/s]
 38%|███▊      | 261/690 [00:33<00:54,  7.93it/s]
 38%|███▊      | 262/690 [00:33<00:52,  8.17it/s]
 38%|███▊      | 263/690 [00:33<00:51,  8.28it/s]
 38%|███▊     

== Status ==
Current time: 2022-11-29 17:28:36 (running for 00:02:58.14)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m  41%|████▏     | 286/690 [00:36<00:51,  7.88it/s]
 42%|████▏     | 287/690 [00:36<00:49,  8.12it/s]
 42%|████▏     | 288/690 [00:36<00:50,  7.94it/s]
 42%|████▏     | 289/690 [00:36<00:49,  8.05it/s]
 42%|████▏     | 290/690 [00:36<00:50,  7.86it/s]
 42%|████▏     | 291/690 [00:36<00:51,  7.79it/s]
 42%|████▏     | 292/690 [00:37<00:50,  7.81it/s]
 42%|████▏     | 293/690 [00:37<00:49,  7.97it/s]
 43%|████▎     | 294/690 [00:37<00:50,  7.92it/s]
 43%|████▎     | 295/690 [00:37<00:49,  8.01it/s]
 43%|████▎     | 296/690 [00:37<00:48,  8.09it/s]
 43%|████▎     | 297/690 [00:37<00:48,  8.09it/s]
 43%|████▎     | 298/690 [00:37<00:50,  7.81it/s]
 43%|████▎     | 299/690 [00:37<00:48,  8.00it/s]
 43%|████▎     | 300/690 [00:38<00:51,  7.62it/s]
 44%|████▎     | 301/690 [00:38<00:50,  7.76it/s]
 44%|████▍     | 302/690 [00:38<00:49,  7.85it/s]
 44%|████▍     | 303/690 [00:38<00:48,  7.91it/s]
 44%|████▍     | 304/690 [00:38<00:49,  7.76it/s]
 44%|████▍    

== Status ==
Current time: 2022-11-29 17:28:41 (running for 00:03:03.15)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

 47%|████▋     | 325/690 [00:41<00:47,  7.68it/s]
 47%|████▋     | 327/690 [00:41<00:43,  8.34it/s]
 48%|████▊     | 328/690 [00:41<00:42,  8.45it/s]
 48%|████▊     | 329/690 [00:41<00:43,  8.38it/s]
 48%|████▊     | 330/690 [00:41<00:43,  8.34it/s]
 48%|████▊     | 331/690 [00:41<00:43,  8.33it/s]
 48%|████▊     | 332/690 [00:42<00:43,  8.31it/s]
 48%|████▊     | 333/690 [00:42<00:42,  8.36it/s]
 48%|████▊     | 334/690 [00:42<00:42,  8.37it/s]
 49%|████▊     | 335/690 [00:42<00:43,  8.18it/s]
 49%|████▊     | 336/690 [00:42<00:45,  7.86it/s]
 49%|████▉     | 337/690 [00:42<00:43,  8.21it/s]
 49%|████▉     | 338/690 [00:42<00:41,  8.46it/s]
 49%|████▉     | 339/690 [00:42<00:42,  8.32it/s]
 49%|████▉     | 340/690 [00:43<00:42,  8.14it/s]
 49%|████▉     | 341/690 [00:43<00:44,  7.87it/s]
 50%|████▉     | 342/690 [00:43<00:43,  7.91it/s]
 50%|████▉     | 343/690 [00:43<00:42,  8.13it/s]
 50%|████▉     | 344/690 [00:43<00:43,  8.04it/s]
 50%|█████     | 345/690 [00:43<00:43,  7.99it/s]


== Status ==
Current time: 2022-11-29 17:28:46 (running for 00:03:08.15)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m  53%|█████▎    | 366/690 [00:46<00:39,  8.12it/s]
 53%|█████▎    | 367/690 [00:46<00:39,  8.15it/s]
 53%|█████▎    | 368/690 [00:46<00:40,  8.03it/s]
 53%|█████▎    | 369/690 [00:46<00:39,  8.22it/s]
 54%|█████▎    | 370/690 [00:46<00:38,  8.21it/s]
 54%|█████▍    | 371/690 [00:46<00:40,  7.89it/s]
 54%|█████▍    | 372/690 [00:46<00:39,  7.99it/s]
 54%|█████▍    | 373/690 [00:47<00:38,  8.16it/s]
 54%|█████▍    | 374/690 [00:47<00:38,  8.20it/s]
 54%|█████▍    | 375/690 [00:47<00:38,  8.26it/s]
 54%|█████▍    | 376/690 [00:47<00:38,  8.16it/s]
 55%|█████▍    | 377/690 [00:47<00:38,  8.24it/s]
 55%|█████▍    | 378/690 [00:47<00:37,  8.30it/s]
 55%|█████▍    | 379/690 [00:47<00:38,  8.15it/s]
 55%|█████▌    | 380/690 [00:47<00:37,  8.20it/s]
 55%|█████▌    | 381/690 [00:48<00:37,  8.13it/s]
 55%|█████▌    | 382/690 [00:48<00:37,  8.26it/s]
 56%|█████▌    | 383/690 [00:48<00:38,  7.98it/s]
 56%|█████▌    | 384/690 [00:48<00:38,  7.99it/s]
 56%|█████▌   

== Status ==
Current time: 2022-11-29 17:28:51 (running for 00:03:13.16)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

 59%|█████▉    | 406/690 [00:51<00:34,  8.28it/s]
 59%|█████▉    | 407/690 [00:51<00:34,  8.22it/s]
 59%|█████▉    | 408/690 [00:51<00:34,  8.27it/s]
 59%|█████▉    | 409/690 [00:51<00:34,  8.09it/s]
 59%|█████▉    | 410/690 [00:51<00:33,  8.26it/s]
 60%|█████▉    | 411/690 [00:51<00:34,  7.98it/s]
 60%|█████▉    | 412/690 [00:51<00:34,  8.03it/s]
 60%|█████▉    | 413/690 [00:52<00:34,  8.04it/s]
 60%|██████    | 414/690 [00:52<00:32,  8.42it/s]
 60%|██████    | 415/690 [00:52<00:32,  8.38it/s]
 60%|██████    | 416/690 [00:52<00:35,  7.77it/s]
 60%|██████    | 417/690 [00:52<00:34,  7.89it/s]
 61%|██████    | 418/690 [00:52<00:34,  7.78it/s]
 61%|██████    | 419/690 [00:52<00:35,  7.59it/s]
 61%|██████    | 420/690 [00:52<00:34,  7.79it/s]
 61%|██████    | 421/690 [00:53<00:34,  7.86it/s]
 61%|██████    | 422/690 [00:53<00:33,  8.12it/s]
 61%|██████▏   | 423/690 [00:53<00:32,  8.15it/s]
 61%|██████▏   | 424/690 [00:53<00:31,  8.49it/s]
 62%|██████▏   | 425/690 [00:53<00:34,  7.78it/s]


== Status ==
Current time: 2022-11-29 17:28:56 (running for 00:03:18.16)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

 65%|██████▍   | 446/690 [00:56<00:30,  8.01it/s]
 65%|██████▍   | 447/690 [00:56<00:30,  8.04it/s]
 65%|██████▍   | 448/690 [00:56<00:30,  7.99it/s]
 65%|██████▌   | 449/690 [00:56<00:29,  8.05it/s]
 65%|██████▌   | 450/690 [00:56<00:29,  8.01it/s]
 65%|██████▌   | 451/690 [00:56<00:29,  8.19it/s]
 66%|██████▌   | 452/690 [00:56<00:28,  8.27it/s]
 66%|██████▌   | 453/690 [00:57<00:29,  7.90it/s]
 66%|██████▌   | 454/690 [00:57<00:29,  8.10it/s]
 66%|██████▌   | 455/690 [00:57<00:29,  8.10it/s]
 66%|██████▌   | 456/690 [00:57<00:29,  8.07it/s]
 66%|██████▌   | 457/690 [00:57<00:29,  7.80it/s]
 66%|██████▋   | 458/690 [00:57<00:29,  7.96it/s]
 67%|██████▋   | 459/690 [00:57<00:28,  8.03it/s]
 67%|██████▋   | 461/690 [00:58<00:24,  9.41it/s]
 67%|██████▋   | 462/690 [00:58<00:25,  9.06it/s]
 67%|██████▋   | 463/690 [00:58<00:25,  8.80it/s]
 67%|██████▋   | 464/690 [00:58<00:26,  8.55it/s]
 67%|██████▋   | 465/690 [00:58<00:26,  8.34it/s]
 68%|██████▊   | 466/690 [00:58<00:25,  8.71it/s]


== Status ==
Current time: 2022-11-29 17:29:01 (running for 00:03:23.17)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m  70%|███████   | 486/690 [01:01<00:25,  8.01it/s]
 71%|███████   | 487/690 [01:01<00:25,  7.85it/s]
 71%|███████   | 488/690 [01:01<00:25,  7.88it/s]
 71%|███████   | 489/690 [01:01<00:25,  7.75it/s]
 71%|███████   | 490/690 [01:01<00:25,  7.84it/s]
 71%|███████   | 491/690 [01:01<00:24,  8.16it/s]
 71%|███████▏  | 492/690 [01:01<00:24,  8.09it/s]
 71%|███████▏  | 493/690 [01:02<00:24,  8.10it/s]
 72%|███████▏  | 494/690 [01:02<00:24,  7.95it/s]
 72%|███████▏  | 495/690 [01:02<00:24,  8.04it/s]
 72%|███████▏  | 496/690 [01:02<00:23,  8.38it/s]
 72%|███████▏  | 497/690 [01:02<00:23,  8.19it/s]
 72%|███████▏  | 498/690 [01:02<00:23,  8.19it/s]
 72%|███████▏  | 499/690 [01:02<00:24,  7.95it/s]


[2m[36m(_objective pid=1063)[0m {'loss': 0.6098, 'learning_rate': 5.648612608501682e-07, 'epoch': 2.17}


[2m[36m(_objective pid=1063)[0m  72%|███████▏  | 500/690 [01:02<00:25,  7.55it/s]                                                  72%|███████▏  | 500/690 [01:02<00:25,  7.55it/s]
[2m[36m(_objective pid=1063)[0m   0%|          | 0/51 [00:00<?, ?it/s][A
[2m[36m(_objective pid=1063)[0m 
 14%|█▎        | 7/51 [00:00<00:00, 63.84it/s][A
[2m[36m(_objective pid=1063)[0m 
 27%|██▋       | 14/51 [00:00<00:00, 57.18it/s][A
[2m[36m(_objective pid=1063)[0m 
 39%|███▉      | 20/51 [00:00<00:00, 53.81it/s][A
[2m[36m(_objective pid=1063)[0m 
 51%|█████     | 26/51 [00:00<00:00, 54.55it/s][A
[2m[36m(_objective pid=1063)[0m 
 63%|██████▎   | 32/51 [00:00<00:00, 54.17it/s][A
[2m[36m(_objective pid=1063)[0m 
 75%|███████▍  | 38/51 [00:00<00:00, 55.44it/s][A
[2m[36m(_objective pid=1063)[0m 
 86%|████████▋ | 44/51 [00:00<00:00, 55.14it/s][A


[2m[36m(_objective pid=1063)[0m {'eval_loss': 0.5820257663726807, 'eval_accuracy': 0.6838235294117647, 'eval_f1': 0.8122270742358079, 'eval_runtime': 0.9524, 'eval_samples_per_second': 428.397, 'eval_steps_per_second': 53.55, 'epoch': 2.17}


[2m[36m(_objective pid=1063)[0m 
[2m[36m(_objective pid=1063)[0m  98%|█████████▊| 50/51 [00:00<00:00, 54.61it/s][A                                                 
[2m[36m(_objective pid=1063)[0m                                                [A 72%|███████▏  | 500/690 [01:03<00:25,  7.55it/s]
[2m[36m(_objective pid=1063)[0m 100%|██████████| 51/51 [00:00<00:00, 54.61it/s][A
                                               [A
 73%|███████▎  | 501/690 [01:05<02:51,  1.10it/s]
 73%|███████▎  | 502/690 [01:05<02:06,  1.48it/s]
 73%|███████▎  | 503/690 [01:05<01:35,  1.97it/s]
 73%|███████▎  | 504/690 [01:06<01:12,  2.55it/s]
 73%|███████▎  | 505/690 [01:06<00:58,  3.19it/s]
 73%|███████▎  | 506/690 [01:06<00:46,  3.93it/s]
 73%|███████▎  | 507/690 [01:06<00:38,  4.71it/s]
 74%|███████▎  | 508/690 [01:06<00:34,  5.26it/s]
 74%|███████▍  | 509/690 [01:06<00:31,  5.77it/s]
 74%|███████▍  | 510/690 [01:06<00:28,  6.34it/s]
 74%|███████▍  | 511/690 [01:06<00:26,  6.85it/s]
 

== Status ==
Current time: 2022-11-29 17:29:08 (running for 00:03:30.89)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m  76%|███████▋  | 527/690 [01:08<00:19,  8.52it/s]
 77%|███████▋  | 528/690 [01:08<00:19,  8.15it/s]
 77%|███████▋  | 529/690 [01:09<00:19,  8.18it/s]
 77%|███████▋  | 530/690 [01:09<00:19,  8.30it/s]
 77%|███████▋  | 531/690 [01:09<00:19,  8.37it/s]
 77%|███████▋  | 532/690 [01:09<00:19,  8.01it/s]
 77%|███████▋  | 533/690 [01:09<00:19,  7.85it/s]
 77%|███████▋  | 534/690 [01:09<00:19,  7.91it/s]
 78%|███████▊  | 535/690 [01:09<00:19,  7.96it/s]
 78%|███████▊  | 536/690 [01:09<00:19,  7.98it/s]
 78%|███████▊  | 537/690 [01:10<00:18,  8.26it/s]
 78%|███████▊  | 538/690 [01:10<00:18,  8.32it/s]
 78%|███████▊  | 539/690 [01:10<00:18,  8.25it/s]
 78%|███████▊  | 540/690 [01:10<00:18,  8.28it/s]
 78%|███████▊  | 541/690 [01:10<00:18,  8.08it/s]
 79%|███████▊  | 542/690 [01:10<00:18,  8.09it/s]
 79%|███████▊  | 543/690 [01:10<00:19,  7.56it/s]
 79%|███████▉  | 544/690 [01:10<00:19,  7.40it/s]
 79%|███████▉  | 545/690 [01:11<00:19,  7.42it/s]
 79%|███████▉ 

== Status ==
Current time: 2022-11-29 17:29:13 (running for 00:03:35.89)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

 82%|████████▏ | 568/690 [01:13<00:14,  8.42it/s]
 82%|████████▏ | 569/690 [01:14<00:15,  7.98it/s]
 83%|████████▎ | 570/690 [01:14<00:14,  8.06it/s]
 83%|████████▎ | 571/690 [01:14<00:14,  8.10it/s]
 83%|████████▎ | 572/690 [01:14<00:14,  8.25it/s]
 83%|████████▎ | 573/690 [01:14<00:14,  8.22it/s]
 83%|████████▎ | 574/690 [01:14<00:14,  8.27it/s]
 83%|████████▎ | 575/690 [01:14<00:13,  8.33it/s]
 83%|████████▎ | 576/690 [01:14<00:14,  8.07it/s]
 84%|████████▎ | 577/690 [01:15<00:14,  8.02it/s]
 84%|████████▍ | 578/690 [01:15<00:13,  8.07it/s]
 84%|████████▍ | 579/690 [01:15<00:13,  8.31it/s]
 84%|████████▍ | 580/690 [01:15<00:13,  8.35it/s]
 84%|████████▍ | 581/690 [01:15<00:13,  7.87it/s]
 84%|████████▍ | 582/690 [01:15<00:13,  8.03it/s]
 84%|████████▍ | 583/690 [01:15<00:13,  7.87it/s]
 85%|████████▍ | 584/690 [01:15<00:13,  7.72it/s]
 85%|████████▍ | 586/690 [01:16<00:12,  8.37it/s]
 85%|████████▌ | 587/690 [01:16<00:12,  8.49it/s]
 85%|████████▌ | 588/690 [01:16<00:12,  8.47it/s]


== Status ==
Current time: 2022-11-29 17:29:18 (running for 00:03:40.90)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m  88%|████████▊ | 610/690 [01:19<00:10,  7.69it/s]
 89%|████████▊ | 611/690 [01:19<00:10,  7.85it/s]
 89%|████████▊ | 612/690 [01:19<00:09,  8.03it/s]
 89%|████████▉ | 613/690 [01:19<00:09,  8.07it/s]
 89%|████████▉ | 614/690 [01:19<00:09,  7.87it/s]
 89%|████████▉ | 615/690 [01:19<00:09,  7.95it/s]
 89%|████████▉ | 616/690 [01:19<00:09,  8.03it/s]
 89%|████████▉ | 617/690 [01:19<00:09,  8.06it/s]
 90%|████████▉ | 618/690 [01:20<00:08,  8.07it/s]
 90%|████████▉ | 619/690 [01:20<00:08,  8.20it/s]
 90%|████████▉ | 620/690 [01:20<00:08,  8.15it/s]
 90%|█████████ | 621/690 [01:20<00:08,  7.90it/s]
 90%|█████████ | 622/690 [01:20<00:08,  7.77it/s]
 90%|█████████ | 623/690 [01:20<00:08,  7.92it/s]
 90%|█████████ | 624/690 [01:20<00:08,  7.75it/s]
 91%|█████████ | 625/690 [01:20<00:08,  8.00it/s]
 91%|█████████ | 626/690 [01:21<00:07,  8.06it/s]
 91%|█████████ | 627/690 [01:21<00:07,  8.27it/s]
 91%|█████████ | 628/690 [01:21<00:07,  8.33it/s]
 91%|█████████

== Status ==
Current time: 2022-11-29 17:29:23 (running for 00:03:45.90)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---

[2m[36m(_objective pid=1063)[0m  94%|█████████▍| 650/690 [01:24<00:04,  8.13it/s]
 94%|█████████▍| 651/690 [01:24<00:04,  7.92it/s]
 94%|█████████▍| 652/690 [01:24<00:04,  7.99it/s]
 95%|█████████▍| 653/690 [01:24<00:04,  8.16it/s]
 95%|█████████▍| 654/690 [01:24<00:04,  8.29it/s]
 95%|█████████▍| 655/690 [01:24<00:04,  8.31it/s]
 95%|█████████▌| 656/690 [01:24<00:04,  8.24it/s]
 95%|█████████▌| 657/690 [01:24<00:03,  8.35it/s]
 95%|█████████▌| 658/690 [01:24<00:03,  8.24it/s]
 96%|█████████▌| 659/690 [01:25<00:04,  7.66it/s]
 96%|█████████▌| 660/690 [01:25<00:03,  8.11it/s]
 96%|█████████▌| 661/690 [01:25<00:03,  8.15it/s]
 96%|█████████▌| 662/690 [01:25<00:03,  8.39it/s]
 96%|█████████▌| 663/690 [01:25<00:03,  8.32it/s]
 96%|█████████▌| 664/690 [01:25<00:03,  8.20it/s]
 96%|█████████▋| 665/690 [01:25<00:03,  8.23it/s]
 97%|█████████▋| 666/690 [01:25<00:02,  8.04it/s]
 97%|█████████▋| 667/690 [01:26<00:02,  8.33it/s]
 97%|█████████▋| 668/690 [01:26<00:02,  8.08it/s]
 97%|█████████

[2m[36m(_objective pid=1063)[0m {'train_runtime': 88.9293, 'train_samples_per_second': 123.739, 'train_steps_per_second': 7.759, 'train_loss': 0.6039019045622452, 'epoch': 3.0}
== Status ==
Current time: 2022-11-29 17:29:29 (running for 00:03:51.11)
Memory usage on this node: 3.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_siz

[2m[36m(_objective pid=1846)[0m Some weights of the model checkpoint at distilbert-base-uncased were not used when initializing DistilBertForSequenceClassification: ['vocab_projector.bias', 'vocab_layer_norm.weight', 'vocab_transform.weight', 'vocab_projector.weight', 'vocab_transform.bias', 'vocab_layer_norm.bias']
[2m[36m(_objective pid=1846)[0m - This IS expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
[2m[36m(_objective pid=1846)[0m - This IS NOT expected if you are initializing DistilBertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
[2m[36m(_objective pid=1846)[0m Some weights of DistilBertForSequenceClassification were not initiali

== Status ==
Current time: 2022-11-29 17:29:38 (running for 00:04:00.76)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

  0%|          | 1/1377 [00:00<22:54,  1.00it/s]
  0%|          | 3/1377 [00:01<07:33,  3.03it/s]
  0%|          | 5/1377 [00:01<04:30,  5.07it/s]
  1%|          | 7/1377 [00:01<03:18,  6.90it/s]
  1%|          | 9/1377 [00:01<02:43,  8.38it/s]
  1%|          | 11/1377 [00:01<02:20,  9.72it/s]
  1%|          | 13/1377 [00:01<02:09, 10.55it/s]
  1%|          | 15/1377 [00:02<01:59, 11.40it/s]
  1%|          | 17/1377 [00:02<01:52, 12.04it/s]
  1%|▏         | 19/1377 [00:02<01:50, 12.28it/s]
  2%|▏         | 21/1377 [00:02<01:47, 12.57it/s]
  2%|▏         | 23/1377 [00:02<01:45, 12.87it/s]
  2%|▏         | 25/1377 [00:02<01:45, 12.87it/s]
  2%|▏         | 27/1377 [00:02<01:42, 13.11it/s]
  2%|▏         | 29/1377 [00:03<01:42, 13.15it/s]
  2%|▏         | 31/1377 [00:03<01:42, 13.16it/s]
  2%|▏         | 33/1377 [00:03<01:40, 13.31it/s]
  3%|▎         | 35/1377 [00:03<01:40, 13.37it/s]
  3%|▎         | 37/1377 [00:03<01:39, 13.40it/s]
  3%|▎         | 39/1377 [00:03<01:40, 13.36it/s]
  3%|

== Status ==
Current time: 2022-11-29 17:29:43 (running for 00:04:05.77)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

  5%|▍         | 63/1377 [00:05<01:35, 13.78it/s]
  5%|▍         | 65/1377 [00:05<01:36, 13.60it/s]
  5%|▍         | 67/1377 [00:05<01:35, 13.69it/s]
  5%|▌         | 69/1377 [00:06<01:35, 13.76it/s]
  5%|▌         | 71/1377 [00:06<01:36, 13.52it/s]
  5%|▌         | 73/1377 [00:06<01:37, 13.31it/s]
  5%|▌         | 75/1377 [00:06<01:38, 13.23it/s]
  6%|▌         | 77/1377 [00:06<01:38, 13.22it/s]
  6%|▌         | 79/1377 [00:06<01:37, 13.35it/s]
  6%|▌         | 81/1377 [00:06<01:36, 13.37it/s]
  6%|▌         | 83/1377 [00:07<01:36, 13.37it/s]
  6%|▌         | 85/1377 [00:07<01:36, 13.33it/s]
  6%|▋         | 87/1377 [00:07<01:36, 13.35it/s]
  6%|▋         | 89/1377 [00:07<01:38, 13.14it/s]
  7%|▋         | 91/1377 [00:07<01:38, 13.07it/s]
  7%|▋         | 93/1377 [00:07<01:37, 13.11it/s]
  7%|▋         | 95/1377 [00:08<01:35, 13.45it/s]
  7%|▋         | 97/1377 [00:08<01:35, 13.46it/s]
  7%|▋         | 99/1377 [00:08<01:33, 13.70it/s]
  7%|▋         | 101/1377 [00:08<01:35, 13.34it/s]

== Status ==
Current time: 2022-11-29 17:29:48 (running for 00:04:10.77)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

 10%|▉         | 131/1377 [00:10<01:32, 13.52it/s]
 10%|▉         | 133/1377 [00:10<01:30, 13.69it/s]
 10%|▉         | 135/1377 [00:11<01:30, 13.69it/s]
 10%|▉         | 137/1377 [00:11<01:30, 13.67it/s]
 10%|█         | 139/1377 [00:11<01:31, 13.57it/s]
 10%|█         | 141/1377 [00:11<01:30, 13.65it/s]
 10%|█         | 143/1377 [00:11<01:32, 13.36it/s]
 11%|█         | 145/1377 [00:11<01:32, 13.33it/s]
 11%|█         | 147/1377 [00:11<01:31, 13.48it/s]
 11%|█         | 149/1377 [00:12<01:32, 13.21it/s]
 11%|█         | 151/1377 [00:12<01:33, 13.15it/s]
 11%|█         | 153/1377 [00:12<01:33, 13.14it/s]
 11%|█▏        | 155/1377 [00:12<01:32, 13.20it/s]
 11%|█▏        | 157/1377 [00:12<01:32, 13.23it/s]
 12%|█▏        | 159/1377 [00:12<01:32, 13.11it/s]
 12%|█▏        | 161/1377 [00:12<01:31, 13.30it/s]
 12%|█▏        | 163/1377 [00:13<01:34, 12.91it/s]
 12%|█▏        | 165/1377 [00:13<01:32, 13.06it/s]
 12%|█▏        | 167/1377 [00:13<01:31, 13.26it/s]
 12%|█▏        | 169/1377 [00:1

== Status ==
Current time: 2022-11-29 17:29:53 (running for 00:04:15.78)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  14%|█▍        | 195/1377 [00:15<01:30, 13.07it/s]
 14%|█▍        | 197/1377 [00:15<01:31, 12.95it/s]
 14%|█▍        | 199/1377 [00:15<01:29, 13.09it/s]
 15%|█▍        | 201/1377 [00:16<01:30, 13.04it/s]
 15%|█▍        | 203/1377 [00:16<01:29, 13.15it/s]
 15%|█▍        | 205/1377 [00:16<01:28, 13.20it/s]
 15%|█▌        | 207/1377 [00:16<01:27, 13.31it/s]
 15%|█▌        | 209/1377 [00:16<01:27, 13.29it/s]
 15%|█▌        | 211/1377 [00:16<01:29, 12.98it/s]
 15%|█▌        | 213/1377 [00:16<01:29, 13.04it/s]
 16%|█▌        | 215/1377 [00:17<01:28, 13.07it/s]
 16%|█▌        | 217/1377 [00:17<01:26, 13.35it/s]
 16%|█▌        | 219/1377 [00:17<01:26, 13.45it/s]
 16%|█▌        | 221/1377 [00:17<01:27, 13.21it/s]
 16%|█▌        | 223/1377 [00:17<01:26, 13.33it/s]
 16%|█▋        | 225/1377 [00:17<01:27, 13.21it/s]
 16%|█▋        | 227/1377 [00:17<01:28, 13.03it/s]
 17%|█▋        | 229/1377 [00:18<01:27, 13.07it/s]
 17%|█▋        | 231/1377 [00:18<01:25, 13.35i

== Status ==
Current time: 2022-11-29 17:29:58 (running for 00:04:20.78)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

 19%|█▉        | 263/1377 [00:20<01:22, 13.58it/s]
 19%|█▉        | 265/1377 [00:20<01:23, 13.39it/s]
 19%|█▉        | 267/1377 [00:20<01:23, 13.22it/s]
 20%|█▉        | 269/1377 [00:21<01:25, 12.95it/s]
 20%|█▉        | 271/1377 [00:21<01:24, 13.06it/s]
 20%|█▉        | 273/1377 [00:21<01:22, 13.36it/s]
 20%|█▉        | 275/1377 [00:21<01:22, 13.32it/s]
 20%|██        | 277/1377 [00:21<01:24, 13.01it/s]
 20%|██        | 279/1377 [00:21<01:23, 13.16it/s]
 20%|██        | 281/1377 [00:22<01:23, 13.11it/s]
 21%|██        | 283/1377 [00:22<01:23, 13.13it/s]
 21%|██        | 285/1377 [00:22<01:21, 13.36it/s]
 21%|██        | 287/1377 [00:22<01:20, 13.46it/s]
 21%|██        | 289/1377 [00:22<01:20, 13.60it/s]
 21%|██        | 291/1377 [00:22<01:19, 13.72it/s]
 21%|██▏       | 293/1377 [00:22<01:19, 13.62it/s]
 21%|██▏       | 295/1377 [00:23<01:21, 13.21it/s]
 22%|██▏       | 297/1377 [00:23<01:23, 12.95it/s]
 22%|██▏       | 299/1377 [00:23<01:23, 12.98it/s]
 22%|██▏       | 301/1377 [00:2

== Status ==
Current time: 2022-11-29 17:30:03 (running for 00:04:25.79)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  24%|██▍       | 329/1377 [00:25<01:18, 13.38it/s]
 24%|██▍       | 331/1377 [00:25<01:18, 13.31it/s]
 24%|██▍       | 333/1377 [00:25<01:17, 13.43it/s]
 24%|██▍       | 335/1377 [00:26<01:19, 13.15it/s]
 24%|██▍       | 337/1377 [00:26<01:18, 13.21it/s]
 25%|██▍       | 339/1377 [00:26<01:17, 13.36it/s]
 25%|██▍       | 341/1377 [00:26<01:17, 13.33it/s]
 25%|██▍       | 343/1377 [00:26<01:19, 13.00it/s]
 25%|██▌       | 345/1377 [00:26<01:20, 12.90it/s]
 25%|██▌       | 347/1377 [00:27<01:19, 12.90it/s]
 25%|██▌       | 349/1377 [00:27<01:20, 12.69it/s]
 25%|██▌       | 351/1377 [00:27<01:20, 12.68it/s]
 26%|██▌       | 353/1377 [00:27<01:22, 12.38it/s]
 26%|██▌       | 355/1377 [00:27<01:21, 12.47it/s]
 26%|██▌       | 357/1377 [00:27<01:23, 12.22it/s]
 26%|██▌       | 359/1377 [00:28<01:20, 12.71it/s]
 26%|██▌       | 361/1377 [00:28<01:17, 13.13it/s]
 26%|██▋       | 363/1377 [00:28<01:16, 13.23it/s]
 27%|██▋       | 365/1377 [00:28<01:15, 13.35i

== Status ==
Current time: 2022-11-29 17:30:08 (running for 00:04:30.79)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  29%|██▊       | 393/1377 [00:30<01:14, 13.20it/s]
 29%|██▊       | 395/1377 [00:30<01:14, 13.13it/s]
 29%|██▉       | 397/1377 [00:30<01:14, 13.24it/s]
 29%|██▉       | 399/1377 [00:31<01:13, 13.37it/s]
 29%|██▉       | 401/1377 [00:31<01:13, 13.21it/s]
 29%|██▉       | 403/1377 [00:31<01:12, 13.38it/s]
 29%|██▉       | 405/1377 [00:31<01:14, 13.13it/s]
 30%|██▉       | 407/1377 [00:31<01:14, 12.97it/s]
 30%|██▉       | 409/1377 [00:31<01:13, 13.09it/s]
 30%|██▉       | 411/1377 [00:31<01:14, 13.00it/s]
 30%|██▉       | 413/1377 [00:32<01:14, 13.00it/s]
 30%|███       | 415/1377 [00:32<01:14, 13.00it/s]
 30%|███       | 417/1377 [00:32<01:15, 12.76it/s]
 30%|███       | 419/1377 [00:32<01:13, 12.95it/s]
 31%|███       | 421/1377 [00:32<01:14, 12.82it/s]
 31%|███       | 423/1377 [00:32<01:13, 12.93it/s]
 31%|███       | 425/1377 [00:33<01:13, 13.00it/s]
 31%|███       | 427/1377 [00:33<01:12, 13.03it/s]
 31%|███       | 429/1377 [00:33<01:12, 13.01i

== Status ==
Current time: 2022-11-29 17:30:13 (running for 00:04:35.80)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  33%|███▎      | 461/1377 [00:35<01:06, 13.76it/s]
 34%|███▎      | 463/1377 [00:35<01:07, 13.49it/s]
 34%|███▍      | 465/1377 [00:36<01:09, 13.21it/s]
 34%|███▍      | 467/1377 [00:36<01:10, 12.98it/s]
 34%|███▍      | 469/1377 [00:36<01:09, 13.12it/s]
 34%|███▍      | 471/1377 [00:36<01:07, 13.34it/s]
 34%|███▍      | 473/1377 [00:36<01:08, 13.14it/s]
 34%|███▍      | 475/1377 [00:36<01:08, 13.14it/s]
 35%|███▍      | 477/1377 [00:36<01:08, 13.20it/s]
 35%|███▍      | 479/1377 [00:37<01:08, 13.20it/s]
 35%|███▍      | 481/1377 [00:37<01:07, 13.33it/s]
 35%|███▌      | 483/1377 [00:37<01:07, 13.24it/s]
 35%|███▌      | 485/1377 [00:37<01:07, 13.25it/s]
 35%|███▌      | 487/1377 [00:37<01:08, 12.95it/s]
 36%|███▌      | 489/1377 [00:37<01:07, 13.07it/s]
 36%|███▌      | 491/1377 [00:38<01:07, 13.15it/s]
 36%|███▌      | 493/1377 [00:38<01:08, 12.96it/s]
 36%|███▌      | 495/1377 [00:38<01:08, 12.95it/s]
 36%|███▌      | 497/1377 [00:38<01:07, 12.99i

[2m[36m(_objective pid=1846)[0m {'loss': 0.5262, 'learning_rate': 1.0146019124505001e-05, 'epoch': 1.09}


[2m[36m(_objective pid=1846)[0m 
[2m[36m(_objective pid=1846)[0m  27%|██▋       | 14/51 [00:00<00:00, 53.71it/s][A
[2m[36m(_objective pid=1846)[0m 
 39%|███▉      | 20/51 [00:00<00:00, 50.54it/s][A
[2m[36m(_objective pid=1846)[0m 
 51%|█████     | 26/51 [00:00<00:00, 51.17it/s][A
[2m[36m(_objective pid=1846)[0m 
 63%|██████▎   | 32/51 [00:00<00:00, 50.64it/s][A
[2m[36m(_objective pid=1846)[0m 
 75%|███████▍  | 38/51 [00:00<00:00, 52.44it/s][A
[2m[36m(_objective pid=1846)[0m 
 86%|████████▋ | 44/51 [00:00<00:00, 52.37it/s][A
[2m[36m(_objective pid=1846)[0m 
                                                  
 36%|███▋      | 500/1377 [00:39<01:07, 12.94it/s]
100%|██████████| 51/51 [00:00<00:00, 50.85it/s][A
                                               [A


[2m[36m(_objective pid=1846)[0m {'eval_loss': 0.38265106081962585, 'eval_accuracy': 0.8333333333333334, 'eval_f1': 0.8785714285714286, 'eval_runtime': 1.0131, 'eval_samples_per_second': 402.73, 'eval_steps_per_second': 50.341, 'epoch': 1.09}


 36%|███▋      | 501/1377 [00:41<06:51,  2.13it/s]
 37%|███▋      | 503/1377 [00:41<05:07,  2.84it/s]
 37%|███▋      | 505/1377 [00:41<03:53,  3.73it/s]
 37%|███▋      | 507/1377 [00:41<03:03,  4.75it/s]
 37%|███▋      | 509/1377 [00:42<02:27,  5.90it/s]
 37%|███▋      | 511/1377 [00:42<02:02,  7.09it/s]
 37%|███▋      | 513/1377 [00:42<01:45,  8.17it/s]
 37%|███▋      | 515/1377 [00:42<01:31,  9.38it/s]
 38%|███▊      | 517/1377 [00:42<01:23, 10.25it/s]
 38%|███▊      | 519/1377 [00:42<01:16, 11.27it/s]
 38%|███▊      | 521/1377 [00:42<01:12, 11.81it/s]
 38%|███▊      | 523/1377 [00:43<01:08, 12.51it/s]
 38%|███▊      | 525/1377 [00:43<01:06, 12.84it/s]
 38%|███▊      | 527/1377 [00:43<01:07, 12.60it/s]
 38%|███▊      | 529/1377 [00:43<01:06, 12.67it/s]
 39%|███▊      | 531/1377 [00:43<01:06, 12.73it/s]
 39%|███▊      | 533/1377 [00:43<01:06, 12.71it/s]
 39%|███▉      | 535/1377 [00:43<01:05, 12.84it/s]
 39%|███▉      | 537/1377 [00:44<01:03, 13.17it/s]
 39%|███▉      | 539/1377 [00:4

== Status ==
Current time: 2022-11-29 17:30:22 (running for 00:04:44.90)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  40%|███▉      | 545/1377 [00:44<01:02, 13.41it/s]
 40%|███▉      | 547/1377 [00:44<01:01, 13.57it/s]
 40%|███▉      | 549/1377 [00:45<01:01, 13.40it/s]
 40%|████      | 551/1377 [00:45<01:01, 13.33it/s]
 40%|████      | 553/1377 [00:45<01:02, 13.24it/s]
 40%|████      | 555/1377 [00:45<01:02, 13.23it/s]
 40%|████      | 557/1377 [00:45<01:02, 13.16it/s]
 41%|████      | 559/1377 [00:45<01:01, 13.21it/s]
 41%|████      | 561/1377 [00:45<01:02, 13.00it/s]
 41%|████      | 563/1377 [00:46<01:01, 13.13it/s]
 41%|████      | 565/1377 [00:46<01:01, 13.10it/s]
 41%|████      | 567/1377 [00:46<01:02, 12.93it/s]
 41%|████▏     | 569/1377 [00:46<01:01, 13.15it/s]
 41%|████▏     | 571/1377 [00:46<01:01, 13.12it/s]
 42%|████▏     | 573/1377 [00:46<01:03, 12.76it/s]
 42%|████▏     | 575/1377 [00:47<01:02, 12.78it/s]
 42%|████▏     | 577/1377 [00:47<01:03, 12.61it/s]
 42%|████▏     | 579/1377 [00:47<01:02, 12.81it/s]
 42%|████▏     | 581/1377 [00:47<01:01, 12.88i

== Status ==
Current time: 2022-11-29 17:30:27 (running for 00:04:49.91)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  45%|████▍     | 613/1377 [00:49<00:58, 13.05it/s]
 45%|████▍     | 615/1377 [00:50<00:57, 13.17it/s]
 45%|████▍     | 617/1377 [00:50<00:58, 12.93it/s]
 45%|████▍     | 619/1377 [00:50<00:57, 13.11it/s]
 45%|████▌     | 621/1377 [00:50<00:57, 13.17it/s]
 45%|████▌     | 623/1377 [00:50<00:57, 13.11it/s]
 45%|████▌     | 625/1377 [00:50<00:57, 13.16it/s]
 46%|████▌     | 627/1377 [00:50<00:55, 13.49it/s]
 46%|████▌     | 629/1377 [00:51<00:55, 13.58it/s]
 46%|████▌     | 631/1377 [00:51<00:55, 13.47it/s]
 46%|████▌     | 633/1377 [00:51<00:55, 13.42it/s]
 46%|████▌     | 635/1377 [00:51<00:55, 13.31it/s]
 46%|████▋     | 637/1377 [00:51<00:55, 13.43it/s]
 46%|████▋     | 639/1377 [00:51<00:54, 13.42it/s]
 47%|████▋     | 641/1377 [00:52<00:55, 13.33it/s]
 47%|████▋     | 643/1377 [00:52<00:55, 13.31it/s]
 47%|████▋     | 645/1377 [00:52<00:55, 13.19it/s]
 47%|████▋     | 647/1377 [00:52<00:54, 13.38it/s]
 47%|████▋     | 649/1377 [00:52<00:54, 13.42i

== Status ==
Current time: 2022-11-29 17:30:32 (running for 00:04:54.91)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  49%|████▉     | 677/1377 [00:54<00:52, 13.32it/s]
 49%|████▉     | 679/1377 [00:54<00:51, 13.47it/s]
 49%|████▉     | 681/1377 [00:55<00:51, 13.55it/s]
 50%|████▉     | 683/1377 [00:55<00:51, 13.53it/s]
 50%|████▉     | 685/1377 [00:55<00:51, 13.52it/s]
 50%|████▉     | 687/1377 [00:55<00:52, 13.14it/s]
 50%|█████     | 689/1377 [00:55<00:51, 13.37it/s]
 50%|█████     | 691/1377 [00:55<00:51, 13.23it/s]
 50%|█████     | 693/1377 [00:55<00:51, 13.25it/s]
 50%|█████     | 695/1377 [00:56<00:51, 13.30it/s]
 51%|█████     | 697/1377 [00:56<00:50, 13.34it/s]
 51%|█████     | 699/1377 [00:56<00:51, 13.27it/s]
 51%|█████     | 701/1377 [00:56<00:50, 13.35it/s]
 51%|█████     | 703/1377 [00:56<00:50, 13.26it/s]
 51%|█████     | 705/1377 [00:56<00:49, 13.53it/s]
 51%|█████▏    | 707/1377 [00:56<00:50, 13.20it/s]
 51%|█████▏    | 709/1377 [00:57<00:50, 13.24it/s]
 52%|█████▏    | 711/1377 [00:57<00:50, 13.23it/s]
 52%|█████▏    | 713/1377 [00:57<00:49, 13.30i

== Status ==
Current time: 2022-11-29 17:30:37 (running for 00:04:59.92)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  54%|█████▍    | 745/1377 [00:59<00:47, 13.34it/s]
 54%|█████▍    | 747/1377 [00:59<00:47, 13.21it/s]
 54%|█████▍    | 749/1377 [01:00<00:47, 13.27it/s]
 55%|█████▍    | 751/1377 [01:00<00:47, 13.27it/s]
 55%|█████▍    | 753/1377 [01:00<00:47, 13.03it/s]
 55%|█████▍    | 755/1377 [01:00<00:47, 12.99it/s]
 55%|█████▍    | 757/1377 [01:00<00:46, 13.28it/s]
 55%|█████▌    | 759/1377 [01:00<00:46, 13.20it/s]
 55%|█████▌    | 761/1377 [01:01<00:46, 13.27it/s]
 55%|█████▌    | 763/1377 [01:01<00:46, 13.31it/s]
 56%|█████▌    | 765/1377 [01:01<00:45, 13.34it/s]
 56%|█████▌    | 767/1377 [01:01<00:46, 13.19it/s]
 56%|█████▌    | 769/1377 [01:01<00:45, 13.41it/s]
 56%|█████▌    | 771/1377 [01:01<00:44, 13.52it/s]
 56%|█████▌    | 773/1377 [01:01<00:44, 13.47it/s]
 56%|█████▋    | 775/1377 [01:02<00:44, 13.45it/s]
 56%|█████▋    | 777/1377 [01:02<00:45, 13.33it/s]
 57%|█████▋    | 779/1377 [01:02<00:44, 13.59it/s]
 57%|█████▋    | 781/1377 [01:02<00:44, 13.28i

== Status ==
Current time: 2022-11-29 17:30:43 (running for 00:05:04.92)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  59%|█████▉    | 811/1377 [01:04<00:42, 13.34it/s]
 59%|█████▉    | 813/1377 [01:04<00:42, 13.35it/s]
 59%|█████▉    | 815/1377 [01:05<00:41, 13.47it/s]
 59%|█████▉    | 817/1377 [01:05<00:40, 13.68it/s]
 59%|█████▉    | 819/1377 [01:05<00:41, 13.48it/s]
 60%|█████▉    | 821/1377 [01:05<00:41, 13.51it/s]
 60%|█████▉    | 823/1377 [01:05<00:41, 13.49it/s]
 60%|█████▉    | 825/1377 [01:05<00:40, 13.65it/s]
 60%|██████    | 827/1377 [01:05<00:40, 13.60it/s]
 60%|██████    | 829/1377 [01:06<00:40, 13.45it/s]
 60%|██████    | 831/1377 [01:06<00:39, 13.81it/s]
 60%|██████    | 833/1377 [01:06<00:38, 13.95it/s]
 61%|██████    | 835/1377 [01:06<00:39, 13.83it/s]
 61%|██████    | 837/1377 [01:06<00:39, 13.55it/s]
 61%|██████    | 839/1377 [01:06<00:39, 13.48it/s]
 61%|██████    | 841/1377 [01:06<00:40, 13.37it/s]
 61%|██████    | 843/1377 [01:07<00:39, 13.52it/s]
 61%|██████▏   | 845/1377 [01:07<00:39, 13.35it/s]
 62%|██████▏   | 847/1377 [01:07<00:38, 13.75i

== Status ==
Current time: 2022-11-29 17:30:48 (running for 00:05:09.93)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  64%|██████▍   | 881/1377 [01:09<00:37, 13.38it/s]
 64%|██████▍   | 883/1377 [01:10<00:36, 13.45it/s]
 64%|██████▍   | 885/1377 [01:10<00:37, 13.15it/s]
 64%|██████▍   | 887/1377 [01:10<00:37, 13.22it/s]
 65%|██████▍   | 889/1377 [01:10<00:36, 13.32it/s]
 65%|██████▍   | 891/1377 [01:10<00:36, 13.34it/s]
 65%|██████▍   | 893/1377 [01:10<00:37, 13.04it/s]
 65%|██████▍   | 895/1377 [01:11<00:36, 13.19it/s]
 65%|██████▌   | 897/1377 [01:11<00:35, 13.42it/s]
 65%|██████▌   | 899/1377 [01:11<00:35, 13.40it/s]
 65%|██████▌   | 901/1377 [01:11<00:36, 13.22it/s]
 66%|██████▌   | 903/1377 [01:11<00:35, 13.34it/s]
 66%|██████▌   | 905/1377 [01:11<00:34, 13.50it/s]
 66%|██████▌   | 907/1377 [01:11<00:35, 13.38it/s]
 66%|██████▌   | 909/1377 [01:12<00:34, 13.62it/s]
 66%|██████▌   | 911/1377 [01:12<00:34, 13.47it/s]
 66%|██████▋   | 913/1377 [01:12<00:34, 13.46it/s]
 66%|██████▋   | 915/1377 [01:12<00:35, 13.17it/s]
 67%|██████▋   | 917/1377 [01:12<00:34, 13.33i

== Status ==
Current time: 2022-11-29 17:30:53 (running for 00:05:14.93)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  69%|██████▉   | 947/1377 [01:14<00:32, 13.43it/s]
 69%|██████▉   | 949/1377 [01:15<00:32, 13.35it/s]
 69%|██████▉   | 951/1377 [01:15<00:32, 13.05it/s]
 69%|██████▉   | 953/1377 [01:15<00:31, 13.38it/s]
 69%|██████▉   | 955/1377 [01:15<00:32, 13.13it/s]
 69%|██████▉   | 957/1377 [01:15<00:32, 13.05it/s]
 70%|██████▉   | 959/1377 [01:15<00:31, 13.31it/s]
 70%|██████▉   | 961/1377 [01:15<00:30, 13.54it/s]
 70%|██████▉   | 963/1377 [01:16<00:31, 13.31it/s]
 70%|███████   | 965/1377 [01:16<00:30, 13.51it/s]
 70%|███████   | 967/1377 [01:16<00:30, 13.59it/s]
 70%|███████   | 969/1377 [01:16<00:30, 13.59it/s]
 71%|███████   | 971/1377 [01:16<00:29, 13.74it/s]
 71%|███████   | 973/1377 [01:16<00:29, 13.89it/s]
 71%|███████   | 975/1377 [01:16<00:30, 13.38it/s]
 71%|███████   | 977/1377 [01:17<00:30, 12.94it/s]
 71%|███████   | 979/1377 [01:17<00:30, 13.09it/s]
 71%|███████   | 981/1377 [01:17<00:29, 13.50it/s]
 71%|███████▏  | 983/1377 [01:17<00:29, 13.51i

[2m[36m(_objective pid=1846)[0m {'loss': 0.3433, 'learning_rate': 4.361515632768968e-06, 'epoch': 2.18}


[2m[36m(_objective pid=1846)[0m 
[2m[36m(_objective pid=1846)[0m  27%|██▋       | 14/51 [00:00<00:00, 56.53it/s][A
[2m[36m(_objective pid=1846)[0m 
 39%|███▉      | 20/51 [00:00<00:00, 53.25it/s][A
[2m[36m(_objective pid=1846)[0m 
 51%|█████     | 26/51 [00:00<00:00, 54.32it/s][A
[2m[36m(_objective pid=1846)[0m 
 63%|██████▎   | 32/51 [00:00<00:00, 54.46it/s][A
[2m[36m(_objective pid=1846)[0m 
 75%|███████▍  | 38/51 [00:00<00:00, 55.60it/s][A
[2m[36m(_objective pid=1846)[0m 
 86%|████████▋ | 44/51 [00:00<00:00, 55.11it/s][A
[2m[36m(_objective pid=1846)[0m 
                                                   
 73%|███████▎  | 1000/1377 [01:19<00:28, 13.38it/s]
100%|██████████| 51/51 [00:00<00:00, 54.74it/s][A
                                               [A


[2m[36m(_objective pid=1846)[0m {'eval_loss': 0.4225529730319977, 'eval_accuracy': 0.8455882352941176, 'eval_f1': 0.891566265060241, 'eval_runtime': 0.9505, 'eval_samples_per_second': 429.253, 'eval_steps_per_second': 53.657, 'epoch': 2.18}
== Status ==
Current time: 2022-11-29 17:30:58 (running for 00:05:19.95)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |               

[2m[36m(_objective pid=1846)[0m  73%|███████▎  | 1001/1377 [01:21<02:49,  2.21it/s]
 73%|███████▎  | 1003/1377 [01:21<02:06,  2.95it/s]
 73%|███████▎  | 1005/1377 [01:21<01:36,  3.84it/s]
 73%|███████▎  | 1007/1377 [01:21<01:15,  4.89it/s]
 73%|███████▎  | 1009/1377 [01:22<01:01,  6.01it/s]
 73%|███████▎  | 1011/1377 [01:22<00:50,  7.20it/s]
 74%|███████▎  | 1013/1377 [01:22<00:44,  8.16it/s]
 74%|███████▎  | 1015/1377 [01:22<00:39,  9.20it/s]
 74%|███████▍  | 1017/1377 [01:22<00:35, 10.28it/s]
 74%|███████▍  | 1019/1377 [01:22<00:32, 11.07it/s]
 74%|███████▍  | 1021/1377 [01:22<00:30, 11.81it/s]
 74%|███████▍  | 1023/1377 [01:23<00:28, 12.38it/s]
 74%|███████▍  | 1025/1377 [01:23<00:27, 12.73it/s]
 75%|███████▍  | 1027/1377 [01:23<00:26, 13.17it/s]
 75%|███████▍  | 1029/1377 [01:23<00:25, 13.46it/s]
 75%|███████▍  | 1031/1377 [01:23<00:25, 13.57it/s]
 75%|███████▌  | 1033/1377 [01:23<00:26, 13.14it/s]
 75%|███████▌  | 1035/1377 [01:23<00:26, 12.92it/s]
 75%|███████▌  | 1037/1377 [

== Status ==
Current time: 2022-11-29 17:31:03 (running for 00:05:24.95)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  76%|███████▌  | 1049/1377 [01:25<00:25, 13.08it/s]
 76%|███████▋  | 1051/1377 [01:25<00:24, 13.15it/s]
 76%|███████▋  | 1053/1377 [01:25<00:24, 13.38it/s]
 77%|███████▋  | 1055/1377 [01:25<00:23, 13.53it/s]
 77%|███████▋  | 1057/1377 [01:25<00:23, 13.37it/s]
 77%|███████▋  | 1059/1377 [01:25<00:23, 13.58it/s]
 77%|███████▋  | 1061/1377 [01:25<00:23, 13.61it/s]
 77%|███████▋  | 1063/1377 [01:26<00:22, 13.66it/s]
 77%|███████▋  | 1065/1377 [01:26<00:23, 13.23it/s]
 77%|███████▋  | 1067/1377 [01:26<00:23, 13.13it/s]
 78%|███████▊  | 1069/1377 [01:26<00:22, 13.40it/s]
 78%|███████▊  | 1071/1377 [01:26<00:22, 13.68it/s]
 78%|███████▊  | 1073/1377 [01:26<00:22, 13.80it/s]
 78%|███████▊  | 1075/1377 [01:26<00:21, 13.73it/s]
 78%|███████▊  | 1077/1377 [01:27<00:22, 13.59it/s]
 78%|███████▊  | 1079/1377 [01:27<00:21, 13.85it/s]
 79%|███████▊  | 1081/1377 [01:27<00:21, 13.74it/s]
 79%|███████▊  | 1083/1377 [01:27<00:21, 13.70it/s]
 79%|███████▉  | 1085/1377 [

== Status ==
Current time: 2022-11-29 17:31:08 (running for 00:05:29.96)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  81%|████████  | 1117/1377 [01:30<00:19, 13.04it/s]
 81%|████████▏ | 1119/1377 [01:30<00:19, 13.09it/s]
 81%|████████▏ | 1121/1377 [01:30<00:19, 13.16it/s]
 82%|████████▏ | 1123/1377 [01:30<00:19, 13.16it/s]
 82%|████████▏ | 1125/1377 [01:30<00:19, 13.22it/s]
 82%|████████▏ | 1127/1377 [01:30<00:18, 13.28it/s]
 82%|████████▏ | 1129/1377 [01:30<00:18, 13.15it/s]
 82%|████████▏ | 1131/1377 [01:31<00:18, 13.47it/s]
 82%|████████▏ | 1133/1377 [01:31<00:18, 13.35it/s]
 82%|████████▏ | 1135/1377 [01:31<00:18, 13.35it/s]
 83%|████████▎ | 1137/1377 [01:31<00:17, 13.50it/s]
 83%|████████▎ | 1139/1377 [01:31<00:17, 13.45it/s]
 83%|████████▎ | 1141/1377 [01:31<00:17, 13.58it/s]
 83%|████████▎ | 1143/1377 [01:31<00:17, 13.24it/s]
 83%|████████▎ | 1145/1377 [01:32<00:17, 13.61it/s]
 83%|████████▎ | 1147/1377 [01:32<00:16, 13.54it/s]
 83%|████████▎ | 1149/1377 [01:32<00:17, 13.34it/s]
 84%|████████▎ | 1151/1377 [01:32<00:17, 13.16it/s]
 84%|████████▎ | 1153/1377 [

== Status ==
Current time: 2022-11-29 17:31:13 (running for 00:05:34.97)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  86%|████████▌ | 1181/1377 [01:34<00:14, 13.23it/s]
 86%|████████▌ | 1183/1377 [01:34<00:14, 13.25it/s]
 86%|████████▌ | 1185/1377 [01:35<00:14, 13.31it/s]
 86%|████████▌ | 1187/1377 [01:35<00:14, 13.32it/s]
 86%|████████▋ | 1189/1377 [01:35<00:14, 13.33it/s]
 86%|████████▋ | 1191/1377 [01:35<00:13, 13.30it/s]
 87%|████████▋ | 1193/1377 [01:35<00:13, 13.43it/s]
 87%|████████▋ | 1195/1377 [01:35<00:13, 13.63it/s]
 87%|████████▋ | 1197/1377 [01:36<00:13, 13.55it/s]
 87%|████████▋ | 1199/1377 [01:36<00:13, 13.30it/s]
 87%|████████▋ | 1201/1377 [01:36<00:12, 13.58it/s]
 87%|████████▋ | 1203/1377 [01:36<00:12, 13.86it/s]
 88%|████████▊ | 1205/1377 [01:36<00:12, 13.95it/s]
 88%|████████▊ | 1207/1377 [01:36<00:12, 13.91it/s]
 88%|████████▊ | 1209/1377 [01:36<00:12, 13.93it/s]
 88%|████████▊ | 1211/1377 [01:37<00:11, 13.85it/s]
 88%|████████▊ | 1213/1377 [01:37<00:11, 13.70it/s]
 88%|████████▊ | 1215/1377 [01:37<00:11, 13.77it/s]
 88%|████████▊ | 1217/1377 [

== Status ==
Current time: 2022-11-29 17:31:18 (running for 00:05:39.97)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  91%|█████████ | 1251/1377 [01:40<00:09, 13.30it/s]
 91%|█████████ | 1253/1377 [01:40<00:09, 13.58it/s]
 91%|█████████ | 1255/1377 [01:40<00:09, 13.43it/s]
 91%|█████████▏| 1257/1377 [01:40<00:08, 13.42it/s]
 91%|█████████▏| 1259/1377 [01:40<00:08, 13.23it/s]
 92%|█████████▏| 1261/1377 [01:40<00:08, 13.31it/s]
 92%|█████████▏| 1263/1377 [01:40<00:08, 13.31it/s]
 92%|█████████▏| 1265/1377 [01:41<00:08, 13.62it/s]
 92%|█████████▏| 1267/1377 [01:41<00:08, 13.26it/s]
 92%|█████████▏| 1269/1377 [01:41<00:08, 13.27it/s]
 92%|█████████▏| 1271/1377 [01:41<00:08, 13.23it/s]
 92%|█████████▏| 1273/1377 [01:41<00:07, 13.26it/s]
 93%|█████████▎| 1275/1377 [01:41<00:07, 13.45it/s]
 93%|█████████▎| 1277/1377 [01:41<00:07, 13.50it/s]
 93%|█████████▎| 1279/1377 [01:42<00:07, 13.49it/s]
 93%|█████████▎| 1281/1377 [01:42<00:07, 13.45it/s]
 93%|█████████▎| 1283/1377 [01:42<00:07, 13.33it/s]
 93%|█████████▎| 1285/1377 [01:42<00:06, 13.55it/s]
 93%|█████████▎| 1287/1377 [

== Status ==
Current time: 2022-11-29 17:31:23 (running for 00:05:44.97)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 1.0/8 CPUs, 1.0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (1 RUNNING, 2 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----

[2m[36m(_objective pid=1846)[0m  96%|█████████▌| 1317/1377 [01:44<00:04, 13.21it/s]
 96%|█████████▌| 1319/1377 [01:45<00:04, 13.03it/s]
 96%|█████████▌| 1321/1377 [01:45<00:04, 12.85it/s]
 96%|█████████▌| 1323/1377 [01:45<00:04, 13.20it/s]
 96%|█████████▌| 1325/1377 [01:45<00:03, 13.00it/s]
 96%|█████████▋| 1327/1377 [01:45<00:03, 13.34it/s]
 97%|█████████▋| 1329/1377 [01:45<00:03, 13.08it/s]
 97%|█████████▋| 1331/1377 [01:46<00:03, 12.74it/s]
 97%|█████████▋| 1333/1377 [01:46<00:03, 13.15it/s]
 97%|█████████▋| 1335/1377 [01:46<00:03, 12.67it/s]
 97%|█████████▋| 1337/1377 [01:46<00:03, 12.86it/s]
 97%|█████████▋| 1339/1377 [01:46<00:02, 13.11it/s]
 97%|█████████▋| 1341/1377 [01:46<00:02, 13.31it/s]
 98%|█████████▊| 1343/1377 [01:46<00:02, 13.23it/s]
 98%|█████████▊| 1345/1377 [01:47<00:02, 13.32it/s]
 98%|█████████▊| 1347/1377 [01:47<00:02, 13.73it/s]
 98%|█████████▊| 1349/1377 [01:47<00:02, 13.66it/s]
 98%|█████████▊| 1351/1377 [01:47<00:01, 13.28it/s]
 98%|█████████▊| 1353/1377 [

== Status ==
Current time: 2022-11-29 17:31:27 (running for 00:05:49.63)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/1 GPUs, 0.0/30.18 GiB heap, 0.0/15.09 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-29_17-25-37
Number of trials: 3/3 (3 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+--

In [7]:
print(best_run)

BestRun(run_id='d6e35_00002', objective=1.7371545003543587, hyperparameters={'learning_rate': 1.5930522616241033e-05, 'num_train_epochs': 3, 'seed': 1.8027952775362954, 'per_device_train_batch_size': 8, 'weight_decay': 0.3})


* Result of running Colab Pro +, GPU/High RAM:


```

2022-11-28 02:28:06,252	INFO tune.py:778 -- Total run time: 347.71 seconds (347.41 seconds for the tuning loop).
== Status ==
Current time: 2022-11-28 02:28:06 (running for 00:05:47.41)
Memory usage on this node: 4.4/51.0 GiB 
Using FIFO scheduling algorithm.
Resources requested: 0/8 CPUs, 0/1 GPUs, 0.0/30.17 GiB heap, 0.0/15.08 GiB objects (0.0/1.0 accelerator_type:T4)
Result logdir: /root/ray_results/_objective_2022-11-28_02-22-18
Number of trials: 3/3 (3 TERMINATED)
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+
| Trial name             | status     | loc             |   learning_rate |   num_train_epochs |   per_device_train_bat |    seed |   weight_decay |   objective |
|                        |            |                 |                 |                    |                ch_size |         |                |             |
|------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------|
| _objective_7b66a_00000 | TERMINATED | 172.28.0.2:655  |     5.61152e-06 |                  5 |                     64 | 8.15396 |            0.1 |     1.5247  |
| _objective_7b66a_00001 | TERMINATED | 172.28.0.2:1060 |     2.05134e-06 |                  3 |                     16 | 4.89902 |            0.1 |     1.49605 |
| _objective_7b66a_00002 | TERMINATED | 172.28.0.2:1841 |     1.59305e-05 |                  3 |                      8 | 1.8028  |            0.2 |     1.72919 |
+------------------------+------------+-----------------+-----------------+--------------------+------------------------+---------+----------------+-------------+


(_objective pid=1841) {'train_runtime': 107.4607, 'train_samples_per_second': 102.4, 'train_steps_per_second': 12.814, 'train_loss': 0.3774486755060821, 'epoch': 3.0}
BestRun(run_id='7b66a_00002', objective=1.7291939932062017, hyperparameters={'learning_rate': 1.5930522616241033e-05, 'num_train_epochs': 3, 'seed': 1.8027952775362954, 'per_device_train_batch_size': 8, 'weight_decay': 0.2})
```



##HPO Using Optuna

In [8]:
from typing import Dict

def hp_space_optuna(trial) -> Dict[str, float]:
    return {
        "learning_rate" : trial.suggest_float("learning_rate", 1e-6, 1e-4, log=True),
        "num_train_epochs" : trial.suggest_int("num_train_epochs", 1, 5),
        "seed" : trial.suggest_int("seed", 1, 40),
        "per_device_train_batch_size" : trial.suggest_categorical("per_device_train_batch_size", [4, 8, 16, 32, 64]),
        "weight_decay" : trial.suggest_loguniform('weight_decay', 0.001, 0.1)
    }


In [9]:
!pip install optuna

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting optuna
  Downloading optuna-3.0.3-py3-none-any.whl (348 kB)
[K     |████████████████████████████████| 348 kB 22.3 MB/s 
[?25hCollecting colorlog
  Downloading colorlog-6.7.0-py2.py3-none-any.whl (11 kB)
Collecting alembic>=1.5.0
  Downloading alembic-1.8.1-py3-none-any.whl (209 kB)
[K     |████████████████████████████████| 209 kB 86.9 MB/s 
Collecting cmaes>=0.8.2
  Downloading cmaes-0.9.0-py3-none-any.whl (23 kB)
Collecting cliff
  Downloading cliff-3.10.1-py3-none-any.whl (81 kB)
[K     |████████████████████████████████| 81 kB 10.8 MB/s 
Collecting Mako
  Downloading Mako-1.2.4-py3-none-any.whl (78 kB)
[K     |████████████████████████████████| 78 kB 7.9 MB/s 
Collecting pbr!=2.1.0,>=2.0.0
  Downloading pbr-5.11.0-py2.py3-none-any.whl (112 kB)
[K     |████████████████████████████████| 112 kB 73.7 MB/s 
[?25hCollecting cmd2>=1.0.0
  Downloading cmd2-2.4.2-py3-none-any.wh

In [10]:
import optuna

In [11]:
%%time
if RUN == RUN_OPTUNA or RUN == RUN_ALL:
  best_run = trainer.hyperparameter_search(
      hp_space=hp_space_optuna,
      direction="maximize",
      backend="optuna",
      n_trials=3
  )
  print("------------OPTUNA------------")
  print(best_run)

[32m[I 2022-11-29 17:31:39,523][0m A new study created in memory with name: no-name-d173dd73-612d-45b8-a0a6-9c5d11025337[0m
  if __name__ == '__main__':
Trial: {'learning_rate': 2.9378274943857116e-05, 'num_train_epochs': 5, 'seed': 27, 'per_device_train_batch_size': 64, 'weight_decay': 0.02237594008580847}
loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--distilbert-base-uncased/snapshots/1c4513b2eedbda136f57676a34eea67aba266e5c/config.json
Model config DistilBertConfig {
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_heads": 12,
  "n_layers": 6,
  "pad_token_id": 0,
  "qa_dropout": 0.1,
  "seq_classif_dropout": 0.2,
  "sinusoidal_pos_embds": false,
  "tie_weights_": true,
  "transformers_v

Step,Training Loss,Validation Loss




Training completed. Do not forget to share your model on huggingface.co/models =)


The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: sentence2, sentence1, idx. If sentence2, sentence1, idx are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 408
  Batch size = 8


[32m[I 2022-11-29 17:33:50,810][0m Trial 0 finished with value: 1.7224171394446104 and parameters: {'learning_rate': 2.9378274943857116e-05, 'num_train_epochs': 5, 'seed': 27, 'per_device_train_batch_size': 64, 'weight_decay': 0.02237594008580847}. Best is trial 0 with value: 1.7224171394446104.[0m
  if __name__ == '__main__':
Trial: {'learning_rate': 6.155893160017104e-06, 'num_train_epochs': 2, 'seed': 12, 'per_device_train_batch_size': 8, 'weight_decay': 0.039499246466172394}
loading configuration file config.json from cache at /root/.cache/huggingface/hub/models--distilbert-base-uncased/snapshots/1c4513b2eedbda136f57676a34eea67aba266e5c/config.json
Model config DistilBertConfig {
  "_name_or_path": "distilbert-base-uncased",
  "activation": "gelu",
  "architectures": [
    "DistilBertForMaskedLM"
  ],
  "attention_dropout": 0.1,
  "dim": 768,
  "dropout": 0.1,
  "hidden_dim": 3072,
  "initializer_range": 0.02,
  "max_position_embeddings": 512,
  "model_type": "distilbert",
  "n_

Step,Training Loss,Validation Loss,Accuracy,F1
500,0.5777,0.496551,0.718137,0.823349


The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: sentence2, sentence1, idx. If sentence2, sentence1, idx are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 408
  Batch size = 8
Saving model checkpoint to test/run-1/checkpoint-500
Configuration saved in test/run-1/checkpoint-500/config.json
Model weights saved in test/run-1/checkpoint-500/pytorch_model.bin
tokenizer config file saved in test/run-1/checkpoint-500/tokenizer_config.json
Special tokens file saved in test/run-1/checkpoint-500/special_tokens_map.json


Training completed. Do not forget to share your model on huggingface.co/models =)


[32m[I 2022-11-29 17:35:03,788][0m Trial 1 finished with value: 1.541485949218397 and parameters: {'learning_rate': 6.155893160017104e-06, 'num_train_epochs': 2, 'seed': 12, 'per_device_train_

Step,Training Loss,Validation Loss,Accuracy,F1
500,0.5724,0.49667,0.708333,0.820513


The following columns in the evaluation set don't have a corresponding argument in `DistilBertForSequenceClassification.forward` and have been ignored: sentence2, sentence1, idx. If sentence2, sentence1, idx are not expected by `DistilBertForSequenceClassification.forward`,  you can safely ignore this message.
***** Running Evaluation *****
  Num examples = 408
  Batch size = 8
Saving model checkpoint to test/run-2/checkpoint-500
Configuration saved in test/run-2/checkpoint-500/config.json
Model weights saved in test/run-2/checkpoint-500/pytorch_model.bin
tokenizer config file saved in test/run-2/checkpoint-500/tokenizer_config.json
Special tokens file saved in test/run-2/checkpoint-500/special_tokens_map.json


Training completed. Do not forget to share your model on huggingface.co/models =)


[32m[I 2022-11-29 17:36:16,840][0m Trial 2 finished with value: 1.528846153846154 and parameters: {'learning_rate': 7.2967030975123275e-06, 'num_train_epochs': 2, 'seed': 2, 'per_device_train_

------------
BestRun(run_id='0', objective=1.7224171394446104, hyperparameters={'learning_rate': 2.9378274943857116e-05, 'num_train_epochs': 5, 'seed': 27, 'per_device_train_batch_size': 64, 'weight_decay': 0.02237594008580847})
CPU times: user 4min 55s, sys: 4.23 s, total: 4min 59s
Wall time: 4min 37s


* optuna finished result:
* [I 2022-11-28 08:38:32,613] Trial 2 finished with value: 1.7188552188552189 and parameters: {'learning_rate': 3.9534652778264654e-05, 'num_train_epochs': 3, 'seed': 15, 'per_device_train_batch_size': 32, 'weight_decay': 0.0023147682141046017}. 
* Best is trial 2 with value: 1.7188552188552189.
* CPU times: user 7min 53s, sys: 10.2 s, total: 8min 3s
Wall time: 8min 4s
* BestRun(run_id='2', objective=1.7188552188552189, hyperparameters={'learning_rate': 3.9534652778264654e-05, 'num_train_epochs': 3, 'seed': 15, 'per_device_train_batch_size': 32, 'weight_decay': 0.0023147682141046017}) 
* 10/28 BestRun(run_id='0', objective=1.6687864334923157, hyperparameters={'learning_rate': 6.756272182144086e-06, 'num_train_epochs': 3, 'seed': 21, 'per_device_train_batch_size': 8, 'weight_decay': 0.0017858013659287576})
* 10/29 BestRun(run_id='0', objective=1.7224171394446104, hyperparameters={'learning_rate': 2.9378274943857116e-05, 'num_train_epochs': 5, 'seed': 27, 'per_device_train_batch_size': 64, 'weight_decay': 0.02237594008580847})
* 10/29 CPU times: user 4min 55s, sys: 4.23 s, total: 4min 59s
* Wall time: 4min 37s



In [12]:
print(best_run)

BestRun(run_id='0', objective=1.7224171394446104, hyperparameters={'learning_rate': 2.9378274943857116e-05, 'num_train_epochs': 5, 'seed': 27, 'per_device_train_batch_size': 64, 'weight_decay': 0.02237594008580847})
