## Population Based Training for Hedge Classifier with Transformers

### Set Up

In [1]:
%%bash
pwd

/home/ec2-user/SageMaker/Crypto-Uncertainty-Index/sagemaker


In [2]:
%cd ..

/home/ec2-user/SageMaker/Crypto-Uncertainty-Index


In [3]:
!pip install "ray[tune]" emoji arrow transformers tokenizers datasets sklearn rich numpy pandas wandb



In [4]:
from nlp.hedge_classifier.huggingface.pbt_transformer import (
    train_pbt_hf_clf,
    WANDB_DEFAULT_ARGS,
)

### Training Config

In [5]:
model_name = "vinai/bertweet-base"
train_data_dir = "nlp/hedge_classifier/data/wiki_weasel_clean"
model_save_dir = "nlp/hedge_classifier/models"
num_labels = 2
text_col = "text"
wandb_args = WANDB_DEFAULT_ARGS
train_data_file_type = "csv"
sample_data_size = 1000
# Using A100
num_cpus_per_trial = 8
num_gpus_per_trial = 1
smoke_test = False
ray_address = None
ray_num_trials = 8

In [6]:
%%bash
# To be Removed (Don't leave your pw lying around :/ )
wandb login 310cbf1480e4106c80f5c34995e7006f193319fa

wandb: Appending key for api.wandb.ai to your netrc file: /home/ec2-user/.netrc


### Run PBT Hyperparams Search

In [None]:
best_result = train_pbt_hf_clf(
    model_name=model_name,
    train_data_dir=train_data_dir,
    model_save_dir=model_save_dir,
    num_labels=num_labels,
    text_col=text_col,
    wandb_args=wandb_args,
    train_data_file_type=train_data_file_type,
    sample_data_size=sample_data_size,
    num_cpus_per_trial=num_cpus_per_trial,
    num_gpus_per_trial=num_gpus_per_trial,
    smoke_test=smoke_test,
    ray_address=ray_address,
    ray_num_trials=ray_num_trials,
)

  0%|          | 0/2 [00:00<?, ?it/s]

Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.


  0%|          | 0/8 [00:00<?, ?ba/s]

Some weights of the model checkpoint at vinai/bertweet-base were not used when initializing RobertaForSequenceClassification: ['lm_head.decoder.weight', 'lm_head.bias', 'lm_head.dense.weight', 'roberta.pooler.dense.bias', 'lm_head.dense.bias', 'lm_head.layer_norm.bias', 'lm_head.layer_norm.weight', 'roberta.pooler.dense.weight', 'lm_head.decoder.bias']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at vinai/bertweet-base and are newly initialized: 

  "Consider boosting PBT performance by enabling `reuse_actors` as "


== Status ==
Current time: 2022-03-08 18:22:19 (running for 00:00:00.17)
Memory usage on this node: 5.1/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 0/8 CPUs, 0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (8 PENDING)
+------------------------+----------+-------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc   |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+-------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | PENDING  |       |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |       |       50 | 0.0468056 | 5e-05 |             32 |            2 |

[2m[36m(_objective pid=7142)[0m E0308 18:22:25.743790775    7594 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
[2m[36m(_objective pid=7142)[0m E0308 18:22:25.795871515    7594 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
[2m[36m(_objective pid=7142)[0m E0308 18:22:25.958970289    7594 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
[2m[36m(_objective pid=7142)[0m Some weights of the model checkpoint at vinai/bertweet-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.decoder.weight', 'lm_head.dense.bias', 'lm_head.layer_norm.weight', 'lm_head.layer_norm.bias', 'lm_head.bias', 'roberta.pooler.dense.weight', 'lm_head.decoder.bias', 'lm_head.dense.weight']
[2m[36m(_objective pid=7142)[0m - This IS expected if you are initializing RobertaForSequen

== Status ==
Current time: 2022-03-08 18:22:30 (running for 00:00:11.18)
Memory usage on this node: 7.8/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    

[2m[36m(_objective pid=7142)[0m wandb: Tracking run with wandb version 0.12.11
[2m[36m(_objective pid=7142)[0m wandb: Run data is saved locally in /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18/_objective_b0de0_00000_0_learning_rate=5e-05,num_train_epochs=5,warmup_steps=500,weight_decay=0.28521_2022-03-08_18-22-19/wandb/run-20220308_182228-23ncxeeg
[2m[36m(_objective pid=7142)[0m wandb: Run `wandb offline` to turn off syncing.
[2m[36m(_objective pid=7142)[0m wandb: Syncing run generous-universe-29
[2m[36m(_objective pid=7142)[0m wandb: ⭐️ View project at https://wandb.ai/chrisliew/huggingface
[2m[36m(_objective pid=7142)[0m wandb: 🚀 View run at https://wandb.ai/chrisliew/huggingface/runs/23ncxeeg
  0%|          | 0/160 [00:00<?, ?it/s]
  1%|          | 1/160 [00:00<01:30,  1.75it/s]
  1%|▏         | 2/160 [00:00<01:02,  2.51it/s]


== Status ==
Current time: 2022-03-08 18:22:35 (running for 00:00:16.19)
Memory usage on this node: 7.8/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    

  2%|▏         | 3/160 [00:01<00:53,  2.94it/s]
  2%|▎         | 4/160 [00:01<00:48,  3.21it/s]
  3%|▎         | 5/160 [00:01<00:45,  3.38it/s]
  4%|▍         | 6/160 [00:01<00:44,  3.48it/s]
  4%|▍         | 7/160 [00:02<00:43,  3.55it/s]
  5%|▌         | 8/160 [00:02<00:42,  3.60it/s]
  6%|▌         | 9/160 [00:02<00:41,  3.62it/s]
  6%|▋         | 10/160 [00:03<00:41,  3.65it/s]
  7%|▋         | 11/160 [00:03<00:40,  3.67it/s]
  8%|▊         | 12/160 [00:03<00:40,  3.68it/s]
  8%|▊         | 13/160 [00:03<00:39,  3.69it/s]
  9%|▉         | 14/160 [00:04<00:39,  3.69it/s]
  9%|▉         | 15/160 [00:04<00:39,  3.70it/s]
 10%|█         | 16/160 [00:04<00:38,  3.71it/s]
 11%|█         | 17/160 [00:04<00:38,  3.70it/s]
 11%|█▏        | 18/160 [00:05<00:38,  3.69it/s]
 12%|█▏        | 19/160 [00:05<00:38,  3.70it/s]
 12%|█▎        | 20/160 [00:05<00:37,  3.70it/s]
 13%|█▎        | 21/160 [00:05<00:37,  3.70it/s]


== Status ==
Current time: 2022-03-08 18:22:40 (running for 00:00:21.21)
Memory usage on this node: 7.8/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    

 14%|█▍        | 22/160 [00:06<00:37,  3.69it/s]
 14%|█▍        | 23/160 [00:06<00:37,  3.67it/s]
 15%|█▌        | 24/160 [00:06<00:37,  3.64it/s]
 16%|█▌        | 25/160 [00:07<00:37,  3.64it/s]
 16%|█▋        | 26/160 [00:07<00:36,  3.66it/s]
 17%|█▋        | 27/160 [00:07<00:36,  3.66it/s]
 18%|█▊        | 28/160 [00:07<00:36,  3.66it/s]
 18%|█▊        | 29/160 [00:08<00:35,  3.66it/s]
 19%|█▉        | 30/160 [00:08<00:35,  3.67it/s]
 19%|█▉        | 31/160 [00:08<00:35,  3.67it/s]
 20%|██        | 32/160 [00:08<00:29,  4.33it/s]
 21%|██        | 33/160 [00:09<00:30,  4.13it/s]
 21%|██▏       | 34/160 [00:09<00:31,  3.99it/s]
 22%|██▏       | 35/160 [00:09<00:32,  3.90it/s]
 22%|██▎       | 36/160 [00:09<00:32,  3.82it/s]
 23%|██▎       | 37/160 [00:10<00:32,  3.79it/s]
 24%|██▍       | 38/160 [00:10<00:32,  3.78it/s]
 24%|██▍       | 39/160 [00:10<00:32,  3.76it/s]
 25%|██▌       | 40/160 [00:11<00:32,  3.72it/s]


== Status ==
Current time: 2022-03-08 18:22:45 (running for 00:00:26.23)
Memory usage on this node: 7.7/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    

 26%|██▌       | 41/160 [00:11<00:31,  3.73it/s]
 26%|██▋       | 42/160 [00:11<00:31,  3.74it/s]
 27%|██▋       | 43/160 [00:11<00:31,  3.72it/s]
 28%|██▊       | 44/160 [00:12<00:31,  3.70it/s]
 28%|██▊       | 45/160 [00:12<00:31,  3.70it/s]
 29%|██▉       | 46/160 [00:12<00:30,  3.72it/s]
 29%|██▉       | 47/160 [00:12<00:30,  3.71it/s]
 30%|███       | 48/160 [00:13<00:30,  3.69it/s]
 31%|███       | 49/160 [00:13<00:30,  3.68it/s]
 31%|███▏      | 50/160 [00:13<00:29,  3.71it/s]
 32%|███▏      | 51/160 [00:13<00:29,  3.70it/s]
 32%|███▎      | 52/160 [00:14<00:29,  3.72it/s]
 33%|███▎      | 53/160 [00:14<00:28,  3.73it/s]
 34%|███▍      | 54/160 [00:14<00:28,  3.74it/s]
 34%|███▍      | 55/160 [00:15<00:28,  3.72it/s]
 35%|███▌      | 56/160 [00:15<00:27,  3.73it/s]
 36%|███▌      | 57/160 [00:15<00:27,  3.71it/s]
 36%|███▋      | 58/160 [00:15<00:27,  3.70it/s]
 37%|███▋      | 59/160 [00:16<00:27,  3.68it/s]


== Status ==
Current time: 2022-03-08 18:22:51 (running for 00:00:31.25)
Memory usage on this node: 7.7/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    

 38%|███▊      | 60/160 [00:16<00:27,  3.66it/s]
 38%|███▊      | 61/160 [00:16<00:26,  3.68it/s]
 39%|███▉      | 62/160 [00:16<00:26,  3.69it/s]
 39%|███▉      | 63/160 [00:17<00:26,  3.66it/s]
 40%|████      | 64/160 [00:17<00:22,  4.36it/s]
 41%|████      | 65/160 [00:17<00:22,  4.17it/s]
 41%|████▏     | 66/160 [00:17<00:23,  4.01it/s]
 42%|████▏     | 67/160 [00:18<00:23,  3.93it/s]
 42%|████▎     | 68/160 [00:18<00:23,  3.85it/s]
 43%|████▎     | 69/160 [00:18<00:23,  3.80it/s]
 44%|████▍     | 70/160 [00:18<00:23,  3.78it/s]
 44%|████▍     | 71/160 [00:19<00:23,  3.77it/s]
 45%|████▌     | 72/160 [00:19<00:23,  3.77it/s]
 46%|████▌     | 73/160 [00:19<00:23,  3.77it/s]
 46%|████▋     | 74/160 [00:20<00:23,  3.73it/s]
 47%|████▋     | 75/160 [00:20<00:22,  3.73it/s]
 48%|████▊     | 76/160 [00:20<00:22,  3.72it/s]
 48%|████▊     | 77/160 [00:20<00:22,  3.70it/s]


== Status ==
Current time: 2022-03-08 18:22:56 (running for 00:00:36.27)
Memory usage on this node: 7.7/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    

 49%|████▉     | 78/160 [00:21<00:22,  3.68it/s]
 49%|████▉     | 79/160 [00:21<00:22,  3.66it/s]
 50%|█████     | 80/160 [00:21<00:21,  3.67it/s]
 51%|█████     | 81/160 [00:21<00:21,  3.69it/s]
 51%|█████▏    | 82/160 [00:22<00:21,  3.71it/s]
 52%|█████▏    | 83/160 [00:22<00:20,  3.73it/s]
 52%|█████▎    | 84/160 [00:22<00:20,  3.72it/s]
 53%|█████▎    | 85/160 [00:23<00:20,  3.71it/s]
 54%|█████▍    | 86/160 [00:23<00:20,  3.69it/s]
 54%|█████▍    | 87/160 [00:23<00:19,  3.72it/s]
 55%|█████▌    | 88/160 [00:23<00:19,  3.73it/s]
 56%|█████▌    | 89/160 [00:24<00:18,  3.74it/s]
 56%|█████▋    | 90/160 [00:24<00:18,  3.69it/s]
 57%|█████▋    | 91/160 [00:24<00:18,  3.71it/s]
 57%|█████▊    | 92/160 [00:24<00:18,  3.70it/s]
 58%|█████▊    | 93/160 [00:25<00:18,  3.68it/s]
 59%|█████▉    | 94/160 [00:25<00:17,  3.67it/s]
 59%|█████▉    | 95/160 [00:25<00:17,  3.66it/s]
 60%|██████    | 96/160 [00:25<00:14,  4.36it/s]
 61%|██████    | 97/160 [00:26<00:15,  4.10it/s]


== Status ==
Current time: 2022-03-08 18:23:01 (running for 00:00:41.29)
Memory usage on this node: 7.7/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    

 61%|██████▏   | 98/160 [00:26<00:15,  3.97it/s]
 62%|██████▏   | 99/160 [00:26<00:15,  3.87it/s]


[2m[36m(_objective pid=7142)[0m {'loss': 0.6258, 'learning_rate': 1e-05, 'epoch': 3.12}


 62%|██████▎   | 100/160 [00:27<00:19,  3.01it/s]
 63%|██████▎   | 101/160 [00:27<00:18,  3.18it/s]
 64%|██████▍   | 102/160 [00:27<00:17,  3.31it/s]
 64%|██████▍   | 103/160 [00:27<00:16,  3.41it/s]
 65%|██████▌   | 104/160 [00:28<00:16,  3.44it/s]
 66%|██████▌   | 105/160 [00:28<00:15,  3.50it/s]
 66%|██████▋   | 106/160 [00:28<00:15,  3.56it/s]
 67%|██████▋   | 107/160 [00:29<00:14,  3.60it/s]
 68%|██████▊   | 108/160 [00:29<00:14,  3.60it/s]
 68%|██████▊   | 109/160 [00:29<00:14,  3.63it/s]
 69%|██████▉   | 110/160 [00:29<00:13,  3.65it/s]
 69%|██████▉   | 111/160 [00:30<00:13,  3.65it/s]
 70%|███████   | 112/160 [00:30<00:13,  3.66it/s]
 71%|███████   | 113/160 [00:30<00:12,  3.66it/s]
 71%|███████▏  | 114/160 [00:31<00:12,  3.67it/s]


== Status ==
Current time: 2022-03-08 18:23:06 (running for 00:00:46.31)
Memory usage on this node: 7.7/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    

 72%|███████▏  | 115/160 [00:31<00:12,  3.67it/s]
 72%|███████▎  | 116/160 [00:31<00:11,  3.67it/s]
 73%|███████▎  | 117/160 [00:31<00:11,  3.67it/s]
 74%|███████▍  | 118/160 [00:32<00:11,  3.66it/s]
 74%|███████▍  | 119/160 [00:32<00:11,  3.66it/s]
 75%|███████▌  | 120/160 [00:32<00:10,  3.66it/s]
 76%|███████▌  | 121/160 [00:32<00:10,  3.65it/s]
 76%|███████▋  | 122/160 [00:33<00:10,  3.65it/s]
 77%|███████▋  | 123/160 [00:33<00:10,  3.65it/s]
 78%|███████▊  | 124/160 [00:33<00:09,  3.63it/s]
 78%|███████▊  | 125/160 [00:34<00:09,  3.64it/s]
 79%|███████▉  | 126/160 [00:34<00:09,  3.65it/s]
 79%|███████▉  | 127/160 [00:34<00:09,  3.65it/s]
 80%|████████  | 128/160 [00:34<00:07,  4.33it/s]
 81%|████████  | 129/160 [00:34<00:07,  4.12it/s]
 81%|████████▏ | 130/160 [00:35<00:07,  3.96it/s]
 82%|████████▏ | 131/160 [00:35<00:07,  3.86it/s]
 82%|████████▎ | 132/160 [00:35<00:07,  3.79it/s]
 83%|████████▎ | 133/160 [00:36<00:07,  3.75it/s]


== Status ==
Current time: 2022-03-08 18:23:11 (running for 00:00:51.33)
Memory usage on this node: 7.7/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    

 84%|████████▍ | 134/160 [00:36<00:06,  3.72it/s]
 84%|████████▍ | 135/160 [00:36<00:06,  3.69it/s]
 85%|████████▌ | 136/160 [00:36<00:06,  3.69it/s]
 86%|████████▌ | 137/160 [00:37<00:06,  3.67it/s]
 86%|████████▋ | 138/160 [00:37<00:06,  3.66it/s]
 87%|████████▋ | 139/160 [00:37<00:05,  3.67it/s]
 88%|████████▊ | 140/160 [00:37<00:05,  3.67it/s]
 88%|████████▊ | 141/160 [00:38<00:05,  3.67it/s]
 89%|████████▉ | 142/160 [00:38<00:04,  3.67it/s]
 89%|████████▉ | 143/160 [00:38<00:04,  3.66it/s]
 90%|█████████ | 144/160 [00:39<00:04,  3.65it/s]
 91%|█████████ | 145/160 [00:39<00:04,  3.65it/s]
 91%|█████████▏| 146/160 [00:39<00:03,  3.65it/s]
 92%|█████████▏| 147/160 [00:39<00:03,  3.65it/s]
 92%|█████████▎| 148/160 [00:40<00:03,  3.65it/s]
 93%|█████████▎| 149/160 [00:40<00:03,  3.65it/s]
 94%|█████████▍| 150/160 [00:40<00:02,  3.66it/s]
 94%|█████████▍| 151/160 [00:40<00:02,  3.66it/s]


== Status ==
Current time: 2022-03-08 18:23:16 (running for 00:00:56.34)
Memory usage on this node: 7.7/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    

 95%|█████████▌| 152/160 [00:41<00:02,  3.66it/s]
 96%|█████████▌| 153/160 [00:41<00:01,  3.65it/s]
 96%|█████████▋| 154/160 [00:41<00:01,  3.64it/s]
 97%|█████████▋| 155/160 [00:42<00:01,  3.64it/s]
 98%|█████████▊| 156/160 [00:42<00:01,  3.65it/s]
 98%|█████████▊| 157/160 [00:42<00:00,  3.66it/s]
 99%|█████████▉| 158/160 [00:42<00:00,  3.66it/s]
 99%|█████████▉| 159/160 [00:43<00:00,  3.65it/s]
100%|██████████| 160/160 [00:43<00:00,  3.69it/s]


[2m[36m(_objective pid=7142)[0m {'train_runtime': 49.9678, 'train_samples_per_second': 100.065, 'train_steps_per_second': 3.202, 'train_loss': 0.5769810795783996, 'epoch': 5.0}


  0%|          | 0/32 [00:00<?, ?it/s]
  9%|▉         | 3/32 [00:00<00:01, 19.49it/s]
 16%|█▌        | 5/32 [00:00<00:01, 15.68it/s]
 22%|██▏       | 7/32 [00:00<00:01, 14.46it/s]
 28%|██▊       | 9/32 [00:00<00:01, 13.89it/s]
 34%|███▍      | 11/32 [00:00<00:01, 13.52it/s]
 41%|████      | 13/32 [00:00<00:01, 13.31it/s]
 47%|████▋     | 15/32 [00:01<00:01, 13.20it/s]
 53%|█████▎    | 17/32 [00:01<00:01, 13.11it/s]
 59%|█████▉    | 19/32 [00:01<00:00, 13.06it/s]
 66%|██████▌   | 21/32 [00:01<00:00, 13.01it/s]
 72%|███████▏  | 23/32 [00:01<00:00, 13.01it/s]
 78%|███████▊  | 25/32 [00:01<00:00, 12.98it/s]
 84%|████████▍ | 27/32 [00:02<00:00, 12.89it/s]
 91%|█████████ | 29/32 [00:02<00:00, 12.90it/s]
100%|██████████| 32/32 [00:02<00:00, 13.66it/s]


== Status ==
Current time: 2022-03-08 18:23:21 (running for 00:01:01.37)
Memory usage on this node: 8.0/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 RUNNING)
+------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------+
| Trial name             | status   | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |
|------------------------+----------+--------------------+----------+-----------+-------+----------------+--------------|
| _objective_b0de0_00000 | RUNNING  | 172.16.56.108:7142 |      500 | 0.285214  | 5e-05 |             32 |            5 |
| _objective_b0de0_00001 | PENDING  |    



== Status ==
Current time: 2022-03-08 18:23:26 (running for 00:01:06.37)
Memory usage on this node: 5.2/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 0/8 CPUs, 0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (7 PENDING, 1 TERMINATED)
+------------------------+------------+--------------------+----------+-----------+-------+----------------+--------------+------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |   eval_acc |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+----------+-----------+-------+----------------+--------------+------------+-------------+---------+--

[2m[36m(_objective pid=7142)[0m   len(cache))


== Status ==
Current time: 2022-03-08 18:23:32 (running for 00:01:12.25)
Memory usage on this node: 7.0/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (6 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+----------+-----------+-------+----------------+--------------+------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |   eval_acc |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+----------+-----------+-------+----------------+--------------+------------+-----------

[2m[36m(_objective pid=7143)[0m E0308 18:23:32.789227916    8336 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
[2m[36m(_objective pid=7143)[0m E0308 18:23:32.842908851    8336 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
[2m[36m(_objective pid=7143)[0m E0308 18:23:33.007065207    8336 fork_posix.cc:70]           Fork support is only compatible with the epoll1 and poll polling strategies
[2m[36m(_objective pid=7143)[0m Some weights of the model checkpoint at vinai/bertweet-base were not used when initializing RobertaForSequenceClassification: ['lm_head.dense.weight', 'roberta.pooler.dense.bias', 'roberta.pooler.dense.weight', 'lm_head.layer_norm.bias', 'lm_head.bias', 'lm_head.dense.bias', 'lm_head.decoder.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.weight']
[2m[36m(_objective pid=7143)[0m - This IS expected if you are initializing RobertaForSequen

== Status ==
Current time: 2022-03-08 18:23:37 (running for 00:01:17.27)
Memory usage on this node: 7.8/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (6 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+----------+-----------+-------+----------------+--------------+------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |   eval_acc |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+----------+-----------+-------+----------------+--------------+------------+-----------

[2m[36m(_objective pid=7143)[0m wandb: Tracking run with wandb version 0.12.11
[2m[36m(_objective pid=7143)[0m wandb: Run data is saved locally in /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18/_objective_b0de0_00001_1_learning_rate=5e-05,num_train_epochs=2,warmup_steps=50,weight_decay=0.046806_2022-03-08_18-22-19/wandb/run-20220308_182335-1upfm4bk
[2m[36m(_objective pid=7143)[0m wandb: Run `wandb offline` to turn off syncing.
[2m[36m(_objective pid=7143)[0m wandb: Syncing run different-voice-30
[2m[36m(_objective pid=7143)[0m wandb: ⭐️ View project at https://wandb.ai/chrisliew/huggingface
[2m[36m(_objective pid=7143)[0m wandb: 🚀 View run at https://wandb.ai/chrisliew/huggingface/runs/1upfm4bk
  0%|          | 0/64 [00:00<?, ?it/s]


[2m[36m(_objective pid=7143)[0m signal only works in main thread


  2%|▏         | 1/64 [00:00<00:32,  1.92it/s]
  3%|▎         | 2/64 [00:00<00:23,  2.65it/s]
  5%|▍         | 3/64 [00:01<00:20,  3.01it/s]
  6%|▋         | 4/64 [00:01<00:18,  3.23it/s]
  8%|▊         | 5/64 [00:01<00:17,  3.38it/s]
  9%|▉         | 6/64 [00:01<00:16,  3.47it/s]
 11%|█         | 7/64 [00:02<00:16,  3.54it/s]
 12%|█▎        | 8/64 [00:02<00:15,  3.57it/s]
 14%|█▍        | 9/64 [00:02<00:15,  3.59it/s]
 16%|█▌        | 10/64 [00:02<00:14,  3.60it/s]
 17%|█▋        | 11/64 [00:03<00:14,  3.61it/s]
 19%|█▉        | 12/64 [00:03<00:14,  3.63it/s]
 20%|██        | 13/64 [00:03<00:14,  3.64it/s]
 22%|██▏       | 14/64 [00:04<00:13,  3.64it/s]
 23%|██▎       | 15/64 [00:04<00:13,  3.64it/s]
 25%|██▌       | 16/64 [00:04<00:13,  3.65it/s]


== Status ==
Current time: 2022-03-08 18:23:47 (running for 00:01:27.31)
Memory usage on this node: 7.8/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (6 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+----------+-----------+-------+----------------+--------------+------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |   eval_acc |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+----------+-----------+-------+----------------+--------------+------------+-----------

 27%|██▋       | 17/64 [00:04<00:12,  3.66it/s]
 28%|██▊       | 18/64 [00:05<00:12,  3.64it/s]
 30%|██▉       | 19/64 [00:05<00:12,  3.66it/s]
 31%|███▏      | 20/64 [00:05<00:12,  3.66it/s]
 33%|███▎      | 21/64 [00:06<00:11,  3.65it/s]
 34%|███▍      | 22/64 [00:06<00:11,  3.67it/s]
 36%|███▌      | 23/64 [00:06<00:11,  3.67it/s]
 38%|███▊      | 24/64 [00:06<00:10,  3.68it/s]
 39%|███▉      | 25/64 [00:07<00:10,  3.67it/s]
 41%|████      | 26/64 [00:07<00:10,  3.67it/s]
 42%|████▏     | 27/64 [00:07<00:10,  3.67it/s]
 44%|████▍     | 28/64 [00:07<00:09,  3.69it/s]
 45%|████▌     | 29/64 [00:08<00:09,  3.68it/s]
 47%|████▋     | 30/64 [00:08<00:09,  3.68it/s]
 48%|████▊     | 31/64 [00:08<00:08,  3.70it/s]
 50%|█████     | 32/64 [00:08<00:07,  4.38it/s]
 52%|█████▏    | 33/64 [00:09<00:07,  4.15it/s]
 53%|█████▎    | 34/64 [00:09<00:07,  3.99it/s]
 55%|█████▍    | 35/64 [00:09<00:07,  3.89it/s]


== Status ==
Current time: 2022-03-08 18:23:52 (running for 00:01:32.33)
Memory usage on this node: 7.7/60.0 GiB
PopulationBasedTraining: 0 checkpoints, 0 perturbs
Resources requested: 8.0/8 CPUs, 1.0/1 GPUs, 0.0/35.43 GiB heap, 0.0/17.71 GiB objects (0.0/1.0 accelerator_type:V100)
Result logdir: /home/ec2-user/SageMaker/Crypto-Uncertainty-Index/nlp/hedge_classifier/hyper_tuning/ray_results/tune_hf_pbt_2022-03-08T18:22:18
Number of trials: 8/8 (6 PENDING, 1 RUNNING, 1 TERMINATED)
+------------------------+------------+--------------------+----------+-----------+-------+----------------+--------------+------------+-------------+---------+----------------------+
| Trial name             | status     | loc                |   warmup |   w_decay |    lr |   train_bs/gpu |   num_epochs |   eval_acc |   eval_loss |   epoch |   training_iteration |
|------------------------+------------+--------------------+----------+-----------+-------+----------------+--------------+------------+-----------

 56%|█████▋    | 36/64 [00:09<00:07,  3.83it/s]
 58%|█████▊    | 37/64 [00:10<00:07,  3.76it/s]
 59%|█████▉    | 38/64 [00:10<00:06,  3.76it/s]
 61%|██████    | 39/64 [00:10<00:06,  3.70it/s]
 62%|██████▎   | 40/64 [00:11<00:06,  3.67it/s]
 64%|██████▍   | 41/64 [00:11<00:06,  3.67it/s]
 66%|██████▌   | 42/64 [00:11<00:06,  3.66it/s]
 67%|██████▋   | 43/64 [00:11<00:05,  3.64it/s]
 69%|██████▉   | 44/64 [00:12<00:05,  3.64it/s]
 70%|███████   | 45/64 [00:12<00:05,  3.64it/s]
 72%|███████▏  | 46/64 [00:12<00:04,  3.62it/s]
 73%|███████▎  | 47/64 [00:12<00:04,  3.62it/s]
