Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

复现不出结果 #28

Closed
dengfenglai321 opened this issue Dec 8, 2022 · 13 comments
Closed

复现不出结果 #28

dengfenglai321 opened this issue Dec 8, 2022 · 13 comments
Labels
solved in expectation The issue has been solved in expectation and just needs a validation by the author of issue. 已解决待验证 该问题预期已解决,等待issue作者验证中

Comments

@dengfenglai321
Copy link

你好
我跑MUGE数据集,看训练过程中的验证集指标一直没有变化如下:
2022-12-08,21:22:28 | INFO | Rank 1 | Validation Result (epoch 2 @ 4650 steps) | Valid Loss: 1.668444 | Image2Text Acc: 32.41 | Text2Image Acc: 32.94 | logit_scale: 4.595 | Valid Batch Size: 48
2022-12-08,21:22:28 | INFO | Rank 0 | Validation Result (epoch 2 @ 4650 steps) | Valid Loss: 1.668444 | Image2Text Acc: 32.41 | Text2Image Acc: 32.94 | logit_scale: 4.595 | Valid Batch Size: 48

准确率一直是32左右。
请问怎么浮现出仓库中写的60+的准确率?

@yangapku
Copy link
Member

yangapku commented Dec 8, 2022

@yumulinfeng1 您好,有以下几个要点请您注意下:

  1. 训练过程中进行的验证,都是在一个batch内部计算Acc,这里计算的结果和最终全局召回的Recall@1/5/10及Mean Recall不是一个指标,仅用于训练过程判断收敛趋势。如果要得到可以和我们汇报的Recall@1/5/10以及Mean Recall对比的指标,请按照我们readme中跨模态检索部分,描述的训练→特征提取→KNN召回→计算Recall这个过程,完整走一遍finetune和测试集全图片池召回的过程。
  2. finetune的效果与超参数也有关系。能够跑出最优结果的超参数,请参见我们技术报告的附录部分A.3,给出了每个规模、每个数据集的最优超参数,供您参考。
  3. 请保证预训练ckpt有正常load进来。

还不太清楚您所说的60+具体是哪个指标。建议您可以先尝试对齐zero-shot结果,无须finetune。直接用想要对齐规模的模型预训练ckpt、和我们提供的预处理好的数据集,走完特征提取→KNN召回→计算Recall这个流程,看下Recall指标能否和我们汇报的结果一致,也供您熟悉一下用Chinese-CLIP进行图文检索的一个标准流程。 在此基础上,进一步跑个finetune,观察下效果能否提升。

@dengfenglai321
Copy link
Author

建议您可以先尝试对齐zero-shot结果,无须fine

回答得很详细 谢谢

@dengfenglai321
Copy link
Author

@yumulinfeng1 您好,有以下几个要点请您注意下:

  1. 训练过程中进行的验证,都是在一个batch内部计算Acc,这里计算的结果和最终全局召回的Recall@1/5/10及Mean Recall不是一个指标,仅用于训练过程判断收敛趋势。如果要得到可以和我们汇报的Recall@1/5/10以及Mean Recall对比的指标,请按照我们readme中跨模态检索部分,描述的训练→特征提取→KNN召回→计算Recall这个过程,完整走一遍finetune和测试集全图片池召回的过程。
  2. finetune的效果与超参数也有关系。能够跑出最优结果的超参数,请参见我们技术报告的附录部分A.3,给出了每个规模、每个数据集的最优超参数,供您参考。
  3. 请保证预训练ckpt有正常load进来。

还不太清楚您所说的60+具体是哪个指标。建议您可以先尝试对齐zero-shot结果,无须finetune。直接用想要对齐规模的模型预训练ckpt、和我们提供的预处理好的数据集,走完特征提取→KNN召回→计算Recall这个流程,看下Recall指标能否和我们汇报的结果一致,也供您熟悉一下用Chinese-CLIP进行图文检索的一个标准流程。 在此基础上,进一步跑个finetune,观察下效果能否提升。

你好 请问一下你仓库里给的验证集的log示例:
2022-06-16,11:06:00 | INFO | Rank 0 | Validation Result (epoch 1 @ 150 steps) | Valid Loss: 0.503617 | Image2Text Acc: 84.76 | Text2Image Acc: 84.37 | logit_scale: 4.605 | Valid Batch Size: 128

是哪个数据集的验证集?
如果是MUGE,那验证集的差距太大了。
我这边跑MUGE验证集的准确率就没变过,一直在32左右,是否有问题

@yangapku
Copy link
Member

yangapku commented Dec 9, 2022

您好,这个应该是来自MUGE的log,您可以贴一下您实际的运行脚本和log,便于我们判断

@dengfenglai321
Copy link
Author

dengfenglai321 commented Dec 9, 2022

脚本如下:

# Number of GPUs per GPU worker
GPUS_PER_NODE=2 
# Number of GPU workers, for single-worker training, please set to 1
WORKER_CNT=1
# The ip address of the rank-0 worker, for single-worker training, please set to localhost
export MASTER_ADDR=10.5.55.18
# The port for communication
export MASTER_PORT=8514
# The rank of this worker, should be in {0, ..., WORKER_CNT-1}, for single-worker training, please set to 0
export RANK=0 

export DATAPATH=/xxx/Text_Based_Image_Retrieval/data
export OUTLOGPATH=/xxx/Text_Based_Image_Retrieval/Chinese-CLIP-master/experiments


# data options
train_data=${DATAPATH}/datasets/MUGE/lmdb/train
val_data=${DATAPATH}/datasets/MUGE/lmdb/valid # if val_data is not specified, the validation will be automatically disabled

# restore options
resume=${OUTLOGPATH}/pretrained_weights/clip_cn_vit-b-16.pt # or specify your customed ckpt path to resume
reset_data_offset="--reset-data-offset"
reset_optimizer="--reset-optimizer"
# reset_optimizer=""

# output options
output_base_dir=${OUTLOGPATH}/
name=muge_finetune_vit-b-16_roberta-base_bs32
save_step_frequency=999999 # disable it
save_epoch_frequency=1
log_interval=1
report_training_batch_acc="--report-training-batch-acc"
# report_training_batch_acc=""

# training hyper-params
context_length=52
warmup=100
batch_size=48
valid_batch_size=48
lr=2e-5
wd=0.001
max_epochs=16
valid_step_interval=150
valid_epoch_interval=1
vision_model=ViT-B-16
text_model=RoBERTa-wwm-ext-base-chinese
use_augment="--use-augment"
# use_augment=""

python3 -m torch.distributed.launch --nproc_per_node=${GPUS_PER_NODE} --nnodes=${WORKER_CNT} --node_rank=${RANK} \
          --master_addr=${MASTER_ADDR} --master_port=${MASTER_PORT} cn_clip/training/main.py \
          --train-data=${train_data} \
          --val-data=${val_data} \
          --resume=${resume} \
          ${reset_data_offset} \
          ${reset_optimizer} \
          --logs=${output_base_dir} \
          --name=${name} \
          --save-step-frequency=${save_step_frequency} \
          --save-epoch-frequency=${save_epoch_frequency} \
          --log-interval=${log_interval} \
          ${report_training_batch_acc} \
          --context-length=${context_length} \
          --warmup=${warmup} \
          --batch-size=${batch_size} \
          --valid-batch-size=${valid_batch_size} \
          --valid-step-interval=${valid_step_interval} \
          --valid-epoch-interval=${valid_epoch_interval} \
          --lr=${lr} \
          --wd=${wd} \
          --max-epochs=${max_epochs} \
          --vision-model=${vision_model} \
          ${use_augment} \
          --text-model=${text_model}


log如下:

2022-12-09,11:31:47 | INFO | Rank 0 | Global Steps: 25500/41728 | Train Epoch: 10 [194688/250368 (78%)] | Loss: 0.191398 | Image2Text Acc: 92.71 | Text2Image Acc: 94.79 | Data Time: 0.060s | Batch Time: 0.835s | LR: 0.000007 | logit_scale: 4.584 | Global Batch Size: 96
2022-12-09,11:31:47 | INFO | Rank 0 | Begin to eval on validation set (epoch 10 @ 25500 steps)...
2022-12-09,11:32:56 | INFO | Rank 1 | Evaluated 100/319 batches...
2022-12-09,11:32:58 | INFO | Rank 0 | Evaluated 100/319 batches...
2022-12-09,11:34:04 | INFO | Rank 1 | Evaluated 200/319 batches...
2022-12-09,11:34:10 | INFO | Rank 0 | Evaluated 200/319 batches...2022-12-09,11:35:13 | INFO | Rank 1 | Evaluated 300/319 batches... 0
2022-12-09,11:35:21 | INFO | Rank 0 | Evaluated 300/319 batches...
2022-12-09,11:35:35 | INFO | Rank 1 | Validation Result (epoch 10 @ 25500 steps) | Valid Loss: 2.025491 | Image2Text Acc: 32.39 | Text2Image Acc: 33.03 | logit_scale: 4.584 | Valid Batch Size: 48
2022-12-09,11:35:35 | INFO | Rank 0 | Validation Result (epoch 10 @ 25500 steps) | Valid Loss: 2.025491 | Image2Text Acc: 32.39 | Text2Image Acc: 33.03 | logit_scale: 4.584 | Valid Batch Size: 48

@yangapku
Copy link
Member

yangapku commented Dec 9, 2022

请问方便提供一个完整的训练log文件链接吗?我们初步判断,是没有正确load进来预训练的ckpt参数,建议您也检查下ckpt是否放在--resume指定的位置。

@yangapku
Copy link
Member

yangapku commented Dec 9, 2022

您也可以看下log中是否有"=> no checkpoint found at"这样的日志打印出来,确认这一点。

@dengfenglai321
Copy link
Author

您也可以看下log中是否有"=> no checkpoint found at"这样的日志打印出来,确认这一点。

应该是加载成功权重了

2022-12-08,18:15:40 | INFO | Rank 0 | train LMDB file contains 129380 images and 250314 pairs.
2022-12-08,18:15:40 | INFO | Rank 0 | val LMDB file contains 29806 images and 30588 pairs.
2022-12-08,18:15:40 | INFO | Rank 0 | Params:
2022-12-08,18:15:40 | INFO | Rank 0 |   aggregate: True
2022-12-08,18:15:40 | INFO | Rank 0 |   batch_size: 48
2022-12-08,18:15:40 | INFO | Rank 0 |   bert_weight_path: None
2022-12-08,18:15:40 | INFO | Rank 0 |   beta1: 0.9
2022-12-08,18:15:40 | INFO | Rank 0 |   beta2: 0.98
2022-12-08,18:15:40 | INFO | Rank 0 |   checkpoint_path: /storage1/xxx/Text_Based_Image_Retrieval/Chinese-CLIP-master/experiments/muge_finetune_vit-b-16_roberta-base_bs32/checkpoints
2022-12-08,18:15:40 | INFO | Rank 0 |   clip_weight_path: None
2022-12-08,18:15:40 | INFO | Rank 0 |   context_length: 52
2022-12-08,18:15:40 | INFO | Rank 0 |   debug: False
2022-12-08,18:15:40 | INFO | Rank 0 |   device: cuda:0
2022-12-08,18:15:40 | INFO | Rank 0 |   eps: 1e-06
2022-12-08,18:15:40 | INFO | Rank 0 |   freeze_vision: False
2022-12-08,18:15:40 | INFO | Rank 0 |   grad_checkpointing: False
2022-12-08,18:15:40 | INFO | Rank 0 |   local_device_rank: 0
2022-12-08,18:15:40 | INFO | Rank 0 |   local_rank: 0
2022-12-08,18:15:40 | INFO | Rank 0 |   log_interval: 1
2022-12-08,18:15:40 | INFO | Rank 0 |   log_level: 20
2022-12-08,18:15:40 | INFO | Rank 0 |   log_path: /storage1/xxx/Text_Based_Image_Retrieval/Chinese-CLIP-master/experiments/muge_finetune_vit-b-16_roberta-base_bs32/out_2022-12-08-10-15-34.log
2022-12-08,18:15:40 | INFO | Rank 0 |   logs: /storage1/xxx/Text_Based_Image_Retrieval/Chinese-CLIP-master/experiments/
2022-12-08,18:15:40 | INFO | Rank 0 |   lr: 2e-05
2022-12-08,18:15:40 | INFO | Rank 0 |   max_epochs: 16
2022-12-08,18:15:40 | INFO | Rank 0 |   max_steps: 41728
2022-12-08,18:15:40 | INFO | Rank 0 |   name: muge_finetune_vit-b-16_roberta-base_bs32
2022-12-08,18:15:40 | INFO | Rank 0 |   num_workers: 4
2022-12-08,18:15:40 | INFO | Rank 0 |   precision: amp
2022-12-08,18:15:40 | INFO | Rank 0 |   rank: 0
2022-12-08,18:15:40 | INFO | Rank 0 |   report_training_batch_acc: True
2022-12-08,18:15:40 | INFO | Rank 0 |   reset_data_offset: True
2022-12-08,18:15:40 | INFO | Rank 0 |   reset_optimizer: True
2022-12-08,18:15:40 | INFO | Rank 0 |   resume: /storage1/xxx/Text_Based_Image_Retrieval/Chinese-CLIP-master/experiments/pretrained_weights/clip_cn_vit-b-16.pt
2022-12-08,18:15:40 | INFO | Rank 0 |   save_epoch_frequency: 1
2022-12-08,18:15:40 | INFO | Rank 0 |   save_step_frequency: 999999
2022-12-08,18:15:40 | INFO | Rank 0 |   seed: 123
2022-12-08,18:15:40 | INFO | Rank 0 |   skip_aggregate: False
2022-12-08,18:15:40 | INFO | Rank 0 |   skip_scheduler: False
2022-12-08,18:15:40 | INFO | Rank 0 |   text_model: RoBERTa-wwm-ext-base-chinese
2022-12-08,18:15:40 | INFO | Rank 0 |   train_data: /storage1/xxx/Text_Based_Image_Retrieval/data/datasets/MUGE/lmdb/train
2022-12-08,18:15:40 | INFO | Rank 0 |   use_augment: True
2022-12-08,18:15:40 | INFO | Rank 0 |   use_bn_sync: False
2022-12-08,18:15:40 | INFO | Rank 0 |   val_data: /storage1/xxx/Text_Based_Image_Retrieval/data/datasets/MUGE/lmdb/valid
2022-12-08,18:15:40 | INFO | Rank 0 |   valid_batch_size: 48
2022-12-08,18:15:40 | INFO | Rank 0 |   valid_epoch_interval: 1
2022-12-08,18:15:40 | INFO | Rank 0 |   valid_step_interval: 150
2022-12-08,18:15:40 | INFO | Rank 0 |   vision_model: ViT-B-16
2022-12-08,18:15:40 | INFO | Rank 0 |   warmup: 100
2022-12-08,18:15:40 | INFO | Rank 0 |   wd: 0.001
2022-12-08,18:15:40 | INFO | Rank 0 |   world_size: 2
2022-12-08,18:15:40 | INFO | Rank 0 | Use GPU: 0 for training
2022-12-08,18:15:40 | INFO | Rank 0 | => begin to load checkpoint '/storage1/xxx/Text_Based_Image_Retrieval/Chinese-CLIP-master/experiments/pretrained_weights/clip_cn_vit-b-16.pt'
2022-12-08,18:15:41 | INFO | Rank 0 | => loaded checkpoint '/storage1/xxx/Text_Based_Image_Retrieval/Chinese-CLIP-master/experiments/pretrained_weights/clip_cn_vit-b-16.pt' (epoch 15 @ 0 steps)
2022-12-08,18:15:43 | INFO | Rank 0 | Global Steps: 1/41728 | Train Epoch: 1 [96/250368 (0%)] | Loss: 0.689587 | Image2Text Acc: 82.29 | Text2Image Acc: 81.25 | Data Time: 0.308s | Batch Time: 1.291s | LR: 0.000000 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:43 | INFO | Rank 0 | Reducer buckets have been rebuilt in this iteration.
2022-12-08,18:15:43 | INFO | Rank 0 | Global Steps: 2/41728 | Train Epoch: 1 [192/250368 (0%)] | Loss: 1.083234 | Image2Text Acc: 71.88 | Text2Image Acc: 76.04 | Data Time: 0.014s | Batch Time: 0.739s | LR: 0.000000 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:44 | INFO | Rank 0 | Global Steps: 3/41728 | Train Epoch: 1 [288/250368 (0%)] | Loss: 0.726163 | Image2Text Acc: 78.12 | Text2Image Acc: 78.12 | Data Time: 0.014s | Batch Time: 0.811s | LR: 0.000001 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:45 | INFO | Rank 0 | Global Steps: 4/41728 | Train Epoch: 1 [384/250368 (0%)] | Loss: 0.866006 | Image2Text Acc: 73.96 | Text2Image Acc: 77.08 | Data Time: 0.057s | Batch Time: 0.824s | LR: 0.000001 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:46 | INFO | Rank 0 | Global Steps: 5/41728 | Train Epoch: 1 [480/250368 (0%)] | Loss: 0.849096 | Image2Text Acc: 79.17 | Text2Image Acc: 82.29 | Data Time: 0.062s | Batch Time: 0.836s | LR: 0.000001 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:47 | INFO | Rank 0 | Global Steps: 6/41728 | Train Epoch: 1 [576/250368 (0%)] | Loss: 1.290957 | Image2Text Acc: 62.50 | Text2Image Acc: 66.67 | Data Time: 0.057s | Batch Time: 0.789s | LR: 0.000001 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:47 | INFO | Rank 0 | Global Steps: 7/41728 | Train Epoch: 1 [672/250368 (0%)] | Loss: 0.592875 | Image2Text Acc: 82.29 | Text2Image Acc: 83.33 | Data Time: 0.014s | Batch Time: 0.805s | LR: 0.000001 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:48 | INFO | Rank 0 | Global Steps: 8/41728 | Train Epoch: 1 [768/250368 (0%)] | Loss: 1.060924 | Image2Text Acc: 75.00 | Text2Image Acc: 76.04 | Data Time: 0.057s | Batch Time: 0.830s | LR: 0.000002 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:49 | INFO | Rank 0 | Global Steps: 9/41728 | Train Epoch: 1 [864/250368 (0%)] | Loss: 0.883194 | Image2Text Acc: 78.12 | Text2Image Acc: 78.12 | Data Time: 0.059s | Batch Time: 0.837s | LR: 0.000002 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:50 | INFO | Rank 0 | Global Steps: 10/41728 | Train Epoch: 1 [960/250368 (0%)] | Loss: 0.976302 | Image2Text Acc: 75.00 | Text2Image Acc: 76.04 | Data Time: 0.058s | Batch Time: 0.831s | LR: 0.000002 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:51 | INFO | Rank 0 | Global Steps: 11/41728 | Train Epoch: 1 [1056/250368 (0%)] | Loss: 1.009400 | Image2Text Acc: 71.88 | Text2Image Acc: 75.00 | Data Time: 0.059s | Batch Time: 0.835s | LR: 0.000002 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:52 | INFO | Rank 0 | Global Steps: 12/41728 | Train Epoch: 1 [1152/250368 (0%)] | Loss: 0.968916 | Image2Text Acc: 78.12 | Text2Image Acc: 77.08 | Data Time: 0.059s | Batch Time: 0.830s | LR: 0.000002 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:52 | INFO | Rank 0 | Global Steps: 13/41728 | Train Epoch: 1 [1248/250368 (0%)] | Loss: 0.952544 | Image2Text Acc: 77.08 | Text2Image Acc: 78.12 | Data Time: 0.058s | Batch Time: 0.836s | LR: 0.000003 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:53 | INFO | Rank 0 | Global Steps: 14/41728 | Train Epoch: 1 [1344/250368 (1%)] | Loss: 1.175281 | Image2Text Acc: 75.00 | Text2Image Acc: 75.00 | Data Time: 0.059s | Batch Time: 0.829s | LR: 0.000003 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:54 | INFO | Rank 0 | Global Steps: 15/41728 | Train Epoch: 1 [1440/250368 (1%)] | Loss: 0.807069 | Image2Text Acc: 75.00 | Text2Image Acc: 77.08 | Data Time: 0.059s | Batch Time: 0.832s | LR: 0.000003 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:55 | INFO | Rank 0 | Global Steps: 16/41728 | Train Epoch: 1 [1536/250368 (1%)] | Loss: 1.177227 | Image2Text Acc: 69.79 | Text2Image Acc: 73.96 | Data Time: 0.058s | Batch Time: 0.833s | LR: 0.000003 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:56 | INFO | Rank 0 | Global Steps: 17/41728 | Train Epoch: 1 [1632/250368 (1%)] | Loss: 0.982769 | Image2Text Acc: 68.75 | Text2Image Acc: 76.04 | Data Time: 0.059s | Batch Time: 0.831s | LR: 0.000003 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:57 | INFO | Rank 0 | Global Steps: 18/41728 | Train Epoch: 1 [1728/250368 (1%)] | Loss: 1.051534 | Image2Text Acc: 68.75 | Text2Image Acc: 70.83 | Data Time: 0.059s | Batch Time: 0.832s | LR: 0.000004 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:57 | INFO | Rank 0 | Global Steps: 19/41728 | Train Epoch: 1 [1824/250368 (1%)] | Loss: 0.981971 | Image2Text Acc: 77.08 | Text2Image Acc: 71.88 | Data Time: 0.058s | Batch Time: 0.833s | LR: 0.000004 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:58 | INFO | Rank 0 | Global Steps: 20/41728 | Train Epoch: 1 [1920/250368 (1%)] | Loss: 0.906785 | Image2Text Acc: 72.92 | Text2Image Acc: 78.12 | Data Time: 0.059s | Batch Time: 0.830s | LR: 0.000004 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:15:59 | INFO | Rank 0 | Global Steps: 21/41728 | Train Epoch: 1 [2016/250368 (1%)] | Loss: 0.401411 | Image2Text Acc: 92.71 | Text2Image Acc: 85.42 | Data Time: 0.058s | Batch Time: 0.833s | LR: 0.000004 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:00 | INFO | Rank 0 | Global Steps: 22/41728 | Train Epoch: 1 [2112/250368 (1%)] | Loss: 0.765108 | Image2Text Acc: 77.08 | Text2Image Acc: 78.12 | Data Time: 0.058s | Batch Time: 0.839s | LR: 0.000004 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:01 | INFO | Rank 0 | Global Steps: 23/41728 | Train Epoch: 1 [2208/250368 (1%)] | Loss: 0.725298 | Image2Text Acc: 81.25 | Text2Image Acc: 79.17 | Data Time: 0.059s | Batch Time: 0.839s | LR: 0.000005 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:02 | INFO | Rank 0 | Global Steps: 24/41728 | Train Epoch: 1 [2304/250368 (1%)] | Loss: 1.053250 | Image2Text Acc: 71.88 | Text2Image Acc: 71.88 | Data Time: 0.059s | Batch Time: 0.833s | LR: 0.000005 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:02 | INFO | Rank 0 | Global Steps: 25/41728 | Train Epoch: 1 [2400/250368 (1%)] | Loss: 0.892181 | Image2Text Acc: 71.88 | Text2Image Acc: 80.21 | Data Time: 0.059s | Batch Time: 0.833s | LR: 0.000005 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:03 | INFO | Rank 0 | Global Steps: 26/41728 | Train Epoch: 1 [2496/250368 (1%)] | Loss: 0.863705 | Image2Text Acc: 72.92 | Text2Image Acc: 82.29 | Data Time: 0.058s | Batch Time: 0.834s | LR: 0.000005 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:04 | INFO | Rank 0 | Global Steps: 27/41728 | Train Epoch: 1 [2592/250368 (1%)] | Loss: 1.011039 | Image2Text Acc: 75.00 | Text2Image Acc: 73.96 | Data Time: 0.059s | Batch Time: 0.833s | LR: 0.000005 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:05 | INFO | Rank 0 | Global Steps: 28/41728 | Train Epoch: 1 [2688/250368 (1%)] | Loss: 0.954087 | Image2Text Acc: 73.96 | Text2Image Acc: 78.12 | Data Time: 0.058s | Batch Time: 0.834s | LR: 0.000006 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:06 | INFO | Rank 0 | Global Steps: 29/41728 | Train Epoch: 1 [2784/250368 (1%)] | Loss: 0.826310 | Image2Text Acc: 75.00 | Text2Image Acc: 71.88 | Data Time: 0.058s | Batch Time: 0.833s | LR: 0.000006 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:07 | INFO | Rank 0 | Global Steps: 30/41728 | Train Epoch: 1 [2880/250368 (1%)] | Loss: 0.941817 | Image2Text Acc: 77.08 | Text2Image Acc: 73.96 | Data Time: 0.058s | Batch Time: 0.837s | LR: 0.000006 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:07 | INFO | Rank 0 | Global Steps: 31/41728 | Train Epoch: 1 [2976/250368 (1%)] | Loss: 0.758239 | Image2Text Acc: 76.04 | Text2Image Acc: 75.00 | Data Time: 0.058s | Batch Time: 0.841s | LR: 0.000006 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:08 | INFO | Rank 0 | Global Steps: 32/41728 | Train Epoch: 1 [3072/250368 (1%)] | Loss: 0.736244 | Image2Text Acc: 76.04 | Text2Image Acc: 71.88 | Data Time: 0.058s | Batch Time: 0.833s | LR: 0.000006 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:09 | INFO | Rank 0 | Global Steps: 33/41728 | Train Epoch: 1 [3168/250368 (1%)] | Loss: 0.652642 | Image2Text Acc: 78.12 | Text2Image Acc: 81.25 | Data Time: 0.058s | Batch Time: 0.839s | LR: 0.000007 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:10 | INFO | Rank 0 | Global Steps: 34/41728 | Train Epoch: 1 [3264/250368 (1%)] | Loss: 0.621304 | Image2Text Acc: 82.29 | Text2Image Acc: 83.33 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000007 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:11 | INFO | Rank 0 | Global Steps: 35/41728 | Train Epoch: 1 [3360/250368 (1%)] | Loss: 0.860351 | Image2Text Acc: 78.12 | Text2Image Acc: 79.17 | Data Time: 0.059s | Batch Time: 0.841s | LR: 0.000007 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:12 | INFO | Rank 0 | Global Steps: 36/41728 | Train Epoch: 1 [3456/250368 (1%)] | Loss: 0.715295 | Image2Text Acc: 81.25 | Text2Image Acc: 78.12 | Data Time: 0.058s | Batch Time: 0.836s | LR: 0.000007 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:12 | INFO | Rank 0 | Global Steps: 37/41728 | Train Epoch: 1 [3552/250368 (1%)] | Loss: 0.674997 | Image2Text Acc: 80.21 | Text2Image Acc: 84.38 | Data Time: 0.058s | Batch Time: 0.836s | LR: 0.000007 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:13 | INFO | Rank 0 | Global Steps: 38/41728 | Train Epoch: 1 [3648/250368 (1%)] | Loss: 0.958120 | Image2Text Acc: 72.92 | Text2Image Acc: 73.96 | Data Time: 0.058s | Batch Time: 0.836s | LR: 0.000008 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:14 | INFO | Rank 0 | Global Steps: 39/41728 | Train Epoch: 1 [3744/250368 (1%)] | Loss: 1.020398 | Image2Text Acc: 71.88 | Text2Image Acc: 75.00 | Data Time: 0.058s | Batch Time: 0.841s | LR: 0.000008 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:15 | INFO | Rank 0 | Global Steps: 40/41728 | Train Epoch: 1 [3840/250368 (2%)] | Loss: 0.860142 | Image2Text Acc: 71.88 | Text2Image Acc: 69.79 | Data Time: 0.058s | Batch Time: 0.832s | LR: 0.000008 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:16 | INFO | Rank 0 | Global Steps: 41/41728 | Train Epoch: 1 [3936/250368 (2%)] | Loss: 0.591362 | Image2Text Acc: 82.29 | Text2Image Acc: 84.38 | Data Time: 0.058s | Batch Time: 0.836s | LR: 0.000008 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:17 | INFO | Rank 0 | Global Steps: 42/41728 | Train Epoch: 1 [4032/250368 (2%)] | Loss: 0.677259 | Image2Text Acc: 76.04 | Text2Image Acc: 80.21 | Data Time: 0.058s | Batch Time: 0.836s | LR: 0.000008 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:17 | INFO | Rank 0 | Global Steps: 43/41728 | Train Epoch: 1 [4128/250368 (2%)] | Loss: 0.857192 | Image2Text Acc: 76.04 | Text2Image Acc: 75.00 | Data Time: 0.058s | Batch Time: 0.837s | LR: 0.000009 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:18 | INFO | Rank 0 | Global Steps: 44/41728 | Train Epoch: 1 [4224/250368 (2%)] | Loss: 0.734577 | Image2Text Acc: 80.21 | Text2Image Acc: 84.38 | Data Time: 0.058s | Batch Time: 0.842s | LR: 0.000009 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:19 | INFO | Rank 0 | Global Steps: 45/41728 | Train Epoch: 1 [4320/250368 (2%)] | Loss: 0.731063 | Image2Text Acc: 80.21 | Text2Image Acc: 78.12 | Data Time: 0.058s | Batch Time: 0.835s | LR: 0.000009 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:20 | INFO | Rank 0 | Global Steps: 46/41728 | Train Epoch: 1 [4416/250368 (2%)] | Loss: 0.641702 | Image2Text Acc: 81.25 | Text2Image Acc: 83.33 | Data Time: 0.058s | Batch Time: 0.839s | LR: 0.000009 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:21 | INFO | Rank 0 | Global Steps: 47/41728 | Train Epoch: 1 [4512/250368 (2%)] | Loss: 0.874175 | Image2Text Acc: 73.96 | Text2Image Acc: 76.04 | Data Time: 0.058s | Batch Time: 0.840s | LR: 0.000009 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:22 | INFO | Rank 0 | Global Steps: 48/41728 | Train Epoch: 1 [4608/250368 (2%)] | Loss: 0.808500 | Image2Text Acc: 79.17 | Text2Image Acc: 81.25 | Data Time: 0.058s | Batch Time: 0.838s | LR: 0.000010 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:22 | INFO | Rank 0 | Global Steps: 49/41728 | Train Epoch: 1 [4704/250368 (2%)] | Loss: 0.682884 | Image2Text Acc: 78.12 | Text2Image Acc: 80.21 | Data Time: 0.059s | Batch Time: 0.841s | LR: 0.000010 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:23 | INFO | Rank 0 | Global Steps: 50/41728 | Train Epoch: 1 [4800/250368 (2%)] | Loss: 0.991166 | Image2Text Acc: 65.62 | Text2Image Acc: 68.75 | Data Time: 0.058s | Batch Time: 0.834s | LR: 0.000010 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:24 | INFO | Rank 0 | Global Steps: 51/41728 | Train Epoch: 1 [4896/250368 (2%)] | Loss: 0.666178 | Image2Text Acc: 80.21 | Text2Image Acc: 81.25 | Data Time: 0.058s | Batch Time: 0.841s | LR: 0.000010 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:25 | INFO | Rank 0 | Global Steps: 52/41728 | Train Epoch: 1 [4992/250368 (2%)] | Loss: 0.741099 | Image2Text Acc: 80.21 | Text2Image Acc: 81.25 | Data Time: 0.059s | Batch Time: 0.838s | LR: 0.000010 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:26 | INFO | Rank 0 | Global Steps: 53/41728 | Train Epoch: 1 [5088/250368 (2%)] | Loss: 0.617694 | Image2Text Acc: 79.17 | Text2Image Acc: 83.33 | Data Time: 0.059s | Batch Time: 0.837s | LR: 0.000011 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:27 | INFO | Rank 0 | Global Steps: 54/41728 | Train Epoch: 1 [5184/250368 (2%)] | Loss: 0.492508 | Image2Text Acc: 85.42 | Text2Image Acc: 86.46 | Data Time: 0.058s | Batch Time: 0.841s | LR: 0.000011 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:28 | INFO | Rank 0 | Global Steps: 55/41728 | Train Epoch: 1 [5280/250368 (2%)] | Loss: 0.768075 | Image2Text Acc: 82.29 | Text2Image Acc: 79.17 | Data Time: 0.058s | Batch Time: 0.841s | LR: 0.000011 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:28 | INFO | Rank 0 | Global Steps: 56/41728 | Train Epoch: 1 [5376/250368 (2%)] | Loss: 0.469336 | Image2Text Acc: 86.46 | Text2Image Acc: 86.46 | Data Time: 0.058s | Batch Time: 0.834s | LR: 0.000011 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:29 | INFO | Rank 0 | Global Steps: 57/41728 | Train Epoch: 1 [5472/250368 (2%)] | Loss: 0.766508 | Image2Text Acc: 79.17 | Text2Image Acc: 76.04 | Data Time: 0.059s | Batch Time: 0.835s | LR: 0.000011 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:30 | INFO | Rank 0 | Global Steps: 58/41728 | Train Epoch: 1 [5568/250368 (2%)] | Loss: 0.836963 | Image2Text Acc: 70.83 | Text2Image Acc: 82.29 | Data Time: 0.058s | Batch Time: 0.844s | LR: 0.000012 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:31 | INFO | Rank 0 | Global Steps: 59/41728 | Train Epoch: 1 [5664/250368 (2%)] | Loss: 0.662819 | Image2Text Acc: 82.29 | Text2Image Acc: 76.04 | Data Time: 0.058s | Batch Time: 0.832s | LR: 0.000012 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:32 | INFO | Rank 0 | Global Steps: 60/41728 | Train Epoch: 1 [5760/250368 (2%)] | Loss: 0.831568 | Image2Text Acc: 77.08 | Text2Image Acc: 77.08 | Data Time: 0.058s | Batch Time: 0.842s | LR: 0.000012 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:33 | INFO | Rank 0 | Global Steps: 61/41728 | Train Epoch: 1 [5856/250368 (2%)] | Loss: 0.987221 | Image2Text Acc: 76.04 | Text2Image Acc: 77.08 | Data Time: 0.059s | Batch Time: 0.838s | LR: 0.000012 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:33 | INFO | Rank 0 | Global Steps: 62/41728 | Train Epoch: 1 [5952/250368 (2%)] | Loss: 0.849950 | Image2Text Acc: 75.00 | Text2Image Acc: 76.04 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000012 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:34 | INFO | Rank 0 | Global Steps: 63/41728 | Train Epoch: 1 [6048/250368 (2%)] | Loss: 0.703894 | Image2Text Acc: 79.17 | Text2Image Acc: 82.29 | Data Time: 0.058s | Batch Time: 0.835s | LR: 0.000013 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:35 | INFO | Rank 0 | Global Steps: 64/41728 | Train Epoch: 1 [6144/250368 (2%)] | Loss: 0.706505 | Image2Text Acc: 81.25 | Text2Image Acc: 80.21 | Data Time: 0.058s | Batch Time: 0.840s | LR: 0.000013 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:36 | INFO | Rank 0 | Global Steps: 65/41728 | Train Epoch: 1 [6240/250368 (2%)] | Loss: 0.719888 | Image2Text Acc: 82.29 | Text2Image Acc: 81.25 | Data Time: 0.059s | Batch Time: 0.837s | LR: 0.000013 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:37 | INFO | Rank 0 | Global Steps: 66/41728 | Train Epoch: 1 [6336/250368 (3%)] | Loss: 0.777751 | Image2Text Acc: 75.00 | Text2Image Acc: 76.04 | Data Time: 0.059s | Batch Time: 0.836s | LR: 0.000013 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:38 | INFO | Rank 0 | Global Steps: 67/41728 | Train Epoch: 1 [6432/250368 (3%)] | Loss: 0.822757 | Image2Text Acc: 75.00 | Text2Image Acc: 75.00 | Data Time: 0.058s | Batch Time: 0.839s | LR: 0.000013 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:38 | INFO | Rank 0 | Global Steps: 68/41728 | Train Epoch: 1 [6528/250368 (3%)] | Loss: 0.637543 | Image2Text Acc: 79.17 | Text2Image Acc: 81.25 | Data Time: 0.059s | Batch Time: 0.842s | LR: 0.000014 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:39 | INFO | Rank 0 | Global Steps: 69/41728 | Train Epoch: 1 [6624/250368 (3%)] | Loss: 0.503515 | Image2Text Acc: 83.33 | Text2Image Acc: 83.33 | Data Time: 0.059s | Batch Time: 0.837s | LR: 0.000014 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:40 | INFO | Rank 0 | Global Steps: 70/41728 | Train Epoch: 1 [6720/250368 (3%)] | Loss: 0.861151 | Image2Text Acc: 76.04 | Text2Image Acc: 70.83 | Data Time: 0.059s | Batch Time: 0.831s | LR: 0.000014 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:41 | INFO | Rank 0 | Global Steps: 71/41728 | Train Epoch: 1 [6816/250368 (3%)] | Loss: 0.842519 | Image2Text Acc: 75.00 | Text2Image Acc: 75.00 | Data Time: 0.058s | Batch Time: 0.838s | LR: 0.000014 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:42 | INFO | Rank 0 | Global Steps: 72/41728 | Train Epoch: 1 [6912/250368 (3%)] | Loss: 0.647028 | Image2Text Acc: 80.21 | Text2Image Acc: 80.21 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000014 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:43 | INFO | Rank 0 | Global Steps: 73/41728 | Train Epoch: 1 [7008/250368 (3%)] | Loss: 0.533600 | Image2Text Acc: 79.17 | Text2Image Acc: 82.29 | Data Time: 0.059s | Batch Time: 0.835s | LR: 0.000015 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:43 | INFO | Rank 0 | Global Steps: 74/41728 | Train Epoch: 1 [7104/250368 (3%)] | Loss: 0.602674 | Image2Text Acc: 83.33 | Text2Image Acc: 81.25 | Data Time: 0.062s | Batch Time: 0.841s | LR: 0.000015 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:44 | INFO | Rank 0 | Global Steps: 75/41728 | Train Epoch: 1 [7200/250368 (3%)] | Loss: 0.748078 | Image2Text Acc: 77.08 | Text2Image Acc: 83.33 | Data Time: 0.058s | Batch Time: 0.841s | LR: 0.000015 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:45 | INFO | Rank 0 | Global Steps: 76/41728 | Train Epoch: 1 [7296/250368 (3%)] | Loss: 1.266804 | Image2Text Acc: 75.00 | Text2Image Acc: 73.96 | Data Time: 0.059s | Batch Time: 0.838s | LR: 0.000015 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:46 | INFO | Rank 0 | Global Steps: 77/41728 | Train Epoch: 1 [7392/250368 (3%)] | Loss: 0.701536 | Image2Text Acc: 79.17 | Text2Image Acc: 80.21 | Data Time: 0.059s | Batch Time: 0.838s | LR: 0.000015 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:47 | INFO | Rank 0 | Global Steps: 78/41728 | Train Epoch: 1 [7488/250368 (3%)] | Loss: 0.640760 | Image2Text Acc: 80.21 | Text2Image Acc: 81.25 | Data Time: 0.059s | Batch Time: 0.841s | LR: 0.000016 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:48 | INFO | Rank 0 | Global Steps: 79/41728 | Train Epoch: 1 [7584/250368 (3%)] | Loss: 0.555545 | Image2Text Acc: 82.29 | Text2Image Acc: 89.58 | Data Time: 0.059s | Batch Time: 0.836s | LR: 0.000016 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:48 | INFO | Rank 0 | Global Steps: 80/41728 | Train Epoch: 1 [7680/250368 (3%)] | Loss: 0.600125 | Image2Text Acc: 84.38 | Text2Image Acc: 83.33 | Data Time: 0.059s | Batch Time: 0.836s | LR: 0.000016 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:49 | INFO | Rank 0 | Global Steps: 81/41728 | Train Epoch: 1 [7776/250368 (3%)] | Loss: 0.811505 | Image2Text Acc: 76.04 | Text2Image Acc: 78.12 | Data Time: 0.059s | Batch Time: 0.844s | LR: 0.000016 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:50 | INFO | Rank 0 | Global Steps: 82/41728 | Train Epoch: 1 [7872/250368 (3%)] | Loss: 0.555656 | Image2Text Acc: 79.17 | Text2Image Acc: 86.46 | Data Time: 0.059s | Batch Time: 0.833s | LR: 0.000016 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:51 | INFO | Rank 0 | Global Steps: 83/41728 | Train Epoch: 1 [7968/250368 (3%)] | Loss: 0.585370 | Image2Text Acc: 84.38 | Text2Image Acc: 81.25 | Data Time: 0.058s | Batch Time: 0.838s | LR: 0.000017 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:52 | INFO | Rank 0 | Global Steps: 84/41728 | Train Epoch: 1 [8064/250368 (3%)] | Loss: 0.584991 | Image2Text Acc: 86.46 | Text2Image Acc: 82.29 | Data Time: 0.059s | Batch Time: 0.844s | LR: 0.000017 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:53 | INFO | Rank 0 | Global Steps: 85/41728 | Train Epoch: 1 [8160/250368 (3%)] | Loss: 0.515017 | Image2Text Acc: 86.46 | Text2Image Acc: 86.46 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000017 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:53 | INFO | Rank 0 | Global Steps: 86/41728 | Train Epoch: 1 [8256/250368 (3%)] | Loss: 0.865349 | Image2Text Acc: 71.88 | Text2Image Acc: 78.12 | Data Time: 0.059s | Batch Time: 0.839s | LR: 0.000017 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:54 | INFO | Rank 0 | Global Steps: 87/41728 | Train Epoch: 1 [8352/250368 (3%)] | Loss: 0.640673 | Image2Text Acc: 79.17 | Text2Image Acc: 79.17 | Data Time: 0.059s | Batch Time: 0.847s | LR: 0.000017 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:55 | INFO | Rank 0 | Global Steps: 88/41728 | Train Epoch: 1 [8448/250368 (3%)] | Loss: 0.806827 | Image2Text Acc: 71.88 | Text2Image Acc: 80.21 | Data Time: 0.059s | Batch Time: 0.838s | LR: 0.000018 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:56 | INFO | Rank 0 | Global Steps: 89/41728 | Train Epoch: 1 [8544/250368 (3%)] | Loss: 0.757677 | Image2Text Acc: 75.00 | Text2Image Acc: 76.04 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000018 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:57 | INFO | Rank 0 | Global Steps: 90/41728 | Train Epoch: 1 [8640/250368 (3%)] | Loss: 0.518550 | Image2Text Acc: 84.38 | Text2Image Acc: 86.46 | Data Time: 0.059s | Batch Time: 0.833s | LR: 0.000018 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:58 | INFO | Rank 0 | Global Steps: 91/41728 | Train Epoch: 1 [8736/250368 (3%)] | Loss: 0.763695 | Image2Text Acc: 77.08 | Text2Image Acc: 79.17 | Data Time: 0.058s | Batch Time: 0.838s | LR: 0.000018 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:59 | INFO | Rank 0 | Global Steps: 92/41728 | Train Epoch: 1 [8832/250368 (4%)] | Loss: 0.794845 | Image2Text Acc: 73.96 | Text2Image Acc: 76.04 | Data Time: 0.058s | Batch Time: 0.838s | LR: 0.000018 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:16:59 | INFO | Rank 0 | Global Steps: 93/41728 | Train Epoch: 1 [8928/250368 (4%)] | Loss: 0.611818 | Image2Text Acc: 76.04 | Text2Image Acc: 79.17 | Data Time: 0.058s | Batch Time: 0.836s | LR: 0.000019 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:00 | INFO | Rank 0 | Global Steps: 94/41728 | Train Epoch: 1 [9024/250368 (4%)] | Loss: 0.778732 | Image2Text Acc: 79.17 | Text2Image Acc: 78.12 | Data Time: 0.059s | Batch Time: 0.839s | LR: 0.000019 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:01 | INFO | Rank 0 | Global Steps: 95/41728 | Train Epoch: 1 [9120/250368 (4%)] | Loss: 1.208666 | Image2Text Acc: 69.79 | Text2Image Acc: 68.75 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000019 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:02 | INFO | Rank 0 | Global Steps: 96/41728 | Train Epoch: 1 [9216/250368 (4%)] | Loss: 0.822677 | Image2Text Acc: 75.00 | Text2Image Acc: 80.21 | Data Time: 0.058s | Batch Time: 0.834s | LR: 0.000019 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:03 | INFO | Rank 0 | Global Steps: 97/41728 | Train Epoch: 1 [9312/250368 (4%)] | Loss: 1.062310 | Image2Text Acc: 72.92 | Text2Image Acc: 68.75 | Data Time: 0.059s | Batch Time: 0.837s | LR: 0.000019 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:04 | INFO | Rank 0 | Global Steps: 98/41728 | Train Epoch: 1 [9408/250368 (4%)] | Loss: 0.967333 | Image2Text Acc: 76.04 | Text2Image Acc: 78.12 | Data Time: 0.059s | Batch Time: 0.843s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:04 | INFO | Rank 0 | Global Steps: 99/41728 | Train Epoch: 1 [9504/250368 (4%)] | Loss: 0.582876 | Image2Text Acc: 82.29 | Text2Image Acc: 81.25 | Data Time: 0.058s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:05 | INFO | Rank 0 | Global Steps: 100/41728 | Train Epoch: 1 [9600/250368 (4%)] | Loss: 0.706923 | Image2Text Acc: 77.08 | Text2Image Acc: 81.25 | Data Time: 0.059s | Batch Time: 0.831s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:06 | INFO | Rank 0 | Global Steps: 101/41728 | Train Epoch: 1 [9696/250368 (4%)] | Loss: 0.872719 | Image2Text Acc: 76.04 | Text2Image Acc: 78.12 | Data Time: 0.059s | Batch Time: 0.839s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:07 | INFO | Rank 0 | Global Steps: 102/41728 | Train Epoch: 1 [9792/250368 (4%)] | Loss: 0.717685 | Image2Text Acc: 73.96 | Text2Image Acc: 76.04 | Data Time: 0.059s | Batch Time: 0.839s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:08 | INFO | Rank 0 | Global Steps: 103/41728 | Train Epoch: 1 [9888/250368 (4%)] | Loss: 0.957555 | Image2Text Acc: 73.96 | Text2Image Acc: 71.88 | Data Time: 0.059s | Batch Time: 0.842s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:09 | INFO | Rank 0 | Global Steps: 104/41728 | Train Epoch: 1 [9984/250368 (4%)] | Loss: 0.609131 | Image2Text Acc: 80.21 | Text2Image Acc: 80.21 | Data Time: 0.059s | Batch Time: 0.836s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:09 | INFO | Rank 0 | Global Steps: 105/41728 | Train Epoch: 1 [10080/250368 (4%)] | Loss: 0.697119 | Image2Text Acc: 75.00 | Text2Image Acc: 82.29 | Data Time: 0.059s | Batch Time: 0.841s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:10 | INFO | Rank 0 | Global Steps: 106/41728 | Train Epoch: 1 [10176/250368 (4%)] | Loss: 1.170625 | Image2Text Acc: 70.83 | Text2Image Acc: 66.67 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:11 | INFO | Rank 0 | Global Steps: 107/41728 | Train Epoch: 1 [10272/250368 (4%)] | Loss: 0.866036 | Image2Text Acc: 75.00 | Text2Image Acc: 80.21 | Data Time: 0.059s | Batch Time: 0.838s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:12 | INFO | Rank 0 | Global Steps: 108/41728 | Train Epoch: 1 [10368/250368 (4%)] | Loss: 0.657084 | Image2Text Acc: 81.25 | Text2Image Acc: 81.25 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:13 | INFO | Rank 0 | Global Steps: 109/41728 | Train Epoch: 1 [10464/250368 (4%)] | Loss: 0.803462 | Image2Text Acc: 77.08 | Text2Image Acc: 77.08 | Data Time: 0.058s | Batch Time: 0.842s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:14 | INFO | Rank 0 | Global Steps: 110/41728 | Train Epoch: 1 [10560/250368 (4%)] | Loss: 0.991874 | Image2Text Acc: 77.08 | Text2Image Acc: 72.92 | Data Time: 0.059s | Batch Time: 0.837s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:14 | INFO | Rank 0 | Global Steps: 111/41728 | Train Epoch: 1 [10656/250368 (4%)] | Loss: 0.752720 | Image2Text Acc: 78.12 | Text2Image Acc: 79.17 | Data Time: 0.059s | Batch Time: 0.844s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:15 | INFO | Rank 0 | Global Steps: 112/41728 | Train Epoch: 1 [10752/250368 (4%)] | Loss: 0.643739 | Image2Text Acc: 85.42 | Text2Image Acc: 83.33 | Data Time: 0.059s | Batch Time: 0.832s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:16 | INFO | Rank 0 | Global Steps: 113/41728 | Train Epoch: 1 [10848/250368 (4%)] | Loss: 0.738462 | Image2Text Acc: 82.29 | Text2Image Acc: 76.04 | Data Time: 0.059s | Batch Time: 0.839s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:17 | INFO | Rank 0 | Global Steps: 114/41728 | Train Epoch: 1 [10944/250368 (4%)] | Loss: 0.594122 | Image2Text Acc: 80.21 | Text2Image Acc: 84.38 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:18 | INFO | Rank 0 | Global Steps: 115/41728 | Train Epoch: 1 [11040/250368 (4%)] | Loss: 1.017626 | Image2Text Acc: 73.96 | Text2Image Acc: 76.04 | Data Time: 0.059s | Batch Time: 0.841s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:19 | INFO | Rank 0 | Global Steps: 116/41728 | Train Epoch: 1 [11136/250368 (4%)] | Loss: 0.676917 | Image2Text Acc: 80.21 | Text2Image Acc: 81.25 | Data Time: 0.059s | Batch Time: 0.833s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:19 | INFO | Rank 0 | Global Steps: 117/41728 | Train Epoch: 1 [11232/250368 (4%)] | Loss: 0.659712 | Image2Text Acc: 82.29 | Text2Image Acc: 80.21 | Data Time: 0.059s | Batch Time: 0.836s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:20 | INFO | Rank 0 | Global Steps: 118/41728 | Train Epoch: 1 [11328/250368 (5%)] | Loss: 0.583284 | Image2Text Acc: 81.25 | Text2Image Acc: 82.29 | Data Time: 0.059s | Batch Time: 0.839s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:21 | INFO | Rank 0 | Global Steps: 119/41728 | Train Epoch: 1 [11424/250368 (5%)] | Loss: 0.683597 | Image2Text Acc: 77.08 | Text2Image Acc: 81.25 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:22 | INFO | Rank 0 | Global Steps: 120/41728 | Train Epoch: 1 [11520/250368 (5%)] | Loss: 0.905988 | Image2Text Acc: 72.92 | Text2Image Acc: 73.96 | Data Time: 0.059s | Batch Time: 0.845s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:23 | INFO | Rank 0 | Global Steps: 121/41728 | Train Epoch: 1 [11616/250368 (5%)] | Loss: 0.906386 | Image2Text Acc: 72.92 | Text2Image Acc: 73.96 | Data Time: 0.059s | Batch Time: 0.837s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:24 | INFO | Rank 0 | Global Steps: 122/41728 | Train Epoch: 1 [11712/250368 (5%)] | Loss: 0.835916 | Image2Text Acc: 75.00 | Text2Image Acc: 71.88 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:25 | INFO | Rank 0 | Global Steps: 123/41728 | Train Epoch: 1 [11808/250368 (5%)] | Loss: 0.764448 | Image2Text Acc: 77.08 | Text2Image Acc: 79.17 | Data Time: 0.059s | Batch Time: 0.841s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:25 | INFO | Rank 0 | Global Steps: 124/41728 | Train Epoch: 1 [11904/250368 (5%)] | Loss: 0.849898 | Image2Text Acc: 77.08 | Text2Image Acc: 73.96 | Data Time: 0.059s | Batch Time: 0.837s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:26 | INFO | Rank 0 | Global Steps: 125/41728 | Train Epoch: 1 [12000/250368 (5%)] | Loss: 0.727899 | Image2Text Acc: 79.17 | Text2Image Acc: 77.08 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:27 | INFO | Rank 0 | Global Steps: 126/41728 | Train Epoch: 1 [12096/250368 (5%)] | Loss: 0.800566 | Image2Text Acc: 81.25 | Text2Image Acc: 80.21 | Data Time: 0.060s | Batch Time: 0.843s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:28 | INFO | Rank 0 | Global Steps: 127/41728 | Train Epoch: 1 [12192/250368 (5%)] | Loss: 0.960883 | Image2Text Acc: 72.92 | Text2Image Acc: 73.96 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:29 | INFO | Rank 0 | Global Steps: 128/41728 | Train Epoch: 1 [12288/250368 (5%)] | Loss: 0.790139 | Image2Text Acc: 72.92 | Text2Image Acc: 81.25 | Data Time: 0.061s | Batch Time: 0.843s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:30 | INFO | Rank 0 | Global Steps: 129/41728 | Train Epoch: 1 [12384/250368 (5%)] | Loss: 0.780300 | Image2Text Acc: 82.29 | Text2Image Acc: 79.17 | Data Time: 0.059s | Batch Time: 0.839s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:30 | INFO | Rank 0 | Global Steps: 130/41728 | Train Epoch: 1 [12480/250368 (5%)] | Loss: 0.850396 | Image2Text Acc: 76.04 | Text2Image Acc: 79.17 | Data Time: 0.060s | Batch Time: 0.842s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:31 | INFO | Rank 0 | Global Steps: 131/41728 | Train Epoch: 1 [12576/250368 (5%)] | Loss: 0.649198 | Image2Text Acc: 81.25 | Text2Image Acc: 80.21 | Data Time: 0.059s | Batch Time: 0.835s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:32 | INFO | Rank 0 | Global Steps: 132/41728 | Train Epoch: 1 [12672/250368 (5%)] | Loss: 0.680545 | Image2Text Acc: 77.08 | Text2Image Acc: 83.33 | Data Time: 0.061s | Batch Time: 0.839s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:33 | INFO | Rank 0 | Global Steps: 133/41728 | Train Epoch: 1 [12768/250368 (5%)] | Loss: 0.597127 | Image2Text Acc: 85.42 | Text2Image Acc: 83.33 | Data Time: 0.059s | Batch Time: 0.847s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:34 | INFO | Rank 0 | Global Steps: 134/41728 | Train Epoch: 1 [12864/250368 (5%)] | Loss: 0.929113 | Image2Text Acc: 75.00 | Text2Image Acc: 70.83 | Data Time: 0.060s | Batch Time: 0.836s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:35 | INFO | Rank 0 | Global Steps: 135/41728 | Train Epoch: 1 [12960/250368 (5%)] | Loss: 1.120120 | Image2Text Acc: 68.75 | Text2Image Acc: 70.83 | Data Time: 0.059s | Batch Time: 0.847s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:35 | INFO | Rank 0 | Global Steps: 136/41728 | Train Epoch: 1 [13056/250368 (5%)] | Loss: 1.008211 | Image2Text Acc: 71.88 | Text2Image Acc: 72.92 | Data Time: 0.061s | Batch Time: 0.835s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:36 | INFO | Rank 0 | Global Steps: 137/41728 | Train Epoch: 1 [13152/250368 (5%)] | Loss: 0.680133 | Image2Text Acc: 79.17 | Text2Image Acc: 71.88 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:37 | INFO | Rank 0 | Global Steps: 138/41728 | Train Epoch: 1 [13248/250368 (5%)] | Loss: 0.839577 | Image2Text Acc: 78.12 | Text2Image Acc: 76.04 | Data Time: 0.060s | Batch Time: 0.833s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:38 | INFO | Rank 0 | Global Steps: 139/41728 | Train Epoch: 1 [13344/250368 (5%)] | Loss: 0.828378 | Image2Text Acc: 72.92 | Text2Image Acc: 77.08 | Data Time: 0.059s | Batch Time: 0.841s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:39 | INFO | Rank 0 | Global Steps: 140/41728 | Train Epoch: 1 [13440/250368 (5%)] | Loss: 0.553494 | Image2Text Acc: 85.42 | Text2Image Acc: 83.33 | Data Time: 0.059s | Batch Time: 0.842s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:40 | INFO | Rank 0 | Global Steps: 141/41728 | Train Epoch: 1 [13536/250368 (5%)] | Loss: 0.626136 | Image2Text Acc: 79.17 | Text2Image Acc: 81.25 | Data Time: 0.059s | Batch Time: 0.834s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:40 | INFO | Rank 0 | Global Steps: 142/41728 | Train Epoch: 1 [13632/250368 (5%)] | Loss: 0.748430 | Image2Text Acc: 80.21 | Text2Image Acc: 79.17 | Data Time: 0.060s | Batch Time: 0.837s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:41 | INFO | Rank 0 | Global Steps: 143/41728 | Train Epoch: 1 [13728/250368 (5%)] | Loss: 0.809352 | Image2Text Acc: 73.96 | Text2Image Acc: 80.21 | Data Time: 0.059s | Batch Time: 0.839s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:42 | INFO | Rank 0 | Global Steps: 144/41728 | Train Epoch: 1 [13824/250368 (6%)] | Loss: 0.945053 | Image2Text Acc: 71.88 | Text2Image Acc: 73.96 | Data Time: 0.059s | Batch Time: 0.838s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:43 | INFO | Rank 0 | Global Steps: 145/41728 | Train Epoch: 1 [13920/250368 (6%)] | Loss: 0.878521 | Image2Text Acc: 73.96 | Text2Image Acc: 77.08 | Data Time: 0.059s | Batch Time: 0.833s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:44 | INFO | Rank 0 | Global Steps: 146/41728 | Train Epoch: 1 [14016/250368 (6%)] | Loss: 0.946872 | Image2Text Acc: 69.79 | Text2Image Acc: 70.83 | Data Time: 0.060s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:45 | INFO | Rank 0 | Global Steps: 147/41728 | Train Epoch: 1 [14112/250368 (6%)] | Loss: 0.575010 | Image2Text Acc: 86.46 | Text2Image Acc: 82.29 | Data Time: 0.059s | Batch Time: 0.840s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:46 | INFO | Rank 0 | Global Steps: 148/41728 | Train Epoch: 1 [14208/250368 (6%)] | Loss: 0.860036 | Image2Text Acc: 72.92 | Text2Image Acc: 72.92 | Data Time: 0.059s | Batch Time: 0.838s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:46 | INFO | Rank 0 | Global Steps: 149/41728 | Train Epoch: 1 [14304/250368 (6%)] | Loss: 0.896334 | Image2Text Acc: 78.12 | Text2Image Acc: 78.12 | Data Time: 0.059s | Batch Time: 0.834s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:47 | INFO | Rank 0 | Global Steps: 150/41728 | Train Epoch: 1 [14400/250368 (6%)] | Loss: 0.772457 | Image2Text Acc: 78.12 | Text2Image Acc: 75.00 | Data Time: 0.060s | Batch Time: 0.838s | LR: 0.000020 | logit_scale: 4.605 | Global Batch Size: 96
2022-12-08,18:17:47 | INFO | Rank 0 | Begin to eval on validation set (epoch 1 @ 150 steps)...
2022-12-08,18:18:58 | INFO | Rank 0 | Evaluated 100/319 batches...
2022-12-08,18:20:09 | INFO | Rank 0 | Evaluated 200/319 batches...
2022-12-08,18:21:21 | INFO | Rank 0 | Evaluated 300/319 batches...
2022-12-08,18:21:35 | INFO | Rank 0 | Validation Result (epoch 1 @ 150 steps) | Valid Loss: 1.608249 | Image2Text Acc: 32.55 | Text2Image Acc: 33.00 | logit_scale: 4.605 | Valid Batch Size: 48

@yangapku
Copy link
Member

yangapku commented Dec 9, 2022

收到,我们也去排查下

@dengfenglai321
Copy link
Author

dengfenglai321 commented Dec 9, 2022

2022-12-08,18:15:41 | INFO | Rank 0 | => loaded checkpoint '/storage1/xxx/Text_Based_Image_Retrieval/Chinese-CLIP-master/experiments/pretrained_weights/clip_cn_vit-b-16.pt' (epoch 15 @ 0

好的谢谢。麻烦排查下。从训练日志看,训练loss一直在下降。训练集acc一直在提升。 但是验证集的的loss和acc从一开始就没怎么变过,acc维持在32左右浮动

@yangapku
Copy link
Member

yangapku commented Dec 11, 2022

@yumulinfeng1 您好,关于您在issue提出的问题,大概有以下几点结论和建议:

  1. 关于您所提到的finetune时,验证集inbatch acc一直较低的问题,主要是因为MUGE数据集验证集存在文对图一对多。在默认验证集不shuffle、且GPU分布式卡数较少的情况下,一个验证batch里面会有多个图文样本的文本相同的情况。由于我们的val inbatch acc是最简单的实现版本,只把样本自己作为ground truth,所以这种情况下会出现inbatch acc计算结果偏低的现象。实际上模型训练和收敛完全没有影响,最终走完整个评测流程,得到的Recall指标如果你有尝试跑过,应该也没问题。 更具体的描述参见PR change the val dataset sampler from sequential to deterministically shuffled #29 ,我们也做了个小改动,把finetune时验证集的sampler也改成shuffle的情况了,规避掉这种特殊情况。现在如果拉取最新代码再跑,不会出现验证集inbatch acc一直较低这个情况了。
  2. 关于模型finetune的效果对比,建议您和我们汇报结果所对比的效果指标,应该是跑完图文检索的评测流程最终得到的Recall(或者同一个零样本图片分类数据集上的准确率),而不是对比验证集inbatch acc。这是因为验证集inbatch acc显然和valid_batch_size有关,您所跑的valid_batch_size=48和我们脚本默认的valid_batch_size=128也不相同,所以和我们那个log样例写的inbatch acc对比意义不大,或者说您最新代码跑出来一定会比我们的inbatch acc要高,因为batch小,负例更少。验证集inbatch acc只用于您自己几组实验评估收敛趋势。
  3. 您finetune使用的训练batch size很小,2卡总计96。根据我们Readme中的描述,对比学习的训练收敛和稳定性和总batch size相关。如您使用更小的batch size(相比脚本默认配置128 per-GPU * 8 GPU),建议使用更小的学习率。您的学习率为2e-5,我们实测这组超参下,finetune最终的Recall甚至比不finetune的zeroshot结果还会要低(MUGE Mean Recall (MR) 68左右,对比我们base规模预训练模型zeroshot的MR 71.1)。建议您试试更小的学习率或者更大的batch(可以考虑用Readme中的重计算策略)。我们有尝试如果您提供这组超参,学习率降到2e-6,最终的MR将达到75.5以上,相比于zeroshot的MR 71.1会有提升。供您参考。

如果有更多问题,欢迎继续留言。如果觉得Chinese-CLIP代码库对您有帮助,请您为我们点点star⭐️并推荐给身边的朋友们!

@yangapku yangapku added solved in expectation The issue has been solved in expectation and just needs a validation by the author of issue. 已解决待验证 该问题预期已解决,等待issue作者验证中 labels Dec 11, 2022
@dengfenglai321
Copy link
Author

dengfenglai321 commented Dec 12, 2022

好的 十分感谢!!我去试试

@6Roy
Copy link

6Roy commented Apr 27, 2023

请问您使用的torch版本是多少?我在复现的时候一直出现问题

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved in expectation The issue has been solved in expectation and just needs a validation by the author of issue. 已解决待验证 该问题预期已解决,等待issue作者验证中
Projects
None yet
Development

No branches or pull requests

3 participants