You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I try to training my coco style dataset by your scripts, I dont know which bash script should be used to train.(Could you please briefly explain the function of each script?) Then I use "scripts/train/lambda/coco/train.sh" this one for training. but one error happened.
GAMMA1: 0.99 [0/927]
GAMMA2: 0.0
LR: 0.001
LR_FACTOR: 0.1
LR_STEP: [70, 100]
MOMENTUM: 0.9
NESTEROV: False
OPTIMIZER: adam
RESUME: False
SHUFFLE: True
WD: 0.0001
WORKERS: 24
=> init weights from normal distribution
=> loading pretrained model models/pytorch/imagenet/hrnet_w48-8ef0771d.pth
Total Parameters: 63,746,081
Total Multiply Adds (For Convolution and Linear Layers only): 46.562052726745605 GFLOPs
Number of Layers
Conv2d : 293 layers BatchNorm2d : 292 layers ReLU : 271 layers Bottleneck : 4 layers BasicBlock : 104 layers Upsample : 28 layers HighResolutionModule : 8 layers AdaptiveAvgPool2d : 5 l
ayers Linear : 20 layers Sigmoid : 10 layers BatchNorm1d : 5 layers SELambdaLayer : 5 layers SELambdaModule : 2 layers
=> loading model from models/pytorch/pose_coco/pose_hrnet_w48_384x288.pth
=> loading from latest_state_dict at models/pytorch/pose_coco/pose_hrnet_w48_384x288.pth
loading annotations into memory...
Done (t=31.87s)
creating index...
index created!
=> classes: ['background', 'person']
=> num_images: 118287
loading from cache from cache/coco_lambda/train2017/gt_db.pkl
done!
=> load 149813 samples
loading annotations into memory...
Done (t=4.04s)
creating index...
index created!
=> classes: ['background', 'person']
=> num_images: 5000
=> load 6352 samples
=> resuming optimizer from models/pytorch/pose_coco/pose_hrnet_w48_384x288.pth
=> updated lr schedule is [70, 100]
training on lambda
Epoch: [0][0/18727] Time 64.338s (64.338s) Speed 0.2 samples/s Data 10.114s (10.114s) Loss 0.00020 (0.00020) Accuracy 0.513 (0.513) model_grad 0.000568 (0.000568) DivLoss -0.00074 (-0.00074) PoseLoss 0.00020 (0.00020)
Traceback (most recent call last):
File "tools/lambda/train_lambda_real.py", line 280, in
main()
File "tools/lambda/train_lambda_real.py", line 242, in main
final_output_dir, tb_log_dir, writer_dict, print_prefix='lambda')
File "/data_2/lutianhao/code/MIPNet/tools/lambda/../../lib/core/train.py", line 464, in train_lambda
suffix += '_[{}:{}]'.format(count, round(lambda_a[count + B].item(), 2))
IndexError: index 16 is out of bounds for dimension 0 with size 16
The text was updated successfully, but these errors were encountered:
Hi, I try to training my coco style dataset by your scripts, I dont know which bash script should be used to train.(Could you please briefly explain the function of each script?) Then I use "scripts/train/lambda/coco/train.sh" this one for training. but one error happened.
cd /data_2/lutianhao/code/MIPNet/
CUDA_VISIBLE_DEVICES=4,5,6,7, python tools/lambda/train_lambda_real.py
--cfg experiments/coco/hrnet/w48_384x288_adam_lr1e-3.yaml
GPUS '(0,1,2,3,)'
OUTPUT_DIR 'Outputs/outputs/lambda/lambda_coco_real_waffle'
LOG_DIR 'Outputs/logs/lambda/lambda_coco_real_waffle'
TEST.MODEL_FILE 'models/pytorch/pose_coco/pose_hrnet_w48_384x288.pth'
DATASET.TRAIN_DATASET 'coco_lambda'
DATASET.TRAIN_SET 'train2017'
DATASET.TRAIN_IMAGE_DIR '/data_2/lutianhao/datasets/pose/coco2017/train2017'
DATASET.TRAIN_ANNOTATION_FILE '/data_2/lutianhao/datasets/pose/coco2017/annotations/person_keypoints_train2017.json'
DATASET.TRAIN_DATASET_TYPE 'coco_lambda'
DATASET.TEST_DATASET 'coco'
DATASET.TEST_SET 'val2017'
DATASET.TEST_IMAGE_DIR '/data_2/lutianhao/datasets/pose/coco2017/val2017'
DATASET.TEST_ANNOTATION_FILE '/data_2/lutianhao/datasets/pose/coco2017/annotations/person_keypoints_val2017.json'
DATASET.TEST_DATASET_TYPE 'coco'
TRAIN.LR 0.001
TRAIN.BEGIN_EPOCH 0
TRAIN.END_EPOCH 110
TRAIN.LR_STEP '(70, 100)'
TRAIN.BATCH_SIZE_PER_GPU 2
TEST.BATCH_SIZE_PER_GPU 1
TEST.USE_GT_BBOX True
EPOCH_EVAL_FREQ 1
PRINT_FREQ 100
MODEL.NAME 'pose_hrnet_se_lambda'
MODEL.SE_MODULES '[False, False, True, True]'
And the error is :
GAMMA1: 0.99 [0/927]
GAMMA2: 0.0
LR: 0.001
LR_FACTOR: 0.1
LR_STEP: [70, 100]
MOMENTUM: 0.9
NESTEROV: False
OPTIMIZER: adam
RESUME: False
SHUFFLE: True
WD: 0.0001
WORKERS: 24
=> init weights from normal distribution
=> loading pretrained model models/pytorch/imagenet/hrnet_w48-8ef0771d.pth
Total Parameters: 63,746,081
Total Multiply Adds (For Convolution and Linear Layers only): 46.562052726745605 GFLOPs
Number of Layers
Conv2d : 293 layers BatchNorm2d : 292 layers ReLU : 271 layers Bottleneck : 4 layers BasicBlock : 104 layers Upsample : 28 layers HighResolutionModule : 8 layers AdaptiveAvgPool2d : 5 l
ayers Linear : 20 layers Sigmoid : 10 layers BatchNorm1d : 5 layers SELambdaLayer : 5 layers SELambdaModule : 2 layers
=> loading model from models/pytorch/pose_coco/pose_hrnet_w48_384x288.pth
=> loading from latest_state_dict at models/pytorch/pose_coco/pose_hrnet_w48_384x288.pth
loading annotations into memory...
Done (t=31.87s)
creating index...
index created!
=> classes: ['background', 'person']
=> num_images: 118287
loading from cache from cache/coco_lambda/train2017/gt_db.pkl
done!
=> load 149813 samples
loading annotations into memory...
Done (t=4.04s)
creating index...
index created!
=> classes: ['background', 'person']
=> num_images: 5000
=> load 6352 samples
=> resuming optimizer from models/pytorch/pose_coco/pose_hrnet_w48_384x288.pth
=> updated lr schedule is [70, 100]
training on lambda
Epoch: [0][0/18727] Time 64.338s (64.338s) Speed 0.2 samples/s Data 10.114s (10.114s) Loss 0.00020 (0.00020) Accuracy 0.513 (0.513) model_grad 0.000568 (0.000568) DivLoss -0.00074 (-0.00074) PoseLoss 0.00020 (0.00020)
Traceback (most recent call last):
File "tools/lambda/train_lambda_real.py", line 280, in
main()
File "tools/lambda/train_lambda_real.py", line 242, in main
final_output_dir, tb_log_dir, writer_dict, print_prefix='lambda')
File "/data_2/lutianhao/code/MIPNet/tools/lambda/../../lib/core/train.py", line 464, in train_lambda
suffix += '_[{}:{}]'.format(count, round(lambda_a[count + B].item(), 2))
IndexError: index 16 is out of bounds for dimension 0 with size 16
The text was updated successfully, but these errors were encountered: