You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There some problems in train heatmaps based on ProCPM model. i change your model backbone and train new model without pretrained weights. the trainnin loss log seems to be normal, but The test_300w NME is always 166.901.
when I show the batch_heatmaps, Only backgrounds can be trained, other foregrounds maps predicted is always zore numpy.
I did some experiments to find problems.
There is a sample heatmap label data from: def generate_label_map(pts, height, width, sigma, downsample, nopoints, ctype):
the sum of each map's value > 0, sigma:4, downsample:8:
[ 7, 6, 8, 7, 7, 7, 7, 7, 7, 8, 7, 8,
7, 7, 7, 7, 8, 8, 7, 8, 7, 7, 8, 8,
8, 7, 8, 7, 7, 7, 7, 8, 7, 7, 7, 7,
7, 7, 7, 8, 7, 7, 7, 7, 7, 8, 6, 8,
7, 7, 8, 7, 8, 8, 7, 7, 6, 7, 7, 8,
8, 8, 7, 8, 7, 7, 8, 7, 1024]
the max value of each maps:
[0.8979, 0.8032, 0.6506, 0.6165, 0.6372, 0.4791, 0.5207, 0.4606, 0.8633,
0.8925, 0.8269, 0.7173, 0.8287, 0.5979, 0.5846, 0.6472, 0.7989, 0.7900,
0.9217, 0.6162, 0.6745, 0.5610, 0.8472, 0.8779, 0.0000, 0.9160, 0.5829,
0.4558, 0.6468, 0.4131, 0.6791, 0.7397, 0.8015, 0.6352, 0.7229, 0.6346,
0.4943, 0.6630, 0.8280, 0.6312, 0.9719, 0.9219, 0.7392, 0.7575, 0.6396,
0.9408, 0.8714, 0.8833, 0.6204, 0.6979, 0.6662, 0.8334, 0.7105, 0.7566,
0.9396, 0.8781, 0.6446, 0.9268, 0.7921, 0.9549, 0.5869, 0.9428, 0.8311,
0.9922, 0.5339, 0.8815, 0.9315, 0.9609, 1.0000]
you can find the number of background label is 32*32. There is an imbalance between background and foreground. so i change the function(generate_label_map) and add AwingLoss and loss weight maps. They are all from a paper(AwingLoss). After,my model can be trianed well.
Have you ever encountered such a problem when you train the ProCpm model ?
Which project are you using?
SRT
There some problems in train heatmaps based on ProCPM model. i change your model backbone and train new model without pretrained weights. the trainnin loss log seems to be normal, but The test_300w NME is always 166.901.
when I show the batch_heatmaps, Only backgrounds can be trained, other foregrounds maps predicted is always zore numpy.
logs:
batch_size : 128
optimizer : sgd
LR : 0.0005
momentum : 0.9
Decay : 0.0005
nesterov : 1
criterion_ht : MSE-batch
epochs : 150
schedule : [60, 90, 120]
gamma : 0.1
pre_crop : 0.2
scale_min : 0.9
scale_max : 1.1
shear_max : 0.2
offset_max : 0.2
rotate_max : 30
cut_out : 0.1
sigma : 4
shape : [256, 256]
heatmap_type : gaussian
pixel_jitter_max : 20
downsample : 8
num_pts : 68
Training-data : GeneralDataset(point-num=68, shape=[256, 256], sigma=4, heatmap_type=gaussian, length=31528, cutout=0.1, dataset=train)
Testing-data : GeneralDataset(point-num=68, shape=[256, 256], sigma=4, heatmap_type=gaussian, length=689, cutout=0, dataset=test_300w)
Optimizer : SGD (
Parameter Group 0
dampening: 0
initial_lr: 0.0005
lr: 0.0005
momentum: 0.9
nesterov: 1
weight_decay: 0.0005
)
MSE Loss with reduction=['MSE', 'batch']
=> do not find the last-info file : ../snopshots/last-info.pth
==>>[2021-01-04 10:47:29] [epoch-000-150], [[Time Left: 00:00:00]], LR : [0.00050 ~ 0.00050], Config : {'epochs': 150, 'num_pts': 68, 'sigma': 4, 'print_freq': 10, 'downsample': 8, 'shape': [256, 256]}
-->[train]: [epoch-000-150][000/247] Time 16.47 (16.47) Data 7.20 (7.20) Forward 14.50 (14.50) Loss_all 104426.6016 (104426.6016) [Time Left: 01:07:30] ht_loss=104426.6016 : L1=35482.0781 : L2=35589.6680 : L3=33354.8516
-->[train]: [epoch-000-150][010/247] Time 1.11 (2.38) Data 0.16 (0.67) Forward 0.20 (1.38) Loss_all 279.1572 (11552.4368) [Time Left: 00:09:21] ht_loss=279.1572 : L1=93.0946 : L2=93.0299 : L3=93.0327
-->[train]: [epoch-000-150][020/247] Time 4.60 (2.21) Data 3.63 (0.87) Forward 3.68 (1.26) Loss_all 279.8896 (6183.9105) [Time Left: 00:08:20] ht_loss=279.8896 : L1=93.3314 : L2=93.2782 : L3=93.2800
-->[train]: [epoch-000-150][030/247] Time 0.95 (2.04) Data 0.00 (0.82) Forward 0.05 (1.10) Loss_all 279.0271 (4278.9738) [Time Left: 00:07:20] ht_loss=279.0271 : L1=93.0395 : L2=92.9885 : L3=92.9991
-->[train]: [epoch-000-150][040/247] Time 4.48 (2.12) Data 3.52 (0.96) Forward 3.56 (1.19) Loss_all 279.3033 (3303.2060) [Time Left: 00:07:16] ht_loss=279.3033 : L1=93.0816 : L2=93.1021 : L3=93.1196
-->[train]: [epoch-000-150][050/247] Time 0.96 (2.03) Data 0.00 (0.92) Forward 0.05 (1.11) Loss_all 278.0408 (2710.1125) [Time Left: 00:06:38] ht_loss=278.0408 : L1=92.7466 : L2=92.6507 : L3=92.6435
-->[train]: [epoch-000-150][060/247] Time 4.94 (2.13) Data 3.96 (1.04) Forward 4.00 (1.21) Loss_all 279.6303 (2311.4327) [Time Left: 00:06:36] ht_loss=279.6303 : L1=93.1926 : L2=93.2191 : L3=93.2186
-->[train]: [epoch-000-150][070/247] Time 0.98 (2.08) Data 0.00 (1.00) Forward 0.05 (1.16) Loss_all 277.1473 (2025.1064) [Time Left: 00:06:05] ht_loss=277.1473 : L1=92.4281 : L2=92.3588 : L3=92.3605
-->[train]: [epoch-000-150][080/247] Time 4.86 (2.10) Data 3.87 (1.04) Forward 3.91 (1.18) Loss_all 278.3179 (1809.4039) [Time Left: 00:05:48] ht_loss=278.3179 : L1=92.7645 : L2=92.7693 : L3=92.7842
-->[train]: [epoch-000-150][090/247] Time 0.98 (2.08) Data 0.00 (1.03) Forward 0.05 (1.16) Loss_all 277.4024 (1641.1797) [Time Left: 00:05:24] ht_loss=277.4024 : L1=92.4464 : L2=92.4706 : L3=92.4854
-->[train]: [epoch-000-150][100/247] Time 6.01 (2.09) Data 5.07 (1.05) Forward 5.11 (1.17) Loss_all 277.7260 (1506.2978) [Time Left: 00:05:05] ht_loss=277.7260 : L1=92.5351 : L2=92.5909 : L3=92.6000
-->[train]: [epoch-000-150][110/247] Time 0.98 (2.08) Data 0.00 (1.04) Forward 0.05 (1.15) Loss_all 278.5226 (1395.6688) [Time Left: 00:04:42] ht_loss=278.5226 : L1=92.8076 : L2=92.8527 : L3=92.8623
-->[train]: [epoch-000-150][120/247] Time 4.90 (2.08) Data 3.86 (1.05) Forward 3.98 (1.16) Loss_all 277.3715 (1303.3518) [Time Left: 00:04:22] ht_loss=277.3715 : L1=92.4822 : L2=92.4495 : L3=92.4398
-->[train]: [epoch-000-150][130/247] Time 1.01 (2.06) Data 0.00 (1.03) Forward 0.05 (1.14) Loss_all 277.4960 (1225.1186) [Time Left: 00:03:59] ht_loss=277.4960 : L1=92.5309 : L2=92.4777 : L3=92.4875
-->[train]: [epoch-000-150][140/247] Time 4.89 (2.07) Data 3.85 (1.04) Forward 3.89 (1.14) Loss_all 277.4550 (1157.9637) [Time Left: 00:03:39] ht_loss=277.4550 : L1=92.4373 : L2=92.4946 : L3=92.5231
-->[train]: [epoch-000-150][150/247] Time 0.97 (2.07) Data 0.00 (1.04) Forward 0.05 (1.14) Loss_all 277.0612 (1099.6810) [Time Left: 00:03:18] ht_loss=277.0612 : L1=92.3476 : L2=92.3554 : L3=92.3583
-->[train]: [epoch-000-150][160/247] Time 4.87 (2.09) Data 3.86 (1.06) Forward 3.90 (1.16) Loss_all 278.3372 (1048.6711) [Time Left: 00:02:59] ht_loss=278.3372 : L1=92.7852 : L2=92.7728 : L3=92.7792
-->[train]: [epoch-000-150][170/247] Time 1.05 (2.07) Data 0.00 (1.05) Forward 0.05 (1.14) Loss_all 278.7277 (1003.6236) [Time Left: 00:02:37] ht_loss=278.7277 : L1=92.8838 : L2=92.9143 : L3=92.9296
-->[train]: [epoch-000-150][180/247] Time 5.15 (2.08) Data 4.04 (1.06) Forward 4.21 (1.15) Loss_all 278.2527 (963.5538) [Time Left: 00:02:17] ht_loss=278.2527 : L1=92.8155 : L2=92.7081 : L3=92.7291
-->[train]: [epoch-000-150][190/247] Time 1.03 (2.05) Data 0.00 (1.04) Forward 0.05 (1.12) Loss_all 277.6393 (927.6484) [Time Left: 00:01:55] ht_loss=277.6393 : L1=92.5422 : L2=92.5445 : L3=92.5525
-->[train]: [epoch-000-150][200/247] Time 4.82 (2.08) Data 3.81 (1.06) Forward 3.86 (1.15) Loss_all 278.6940 (895.3700) [Time Left: 00:01:35] ht_loss=278.6940 : L1=92.8379 : L2=92.9121 : L3=92.9440
-->[train]: [epoch-000-150][210/247] Time 1.01 (2.07) Data 0.00 (1.05) Forward 0.05 (1.14) Loss_all 278.6920 (866.1021) [Time Left: 00:01:14] ht_loss=278.6920 : L1=92.8799 : L2=92.8969 : L3=92.9152
-->[train]: [epoch-000-150][220/247] Time 4.31 (2.09) Data 3.27 (1.07) Forward 3.32 (1.15) Loss_all 277.6641 (839.5094) [Time Left: 00:00:54] ht_loss=277.6641 : L1=92.5900 : L2=92.5280 : L3=92.5461
-->[train]: [epoch-000-150][230/247] Time 0.98 (2.08) Data 0.00 (1.06) Forward 0.05 (1.14) Loss_all 277.3657 (815.2032) [Time Left: 00:00:33] ht_loss=277.3657 : L1=92.4130 : L2=92.4656 : L3=92.4871
-->[train]: [epoch-000-150][240/247] Time 5.31 (2.09) Data 4.35 (1.07) Forward 4.40 (1.16) Loss_all 276.9100 (792.9332) [Time Left: 00:00:12] ht_loss=276.9100 : L1=92.2631 : L2=92.3141 : L3=92.3328
-->[train]: [epoch-000-150][246/247] Time 2.29 (2.09) Data 0.00 (1.07) Forward 1.03 (1.15) Loss_all 278.2914 (781.8199) [Time Left: 00:00:00] ht_loss=278.2914 : L1=92.7296 : L2=92.7785 : L3=92.7834
Eval dataset length 31528, labeled data length 31528
Compute NME for 31528 images with 68 points :: [(nms): mean=164.630, std=33.857]
==>>[2021-01-04 10:56:13] Train [epoch-000-150] Average Loss = 781.819878, NME = 164.63
save checkpoint into ../snopshots/checkpoint/HEATMAP-epoch-000-150.pth
save checkpoint into ../snopshots/last-info.pth
==>>[2021-01-04 10:56:13] [epoch-001-150], [[Time Left: 21:42:18]], LR : [0.00050 ~ 0.00050], Config : {'epochs': 150, 'num_pts': 68, 'sigma': 4, 'print_freq': 10, 'downsample': 8, 'shape': [256, 256]}
-->[train]: [epoch-001-150][000/247] Time 8.17 (8.17) Data 7.04 (7.04) Forward 7.16 (7.16) Loss_all 278.1276 (278.1276) [Time Left: 00:33:30] ht_loss=278.1276 : L1=92.7402 : L2=92.6863 : L3=92.7011
-->[train]: [epoch-001-150][010/247] Time 1.02 (2.70) Data 0.00 (1.68) Forward 0.05 (1.74) Loss_all 277.0975 (278.3596) [Time Left: 00:10:37] ht_loss=277.0975 : L1=92.3605 : L2=92.3538 : L3=92.3832
-->[train]: [epoch-001-150][020/247] Time 4.71 (2.55) Data 3.67 (1.54) Forward 3.72 (1.59) Loss_all 277.3455 (278.1147) [Time Left: 00:09:37] ht_loss=277.3455 : L1=92.4743 : L2=92.4308 : L3=92.4404
-->[train]: [epoch-001-150][030/247] Time 1.08 (2.31) Data 0.00 (1.29) Forward 0.11 (1.34) Loss_all 278.7683 (278.1484) [Time Left: 00:08:18] ht_loss=278.7683 : L1=92.9498 : L2=92.9005 : L3=92.9180
-->[train]: [epoch-001-150][040/247] Time 7.17 (2.35) Data 6.17 (1.33) Forward 6.22 (1.38) Loss_all 278.2269 (278.1230) [Time Left: 00:08:04] ht_loss=278.2269 : L1=92.6964 : L2=92.7372 : L3=92.7933
-->[train]: [epoch-001-150][050/247] Time 1.03 (2.30) Data 0.00 (1.27) Forward 0.05 (1.32) Loss_all 276.7727 (278.0273) [Time Left: 00:07:29] ht_loss=276.7727 : L1=92.2674 : L2=92.2390 : L3=92.2664
-->[train]: [epoch-001-150][060/247] Time 5.33 (2.29) Data 4.27 (1.26) Forward 4.32 (1.32) Loss_all 277.9707 (277.9931) [Time Left: 00:07:06] ht_loss=277.9707 : L1=92.5334 : L2=92.6936 : L3=92.7436
-->[train]: [epoch-001-150][070/247] Time 1.08 (2.23) Data 0.00 (1.20) Forward 0.07 (1.25) Loss_all 276.6379 (277.9465) [Time Left: 00:06:32] ht_loss=276.6379 : L1=92.0694 : L2=92.2572 : L3=92.3113
-->[train]: [epoch-001-150][080/247] Time 5.25 (2.25) Data 4.22 (1.21) Forward 4.27 (1.26) Loss_all 278.9980 (277.9131) [Time Left: 00:06:12] ht_loss=278.9980 : L1=92.9393 : L2=93.0119 : L3=93.0468
-->[train]: [epoch-001-150][090/247] Time 1.11 (2.20) Data 0.00 (1.16) Forward 0.06 (1.22) Loss_all 278.4267 (277.9751) [Time Left: 00:05:43] ht_loss=278.4267 : L1=92.7253 : L2=92.8161 : L3=92.8853
-->[train]: [epoch-001-150][100/247] Time 3.97 (2.22) Data 2.93 (1.18) Forward 2.98 (1.23) Loss_all 277.6642 (277.8997) [Time Left: 00:05:23] ht_loss=277.6642 : L1=92.5115 : L2=92.5662 : L3=92.5865
-->[train]: [epoch-001-150][110/247] Time 1.00 (2.21) Data 0.00 (1.17) Forward 0.04 (1.22) Loss_all 278.4783 (277.8227) [Time Left: 00:05:00] ht_loss=278.4783 : L1=92.5632 : L2=92.6771 : L3=93.2380
-->[train]: [epoch-001-150][120/247] Time 1.53 (2.18) Data 0.52 (1.14) Forward 0.56 (1.20) Loss_all 278.9060 (277.7973) [Time Left: 00:04:34] ht_loss=278.9060 : L1=92.9879 : L2=92.9261 : L3=92.9920
-->[train]: [epoch-001-150][130/247] Time 1.04 (2.20) Data 0.00 (1.16) Forward 0.07 (1.22) Loss_all 275.1124 (277.7316) [Time Left: 00:04:15] ht_loss=275.1124 : L1=91.6005 : L2=91.7149 : L3=91.7970
-->[train]: [epoch-001-150][140/247] Time 1.36 (2.19) Data 0.35 (1.15) Forward 0.39 (1.21) Loss_all 277.0453 (277.6980) [Time Left: 00:03:52] ht_loss=277.0453 : L1=92.2058 : L2=92.3897 : L3=92.4498
-->[train]: [epoch-001-150][150/247] Time 1.05 (2.20) Data 0.00 (1.16) Forward 0.05 (1.22) Loss_all 277.4876 (277.6891) [Time Left: 00:03:31] ht_loss=277.4876 : L1=92.4398 : L2=92.4923 : L3=92.5554
-->[train]: [epoch-001-150][160/247] Time 1.01 (2.19) Data 0.00 (1.15) Forward 0.06 (1.20) Loss_all 276.7957 (277.6487) [Time Left: 00:03:08] ht_loss=276.7957 : L1=92.1740 : L2=92.2719 : L3=92.3497
-->[train]: [epoch-001-150][170/247] Time 1.07 (2.21) Data 0.00 (1.17) Forward 0.05 (1.23) Loss_all 274.4813 (277.5769) [Time Left: 00:02:48] ht_loss=274.4813 : L1=91.2472 : L2=91.5625 : L3=91.6715
-->[train]: [epoch-001-150][180/247] Time 4.76 (2.20) Data 3.73 (1.16) Forward 3.78 (1.22) Loss_all 277.1411 (277.5542) [Time Left: 00:02:25] ht_loss=277.1411 : L1=92.2720 : L2=92.3849 : L3=92.4842
-->[train]: [epoch-001-150][190/247] Time 1.03 (2.19) Data 0.00 (1.15) Forward 0.06 (1.20) Loss_all 276.4174 (277.5101) [Time Left: 00:02:02] ht_loss=276.4174 : L1=91.9825 : L2=92.1831 : L3=92.2517
-->[train]: [epoch-001-150][200/247] Time 1.08 (2.18) Data 0.00 (1.14) Forward 0.08 (1.20) Loss_all 276.7092 (277.4698) [Time Left: 00:01:40] ht_loss=276.7092 : L1=92.0072 : L2=92.3023 : L3=92.3998
-->[train]: [epoch-001-150][210/247] Time 1.01 (2.19) Data 0.00 (1.15) Forward 0.05 (1.21) Loss_all 277.3184 (277.4343) [Time Left: 00:01:18] ht_loss=277.3184 : L1=92.2769 : L2=92.4722 : L3=92.5693
-->[train]: [epoch-001-150][220/247] Time 2.94 (2.19) Data 1.92 (1.15) Forward 1.96 (1.20) Loss_all 275.9615 (277.3970) [Time Left: 00:00:56] ht_loss=275.9615 : L1=91.9907 : L2=91.9326 : L3=92.0382
-->[train]: [epoch-001-150][230/247] Time 1.04 (2.19) Data 0.00 (1.15) Forward 0.07 (1.20) Loss_all 276.7830 (277.3277) [Time Left: 00:00:35] ht_loss=276.7830 : L1=91.9892 : L2=92.3281 : L3=92.4656
-->[train]: [epoch-001-150][240/247] Time 4.19 (2.20) Data 3.19 (1.16) Forward 3.24 (1.22) Loss_all 275.8693 (277.2501) [Time Left: 00:00:13] ht_loss=275.8693 : L1=91.6268 : L2=92.0341 : L3=92.2084
-->[train]: [epoch-001-150][246/247] Time 0.37 (2.19) Data 0.00 (1.15) Forward 0.04 (1.21) Loss_all 277.2675 (277.2426) [Time Left: 00:00:00] ht_loss=277.2675 : L1=92.4372 : L2=92.3524 : L3=92.4779
Eval dataset length 31528, labeled data length 31528
Compute NME for 31528 images with 68 points :: [(nms): mean=165.022, std=33.734]
==>>[2021-01-04 11:05:24] Train [epoch-001-150] Average Loss = 277.242612, NME = 165.02
save checkpoint into ../snopshots/checkpoint/HEATMAP-epoch-001-150.pth
save checkpoint into ../snopshots/last-info.pth
Basic-Eval-All evaluates 1 dataset
==>>[2021-01-04 11:05:24], [epoch-001-150], evaluate the 0/1-th dataset [image] : GeneralDataset(point-num=68, shape=[256, 256], sigma=4, heatmap_type=gaussian, length=689, cutout=0, dataset=test_300w)
-->[test]: [epoch-001-150][000/006] Time 6.60 (6.60) Data 6.19 (6.19) Forward 6.22 (6.22) Loss_all 280.0911 (280.0911) [Time Left: 00:00:33] ht_loss=280.0911 : L1=93.1269 : L2=93.3953 : L3=93.5689
-->[test]: [epoch-001-150][005/006] Time 1.23 (1.57) Data 0.00 (1.03) Forward 1.14 (1.25) Loss_all 280.1683 (279.7528) [Time Left: 00:00:00] ht_loss=280.1683 : L1=93.1425 : L2=93.4033 : L3=93.6225
Eval dataset length 689, labeled data length 689
Compute NME for 689 images with 68 points :: [(nms): mean=166.901, std=24.249]
NME Results :
->test_300w : NME = 166.901,
The text was updated successfully, but these errors were encountered: