some questions about keypoints training? #21

lunalulu · 2019-06-05T02:27:24Z

When I was training keypoints subnet, I was wondering if there was no training in segment network, according to the following code, the comment code
`
def build_keypoint_loss(saved_for_loss, heat_temp, heat_weight):
names = build_names()
saved_for_log = OrderedDict()
criterion = nn.MSELoss(size_average=True).cuda()
total_loss = 0
div1 = 1.
#div2 = 100.

for j in range(5):

    pred1 = saved_for_loss[j][:, :18, :, :] * heat_weight
    gt1 = heat_weight * heat_temp

    #pred2 = saved_for_loss[j][:, 18:, :, :]
    #gt2 = mask_all

    # Compute losses
    loss1 = criterion(pred1, gt1)/div1  # heatmap_loss
    #loss2 = criterion(pred2, gt2)/div2  # mask_loss
    total_loss += loss1
    #total_loss += loss2

    # Get value from Tensor and save for log
    saved_for_log[names[j*2]] = loss1.item() #只是保留heatmap的loss
    #saved_for_log[names[j*2+1]] = loss2.item()

saved_for_log['max_ht'] = torch.max(
    saved_for_loss[-1].data[:, :18, :, :]).item()
saved_for_log['min_ht'] = torch.min(
    saved_for_loss[-1].data[:, :18, :, :]).item()
#saved_for_log['max_mask'] = torch.max(
#    saved_for_loss[-1].data[:, 18:, :, :]).item()
#saved_for_log['min_mask'] = torch.min(
#    saved_for_loss[-1].data[:, 18:, :, :]).item()

return total_loss, saved_for_log

`

another question is i found when i training on two gpu , it's speed lower than one gpu , log is below:

thanks for your reply

The text was updated successfully, but these errors were encountered:

LiMeng95 · 2019-06-07T04:33:57Z

hi,

The task Keypoint Estimation Subnet with person segmentation mask is still in to-do list, I have not added it in keypoints subnet yet.
The batch-size of dataloader is linearly related to the number of GPUs. Specifically, batch-size will be multiplied by the number of GPUs you enabled in CUDA_VISIBLES_DEVICES. Related code is in multipose_keypoint_train.py#L61.
That is more GPUs means more pre-processing such as data agumentation, GT generation and so on. All of these pre-processing are on CPUs. So, I think the training speed maybe limited by not only the GPU but the CPUs' computational ability.

lunalulu · 2019-06-13T04:11:11Z

@LiMeng95 thanks very much~

Another question, I trained a resnet50 as backbone, but compared to ckpt_baseline_resnet101.h5 in your project, the inference FPS went up by only 2 frames.
Do you have training with resnet50? How fast is the result compared with resnet101?Can you provide one resnet50?
thanks

LiMeng95 · 2019-06-13T16:24:51Z

Sorry, I didn't train with resnet50. Maybe the backbone is not the main limitation of inference FPS, other parts like person detection will also take some time.

LiMeng95 closed this as completed Jul 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some questions about keypoints training? #21

some questions about keypoints training? #21

lunalulu commented Jun 5, 2019

LiMeng95 commented Jun 7, 2019

lunalulu commented Jun 13, 2019

LiMeng95 commented Jun 13, 2019

some questions about keypoints training? #21

some questions about keypoints training? #21

Comments

lunalulu commented Jun 5, 2019

LiMeng95 commented Jun 7, 2019

lunalulu commented Jun 13, 2019

LiMeng95 commented Jun 13, 2019